21 Apr, 2020

1 commit

  • [ Upstream commit 4faab8c446def7667adf1f722456c2f4c304069c ]

    In the current hsr code, only 0 and 1 protocol versions are valid.
    But current hsr code doesn't check the version, which is received by
    userspace.

    Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1 version 4

    In the test commands, version 4 is invalid.
    So, the command should be failed.

    After this patch, following error will occur.
    "Error: hsr: Only versions 0..1 are supported."

    Fixes: ee1c27977284 ("net/hsr: Added support for HSR v1")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

01 Apr, 2020

4 commits

  • [ Upstream commit 09e91dbea0aa32be02d8877bd50490813de56b9a ]

    The hsr module has been supporting the list and status command.
    (HSR_C_GET_NODE_LIST and HSR_C_GET_NODE_STATUS)
    These commands send node information to the user-space via generic netlink.
    But, in the non-init_net namespace, these commands are not allowed
    because .netnsok flag is false.
    So, there is no way to get node information in the non-init_net namespace.

    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • [ Upstream commit ca19c70f5225771c05bcdcb832b4eb84d7271c5e ]

    The hsr_get_node_list() is to send node addresses to the userspace.
    If there are so many nodes, it could fail because of buffer size.
    In order to avoid this failure, the restart routine is added.

    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • [ Upstream commit 173756b86803655d70af7732079b3aa935e6ab68 ]

    hsr_get_node_{list/status}() are not under rtnl_lock() because
    they are callback functions of generic netlink.
    But they use __dev_get_by_index() without rtnl_lock().
    So, it would use unsafe data.
    In order to fix it, rcu_read_lock() and dev_get_by_index_rcu()
    are used instead of __dev_get_by_index().

    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • [ Upstream commit 3a303cfdd28d5f930a307c82e8a9d996394d5ebd ]

    The port->hsr is used in the hsr_handle_frame(), which is a
    callback of rx_handler.
    hsr master and slaves are initialized in hsr_add_port().
    This function initializes several pointers, which includes port->hsr after
    registering rx_handler.
    So, in the rx_handler routine, un-initialized pointer would be used.
    In order to fix this, pointers should be initialized before
    registering rx_handler.

    Test commands:
    ip netns del left
    ip netns del right
    modprobe -rv veth
    modprobe -rv hsr
    killall ping
    modprobe hsr
    ip netns add left
    ip netns add right
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link add veth4 type veth peer name veth5
    ip link set veth1 netns left
    ip link set veth3 netns right
    ip link set veth4 netns left
    ip link set veth5 netns right
    ip link set veth0 up
    ip link set veth2 up
    ip link set veth0 address fc:00:00:00:00:01
    ip link set veth2 address fc:00:00:00:00:02
    ip netns exec left ip link set veth1 up
    ip netns exec left ip link set veth4 up
    ip netns exec right ip link set veth3 up
    ip netns exec right ip link set veth5 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec left ip link add hsr1 type hsr slave1 veth1 slave2 veth4
    ip netns exec left ip a a 192.168.100.2/24 dev hsr1
    ip netns exec left ip link set hsr1 up
    ip netns exec left ip n a 192.168.100.1 dev hsr1 lladdr \
    fc:00:00:00:00:01 nud permanent
    ip netns exec left ip n r 192.168.100.1 dev hsr1 lladdr \
    fc:00:00:00:00:01 nud permanent
    for i in {1..100}
    do
    ip netns exec left ping 192.168.100.1 &
    done
    ip netns exec left hping3 192.168.100.1 -2 --flood &
    ip netns exec right ip link add hsr2 type hsr slave1 veth3 slave2 veth5
    ip netns exec right ip a a 192.168.100.3/24 dev hsr2
    ip netns exec right ip link set hsr2 up
    ip netns exec right ip n a 192.168.100.1 dev hsr2 lladdr \
    fc:00:00:00:00:02 nud permanent
    ip netns exec right ip n r 192.168.100.1 dev hsr2 lladdr \
    fc:00:00:00:00:02 nud permanent
    for i in {1..100}
    do
    ip netns exec right ping 192.168.100.1 &
    done
    ip netns exec right hping3 192.168.100.1 -2 --flood &
    while :
    do
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip link del hsr0
    done

    Splat looks like:
    [ 120.954938][ C0] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1]I
    [ 120.957761][ C0] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
    [ 120.959064][ C0] CPU: 0 PID: 1511 Comm: hping3 Not tainted 5.6.0-rc5+ #460
    [ 120.960054][ C0] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [ 120.962261][ C0] RIP: 0010:hsr_addr_is_self+0x65/0x2a0 [hsr]
    [ 120.963149][ C0] Code: 44 24 18 70 73 2f c0 48 c1 eb 03 48 8d 04 13 c7 00 f1 f1 f1 f1 c7 40 04 00 f2 f2 f2 4
    [ 120.966277][ C0] RSP: 0018:ffff8880d9c09af0 EFLAGS: 00010206
    [ 120.967293][ C0] RAX: 0000000000000006 RBX: 1ffff1101b38135f RCX: 0000000000000000
    [ 120.968516][ C0] RDX: dffffc0000000000 RSI: ffff8880d17cb208 RDI: 0000000000000000
    [ 120.969718][ C0] RBP: 0000000000000030 R08: ffffed101b3c0e3c R09: 0000000000000001
    [ 120.972203][ C0] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 0000000000000000
    [ 120.973379][ C0] R13: ffff8880aaf80100 R14: ffff8880aaf800f2 R15: ffff8880aaf80040
    [ 120.974410][ C0] FS: 00007f58e693f740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
    [ 120.979794][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 120.980773][ C0] CR2: 00007ffcb8b38f29 CR3: 00000000afe8e001 CR4: 00000000000606f0
    [ 120.981945][ C0] Call Trace:
    [ 120.982411][ C0]
    [ 120.982848][ C0] ? hsr_add_node+0x8c0/0x8c0 [hsr]
    [ 120.983522][ C0] ? rcu_read_lock_held+0x90/0xa0
    [ 120.984159][ C0] ? rcu_read_lock_sched_held+0xc0/0xc0
    [ 120.984944][ C0] hsr_handle_frame+0x1db/0x4e0 [hsr]
    [ 120.985597][ C0] ? hsr_nl_nodedown+0x2b0/0x2b0 [hsr]
    [ 120.986289][ C0] __netif_receive_skb_core+0x6bf/0x3170
    [ 120.992513][ C0] ? check_chain_key+0x236/0x5d0
    [ 120.993223][ C0] ? do_xdp_generic+0x1460/0x1460
    [ 120.993875][ C0] ? register_lock_class+0x14d0/0x14d0
    [ 120.994609][ C0] ? __netif_receive_skb_one_core+0x8d/0x160
    [ 120.995377][ C0] __netif_receive_skb_one_core+0x8d/0x160
    [ 120.996204][ C0] ? __netif_receive_skb_core+0x3170/0x3170
    [ ... ]

    Reported-by: syzbot+fcf5dd39282ceb27108d@syzkaller.appspotmail.com
    Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

11 Feb, 2020

1 commit

  • [ Upstream commit 2b5b8251bc9fe2f9118411f037862ee17cf81e97 ]

    hsr_port_get_rcu() can return NULL, so we need to be careful.

    general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN
    KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
    CPU: 1 PID: 10249 Comm: syz-executor.5 Not tainted 5.5.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline]
    RIP: 0010:hsr_addr_is_self+0x86/0x330 net/hsr/hsr_framereg.c:44
    Code: 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 6b ff 94 f9 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 3c 02 00 0f 85 75 02 00 00 48 8b 43 30 49 39 c6 49 89 47 c0 0f
    RSP: 0018:ffffc90000da8a90 EFLAGS: 00010206
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff87e0cc33
    RDX: 0000000000000006 RSI: ffffffff87e035d5 RDI: 0000000000000000
    RBP: ffffc90000da8b20 R08: ffff88808e7de040 R09: ffffed1015d2707c
    R10: ffffed1015d2707b R11: ffff8880ae9383db R12: ffff8880a689bc5e
    R13: 1ffff920001b5153 R14: 0000000000000030 R15: ffffc90000da8af8
    FS: 00007fd7a42be700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000001b32338000 CR3: 00000000a928c000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:

    hsr_handle_frame+0x1c5/0x630 net/hsr/hsr_slave.c:31
    __netif_receive_skb_core+0xfbc/0x30b0 net/core/dev.c:5099
    __netif_receive_skb_one_core+0xa8/0x1a0 net/core/dev.c:5196
    __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312
    process_backlog+0x206/0x750 net/core/dev.c:6144
    napi_poll net/core/dev.c:6582 [inline]
    net_rx_action+0x508/0x1120 net/core/dev.c:6650
    __do_softirq+0x262/0x98c kernel/softirq.c:292
    do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082

    Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

29 Jan, 2020

1 commit

  • commit 80892772c4edac88c538165d26a0105f19b61c1c upstream.

    A compliation error happen when building branch 5.5-rc7

    In file included from net/hsr/hsr_main.c:12:0:
    net/hsr/hsr_main.h:194:20: error: two or more data types in declaration specifiers
    static inline void void hsr_debugfs_rename(struct net_device *dev)

    So Removed one void.

    Fixes: 4c2d5e33dcd3 ("hsr: rename debugfs file when interface name is changed")
    Signed-off-by: xiaofeng.yan
    Acked-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    xiaofeng.yan
     

18 Jan, 2020

4 commits

  • commit 04b69426d846cd04ca9acefff1ea39e1c64d2714 upstream.

    hsr slave interfaces don't have debugfs directory.
    So, hsr_debugfs_rename() shouldn't be called when hsr slave interface name
    is changed.

    Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
    ip link set dummy0 name ap

    Splat looks like:
    [21071.899367][T22666] ap: renamed from dummy0
    [21071.914005][T22666] ==================================================================
    [21071.919008][T22666] BUG: KASAN: slab-out-of-bounds in hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.923640][T22666] Read of size 8 at addr ffff88805febcd98 by task ip/22666
    [21071.926941][T22666]
    [21071.927750][T22666] CPU: 0 PID: 22666 Comm: ip Not tainted 5.5.0-rc2+ #240
    [21071.929919][T22666] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [21071.935094][T22666] Call Trace:
    [21071.935867][T22666] dump_stack+0x96/0xdb
    [21071.936687][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.937774][T22666] print_address_description.constprop.5+0x1be/0x360
    [21071.939019][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.940081][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.940949][T22666] __kasan_report+0x12a/0x16f
    [21071.941758][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.942674][T22666] kasan_report+0xe/0x20
    [21071.943325][T22666] hsr_debugfs_rename+0xaa/0xb0 [hsr]
    [21071.944187][T22666] hsr_netdev_notify+0x1fe/0x9b0 [hsr]
    [21071.945052][T22666] ? __module_text_address+0x13/0x140
    [21071.945897][T22666] notifier_call_chain+0x90/0x160
    [21071.946743][T22666] dev_change_name+0x419/0x840
    [21071.947496][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10
    [21071.948600][T22666] ? netdev_adjacent_rename_links+0x280/0x280
    [21071.949577][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10
    [21071.950672][T22666] ? lock_downgrade+0x6e0/0x6e0
    [21071.951345][T22666] ? do_setlink+0x811/0x2ef0
    [21071.951991][T22666] do_setlink+0x811/0x2ef0
    [21071.952613][T22666] ? is_bpf_text_address+0x81/0xe0
    [ ... ]

    Reported-by: syzbot+9328206518f08318a5fd@syzkaller.appspotmail.com
    Fixes: 4c2d5e33dcd3 ("hsr: rename debugfs file when interface name is changed")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • commit 3ed0a1d563903bdb4b4c36c58c4d9c1bcb23a6e6 upstream.

    The supervision frame is L2 frame.
    When supervision frame is created, hsr module doesn't set network header.
    If tap routine is enabled, dev_queue_xmit_nit() is called and it checks
    network_header. If network_header pointer wasn't set(or invalid),
    it resets network_header and warns.
    In order to avoid unnecessary warning message, resetting network_header
    is needed.

    Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst
    ip link set veth0 up
    ip link set veth2 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set veth3 up
    ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
    ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
    ip netns exec nst ip link set hsr1 up
    tcpdump -nei veth0

    Splat looks like:
    [ 175.852292][ C3] protocol 88fb is buggy, dev veth0

    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • commit 4c2d5e33dcd3a6333a7895be3b542ff3d373177c upstream.

    hsr interface has own debugfs file, which name is same with interface name.
    So, interface name is changed, debugfs file name should be changed too.

    Fixes: fc4ecaeebd26 ("net: hsr: add debugfs support for display node list")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     
  • commit c6c4ccd7f96993e106dfea7ef18127f972f2db5e upstream.

    In current hsr code, when hsr interface is created, it creates debugfs
    directory /sys/kernel/debug/.
    If there is same directory or file name in there, it fails.
    In order to reduce possibility of failure of creation of debugfs,
    this patch adds root directory.

    Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1

    Before this patch:
    /sys/kernel/debug/hsr0/node_table

    After this patch:
    /sys/kernel/debug/hsr/hsr0/node_table

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

09 Jan, 2020

3 commits

  • [ Upstream commit 92a35678ec075100ce666a2fb6969151affb0e5d ]

    hsr nodes are protected by RCU and there is no write side lock.
    But node insertions and deletions could be being operated concurrently.
    So write side locking is needed.

    Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst
    ip link set veth0 up
    ip link set veth2 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set veth3 up
    ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
    ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
    ip netns exec nst ip link set hsr1 up

    for i in {0..9}
    do
    for j in {0..9}
    do
    for k in {0..9}
    do
    for l in {0..9}
    do
    arping 192.168.100.2 -I hsr0 -s 00:01:3$i:4$j:5$k:6$l -c1 &
    done
    done
    done
    done

    Splat looks like:
    [ 236.066091][ T3286] list_add corruption. next->prev should be prev (ffff8880a5940300), but was ffff8880a5940d0.
    [ 236.069617][ T3286] ------------[ cut here ]------------
    [ 236.070545][ T3286] kernel BUG at lib/list_debug.c:25!
    [ 236.071391][ T3286] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 236.072343][ T3286] CPU: 0 PID: 3286 Comm: arping Tainted: G W 5.5.0-rc1+ #209
    [ 236.073463][ T3286] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [ 236.074695][ T3286] RIP: 0010:__list_add_valid+0x74/0xd0
    [ 236.075499][ T3286] Code: 48 39 da 75 27 48 39 f5 74 36 48 39 dd 74 31 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 b
    [ 236.078277][ T3286] RSP: 0018:ffff8880aaa97648 EFLAGS: 00010286
    [ 236.086991][ T3286] RAX: 0000000000000075 RBX: ffff8880d4624c20 RCX: 0000000000000000
    [ 236.088000][ T3286] RDX: 0000000000000075 RSI: 0000000000000008 RDI: ffffed1015552ebf
    [ 236.098897][ T3286] RBP: ffff88809b53d200 R08: ffffed101b3c04f9 R09: ffffed101b3c04f9
    [ 236.099960][ T3286] R10: 00000000308769a1 R11: ffffed101b3c04f8 R12: ffff8880d4624c28
    [ 236.100974][ T3286] R13: ffff8880d4624c20 R14: 0000000040310100 R15: ffff8880ce17ee02
    [ 236.138967][ T3286] FS: 00007f23479fa680(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
    [ 236.144852][ T3286] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 236.145720][ T3286] CR2: 00007f4a14bab210 CR3: 00000000a61c6001 CR4: 00000000000606f0
    [ 236.146776][ T3286] Call Trace:
    [ 236.147222][ T3286] hsr_add_node+0x314/0x490 [hsr]
    [ 236.153633][ T3286] hsr_forward_skb+0x2b6/0x1bc0 [hsr]
    [ 236.154362][ T3286] ? rcu_read_lock_sched_held+0x90/0xc0
    [ 236.155091][ T3286] ? rcu_read_lock_bh_held+0xa0/0xa0
    [ 236.156607][ T3286] hsr_dev_xmit+0x70/0xd0 [hsr]
    [ 236.157254][ T3286] dev_hard_start_xmit+0x160/0x740
    [ 236.157941][ T3286] __dev_queue_xmit+0x1961/0x2e10
    [ 236.158565][ T3286] ? netdev_core_pick_tx+0x2e0/0x2e0
    [ ... ]

    Reported-by: syzbot+3924327f9ad5f4d2b343@syzkaller.appspotmail.com
    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Taehee Yoo
     
  • [ Upstream commit 1d19e2d53e8ed9e4c98fc95e0067492cda7288b0 ]

    hsr_dev_finalize() is called to create new hsr interface.
    There are some wrong error handling codes.

    1. wrong checking return value of debugfs_create_{dir/file}.
    These function doesn't return NULL. If error occurs in there,
    it returns error pointer.
    So, it should check error pointer instead of NULL.

    2. It doesn't unregister interface if it fails to setup hsr interface.
    If it fails to initialize hsr interface after register_netdevice(),
    it should call unregister_netdevice().

    3. Ignore failure of creation of debugfs
    If creating of debugfs dir and file is failed, creating hsr interface
    will be failed. But debugfs doesn't affect actual logic of hsr module.
    So, ignoring this is more correct and this behavior is more general.

    Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Taehee Yoo
     
  • [ Upstream commit 84bb59d773853bc2dda2ac1ef8474c40eb33a3c6 ]

    When hsr module is being removed, debugfs_remove() is called to remove
    both debugfs directory and file.

    When module is being removed, module state is changed to
    MODULE_STATE_GOING then exit() is called.
    At this moment, module couldn't be held so try_module_get()
    will be failed.

    debugfs's open() callback tries to hold the module if .owner is existing.
    If it fails, warning message is printed.

    CPU0 CPU1
    delete_module()
    try_stop_module()
    hsr_exit() open()
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Taehee Yoo
     

18 Dec, 2019

1 commit

  • [ Upstream commit df95467b6d2bfce49667ee4b71c67249b01957f7 ]

    hsr_dev_xmit() calls hsr_port_get_hsr() to find master node and that would
    return NULL if master node is not existing in the list.
    But hsr_dev_xmit() doesn't check return pointer so a NULL dereference
    could occur.

    Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst
    ip link set veth0 up
    ip link set veth2 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set veth3 up
    ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
    ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
    ip netns exec nst ip link set hsr1 up
    hping3 192.168.100.2 -2 --flood &
    modprobe -rv hsr

    Splat looks like:
    [ 217.351122][ T1635] kasan: CONFIG_KASAN_INLINE enabled
    [ 217.352969][ T1635] kasan: GPF could be caused by NULL-ptr deref or user memory access
    [ 217.354297][ T1635] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 217.355507][ T1635] CPU: 1 PID: 1635 Comm: hping3 Not tainted 5.4.0+ #192
    [ 217.356472][ T1635] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [ 217.357804][ T1635] RIP: 0010:hsr_dev_xmit+0x34/0x90 [hsr]
    [ 217.373010][ T1635] Code: 48 8d be 00 0c 00 00 be 04 00 00 00 48 83 ec 08 e8 21 be ff ff 48 8d 78 10 48 ba 00 b
    [ 217.376919][ T1635] RSP: 0018:ffff8880cd8af058 EFLAGS: 00010202
    [ 217.377571][ T1635] RAX: 0000000000000000 RBX: ffff8880acde6840 RCX: 0000000000000002
    [ 217.379465][ T1635] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: 0000000000000010
    [ 217.380274][ T1635] RBP: ffff8880acde6840 R08: ffffed101b440d5d R09: 0000000000000001
    [ 217.381078][ T1635] R10: 0000000000000001 R11: ffffed101b440d5c R12: ffff8880bffcc000
    [ 217.382023][ T1635] R13: ffff8880bffcc088 R14: 0000000000000000 R15: ffff8880ca675c00
    [ 217.383094][ T1635] FS: 00007f060d9d1740(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
    [ 217.384289][ T1635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 217.385009][ T1635] CR2: 00007faf15381dd0 CR3: 00000000d523c001 CR4: 00000000000606e0
    [ 217.385940][ T1635] Call Trace:
    [ 217.386544][ T1635] dev_hard_start_xmit+0x160/0x740
    [ 217.387114][ T1635] __dev_queue_xmit+0x1961/0x2e10
    [ 217.388118][ T1635] ? check_object+0xaf/0x260
    [ 217.391466][ T1635] ? __alloc_skb+0xb9/0x500
    [ 217.392017][ T1635] ? init_object+0x6b/0x80
    [ 217.392629][ T1635] ? netdev_core_pick_tx+0x2e0/0x2e0
    [ 217.393175][ T1635] ? __alloc_skb+0xb9/0x500
    [ 217.393727][ T1635] ? rcu_read_lock_sched_held+0x90/0xc0
    [ 217.394331][ T1635] ? rcu_read_lock_bh_held+0xa0/0xa0
    [ 217.395013][ T1635] ? kasan_unpoison_shadow+0x30/0x40
    [ 217.395668][ T1635] ? __kasan_kmalloc.constprop.4+0xa0/0xd0
    [ 217.396280][ T1635] ? __kmalloc_node_track_caller+0x3a8/0x3f0
    [ 217.399007][ T1635] ? __kasan_kmalloc.constprop.4+0xa0/0xd0
    [ 217.400093][ T1635] ? __kmalloc_reserve.isra.46+0x2e/0xb0
    [ 217.401118][ T1635] ? memset+0x1f/0x40
    [ 217.402529][ T1635] ? __alloc_skb+0x317/0x500
    [ 217.404915][ T1635] ? arp_xmit+0xca/0x2c0
    [ ... ]

    Fixes: 311633b60406 ("hsr: switch ->dellink() to ->ndo_uninit()")
    Acked-by: Cong Wang
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

12 Jul, 2019

1 commit

  • Switching from ->priv_destructor to dellink() has an unexpected
    consequence: existing RCU readers, that is, hsr_port_get_hsr()
    callers, may still be able to read the port list.

    Instead of checking the return value of each hsr_port_get_hsr(),
    we can just move it to ->ndo_uninit() which is called after
    device unregister and synchronize_net(), and we still have RTNL
    lock there.

    Fixes: b9a1e627405d ("hsr: implement dellink to clean up resources")
    Fixes: edf070a0fb45 ("hsr: fix a NULL pointer deref in hsr_dev_xmit()")
    Reported-by: syzbot+097ef84cdc95843fbaa8@syzkaller.appspotmail.com
    Cc: Arvid Brodin
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

06 Jul, 2019

3 commits

  • hsr_port_get_hsr() could return NULL and kernel
    could crash:

    BUG: kernel NULL pointer dereference, address: 0000000000000010
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 8000000074b84067 P4D 8000000074b84067 PUD 7057d067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 754 Comm: a.out Not tainted 5.2.0-rc6+ #718
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
    RIP: 0010:hsr_dev_xmit+0x20/0x31
    Code: 48 8b 1b eb e0 5b 5d 41 5c c3 66 66 66 66 90 55 48 89 fd 48 8d be 40 0b 00 00 be 04 00 00 00 e8 ee f2 ff ff 48 89 ef 48 89 c6 8b 40 10 48 89 45 10 e8 6c 1b 00 00 31 c0 5d c3 66 66 66 66 90
    RSP: 0018:ffffb5b400003c48 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff9821b4509a88 RCX: 0000000000000000
    RDX: ffff9821b4509a88 RSI: 0000000000000000 RDI: ffff9821bc3fc7c0
    RBP: ffff9821bc3fc7c0 R08: 0000000000000000 R09: 00000000000c2019
    R10: 0000000000000000 R11: 0000000000000002 R12: ffff9821bc3fc7c0
    R13: ffff9821b4509a88 R14: 0000000000000000 R15: 000000000000006e
    FS: 00007fee112a1800(0000) GS:ffff9821bd800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 000000006e9ce000 CR4: 00000000000406f0
    Call Trace:

    netdev_start_xmit+0x1b/0x38
    dev_hard_start_xmit+0x121/0x21e
    ? validate_xmit_skb.isra.0+0x19/0x1e3
    __dev_queue_xmit+0x74c/0x823
    ? lockdep_hardirqs_on+0x12b/0x17d
    ip6_finish_output2+0x3d3/0x42c
    ? ip6_mtu+0x55/0x5c
    ? mld_sendpack+0x191/0x229
    mld_sendpack+0x191/0x229
    mld_ifc_timer_expire+0x1f7/0x230
    ? mld_dad_timer_expire+0x58/0x58
    call_timer_fn+0x12e/0x273
    __run_timers.part.0+0x174/0x1b5
    ? mld_dad_timer_expire+0x58/0x58
    ? sched_clock_cpu+0x10/0xad
    ? mark_lock+0x26/0x1f2
    ? __lock_is_held+0x40/0x71
    run_timer_softirq+0x26/0x48
    __do_softirq+0x1af/0x392
    irq_exit+0x53/0xa2
    smp_apic_timer_interrupt+0x1c4/0x1d9
    apic_timer_interrupt+0xf/0x20

    Cc: Arvid Brodin
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • hsr_link_ops implements ->newlink() but not ->dellink(),
    which leads that resources not released after removing the device,
    particularly the entries in self_node_db and node_db.

    So add ->dellink() implementation to replace the priv_destructor.
    This also makes the code slightly easier to understand.

    Reported-by: syzbot+c6167ec3de7def23d1e8@syzkaller.appspotmail.com
    Cc: Arvid Brodin
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • hsr_del_port() should release all the resources allocated
    in hsr_add_port().

    As a consequence of this change, hsr_for_each_port() is no
    longer safe to work with hsr_del_port(), switch to
    list_for_each_entry_safe() as we always hold RTNL lock.

    Cc: Arvid Brodin
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

24 May, 2019

1 commit

  • Don't prune the master node in the hsr_prune_nodes function.
    Neither time_in[HSR_PT_SLAVE_A] nor time_in[HSR_PT_SLAVE_B]
    will ever be updated by hsr_register_frame_in for the master port.
    Thus, the master node will be repeatedly pruned leading to
    repeated packet loss.
    This bug never appeared because the hsr_prune_nodes function
    was only called once. Since commit 5150b45fd355
    ("net: hsr: Fix node prune function for forget time expiry") this issue
    is fixed unveiling the issue described above.

    Fixes: 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry")
    Signed-off-by: Andreas Oetken
    Tested-by: Murali Karicheri
    Signed-off-by: David S. Miller

    Andreas Oetken
     

21 May, 2019

1 commit


28 Apr, 2019

1 commit

  • Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

    @@
    identifier ops;
    expression X;
    @@
    struct genl_ops ops[] = {
    ...,
    {
    .cmd = X,
    + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
    ...
    },
    ...
    };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

16 Apr, 2019

3 commits


07 Apr, 2019

14 commits


22 Mar, 2019

1 commit

  • Since maxattr is common, the policy can't really differ sanely,
    so make it common as well.

    The only user that did in fact manage to make a non-common policy
    is taskstats, which has to be really careful about it (since it's
    still using a common maxattr!). This is no longer supported, but
    we can fake it using pre_doit.

    This reduces the size of e.g. nl80211.o (which has lots of commands):

    text data bss dec hex filename
    398745 14323 2240 415308 6564c net/wireless/nl80211.o (before)
    397913 14331 2240 414484 65314 net/wireless/nl80211.o (after)
    --------------------------------
    -832 +8 0 -824

    Which is obviously just 8 bytes for each command, and an added 8
    bytes for the new policy pointer. I'm not sure why the ops list is
    counted as .text though.

    Most of the code transformations were done using the following spatch:
    @ops@
    identifier OPS;
    expression POLICY;
    @@
    struct genl_ops OPS[] = {
    ...,
    {
    - .policy = POLICY,
    },
    ...
    };

    @@
    identifier ops.OPS;
    expression ops.POLICY;
    identifier fam;
    expression M;
    @@
    struct genl_family fam = {
    .ops = OPS,
    .maxattr = M,
    + .policy = POLICY,
    ...
    };

    This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing
    the cb->data as ops, which we want to change in a later genl patch.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg