19 Mar, 2019

35 commits

  • [ Upstream commit ae3b564179bfd06f32d051b9e5d72ce4b2a07c37 ]

    Several u->addr and u->path users are not holding any locks in
    common with unix_bind(). unix_state_lock() is useless for those
    purposes.

    u->addr is assign-once and *(u->addr) is fully set up by the time
    we set u->addr (all under unix_table_lock). u->path is also
    set in the same critical area, also before setting u->addr, and
    any unix_sock with ->path filled will have non-NULL ->addr.

    So setting ->addr with smp_store_release() is all we need for those
    "lockless" users - just have them fetch ->addr with smp_load_acquire()
    and don't even bother looking at ->path if they see NULL ->addr.

    Users of ->addr and ->path fall into several classes now:
    1) ones that do smp_load_acquire(u->addr) and access *(u->addr)
    and u->path only if smp_load_acquire() has returned non-NULL.
    2) places holding unix_table_lock. These are guaranteed that
    *(u->addr) is seen fully initialized. If unix_sock is in one of the
    "bound" chains, so's ->path.
    3) unix_sock_destructor() using ->addr is safe. All places
    that set u->addr are guaranteed to have seen all stores *(u->addr)
    while holding a reference to u and unix_sock_destructor() is called
    when (atomic) refcount hits zero.
    4) unix_release_sock() using ->path is safe. unix_bind()
    is serialized wrt unix_release() (normally - by struct file
    refcount), and for the instances that had ->path set by unix_bind()
    unix_release_sock() comes from unix_release(), so they are fine.
    Instances that had it set in unix_stream_connect() either end up
    attached to a socket (in unix_accept()), in which case the call
    chain to unix_release_sock() and serialization are the same as in
    the previous case, or they never get accept'ed and unix_release_sock()
    is called when the listener is shut down and its queue gets purged.
    In that case the listener's queue lock provides the barriers needed -
    unix_stream_connect() shoves our unix_sock into listener's queue
    under that lock right after having set ->path and eventual
    unix_release_sock() caller picks them from that queue under the
    same lock right before calling unix_release_sock().
    5) unix_find_other() use of ->path is pointless, but safe -
    it happens with successful lookup by (abstract) name, so ->path.dentry
    is guaranteed to be NULL there.

    earlier-variant-reviewed-by: "Paul E. McKenney"
    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Al Viro
     
  • [ Upstream commit d7cf4a3bf3a83c977a29055e1c4ffada7697b31f ]

    smc_poll() returns with mask bit EPOLLPRI if the connection urg_state
    is SMC_URG_VALID. Since SMC_URG_VALID is zero, smc_poll signals
    EPOLLPRI errorneously if called in state SMC_INIT before the connection
    is created, for instance in a non-blocking connect scenario.

    This patch switches to non-zero values for the urg states.

    Reviewed-by: Karsten Graul
    Fixes: de8474eb9d50 ("net/smc: urgent data support")
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ursula Braun
     
  • [ Upstream commit 3c963a3306eada999be5ebf4f293dfa3d3945487 ]

    This patch fixes a subtle PACKET_ORIGDEV regression which was a side
    effect of fixes introduced by:

    6a9e461f6fe4 bonding: pass link-local packets to bonding master also.

    ... to:

    b89f04c61efe bonding: deliver link-local packets with skb->dev set to link that packets arrived on

    While 6a9e461f6fe4 restored pre-b89f04c61efe presence of link-local
    packets on bonding masters (which is required e.g. by linux bridges
    participating in spanning tree or needed for lab-like setups created
    with group_fwd_mask) it also caused the originating device
    information to be lost due to cloning.

    Maciej Żenczykowski proposed another solution that doesn't require
    packet cloning and retains original device information - instead of
    returning RX_HANDLER_PASS for all link-local packets it's now limited
    only to packets from inactive slaves.

    At the same time, packets passed to bonding masters retain correct
    information about the originating device and PACKET_ORIGDEV can be used
    to determine it.

    This elegantly solves all issues so far:

    - link-local packets that were removed from bonding masters
    - LLDP daemons being forced to explicitly bind to slave interfaces
    - PACKET_ORIGDEV having no effect on bond interfaces

    Fixes: 6a9e461f6fe4 (bonding: pass link-local packets to bonding master also.)
    Reported-by: Vincent Bernat
    Signed-off-by: Michal Soltys
    Signed-off-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Michal Soltys
     
  • [ Upstream commit bf1dc8bad1d42287164d216d8efb51c5cd381b18 ]

    We need a RCU critical section around rt6_info->from deference, and
    proper annotation.

    Fixes: 4ed591c8ab44 ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route")
    Signed-off-by: Paolo Abeni
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 193f3685d0546b0cea20c99894aadb70098e47bf ]

    We must access rt6_info->from under RCU read lock: move the
    dereference under such lock, with proper annotation.

    v1 -> v2:
    - avoid using multiple, racy, fetch operations for rt->from

    Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected")
    Signed-off-by: Paolo Abeni
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 7cc9f7003a969d359f608ebb701d42cafe75b84a ]

    When running Docker with userns isolation e.g. --userns-remap="default"
    and spawning up some containers with CAP_NET_ADMIN under this realm, I
    noticed that link changes on ipvlan slave device inside that container
    can affect all devices from this ipvlan group which are in other net
    namespaces where the container should have no permission to make changes
    to, such as the init netns, for example.

    This effectively allows to undo ipvlan private mode and switch globally to
    bridge mode where slaves can communicate directly without going through
    hostns, or it allows to switch between global operation mode (l2/l3/l3s)
    for everyone bound to the given ipvlan master device. libnetwork plugin
    here is creating an ipvlan master and ipvlan slave in hostns and a slave
    each that is moved into the container's netns upon creation event.

    * In hostns:

    # ip -d a
    [...]
    8: cilium_host@bond0: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.0.1/32 scope link cilium_host
    valid_lft forever preferred_lft forever
    [...]

    * Spawn container & change ipvlan mode setting inside of it:

    # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
    9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c

    # docker exec -ti client ip -d a
    [...]
    10: cilium0@if4: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
    valid_lft forever preferred_lft forever

    # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2

    # docker exec -ti client ip -d a
    [...]
    10: cilium0@if4: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
    valid_lft forever preferred_lft forever

    * In hostns (mode switched to l2):

    # ip -d a
    [...]
    8: cilium_host@bond0: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.0.1/32 scope link cilium_host
    valid_lft forever preferred_lft forever
    [...]

    Same l3 -> l2 switch would also happen by creating another slave inside
    the container's network namespace when specifying the existing cilium0
    link to derive the actual (bond0) master:

    # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2

    # docker exec -ti client ip -d a
    [...]
    2: cilium1@if4: mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    10: cilium0@if4: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
    valid_lft forever preferred_lft forever

    * In hostns:

    # ip -d a
    [...]
    8: cilium_host@bond0: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
    ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.41.0.1/32 scope link cilium_host
    valid_lft forever preferred_lft forever
    [...]

    One way to mitigate it is to check CAP_NET_ADMIN permissions of
    the ipvlan master device's ns, and only then allow to change
    mode or flags for all devices bound to it. Above two cases are
    then disallowed after the patch.

    Signed-off-by: Daniel Borkmann
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit 8c7a77267eec81dd81af8412f29e50c0b1082548 ]

    When a port is added to a team, its initial state is derived
    from netif_carrier_ok rather than netif_oper_up.
    If it is carrier up but operationally down at the time of being
    added, the port state.linkup will be set prematurely.
    port state.linkup should be set consistently using
    netif_oper_up rather than netif_carrier_ok.

    Fixes: f1d22a1e0595 ("team: account for oper state")
    Signed-off-by: George Wilkie
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    George Wilkie
     
  • [ Upstream commit f5b51fe804ec2a6edce0f8f6b11ea57283f5857b ]

    When a netdevice is unregistered, we flush the relevant exception
    via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().

    Finally, we end-up calling rt6_remove_exception(), where we release
    the relevant dst, while we keep the references to the related fib6_info and
    dev. Such references should be released later when the dst will be
    destroyed.

    There are a number of caches that can keep the exception around for an
    unlimited amount of time - namely dst_cache, possibly even socket cache.
    As a result device registration may hang, as demonstrated by this script:

    ip netns add cl
    ip netns add rt
    ip netns add srv
    ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1

    ip link add name cl_veth type veth peer name cl_rt_veth
    ip link set dev cl_veth netns cl
    ip -n cl link set dev cl_veth up
    ip -n cl addr add dev cl_veth 2001::2/64
    ip -n cl route add default via 2001::1

    ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
    ip -n cl link set tunv6 up
    ip -n cl addr add 2013::2/64 dev tunv6

    ip link set dev cl_rt_veth netns rt
    ip -n rt link set dev cl_rt_veth up
    ip -n rt addr add dev cl_rt_veth 2001::1/64

    ip link add name rt_srv_veth type veth peer name srv_veth
    ip link set dev srv_veth netns srv
    ip -n srv link set dev srv_veth up
    ip -n srv addr add dev srv_veth 2002::1/64
    ip -n srv route add default via 2002::2

    ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
    ip -n srv link set tunv6 up
    ip -n srv addr add 2013::1/64 dev tunv6

    ip link set dev rt_srv_veth netns rt
    ip -n rt link set dev rt_srv_veth up
    ip -n rt addr add dev rt_srv_veth 2002::2/64

    ip netns exec srv netserver & sleep 0.1
    ip netns exec cl ping6 -c 4 2013::1
    ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
    ip -n rt link set dev rt_srv_veth mtu 1400
    wait %2

    ip -n cl link del cl_veth

    This commit addresses the issue purging all the references held by the
    exception at time, as we currently do for e.g. ipv6 pcpu dst entries.

    v1 -> v2:
    - re-order the code to avoid accessing dst and net after dst_dev_put()

    Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes")
    Signed-off-by: Paolo Abeni
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 97f0082a0592212fc15d4680f5a4d80f79a1687c ]

    Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 to
    keep legacy software happy. This is similar to what was done for
    ipv4 in commit 709772e6e065 ("net: Fix routing tables with
    id > 255 for legacy software").

    Signed-off-by: Kalash Nainwal
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kalash Nainwal
     
  • [ Upstream commit 6ff7b060535e87c2ae14dd8548512abfdda528fb ]

    KASAN has found use-after-free in fixed_mdio_bus_init,
    commit 0c692d07842a ("drivers/net/phy/mdio_bus.c: call
    put_device on device_register() failure") call put_device()
    while device_register() fails,give up the last reference
    to the device and allow mdiobus_release to be executed
    ,kfreeing the bus. However in most drives, mdiobus_free
    be called to free the bus while mdiobus_register fails.
    use-after-free occurs when access bus again, this patch
    revert it to let mdiobus_free free the bus.

    KASAN report details as below:

    BUG: KASAN: use-after-free in mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
    Read of size 4 at addr ffff8881dc824d78 by task syz-executor.0/3524

    CPU: 1 PID: 3524 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0xfa/0x1ce lib/dump_stack.c:113
    print_address_description+0x65/0x270 mm/kasan/report.c:187
    kasan_report+0x149/0x18d mm/kasan/report.c:317
    mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
    fixed_mdio_bus_init+0x283/0x1000 [fixed_phy]
    ? 0xffffffffc0e40000
    ? 0xffffffffc0e40000
    ? 0xffffffffc0e40000
    do_one_initcall+0xfa/0x5ca init/main.c:887
    do_init_module+0x204/0x5f6 kernel/module.c:3460
    load_module+0x66b2/0x8570 kernel/module.c:3808
    __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
    do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x462e99
    Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f6215c19c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
    RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
    RBP: 00007f6215c19c70 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6215c1a6bc
    R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004

    Allocated by task 3524:
    set_track mm/kasan/common.c:85 [inline]
    __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:496
    kmalloc include/linux/slab.h:545 [inline]
    kzalloc include/linux/slab.h:740 [inline]
    mdiobus_alloc_size+0x54/0x1b0 drivers/net/phy/mdio_bus.c:143
    fixed_mdio_bus_init+0x163/0x1000 [fixed_phy]
    do_one_initcall+0xfa/0x5ca init/main.c:887
    do_init_module+0x204/0x5f6 kernel/module.c:3460
    load_module+0x66b2/0x8570 kernel/module.c:3808
    __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
    do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Freed by task 3524:
    set_track mm/kasan/common.c:85 [inline]
    __kasan_slab_free+0x130/0x180 mm/kasan/common.c:458
    slab_free_hook mm/slub.c:1409 [inline]
    slab_free_freelist_hook mm/slub.c:1436 [inline]
    slab_free mm/slub.c:2986 [inline]
    kfree+0xe1/0x270 mm/slub.c:3938
    device_release+0x78/0x200 drivers/base/core.c:919
    kobject_cleanup lib/kobject.c:662 [inline]
    kobject_release lib/kobject.c:691 [inline]
    kref_put include/linux/kref.h:67 [inline]
    kobject_put+0x146/0x240 lib/kobject.c:708
    put_device+0x1c/0x30 drivers/base/core.c:2060
    __mdiobus_register+0x483/0x560 drivers/net/phy/mdio_bus.c:382
    fixed_mdio_bus_init+0x26b/0x1000 [fixed_phy]
    do_one_initcall+0xfa/0x5ca init/main.c:887
    do_init_module+0x204/0x5f6 kernel/module.c:3460
    load_module+0x66b2/0x8570 kernel/module.c:3808
    __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
    do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    The buggy address belongs to the object at ffff8881dc824c80
    which belongs to the cache kmalloc-2k of size 2048
    The buggy address is located 248 bytes inside of
    2048-byte region [ffff8881dc824c80, ffff8881dc825480)
    The buggy address belongs to the page:
    page:ffffea0007720800 count:1 mapcount:0 mapping:ffff8881f6c02800 index:0x0 compound_mapcount: 0
    flags: 0x2fffc0000010200(slab|head)
    raw: 02fffc0000010200 0000000000000000 0000000500000001 ffff8881f6c02800
    raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8881dc824c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff8881dc824c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8881dc824d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff8881dc824d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff8881dc824e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

    Fixes: 0c692d07842a ("drivers/net/phy/mdio_bus.c: call put_device on device_register() failure")
    Signed-off-by: YueHaibing
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    YueHaibing
     
  • [ Upstream commit 797a22bd5298c2674d927893f46cadf619dad11d ]

    syzbot was able to trigger another soft lockup [1]

    I first thought it was the O(N^2) issue I mentioned in my
    prior fix (f657d22ee1f "net/x25: do not hold the cpu
    too long in x25_new_lci()"), but I eventually found
    that x25_bind() was not checking SOCK_ZAPPED state under
    socket lock protection.

    This means that multiple threads can end up calling
    x25_insert_socket() for the same socket, and corrupt x25_list

    [1]
    watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492]
    Modules linked in:
    irq event stamp: 27515
    hardirqs last enabled at (27514): [] trace_hardirqs_on_thunk+0x1a/0x1c
    hardirqs last disabled at (27515): [] trace_hardirqs_off_thunk+0x1a/0x1c
    softirqs last enabled at (32): [] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336
    softirqs last disabled at (34): [] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341
    CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97
    Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2
    RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
    RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000
    RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628
    RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561
    R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628
    R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000
    FS: 00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    __x25_find_socket net/x25/af_x25.c:327 [inline]
    x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342
    x25_new_lci net/x25/af_x25.c:355 [inline]
    x25_connect+0x380/0xde0 net/x25/af_x25.c:784
    __sys_connect+0x266/0x330 net/socket.c:1662
    __do_sys_connect net/socket.c:1673 [inline]
    __se_sys_connect net/socket.c:1670 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1670
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x457e29
    Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29
    RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005
    RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4
    R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff
    Sending NMI from CPU 0 to CPUs 1:
    NMI backtrace for cpu 1
    CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
    RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86
    Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00
    RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206
    RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560
    RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00
    RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561
    R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff
    R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003
    FS: 00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
    do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
    __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
    _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
    x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
    x25_bind+0x273/0x340 net/x25/af_x25.c:703
    __sys_bind+0x23f/0x290 net/socket.c:1481
    __do_sys_bind net/socket.c:1492 [inline]
    __se_sys_bind net/socket.c:1490 [inline]
    __x64_sys_bind+0x73/0xb0 net/socket.c:1490
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x457e29

    Fixes: 90c27297a9bf ("X.25 remove bkl in bind")
    Signed-off-by: Eric Dumazet
    Cc: andrew hendry
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 8511a653e9250ef36b95803c375a7be0e2edb628 ]

    Calculation of qp mtt size (in function mlx4_RST2INIT_wrapper)
    ultimately depends on function roundup_pow_of_two.

    If the amount of memory required by the QP is less than one page,
    roundup_pow_of_two is called with argument zero. In this case, the
    roundup_pow_of_two result is undefined.

    Calling roundup_pow_of_two with a zero argument resulted in the
    following stack trace:

    UBSAN: Undefined behaviour in ./include/linux/log2.h:61:13
    shift exponent 64 is too large for 64-bit type 'long unsigned int'
    CPU: 4 PID: 26939 Comm: rping Tainted: G OE 4.19.0-rc1
    Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
    Call Trace:
    dump_stack+0x9a/0xeb
    ubsan_epilogue+0x9/0x7c
    __ubsan_handle_shift_out_of_bounds+0x254/0x29d
    ? __ubsan_handle_load_invalid_value+0x180/0x180
    ? debug_show_all_locks+0x310/0x310
    ? sched_clock+0x5/0x10
    ? sched_clock+0x5/0x10
    ? sched_clock_cpu+0x18/0x260
    ? find_held_lock+0x35/0x1e0
    ? mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]
    mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]

    Fix this by explicitly testing for zero, and returning one if the
    argument is zero (assuming that the next higher power of 2 in this case
    should be one).

    Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests")
    Signed-off-by: Jack Morgenstein
    Signed-off-by: Tariq Toukan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     
  • [ Upstream commit c07d27927f2f2e96fcd27bb9fb330c9ea65612d0 ]

    In procedures mlx4_cmd_use_events() and mlx4_cmd_use_polling(), we need to
    guarantee that there are no FW commands in progress on the comm channel
    (for VFs) or wrapped FW commands (on the PF) when SRIOV is active.

    We do this by also taking the slave_cmd_mutex when SRIOV is active.

    This is especially important when switching from event to polling, since we
    free the command-context array during the switch. If there are FW commands
    in progress (e.g., waiting for a completion event), the completion event
    handler will access freed memory.

    Since the decision to use comm_wait or comm_poll is taken before grabbing
    the event_sem/poll_sem in mlx4_comm_cmd_wait/poll, we must take the
    slave_cmd_mutex as well (to guarantee that the decision to use events or
    polling and the call to the appropriate cmd function are atomic).

    Fixes: a7e1f04905e5 ("net/mlx4_core: Fix deadlock when switching between polling and event fw commands")
    Signed-off-by: Jack Morgenstein
    Signed-off-by: Tariq Toukan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     
  • [ Upstream commit e15ce4b8d11227007577e6dc1364d288b8874fbe ]

    As part of unloading a device, the driver switches from
    FW command event mode to FW command polling mode.

    Part of switching over to polling mode is freeing the command context array
    memory (unfortunately, currently, without NULLing the command context array
    pointer).

    The reset flow calls "complete" to complete all outstanding fw commands
    (if we are in event mode). The check for event vs. polling mode here
    is to test if the command context array pointer is NULL.

    If the reset flow is activated after the switch to polling mode, it will
    attempt (incorrectly) to complete all the commands in the context array --
    because the pointer was not NULLed when the driver switched over to polling
    mode.

    As a result, we have a use-after-free situation, which results in a
    kernel crash.

    For example:
    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] __wake_up_common+0x2e/0x90
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: netconsole nfsv3 nfs_acl nfs lockd grace ...
    CPU: 2 PID: 940 Comm: kworker/2:3 Kdump: loaded Not tainted 3.10.0-862.el7.x86_64 #1
    Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016
    Workqueue: events hv_eject_device_work [pci_hyperv]
    task: ffff8d1734ca0fd0 ti: ffff8d17354bc000 task.ti: ffff8d17354bc000
    RIP: 0010:[] [] __wake_up_common+0x2e/0x90
    RSP: 0018:ffff8d17354bfa38 EFLAGS: 00010082
    RAX: 0000000000000000 RBX: ffff8d17362d42c8 RCX: 0000000000000000
    RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8d17362d42c8
    RBP: ffff8d17354bfa70 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000298 R11: ffff8d173610e000 R12: ffff8d17362d42d0
    R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000003
    FS: 0000000000000000(0000) GS:ffff8d1802680000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000000f16d8000 CR4: 00000000001406e0
    Call Trace:
    [] complete+0x3c/0x50
    [] mlx4_cmd_wake_completions+0x70/0x90 [mlx4_core]
    [] mlx4_enter_error_state+0xe1/0x380 [mlx4_core]
    [] mlx4_comm_cmd+0x29b/0x360 [mlx4_core]
    [] __mlx4_cmd+0x441/0x920 [mlx4_core]
    [] ? __slab_free+0x81/0x2f0
    [] ? __radix_tree_lookup+0x84/0xf0
    [] mlx4_free_mtt_range+0x5b/0xb0 [mlx4_core]
    [] mlx4_mtt_cleanup+0x17/0x20 [mlx4_core]
    [] mlx4_free_eq+0xa7/0x1c0 [mlx4_core]
    [] mlx4_cleanup_eq_table+0xde/0x130 [mlx4_core]
    [] mlx4_unload_one+0x118/0x300 [mlx4_core]
    [] mlx4_remove_one+0x91/0x1f0 [mlx4_core]

    The fix is to set the command context array pointer to NULL after freeing
    the array.

    Fixes: f5aef5aa3506 ("net/mlx4_core: Activate reset flow upon fatal command cases")
    Signed-off-by: Jack Morgenstein
    Signed-off-by: Tariq Toukan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     
  • [ Upstream commit 59cbf56fcd98ba2a715b6e97c4e43f773f956393 ]

    Same reasons than the ones explained in commit 4179cb5a4c92
    ("vxlan: test dev->flags & IFF_UP before calling netif_rx()")

    netif_rx() or gro_cells_receive() must be called under a strict contract.

    At device dismantle phase, core networking clears IFF_UP
    and flush_all_backlogs() is called after rcu grace period
    to make sure no incoming packet might be in a cpu backlog
    and still referencing the device.

    A similar protocol is used for gro_cells infrastructure, as
    gro_cells_destroy() will be called only after a full rcu
    grace period is observed after IFF_UP has been cleared.

    Most drivers call netif_rx() from their interrupt handler,
    and since the interrupts are disabled at device dismantle,
    netif_rx() does not have to check dev->flags & IFF_UP

    Virtual drivers do not have this guarantee, and must
    therefore make the check themselves.

    Otherwise we risk use-after-free and/or crashes.

    Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit ad6c9986bcb627c7c22b8f9e9a934becc27df87c ]

    If we receive a packet while deleting a VXLAN device, there's a chance
    vxlan_rcv() is called at the same time as vxlan_dellink(). This is fine,
    except that vxlan_dellink() should never ever touch stuff that's still in
    use, such as the GRO cells list.

    Otherwise, vxlan_rcv() crashes while queueing packets via
    gro_cells_receive().

    Move the gro_cells_destroy() to vxlan_uninit(), which runs after the RCU
    grace period is elapsed and nothing needs the gro_cells anymore.

    This is now done in the same way as commit 8e816df87997 ("geneve: Use GRO
    cells infrastructure.") originally implemented for GENEVE.

    Reported-by: Jianlin Shi
    Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer")
    Signed-off-by: Stefano Brivio
    Reviewed-by: Sabrina Dubroca
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Stefano Brivio
     
  • [ Upstream commit 9d3e1368bb45893a75a5dfb7cd21fdebfa6b47af ]

    Commit 7716682cc58e ("tcp/dccp: fix another race at listener
    dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
    {tcp,dccp}_check_req() accordingly. However, TFO and syncookies
    weren't modified, thus leaking allocated resources on error.

    Contrary to tcp_check_req(), in both syncookies and TFO cases,
    we need to drop the request socket. Also, since the child socket is
    created with inet_csk_clone_lock(), we have to unlock it and drop an
    extra reference (->sk_refcount is initially set to 2 and
    inet_csk_reqsk_queue_add() drops only one ref).

    For TFO, we also need to revert the work done by tcp_try_fastopen()
    (with reqsk_fastopen_remove()).

    Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle")
    Signed-off-by: Guillaume Nault
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Guillaume Nault
     
  • [ Upstream commit f2feaefdabb0a6253aa020f65e7388f07a9ed47c ]

    Since commit eeea10b83a13 ("tcp: add
    tcp_v4_fill_cb()/tcp_v4_restore_cb()"), tcp_vX_fill_cb is only called
    after tcp_filter(). That means, TCP_SKB_CB(skb)->end_seq still points to
    the IP-part of the cb.

    We thus should not mock with it, as this can trigger bugs (thanks
    syzkaller):
    [ 12.349396] ==================================================================
    [ 12.350188] BUG: KASAN: slab-out-of-bounds in ip6_datagram_recv_specific_ctl+0x19b3/0x1a20
    [ 12.351035] Read of size 1 at addr ffff88006adbc208 by task test_ip6_datagr/1799

    Setting end_seq is actually no more necessary in tcp_filter as it gets
    initialized later on in tcp_vX_fill_cb.

    Cc: Eric Dumazet
    Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()")
    Signed-off-by: Christoph Paasch
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Christoph Paasch
     
  • [ Upstream commit 6466e715651f9f358e60c5ea4880e4731325827f ]

    Returning 0 as inq to userspace indicates there is no more data to
    read, and the application needs to wait for EPOLLIN. For a connection
    that has received FIN from the remote peer, however, the application
    must continue reading until getting EOF (return value of 0
    from tcp_recvmsg) or an error, if edge-triggered epoll (EPOLLET) is
    being used. Otherwise, the application will never receive a new
    EPOLLIN, since there is no epoll edge after the FIN.

    Return 1 when there is no data left on the queue but the
    connection has received FIN, so that the applications continue
    reading.

    Fixes: b75eba76d3d72 (tcp: send in-queue bytes in cmsg upon read)
    Signed-off-by: Soheil Hassas Yeganeh
    Acked-by: Neal Cardwell
    Signed-off-by: Eric Dumazet
    Acked-by: Yuchung Cheng
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Soheil Hassas Yeganeh
     
  • [ Upstream commit 2e990dfd13974d9eae493006f42ffb48707970ef ]

    syzbot reported a NULL-ptr deref caused by that sched->init() in
    sctp_stream_init() set stream->rr_next = NULL.

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    RIP: 0010:sctp_sched_rr_dequeue+0xd3/0x170 net/sctp/stream_sched_rr.c:141
    Call Trace:
    sctp_outq_dequeue_data net/sctp/outqueue.c:90 [inline]
    sctp_outq_flush_data net/sctp/outqueue.c:1079 [inline]
    sctp_outq_flush+0xba2/0x2790 net/sctp/outqueue.c:1205

    All sched info is saved in sout->ext now, in sctp_stream_init()
    sctp_stream_alloc_out() will not change it, there's no need to
    call sched->init() again, since sctp_outq_init() has already
    done it.

    Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations")
    Reported-by: syzbot+4c9934f20522c0efd657@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit 69ffaebb90369ce08657b5aea4896777b9d6e8fc ]

    rxrpc_get_client_conn() adds a new call to the front of the waiting_calls
    queue if the connection it's going to use already exists. This is bad as
    it allows calls to get starved out.

    Fix this by adding to the tail instead.

    Also change the other enqueue point in the same function to put it on the
    front (ie. when we have a new connection). This makes the point that in
    the case of a new connection the new call goes at the front (though it
    doesn't actually matter since the queue should be unoccupied).

    Fixes: 45025bceef17 ("rxrpc: Improve management and caching of client connection objects")
    Signed-off-by: David Howells
    Reviewed-by: Marc Dionne
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • [ Upstream commit ee60ad219f5c7c4fb2f047f88037770063ef785f ]

    The race occurs in __mkroute_output() when 2 threads lookup a dst:

    CPU A CPU B
    find_exception()
    find_exception() [fnhe expires]
    ip_del_fnhe() [fnhe is deleted]
    rt_bind_exception()

    In rt_bind_exception() it will bind a deleted fnhe with the new dst, and
    this dst will get no chance to be freed. It causes a dev defcnt leak and
    consecutive dmesg warnings:

    unregister_netdevice: waiting for ethX to become free. Usage count = 1

    Especially thanks Jon to identify the issue.

    This patch fixes it by setting fnhe_daddr to 0 in ip_del_fnhe() to stop
    binding the deleted fnhe with a new dst when checking fnhe's fnhe_daddr
    and daddr in rt_bind_exception().

    It works as both ip_del_fnhe() and rt_bind_exception() are protected by
    fnhe_lock and the fhne is freed by kfree_rcu().

    Fixes: deed49df7390 ("route: check and remove route cache when we get route")
    Signed-off-by: Jon Maxwell
    Signed-off-by: Xin Long
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit ae9819e339b451da7a86ab6fe38ecfcb6814e78a ]

    Hardware has the CBS (Credit Based Shaper) which affects only Q3
    and Q2. When updating the CBS settings, even if the driver does so
    after waiting for Tx DMA finished, there is a possibility that frame
    data still remains in TxFIFO.

    To avoid this, decrease TxFIFO depth of Q3 and Q2 to one.

    This patch has been exercised this using netperf TCP_MAERTS, TCP_STREAM
    and UDP_STREAM tests run on an Ebisu board. No performance change was
    detected, outside of noise in the tests, both in terms of throughput and
    CPU utilisation.

    Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
    Signed-off-by: Masaru Nagai
    Signed-off-by: Kazuya Mizuguchi
    [simon: updated changelog]
    Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Masaru Nagai
     
  • [ Upstream commit 9417d81f4f8adfe20a12dd1fadf73a618cbd945d ]

    sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
    so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
    otherwise, the dst refcnt will leak.

    It can be reproduced by this syz log:

    r1 = socket$pptp(0x18, 0x1, 0x2)
    bind$pptp(r1, &(0x7f0000000100)={0x18, 0x2, {0x0, @local}}, 0x1e)
    connect$pptp(r1, &(0x7f0000000000)={0x18, 0x2, {0x3, @remote}}, 0x1e)

    Consecutive dmesg warnings will occur:

    unregister_netdevice: waiting for lo to become free. Usage count = 1

    v1->v2:
    - use rcu_dereference_protected() instead of rcu_dereference_check(),
    as suggested by Eric.

    Fixes: 00959ade36ac ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)")
    Reported-by: Xiumei Mu
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit ee74d0bd4325efb41e38affe5955f920ed973f23 ]

    In case x25_connect() fails and frees the socket neighbour,
    we also need to undo the change done to x25->state.

    Before my last bug fix, we had use-after-free so this
    patch fixes a latent bug.

    syzbot report :

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 16137 Comm: syz-executor.1 Not tainted 5.0.0+ #117
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:x25_write_internal+0x1e8/0xdf0 net/x25/x25_subr.c:173
    Code: 00 40 88 b5 e0 fe ff ff 0f 85 01 0b 00 00 48 8b 8b 80 04 00 00 48 ba 00 00 00 00 00 fc ff df 48 8d 79 1c 48 89 fe 48 c1 ee 03 b6 34 16 48 89 fa 83 e2 07 83 c2 03 40 38 f2 7c 09 40 84 f6 0f
    RSP: 0018:ffff888076717a08 EFLAGS: 00010207
    RAX: ffff88805f2f2292 RBX: ffff8880a0ae6000 RCX: 0000000000000000
    kobject: 'loop5' (0000000018d0d0ee): kobject_uevent_env
    RDX: dffffc0000000000 RSI: 0000000000000003 RDI: 000000000000001c
    RBP: ffff888076717b40 R08: ffff8880950e0580 R09: ffffed100be5e46d
    R10: ffffed100be5e46c R11: ffff88805f2f2363 R12: ffff888065579840
    kobject: 'loop5' (0000000018d0d0ee): fill_kobj_path: path = '/devices/virtual/block/loop5'
    R13: 1ffff1100ece2f47 R14: 0000000000000013 R15: 0000000000000013
    FS: 00007fb88cf43700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f9a42a41028 CR3: 0000000087a67000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    x25_release+0xd0/0x340 net/x25/af_x25.c:658
    __sock_release+0xd3/0x2b0 net/socket.c:579
    sock_close+0x1b/0x30 net/socket.c:1162
    __fput+0x2df/0x8d0 fs/file_table.c:278
    ____fput+0x16/0x20 fs/file_table.c:309
    task_work_run+0x14a/0x1c0 kernel/task_work.c:113
    get_signal+0x1961/0x1d50 kernel/signal.c:2388
    do_signal+0x87/0x1940 arch/x86/kernel/signal.c:816
    exit_to_usermode_loop+0x244/0x2c0 arch/x86/entry/common.c:162
    prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
    syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
    do_syscall_64+0x52d/0x610 arch/x86/entry/common.c:293
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x457f29
    Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007fb88cf42c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    RAX: fffffffffffffe00 RBX: 0000000000000003 RCX: 0000000000457f29
    RDX: 0000000000000012 RSI: 0000000020000080 RDI: 0000000000000004
    RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb88cf436d4
    R13: 00000000004be462 R14: 00000000004cec98 R15: 00000000ffffffff
    Modules linked in:

    Fixes: 95d6ebd53c79 ("net/x25: fix use-after-free in x25_device_event()")
    Signed-off-by: Eric Dumazet
    Cc: andrew hendry
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 95d6ebd53c79522bf9502dbc7e89e0d63f94dae4 ]

    In case of failure x25_connect() does a x25_neigh_put(x25->neighbour)
    but forgets to clear x25->neighbour pointer, thus triggering use-after-free.

    Since the socket is visible in x25_list, we need to hold x25_list_lock
    to protect the operation.

    syzbot report :

    BUG: KASAN: use-after-free in x25_kill_by_device net/x25/af_x25.c:217 [inline]
    BUG: KASAN: use-after-free in x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
    Read of size 8 at addr ffff8880a030edd0 by task syz-executor003/7854

    CPU: 0 PID: 7854 Comm: syz-executor003 Not tainted 5.0.0+ #97
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x172/0x1f0 lib/dump_stack.c:113
    print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
    kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
    __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
    x25_kill_by_device net/x25/af_x25.c:217 [inline]
    x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
    notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
    __raw_notifier_call_chain kernel/notifier.c:394 [inline]
    raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
    call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
    call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
    call_netdevice_notifiers net/core/dev.c:1765 [inline]
    __dev_notify_flags+0x1e9/0x2c0 net/core/dev.c:7607
    dev_change_flags+0x10d/0x170 net/core/dev.c:7643
    dev_ifsioc+0x2b0/0x940 net/core/dev_ioctl.c:237
    dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:488
    sock_do_ioctl+0x1bd/0x300 net/socket.c:995
    sock_ioctl+0x32b/0x610 net/socket.c:1096
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:509 [inline]
    do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
    ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
    __do_sys_ioctl fs/ioctl.c:720 [inline]
    __se_sys_ioctl fs/ioctl.c:718 [inline]
    __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x4467c9
    Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007fdbea222d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    RAX: ffffffffffffffda RBX: 00000000006dbc58 RCX: 00000000004467c9
    RDX: 0000000020000340 RSI: 0000000000008914 RDI: 0000000000000003
    RBP: 00000000006dbc50 R08: 00007fdbea223700 R09: 0000000000000000
    R10: 00007fdbea223700 R11: 0000000000000246 R12: 00000000006dbc5c
    R13: 6000030030626669 R14: 0000000000000000 R15: 0000000030626669

    Allocated by task 7843:
    save_stack+0x45/0xd0 mm/kasan/common.c:73
    set_track mm/kasan/common.c:85 [inline]
    __kasan_kmalloc mm/kasan/common.c:495 [inline]
    __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:468
    kasan_kmalloc+0x9/0x10 mm/kasan/common.c:509
    kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3615
    kmalloc include/linux/slab.h:545 [inline]
    x25_link_device_up+0x46/0x3f0 net/x25/x25_link.c:249
    x25_device_event+0x116/0x2b0 net/x25/af_x25.c:242
    notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
    __raw_notifier_call_chain kernel/notifier.c:394 [inline]
    raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
    call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
    call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
    call_netdevice_notifiers net/core/dev.c:1765 [inline]
    __dev_notify_flags+0x121/0x2c0 net/core/dev.c:7605
    dev_change_flags+0x10d/0x170 net/core/dev.c:7643
    dev_ifsioc+0x2b0/0x940 net/core/dev_ioctl.c:237
    dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:488
    sock_do_ioctl+0x1bd/0x300 net/socket.c:995
    sock_ioctl+0x32b/0x610 net/socket.c:1096
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:509 [inline]
    do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
    ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
    __do_sys_ioctl fs/ioctl.c:720 [inline]
    __se_sys_ioctl fs/ioctl.c:718 [inline]
    __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Freed by task 7865:
    save_stack+0x45/0xd0 mm/kasan/common.c:73
    set_track mm/kasan/common.c:85 [inline]
    __kasan_slab_free+0x102/0x150 mm/kasan/common.c:457
    kasan_slab_free+0xe/0x10 mm/kasan/common.c:465
    __cache_free mm/slab.c:3494 [inline]
    kfree+0xcf/0x230 mm/slab.c:3811
    x25_neigh_put include/net/x25.h:253 [inline]
    x25_connect+0x8d8/0xde0 net/x25/af_x25.c:824
    __sys_connect+0x266/0x330 net/socket.c:1685
    __do_sys_connect net/socket.c:1696 [inline]
    __se_sys_connect net/socket.c:1693 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1693
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    The buggy address belongs to the object at ffff8880a030edc0
    which belongs to the cache kmalloc-256 of size 256
    The buggy address is located 16 bytes inside of
    256-byte region [ffff8880a030edc0, ffff8880a030eec0)
    The buggy address belongs to the page:
    page:ffffea000280c380 count:1 mapcount:0 mapping:ffff88812c3f07c0 index:0x0
    flags: 0x1fffc0000000200(slab)
    raw: 01fffc0000000200 ffffea0002806788 ffffea00027f0188 ffff88812c3f07c0
    raw: 0000000000000000 ffff8880a030e000 000000010000000c 0000000000000000
    page dumped because: kasan: bad access detected

    Signed-off-by: Eric Dumazet
    Reported-by: syzbot+04babcefcd396fabec37@syzkaller.appspotmail.com
    Cc: andrew hendry
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit a843dc4ebaecd15fca1f4d35a97210f72ea1473b ]

    In func check_6rd,tunnel->ip6rd.relay_prefixlen may equal to
    32,so UBSAN complain about it.

    UBSAN: Undefined behaviour in net/ipv6/sit.c:781:47
    shift exponent 32 is too large for 32-bit type 'unsigned int'
    CPU: 6 PID: 20036 Comm: syz-executor.0 Not tainted 4.19.27 #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
    04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0xca/0x13e lib/dump_stack.c:113
    ubsan_epilogue+0xe/0x81 lib/ubsan.c:159
    __ubsan_handle_shift_out_of_bounds+0x293/0x2e8 lib/ubsan.c:425
    check_6rd.constprop.9+0x433/0x4e0 net/ipv6/sit.c:781
    try_6rd net/ipv6/sit.c:806 [inline]
    ipip6_tunnel_xmit net/ipv6/sit.c:866 [inline]
    sit_tunnel_xmit+0x141c/0x2720 net/ipv6/sit.c:1033
    __netdev_start_xmit include/linux/netdevice.h:4300 [inline]
    netdev_start_xmit include/linux/netdevice.h:4309 [inline]
    xmit_one net/core/dev.c:3243 [inline]
    dev_hard_start_xmit+0x17c/0x780 net/core/dev.c:3259
    __dev_queue_xmit+0x1656/0x2500 net/core/dev.c:3829
    neigh_output include/net/neighbour.h:501 [inline]
    ip6_finish_output2+0xa36/0x2290 net/ipv6/ip6_output.c:120
    ip6_finish_output+0x3e7/0xa20 net/ipv6/ip6_output.c:154
    NF_HOOK_COND include/linux/netfilter.h:278 [inline]
    ip6_output+0x1e2/0x720 net/ipv6/ip6_output.c:171
    dst_output include/net/dst.h:444 [inline]
    ip6_local_out+0x99/0x170 net/ipv6/output_core.c:176
    ip6_send_skb+0x9d/0x2f0 net/ipv6/ip6_output.c:1697
    ip6_push_pending_frames+0xc0/0x100 net/ipv6/ip6_output.c:1717
    rawv6_push_pending_frames net/ipv6/raw.c:616 [inline]
    rawv6_sendmsg+0x2435/0x3530 net/ipv6/raw.c:946
    inet_sendmsg+0xf8/0x5c0 net/ipv4/af_inet.c:798
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg+0xc8/0x110 net/socket.c:631
    ___sys_sendmsg+0x6cf/0x890 net/socket.c:2114
    __sys_sendmsg+0xf0/0x1b0 net/socket.c:2152
    do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Signed-off-by: linmiaohe
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Miaohe Lin
     
  • [ Upstream commit 1e027960edfaa6a43f9ca31081729b716598112b ]

    syzbot found another add_timer() issue, this time in net/hsr [1]

    Let's use mod_timer() which is safe.

    [1]
    kernel BUG at kernel/time/timer.c:1136!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 15909 Comm: syz-executor.3 Not tainted 5.0.0+ #97
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    kobject: 'loop2' (00000000f5629718): kobject_uevent_env
    RIP: 0010:add_timer kernel/time/timer.c:1136 [inline]
    RIP: 0010:add_timer+0x654/0xbe0 kernel/time/timer.c:1134
    Code: 0f 94 c5 31 ff 44 89 ee e8 09 61 0f 00 45 84 ed 0f 84 77 fd ff ff e8 bb 5f 0f 00 e8 07 10 a0 ff e9 68 fd ff ff e8 ac 5f 0f 00 0b e8 a5 5f 0f 00 0f 0b e8 9e 5f 0f 00 4c 89 b5 58 ff ff ff e9
    RSP: 0018:ffff8880656eeca0 EFLAGS: 00010246
    kobject: 'loop2' (00000000f5629718): fill_kobj_path: path = '/devices/virtual/block/loop2'
    RAX: 0000000000040000 RBX: 1ffff1100caddd9a RCX: ffffc9000c436000
    RDX: 0000000000040000 RSI: ffffffff816056c4 RDI: ffff88806a2f6cc8
    RBP: ffff8880656eed58 R08: ffff888067f4a300 R09: ffff888067f4abc8
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff88806a2f6cc0
    R13: dffffc0000000000 R14: 0000000000000001 R15: ffff8880656eed30
    FS: 00007fc2019bf700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000738000 CR3: 0000000067e8e000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    hsr_check_announce net/hsr/hsr_device.c:99 [inline]
    hsr_check_carrier_and_operstate+0x567/0x6f0 net/hsr/hsr_device.c:120
    hsr_netdev_notify+0x297/0xa00 net/hsr/hsr_main.c:51
    notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
    __raw_notifier_call_chain kernel/notifier.c:394 [inline]
    raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
    call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
    call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
    call_netdevice_notifiers net/core/dev.c:1765 [inline]
    dev_open net/core/dev.c:1436 [inline]
    dev_open+0x143/0x160 net/core/dev.c:1424
    team_port_add drivers/net/team/team.c:1203 [inline]
    team_add_slave+0xa07/0x15d0 drivers/net/team/team.c:1933
    do_set_master net/core/rtnetlink.c:2358 [inline]
    do_set_master+0x1d4/0x230 net/core/rtnetlink.c:2332
    do_setlink+0x966/0x3510 net/core/rtnetlink.c:2493
    rtnl_setlink+0x271/0x3b0 net/core/rtnetlink.c:2747
    rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192
    netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
    rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210
    netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
    netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
    netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
    sock_sendmsg_nosec net/socket.c:622 [inline]
    sock_sendmsg+0xdd/0x130 net/socket.c:632
    sock_write_iter+0x27c/0x3e0 net/socket.c:923
    call_write_iter include/linux/fs.h:1869 [inline]
    do_iter_readv_writev+0x5e0/0x8e0 fs/read_write.c:680
    do_iter_write fs/read_write.c:956 [inline]
    do_iter_write+0x184/0x610 fs/read_write.c:937
    vfs_writev+0x1b3/0x2f0 fs/read_write.c:1001
    do_writev+0xf6/0x290 fs/read_write.c:1036
    __do_sys_writev fs/read_write.c:1109 [inline]
    __se_sys_writev fs/read_write.c:1106 [inline]
    __x64_sys_writev+0x75/0xb0 fs/read_write.c:1106
    do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x457f29
    Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007fc2019bec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457f29
    RDX: 0000000000000001 RSI: 00000000200000c0 RDI: 0000000000000003
    RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc2019bf6d4
    R13: 00000000004c4a60 R14: 00000000004dd218 R15: 00000000ffffffff

    Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Arvid Brodin
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 6caabe7f197d3466d238f70915d65301f1716626 ]

    If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to
    add port, it directly returns res and forgets to free the node
    that allocated in hsr_create_self_node(), and forgets to delete
    the node->mac_list linked in hsr->self_node_db.

    BUG: memory leak
    unreferenced object 0xffff8881cfa0c780 (size 64):
    comm "syz-executor.0", pid 2077, jiffies 4294717969 (age 2415.377s)
    hex dump (first 32 bytes):
    e0 c7 a0 cf 81 88 ff ff 00 02 00 00 00 00 ad de ................
    00 e6 49 cd 81 88 ff ff c0 9b 87 d0 81 88 ff ff ..I.............
    backtrace:
    [] hsr_dev_finalize+0x736/0x960 [hsr]
    [] hsr_newlink+0x2b2/0x3e0 [hsr]
    [] __rtnl_newlink+0xf1f/0x1600 net/core/rtnetlink.c:3182
    [] rtnl_newlink+0x66/0x90 net/core/rtnetlink.c:3240
    [] rtnetlink_rcv_msg+0x54e/0xb90 net/core/rtnetlink.c:5130
    [] netlink_rcv_skb+0x129/0x340 net/netlink/af_netlink.c:2477
    [] netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
    [] netlink_unicast+0x49a/0x650 net/netlink/af_netlink.c:1336
    [] netlink_sendmsg+0x88b/0xdf0 net/netlink/af_netlink.c:1917
    [] sock_sendmsg_nosec net/socket.c:621 [inline]
    [] sock_sendmsg+0xc3/0x100 net/socket.c:631
    [] __sys_sendto+0x33e/0x560 net/socket.c:1786
    [] __do_sys_sendto net/socket.c:1798 [inline]
    [] __se_sys_sendto net/socket.c:1794 [inline]
    [] __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1794
    [] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    [] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [] 0xffffffffffffffff

    Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
    Reported-by: Hulk Robot
    Signed-off-by: Mao Wenan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Mao Wenan
     
  • [ Upstream commit deb6bfabdbb634e91f36a4e9cb00a7137d72d886 ]

    It has been observed that tx queue may stall while downloading
    from certain web sites (example www.speedtest.net)

    The cause has been tracked down to a corner case where
    the tx interrupt vector was disabled automatically, but
    was not re enabled later.

    The lan743x has two mechanisms to enable/disable individual
    interrupts. Interrupts can be enabled/disabled by individual
    source, and they can also be enabled/disabled by individual
    vector which has been mapped to the source. Both must be
    enabled for interrupts to work properly.

    The TX code path, primarily uses the interrupt enable/disable of
    the TX source bit, while leaving the vector enabled all the time.

    However, while investigating this issue it was noticed that
    the driver requested the use of the vector auto clear feature.

    The test above revealed a case where the vector enable was
    cleared unintentionally.

    This patch fixes the issue by deleting the lines that request
    the vector auto clear feature to be used.

    Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
    Signed-off-by: Bryan Whitehead
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Bryan Whitehead
     
  • [ Upstream commit dd9d9f5907bb475f8b1796c47d4ecc7fb9b72136 ]

    It has been noticed that running the speed test at
    www.speedtest.net occasionally causes a kernel panic.

    Investigation revealed that under this test RX buffer allocation
    sometimes fails and returns NULL. But the lan743x driver did
    not handle this case.

    This patch fixes this issue by attempting to allocate a buffer
    before sending the new rx packet to the OS. If the allocation
    fails then the new rx packet is dropped and the existing buffer
    is reused in the DMA ring.

    Updates for v2:
    Additional 2 locations where allocation was not checked,
    has been changed to reuse existing buffer.

    Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
    Signed-off-by: Bryan Whitehead
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Bryan Whitehead
     
  • [ Upstream commit 163d1c3d6f17556ed3c340d3789ea93be95d6c28 ]

    Back in 2013 Hannes took care of most of such leaks in commit
    bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")

    But the bug in l2tp_ip6_recvmsg() has not been fixed.

    syzbot report :

    BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    CPU: 1 PID: 10996 Comm: syz-executor362 Not tainted 5.0.0+ #11
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x173/0x1d0 lib/dump_stack.c:113
    kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600
    kmsan_internal_check_memory+0x9f4/0xb10 mm/kmsan/kmsan.c:694
    kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
    _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    copy_to_user include/linux/uaccess.h:174 [inline]
    move_addr_to_user+0x311/0x570 net/socket.c:227
    ___sys_recvmsg+0xb65/0x1310 net/socket.c:2283
    do_recvmmsg+0x646/0x10c0 net/socket.c:2390
    __sys_recvmmsg net/socket.c:2469 [inline]
    __do_sys_recvmmsg net/socket.c:2492 [inline]
    __se_sys_recvmmsg+0x1d1/0x350 net/socket.c:2485
    __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2485
    do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7
    RIP: 0033:0x445819
    Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f64453eddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
    RAX: ffffffffffffffda RBX: 00000000006dac28 RCX: 0000000000445819
    RDX: 0000000000000005 RSI: 0000000020002f80 RDI: 0000000000000003
    RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac2c
    R13: 00007ffeba8f87af R14: 00007f64453ee9c0 R15: 20c49ba5e353f7cf

    Local variable description: ----addr@___sys_recvmsg
    Variable was created at:
    ___sys_recvmsg+0xf6/0x1310 net/socket.c:2244
    do_recvmmsg+0x646/0x10c0 net/socket.c:2390

    Bytes 0-31 of 32 are uninitialized
    Memory access of size 32 starts at ffff8880ae62fbb0
    Data copied to user address 0000000020000000

    Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 22c74764aa2943ecdf9f07c900d8a9c8ba6c9265 ]

    If a non local multicast packet reaches ip_route_input_rcu() while
    the ingress device IPv4 private data (in_dev) is NULL, we end up
    doing a NULL pointer dereference in IN_DEV_MFORWARD().

    Since the later call to ip_route_input_mc() is going to fail if
    !in_dev, we can fail early in such scenario and avoid the dangerous
    code path.

    v1 -> v2:
    - clarified the commit message, no code changes

    Reported-by: Tianhao Zhao
    Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast")
    Signed-off-by: Paolo Abeni
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 2a5ff07a0eb945f291e361aa6f6becca8340ba46 ]

    We keep receiving syzbot reports [1] that show that tunnels do not play
    the rcu/IFF_UP rules properly.

    At device dismantle phase, gro_cells_destroy() will be called
    only after a full rcu grace period is observed after IFF_UP
    has been cleared.

    This means that IFF_UP needs to be tested before queueing packets
    into netif_rx() or gro_cells.

    This patch implements the test in gro_cells_receive() because
    too many callers do not seem to bother enough.

    [1]
    BUG: unable to handle kernel paging request at fffff4ca0b9ffffe
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.0.0+ #97
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: netns cleanup_net
    RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
    RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
    RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
    RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
    RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
    Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
    RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
    RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
    RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
    RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
    R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
    R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
    kobject: 'loop2' (000000004bd7d84a): kobject_uevent_env
    FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0
    Call Trace:
    kobject: 'loop2' (000000004bd7d84a): fill_kobj_path: path = '/devices/virtual/block/loop2'
    ip_tunnel_dev_free+0x19/0x60 net/ipv4/ip_tunnel.c:1010
    netdev_run_todo+0x51c/0x7d0 net/core/dev.c:8970
    rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:116
    ip_tunnel_delete_nets+0x423/0x5f0 net/ipv4/ip_tunnel.c:1124
    vti_exit_batch_net+0x23/0x30 net/ipv4/ip_vti.c:495
    ops_exit_list.isra.0+0x105/0x160 net/core/net_namespace.c:156
    cleanup_net+0x3fb/0x960 net/core/net_namespace.c:551
    process_one_work+0x98e/0x1790 kernel/workqueue.c:2173
    worker_thread+0x98/0xe40 kernel/workqueue.c:2319
    kthread+0x357/0x430 kernel/kthread.c:246
    ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
    Modules linked in:
    CR2: fffff4ca0b9ffffe
    [ end trace 513fc9c1338d1cb3 ]
    RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
    RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
    RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
    RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
    RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
    Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
    RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
    RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
    RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
    RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
    R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
    kobject: 'loop3' (00000000e4ee57a6): kobject_uevent_env
    R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
    FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0

    Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 6d2b0f02f5a07a4bf02e4cbc90d7eaa85cac2986 ]

    proc_exit_connector() uses ->real_parent lockless. This is not
    safe that its parent can go away at any moment, so use RCU to
    protect it, and ensure that this task is not released.

    [ 747.624551] ==================================================================
    [ 747.632946] BUG: KASAN: use-after-free in proc_exit_connector+0x1f7/0x310
    [ 747.640686] Read of size 4 at addr ffff88a0276988e0 by task sshd/2882
    [ 747.648032]
    [ 747.649804] CPU: 11 PID: 2882 Comm: sshd Tainted: G E 4.19.26-rc2 #11
    [ 747.658629] Hardware name: IBM x3550M4 -[7914OFV]-/00AM544, BIOS -[D7E142BUS-1.71]- 07/31/2014
    [ 747.668419] Call Trace:
    [ 747.671269] dump_stack+0xf0/0x19b
    [ 747.675186] ? show_regs_print_info+0x5/0x5
    [ 747.679988] ? kmsg_dump_rewind_nolock+0x59/0x59
    [ 747.685302] print_address_description+0x6a/0x270
    [ 747.691162] kasan_report+0x258/0x380
    [ 747.695835] ? proc_exit_connector+0x1f7/0x310
    [ 747.701402] proc_exit_connector+0x1f7/0x310
    [ 747.706767] ? proc_coredump_connector+0x2d0/0x2d0
    [ 747.712715] ? _raw_write_unlock_irq+0x29/0x50
    [ 747.718270] ? _raw_write_unlock_irq+0x29/0x50
    [ 747.723820] ? ___preempt_schedule+0x16/0x18
    [ 747.729193] ? ___preempt_schedule+0x16/0x18
    [ 747.734574] do_exit+0xa11/0x14f0
    [ 747.738880] ? mm_update_next_owner+0x590/0x590
    [ 747.744525] ? debug_show_all_locks+0x3c0/0x3c0
    [ 747.761448] ? ktime_get_coarse_real_ts64+0xeb/0x1c0
    [ 747.767589] ? lockdep_hardirqs_on+0x1a6/0x290
    [ 747.773154] ? check_chain_key+0x139/0x1f0
    [ 747.778345] ? check_flags.part.35+0x240/0x240
    [ 747.783908] ? __lock_acquire+0x2300/0x2300
    [ 747.789171] ? _raw_spin_unlock_irqrestore+0x59/0x70
    [ 747.795316] ? _raw_spin_unlock_irqrestore+0x59/0x70
    [ 747.801457] ? do_raw_spin_unlock+0x10f/0x1e0
    [ 747.806914] ? do_raw_spin_trylock+0x120/0x120
    [ 747.812481] ? preempt_count_sub+0x14/0xc0
    [ 747.817645] ? _raw_spin_unlock+0x2e/0x50
    [ 747.822708] ? __handle_mm_fault+0x12db/0x1fa0
    [ 747.828367] ? __pmd_alloc+0x2d0/0x2d0
    [ 747.833143] ? check_noncircular+0x50/0x50
    [ 747.838309] ? match_held_lock+0x7f/0x340
    [ 747.843380] ? check_noncircular+0x50/0x50
    [ 747.848561] ? handle_mm_fault+0x21a/0x5f0
    [ 747.853730] ? check_flags.part.35+0x240/0x240
    [ 747.859290] ? check_chain_key+0x139/0x1f0
    [ 747.864474] ? __do_page_fault+0x40f/0x760
    [ 747.869655] ? __audit_syscall_entry+0x4b/0x1f0
    [ 747.875319] ? syscall_trace_enter+0x1d5/0x7b0
    [ 747.880877] ? trace_raw_output_preemptirq_template+0x90/0x90
    [ 747.887895] ? trace_raw_output_sys_exit+0x80/0x80
    [ 747.893860] ? up_read+0x3b/0x90
    [ 747.898142] ? stop_critical_timings+0x260/0x260
    [ 747.903909] do_group_exit+0xe0/0x1c0
    [ 747.908591] ? __x64_sys_exit+0x30/0x30
    [ 747.913460] ? trace_raw_output_preemptirq_template+0x90/0x90
    [ 747.920485] ? tracer_hardirqs_on+0x270/0x270
    [ 747.925956] __x64_sys_exit_group+0x28/0x30
    [ 747.931214] do_syscall_64+0x117/0x400
    [ 747.935988] ? syscall_return_slowpath+0x2f0/0x2f0
    [ 747.941931] ? trace_hardirqs_off_thunk+0x1a/0x1c
    [ 747.947788] ? trace_hardirqs_on_caller+0x1d0/0x1d0
    [ 747.953838] ? lockdep_sys_exit+0x16/0x8e
    [ 747.958915] ? trace_hardirqs_off_thunk+0x1a/0x1c
    [ 747.964784] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 747.971021] RIP: 0033:0x7f572f154c68
    [ 747.975606] Code: Bad RIP value.
    [ 747.979791] RSP: 002b:00007ffed2dfaa58 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
    [ 747.989324] RAX: ffffffffffffffda RBX: 00007f572f431840 RCX: 00007f572f154c68
    [ 747.997910] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
    [ 748.006495] RBP: 0000000000000001 R08: 00000000000000e7 R09: fffffffffffffee0
    [ 748.015079] R10: 00007f572f4387e8 R11: 0000000000000246 R12: 00007f572f431840
    [ 748.023664] R13: 000055a7f90f2c50 R14: 000055a7f96e2310 R15: 000055a7f96e2310
    [ 748.032287]
    [ 748.034509] Allocated by task 2300:
    [ 748.038982] kasan_kmalloc+0xa0/0xd0
    [ 748.043562] kmem_cache_alloc_node+0xf5/0x2e0
    [ 748.049018] copy_process+0x1781/0x4790
    [ 748.053884] _do_fork+0x166/0x9a0
    [ 748.058163] do_syscall_64+0x117/0x400
    [ 748.062943] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 748.069180]
    [ 748.071405] Freed by task 15395:
    [ 748.075591] __kasan_slab_free+0x130/0x180
    [ 748.080752] kmem_cache_free+0xc2/0x310
    [ 748.085619] free_task+0xea/0x130
    [ 748.089901] __put_task_struct+0x177/0x230
    [ 748.095063] finish_task_switch+0x51b/0x5d0
    [ 748.100315] __schedule+0x506/0xfa0
    [ 748.104791] schedule+0xca/0x260
    [ 748.108978] futex_wait_queue_me+0x27e/0x420
    [ 748.114333] futex_wait+0x251/0x550
    [ 748.118814] do_futex+0x75b/0xf80
    [ 748.123097] __x64_sys_futex+0x231/0x2a0
    [ 748.128065] do_syscall_64+0x117/0x400
    [ 748.132835] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 748.139066]
    [ 748.141289] The buggy address belongs to the object at ffff88a027698000
    [ 748.141289] which belongs to the cache task_struct of size 12160
    [ 748.156589] The buggy address is located 2272 bytes inside of
    [ 748.156589] 12160-byte region [ffff88a027698000, ffff88a02769af80)
    [ 748.171114] The buggy address belongs to the page:
    [ 748.177055] page:ffffea00809da600 count:1 mapcount:0 mapping:ffff888107d01e00 index:0x0 compound_mapcount: 0
    [ 748.189136] flags: 0x57ffffc0008100(slab|head)
    [ 748.194688] raw: 0057ffffc0008100 ffffea00809a3200 0000000300000003 ffff888107d01e00
    [ 748.204424] raw: 0000000000000000 0000000000020002 00000001ffffffff 0000000000000000
    [ 748.214146] page dumped because: kasan: bad access detected
    [ 748.220976]
    [ 748.223197] Memory state around the buggy address:
    [ 748.229128] ffff88a027698780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 748.238271] ffff88a027698800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 748.247414] >ffff88a027698880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 748.256564] ^
    [ 748.264267] ffff88a027698900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 748.273493] ffff88a027698980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 748.282630] ==================================================================

    Fixes: b086ff87251b4a4 ("connector: add parent pid and tgid to coredump and exit events")
    Signed-off-by: Zhang Yu
    Signed-off-by: Li RongQing
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Li RongQing
     

14 Mar, 2019

5 commits

  • Greg Kroah-Hartman
     
  • commit 400816f60c543153656ac74eaf7f36f6b7202378 upstream

    Skylake (and later) will receive a microcode update to address a TSX
    errata. This microcode will, on execution of a TSX instruction
    (speculative or not) use (clobber) PMC3. This update will also provide
    a new MSR to change this behaviour along with a CPUID bit to enumerate
    the presence of this new MSR.

    When the MSR gets set; the microcode will no longer use PMC3 but will
    Force Abort every TSX transaction (upon executing COMMIT).

    When TSX Force Abort (TFA) is allowed (default); the MSR gets set when
    PMC3 gets scheduled and cleared when, after scheduling, PMC3 is
    unused.

    When TFA is not allowed; clear PMC3 from all constraints such that it
    will not get used.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra (Intel)
     
  • commit 52f64909409c17adf54fcf5f9751e0544ca3a6b4 upstream

    Skylake systems will receive a microcode update to address a TSX
    errata. This microcode will (by default) clobber PMC3 when TSX
    instructions are (speculatively or not) executed.

    It also provides an MSR to cause all TSX transaction to abort and
    preserve PMC3.

    Add the CPUID enumeration and MSR definition.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra (Intel)
     
  • commit 11f8b2d65ca9029591c8df26bb6bd063c312b7fe upstream

    Such that we can re-use it.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra (Intel)
     
  • commit d01b1f96a82e5dd7841a1d39db3abfdaf95f70ab upstream

    The cpuc data structure allocation is different between fake and real
    cpuc's; use the same code to init/free both.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra (Intel)