12 Jun, 2020

1 commit

  • Mark switch cases where we are expecting to fall through.

    Fix the following warning through the use of the new the new
    pseudo-keyword fallthrough;

    arch/nios2/kernel/signal.c:254:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
    254 | restart = -2;
    | ~~~~~~~~^~~~
    arch/nios2/kernel/signal.c:255:3: note: here
    255 | case ERESTARTNOHAND:
    | ^~~~

    Reported-by: Christian Brauner
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Ley Foon Tan

    Ley Foon Tan
     

01 Jun, 2020

5 commits

  • Linus Torvalds
     
  • Yes, staying withing 80 columns is certainly still _preferred_. But
    it's not the hard limit that the checkpatch warnings imply, and other
    concerns can most certainly dominate.

    Increase the default limit to 100 characters. Not because 100
    characters is some hard limit either, but that's certainly a "what are
    you doing" kind of value and less likely to be about the occasional
    slightly longer lines.

    Miscellanea:

    - to avoid unnecessary whitespace changes in files, checkpatch will no
    longer emit a warning about line length when scanning files unless
    --strict is also used

    - Add a bit to coding-style about alignment to open parenthesis

    Signed-off-by: Joe Perches
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Pull x86 fixes from Thomas Gleixner:
    "A pile of x86 fixes:

    - Prevent a memory leak in ioperm which was caused by the stupid
    assumption that the exit cleanup is always called for current,
    which is not the case when fork fails after taking a reference on
    the ioperm bitmap.

    - Fix an arithmething overflow in the DMA code on 32bit systems

    - Fill gaps in the xstate copy with defaults instead of leaving them
    uninitialized

    - Revert: "Make __X32_SYSCALL_BIT be unsigned long" as it turned out
    that existing user space fails to build"

    * tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/ioperm: Prevent a memory leak when fork fails
    x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
    copy_xstate_to_kernel(): don't leave parts of destination uninitialized
    x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"

    Linus Torvalds
     
  • Pull scheduler fix from Thomas Gleixner:
    "A single scheduler fix preventing a crash in NUMA balancing.

    The current->mm check is not reliable as the mm might be temporary due
    to use_mm() in a kthread. Check for PF_KTHREAD explictly"

    * tag 'sched-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Don't NUMA balance for kthreads

    Linus Torvalds
     
  • Pull networking fixes from David Miller:
    "Another week, another set of bug fixes:

    1) Fix pskb_pull length in __xfrm_transport_prep(), from Xin Long.

    2) Fix double xfrm_state put in esp{4,6}_gro_receive(), also from Xin
    Long.

    3) Re-arm discovery timer properly in mac80211 mesh code, from Linus
    Lüssing.

    4) Prevent buffer overflows in nf_conntrack_pptp debug code, from
    Pablo Neira Ayuso.

    5) Fix race in ktls code between tls_sw_recvmsg() and
    tls_decrypt_done(), from Vinay Kumar Yadav.

    6) Fix crashes on TCP fallback in MPTCP code, from Paolo Abeni.

    7) More validation is necessary of untrusted GSO packets coming from
    virtualization devices, from Willem de Bruijn.

    8) Fix endianness of bnxt_en firmware message length accesses, from
    Edwin Peer.

    9) Fix infinite loop in sch_fq_pie, from Davide Caratti.

    10) Fix lockdep splat in DSA by setting lockless TX in netdev features
    for slave ports, from Vladimir Oltean.

    11) Fix suspend/resume crashes in mlx5, from Mark Bloch.

    12) Fix use after free in bpf fmod_ret, from Alexei Starovoitov.

    13) ARP retransmit timer guard uses wrong offset, from Hongbin Liu.

    14) Fix leak in inetdev_init(), from Yang Yingliang.

    15) Don't try to use inet hash and unhash in l2tp code, results in
    crashes. From Eric Dumazet"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
    l2tp: add sk_family checks to l2tp_validate_socket
    l2tp: do not use inet_hash()/inet_unhash()
    net: qrtr: Allocate workqueue before kernel_bind
    mptcp: remove msk from the token container at destruction time.
    mptcp: fix race between MP_JOIN and close
    mptcp: fix unblocking connect()
    net/sched: act_ct: add nat mangle action only for NAT-conntrack
    devinet: fix memleak in inetdev_init()
    virtio_vsock: Fix race condition in virtio_transport_recv_pkt
    drivers/net/ibmvnic: Update VNIC protocol version reporting
    NFC: st21nfca: add missed kfree_skb() in an error path
    neigh: fix ARP retransmit timer guard
    bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
    bpf, selftests: Verifier bounds tests need to be updated
    bpf: Fix a verifier issue when assigning 32bit reg states to 64bit ones
    bpf: Fix use-after-free in fmod_ret check
    net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()
    net/mlx5e: Fix MLX5_TC_CT dependencies
    net/mlx5e: Properly set default values when disabling adaptive moderation
    net/mlx5e: Fix arch depending casting issue in FEC
    ...

    Linus Torvalds
     

31 May, 2020

12 commits

  • syzbot was able to trigger a crash after using an ISDN socket
    and fool l2tp.

    Fix this by making sure the UDP socket is of the proper family.

    BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
    Write of size 1 at addr ffff88808ed0c590 by task syz-executor.5/3018

    CPU: 0 PID: 3018 Comm: syz-executor.5 Not tainted 5.7.0-rc6-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x188/0x20d lib/dump_stack.c:118
    print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:382
    __kasan_report.cold+0x20/0x38 mm/kasan/report.c:511
    kasan_report+0x33/0x50 mm/kasan/common.c:625
    setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
    l2tp_tunnel_register+0xb15/0xdd0 net/l2tp/l2tp_core.c:1523
    l2tp_nl_cmd_tunnel_create+0x4b2/0xa60 net/l2tp/l2tp_netlink.c:249
    genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
    genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
    genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
    netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
    genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
    netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
    netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
    netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xcf/0x120 net/socket.c:672
    ____sys_sendmsg+0x6e6/0x810 net/socket.c:2352
    ___sys_sendmsg+0x100/0x170 net/socket.c:2406
    __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
    do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
    RIP: 0033:0x45ca29
    Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007effe76edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00000000004fe1c0 RCX: 000000000045ca29
    RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
    RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 000000000000094e R14: 00000000004d5d00 R15: 00007effe76ee6d4

    Allocated by task 3018:
    save_stack+0x1b/0x40 mm/kasan/common.c:49
    set_track mm/kasan/common.c:57 [inline]
    __kasan_kmalloc mm/kasan/common.c:495 [inline]
    __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:468
    __do_kmalloc mm/slab.c:3656 [inline]
    __kmalloc+0x161/0x7a0 mm/slab.c:3665
    kmalloc include/linux/slab.h:560 [inline]
    sk_prot_alloc+0x223/0x2f0 net/core/sock.c:1612
    sk_alloc+0x36/0x1100 net/core/sock.c:1666
    data_sock_create drivers/isdn/mISDN/socket.c:600 [inline]
    mISDN_sock_create+0x272/0x400 drivers/isdn/mISDN/socket.c:796
    __sock_create+0x3cb/0x730 net/socket.c:1428
    sock_create net/socket.c:1479 [inline]
    __sys_socket+0xef/0x200 net/socket.c:1521
    __do_sys_socket net/socket.c:1530 [inline]
    __se_sys_socket net/socket.c:1528 [inline]
    __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
    do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
    entry_SYSCALL_64_after_hwframe+0x49/0xb3

    Freed by task 2484:
    save_stack+0x1b/0x40 mm/kasan/common.c:49
    set_track mm/kasan/common.c:57 [inline]
    kasan_set_free_info mm/kasan/common.c:317 [inline]
    __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:456
    __cache_free mm/slab.c:3426 [inline]
    kfree+0x109/0x2b0 mm/slab.c:3757
    kvfree+0x42/0x50 mm/util.c:603
    __free_fdtable+0x2d/0x70 fs/file.c:31
    put_files_struct fs/file.c:420 [inline]
    put_files_struct+0x248/0x2e0 fs/file.c:413
    exit_files+0x7e/0xa0 fs/file.c:445
    do_exit+0xb04/0x2dd0 kernel/exit.c:791
    do_group_exit+0x125/0x340 kernel/exit.c:894
    get_signal+0x47b/0x24e0 kernel/signal.c:2739
    do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
    exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
    prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
    syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
    do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
    entry_SYSCALL_64_after_hwframe+0x49/0xb3

    The buggy address belongs to the object at ffff88808ed0c000
    which belongs to the cache kmalloc-2k of size 2048
    The buggy address is located 1424 bytes inside of
    2048-byte region [ffff88808ed0c000, ffff88808ed0c800)
    The buggy address belongs to the page:
    page:ffffea00023b4300 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
    flags: 0xfffe0000000200(slab)
    raw: 00fffe0000000200 ffffea0002838208 ffffea00015ba288 ffff8880aa000e00
    raw: 0000000000000000 ffff88808ed0c000 0000000100000001 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff88808ed0c480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ffff88808ed0c500: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff88808ed0c580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ^
    ffff88808ed0c600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff88808ed0c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

    Fixes: 6b9f34239b00 ("l2tp: fix races in tunnel creation")
    Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
    Signed-off-by: Eric Dumazet
    Cc: James Chapman
    Cc: Guillaume Nault
    Reported-by: syzbot
    Acked-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • syzbot recently found a way to crash the kernel [1]

    Issue here is that inet_hash() & inet_unhash() are currently
    only meant to be used by TCP & DCCP, since only these protocols
    provide the needed hashinfo pointer.

    L2TP uses a single list (instead of a hash table)

    This old bug became an issue after commit 610236587600
    ("bpf: Add new cgroup attach type to enable sock modifications")
    since after this commit, sk_common_release() can be called
    while the L2TP socket is still considered 'hashed'.

    general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
    KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
    CPU: 0 PID: 7063 Comm: syz-executor654 Not tainted 5.7.0-rc6-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
    Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
    RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
    RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
    RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
    R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
    R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
    FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    sk_common_release+0xba/0x370 net/core/sock.c:3210
    inet_create net/ipv4/af_inet.c:390 [inline]
    inet_create+0x966/0xe00 net/ipv4/af_inet.c:248
    __sock_create+0x3cb/0x730 net/socket.c:1428
    sock_create net/socket.c:1479 [inline]
    __sys_socket+0xef/0x200 net/socket.c:1521
    __do_sys_socket net/socket.c:1530 [inline]
    __se_sys_socket net/socket.c:1528 [inline]
    __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
    do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
    RIP: 0033:0x441e29
    Code: e8 fc b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007ffdce184148 EFLAGS: 00000246 ORIG_RAX: 0000000000000029
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441e29
    RDX: 0000000000000073 RSI: 0000000000000002 RDI: 0000000000000002
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000402c30 R14: 0000000000000000 R15: 0000000000000000
    Modules linked in:
    ---[ end trace 23b6578228ce553e ]---
    RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
    Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
    RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
    RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
    RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
    R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
    R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
    FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

    Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
    Signed-off-by: Eric Dumazet
    Cc: James Chapman
    Cc: Andrii Nakryiko
    Reported-by: syzbot+3610d489778b57cc8031@syzkaller.appspotmail.com

    Eric Dumazet
     
  • A null pointer dereference in qrtr_ns_data_ready() is seen if a client
    opens a qrtr socket before qrtr_ns_init() can bind to the control port.
    When the control port is bound, the ENETRESET error will be broadcasted
    and clients will close their sockets. This results in DEL_CLIENT
    packets being sent to the ns and qrtr_ns_data_ready() being called
    without the workqueue being allocated.

    Allocate the workqueue before setting sk_data_ready and binding to the
    control port. This ensures that the work and workqueue structs are
    allocated and initialized before qrtr_ns_data_ready can be called.

    Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace")
    Signed-off-by: Chris Lew
    Reviewed-by: Bjorn Andersson
    Reviewed-by: Manivannan Sadhasivam
    Signed-off-by: David S. Miller

    Chris Lew
     
  • Paolo Abeni says:

    ====================
    mptcp: a bunch of fixes

    This patch series pulls together a few bugfixes for MPTCP bug observed while
    doing stress-test with apache bench - forced to use MPTCP and multiple
    subflows.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Currently we remote the msk from the token container only
    via mptcp_close(). The MPTCP master socket can be destroyed
    also via other paths (e.g. if not yet accepted, when shutting
    down the listener socket). When we hit the latter scenario,
    dangling msk references are left into the token container,
    leading to memory corruption and/or UaF.

    This change addresses the issue by moving the token removal
    into the msk destructor.

    Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree")
    Signed-off-by: Paolo Abeni
    Reviewed-by: Mat Martineau
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • If a MP_JOIN subflow completes the 3whs while another
    CPU is closing the master msk, we can hit the
    following race:

    CPU1 CPU2

    close()
    mptcp_close
    subflow_syn_recv_sock
    mptcp_token_get_sock
    mptcp_finish_join
    inet_sk_state_load
    mptcp_token_destroy
    inet_sk_state_store(TCP_CLOSE)
    __mptcp_flush_join_list()
    mptcp_sock_graft
    list_add_tail
    sk_common_release
    sock_orphan()

    The MP_JOIN socket will be leaked. Additionally we can hit
    UaF for the msk 'struct socket' referenced via the 'conn'
    field.

    This change try to address the issue introducing some
    synchronization between the MP_JOIN 3whs and mptcp_close
    via the join_list spinlock. If we detect the msk is closing
    the MP_JOIN socket is closed, too.

    Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
    Signed-off-by: Paolo Abeni
    Reviewed-by: Mat Martineau
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • Currently unblocking connect() on MPTCP sockets fails frequently.
    If mptcp_stream_connect() is invoked to complete a previously
    attempted unblocking connection, it will still try to create
    the first subflow via __mptcp_socket_create(). If the 3whs is
    completed and the 'can_ack' flag is already set, the latter
    will fail with -EINVAL.

    This change addresses the issue checking for pending connect and
    delegating the completion to the first subflow. Additionally
    do msk addresses and sk_state changes only when needed.

    Fixes: 2303f994b3e1 ("mptcp: Associate MPTCP context with TCP socket")
    Signed-off-by: Paolo Abeni
    Reviewed-by: Mat Martineau
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • Currently add nat mangle action with comparing invert and orig tuple.
    It is better to check IPS_NAT_MASK flags first to avoid non necessary
    memcmp for non-NAT conntrack.

    Signed-off-by: wenxu
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    wenxu
     
  • When devinet_sysctl_register() failed, the memory allocated
    in neigh_parms_alloc() should be freed.

    Fixes: 20e61da7ffcf ("ipv4: fail early when creating netdev named all or default")
    Signed-off-by: Yang Yingliang
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Yang Yingliang
     
  • When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
    server on the guest, there will be a panic on a ThunderX2 (armv8a server):

    [ 463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [ 463.718848] Mem abort info:
    [ 463.718849] ESR = 0x96000044
    [ 463.718852] EC = 0x25: DABT (current EL), IL = 32 bits
    [ 463.718853] SET = 0, FnV = 0
    [ 463.718854] EA = 0, S1PTW = 0
    [ 463.718855] Data abort info:
    [ 463.718856] ISV = 0, ISS = 0x00000044
    [ 463.718857] CM = 0, WnR = 1
    [ 463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
    [ 463.718861] [0000000000000000] pgd=0000000000000000
    [ 463.718866] Internal error: Oops: 96000044 [#1] SMP
    [...]
    [ 463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139
    [ 463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
    [ 463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
    [ 463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
    [ 463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
    [ 463.719000] sp : ffff80002dbe3c40
    [...]
    [ 463.719025] Call trace:
    [ 463.719030] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
    [ 463.719034] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
    [ 463.719041] vhost_worker+0x100/0x1a0 [vhost]
    [ 463.719048] kthread+0x128/0x130
    [ 463.719052] ret_from_fork+0x10/0x18

    The race condition is as follows:
    Task1 Task2
    ===== =====
    __sock_release virtio_transport_recv_pkt
    __vsock_release vsock_find_bound_socket (found sk)
    lock_sock_nested
    vsock_remove_sock
    sock_orphan
    sk_set_socket(sk, NULL)
    sk->sk_shutdown = SHUTDOWN_MASK
    ...
    release_sock
    lock_sock
    virtio_transport_recv_connecting
    sk->sk_socket->state (panic!)

    The root cause is that vsock_find_bound_socket can't hold the lock_sock,
    so there is a small race window between vsock_find_bound_socket() and
    lock_sock(). If __vsock_release() is running in another task,
    sk->sk_socket will be set to NULL inadvertently.

    This fixes it by checking sk->sk_shutdown(suggested by Stefano) after
    lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the
    protection of lock_sock_nested.

    Signed-off-by: Jia He
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Jia He
     
  • Pull powerpc fixes from Michael Ellerman:

    - a fix for the recent change to how we restore non-volatile GPRs,
    which broke our emulation of reading from the DSCR (Data Stream
    Control Register).

    - a fix for the recent rewrite of interrupt/syscall exit in C, we need
    to exclude KCOV from that code, otherwise it can lead to
    unrecoverable faults.

    Thanks to Daniel Axtens.

    * tag 'powerpc-5.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/64s: Disable sanitisers for C syscall/interrupt entry/exit code
    powerpc/64s: Fix restore of NV GPRs after facility unavailable exception

    Linus Torvalds
     
  • Pull GPIO fixes from Linus Walleij:
    "Here are some (very) late fixes for GPIO, none of them very serious
    except the one tagged for stable for enabling IRQ on open drain lines:

    - Fix probing of mvebu chips without PWM

    - Fix error path on ida_get_simple() on the exar driver

    - Notify userspace properly about line status changes when flags are
    changed on lines.

    - Fix a sleeping while holding spinlock in the mellanox driver.

    - Fix return value of the PXA and Kona probe calls.

    - Fix IRQ locking of open drain lines, it is fine to have IRQs on
    open drain lines flagged for output"

    * tag 'gpio-v5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
    gpio: fix locking open drain IRQ lines
    gpio: bcm-kona: Fix return value of bcm_kona_gpio_probe()
    gpio: pxa: Fix return value of pxa_gpio_probe()
    gpio: mlxbf2: Fix sleeping while holding spinlock
    gpiolib: notify user-space about line status changes after flags are set
    gpio: exar: Fix bad handling for ida_simple_get error path
    gpio: mvebu: Fix probing for chips without PWM

    Linus Torvalds
     

30 May, 2020

22 commits

  • VNIC protocol version is reported in big-endian format, but it
    is not byteswapped before logging. Fix that, and remove version
    comparison as only one protocol version exists at this time.

    Signed-off-by: Thomas Falcon
    Signed-off-by: David S. Miller

    Thomas Falcon
     
  • st21nfca_tm_send_atr_res() misses to call kfree_skb() in an error path.
    Add the missed function call to fix it.

    Fixes: 1892bf844ea0 ("NFC: st21nfca: Adding P2P support to st21nfca in Initiator & Target mode")
    Signed-off-by: Chuhong Yuan
    Signed-off-by: David S. Miller

    Chuhong Yuan
     
  • In commit 19e16d220f0a ("neigh: support smaller retrans_time settting")
    we add more accurate control for ARP and NS. But for ARP I forgot to
    update the latest guard in neigh_timer_handler(), then the next
    retransmit would be reset to jiffies + HZ/2 if we set the retrans_time
    less than 500ms. Fix it by setting the time_before() check to HZ/100.

    IPv6 does not have this issue.

    Reported-by: Jianwen Ji
    Fixes: 19e16d220f0a ("neigh: support smaller retrans_time settting")
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     
  • Saeed Mahameed says:

    ====================
    mlx5 fixes 2020-05-28

    This series introduces some fixes to mlx5 driver.

    v1->v2:
    - Fix bad sha1, Jakub.
    - Added one more patch by Pablo.
    net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()

    Nothing major, the only patch worth mentioning is the suspend/resume crash
    fix by adding the missing pci device handlers, the fix is very straight
    forward and as Dexuan already expressed, the patch is important for Azure
    users to avoid crash on VM hibernation, patch is marked for -stable v4.6
    below.

    Conflict note:
    ('net/mlx5e: Fix MLX5_TC_CT dependencies') has a trivial one line conflict
    with current net-next, which can be resolved by simply using the line from
    net-next.

    Please pull and let me know if there is any problem.

    For -stable v4.6
    ('net/mlx5: Fix crash upon suspend/resume')

    For -stable v5.6
    ('net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()')
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull ARM SoC fixes from Arnd Bergmann:
    "This time there is one fix for the error path in the mediatek cmdq
    driver (used by their video driver) and a couple of devicetree fixes,
    mostly for 32-bit ARM, and fairly harmless:

    - On OMAP2 there were a few regressions in the ethernet drivers, one
    of them leading to an external abort trap

    - One Raspberry Pi version had a misconfigured LED

    - Interrupts on Broadcom NSP were slightly misconfigured

    - One i.MX6q board had issues with graphics mode setting

    - On mmp3 there are some minor fixes that were submitted for v5.8
    with a cc:stable tag, so I ended up picking them up here as well

    - The Mediatek Video Codec needs to run at a higher frequency than
    configured originally"

    * tag 'armsoc-fixes-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
    ARM: dts: mmp3: Drop usb-nop-xceiv from HSIC phy
    ARM: dts: mmp3-dell-ariel: Fix the SPI devices
    ARM: dts: mmp3: Use the MMP3 compatible string for /clocks
    ARM: dts: bcm: HR2: Fix PPI interrupt types
    ARM: dts: bcm2835-rpi-zero-w: Fix led polarity
    ARM: dts/imx6q-bx50v3: Set display interface clock parents
    soc: mediatek: cmdq: return send msg error code
    arm64: dts: mt8173: fix vcodec-enc clock
    ARM: dts: Fix wrong mdio clock for dm814x
    ARM: dts: am437x: fix networking on boards with ksz9031 phy
    ARM: dts: am57xx: fix networking on boards with ksz9031 phy

    Linus Torvalds
     
  • Alexei Starovoitov says:

    ====================
    pull-request: bpf 2020-05-29

    The following pull-request contains BPF updates for your *net* tree.

    We've added 6 non-merge commits during the last 7 day(s) which contain
    a total of 4 files changed, 55 insertions(+), 34 deletions(-).

    The main changes are:

    1) minor verifier fix for fmod_ret progs, from Alexei.

    2) af_xdp overflow check, from Bjorn.

    3) minor verifier fix for 32bit assignment, from John.

    4) powerpc has non-overlapping addr space, from Petr.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull ceph fixes from Ilya Dryomov:
    "Cache tiering and cap handling fixups, both marked for stable"

    * tag 'ceph-for-5.7-rc8' of git://github.com/ceph/ceph-client:
    ceph: flush release queue when handling caps for unknown inode
    libceph: ignore pool overlay and cache logic on redirects

    Linus Torvalds
     
  • Pull gfs2 fix from Andreas Gruenbacher:
    "Fix the previous, flawed gfs2_find_jhead commit"

    * tag 'gfs2-v5.7-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
    gfs2: Even more gfs2_find_jhead fixes

    Linus Torvalds
     
  • Pull arm64 fix from Catalin Marinas:
    "Ensure __cpu_up() returns an error if cpu_online() is false after
    waiting for completion on cpu_running"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()

    Linus Torvalds
     
  • Pull parisc fix from Helge Deller:
    "Fix a kernel panic at boot time for some HP-PARISC machines"

    * 'parisc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Fix kernel panic in mem_init()

    Linus Torvalds
     
  • Pull iommu fixes from Joerg Roedel:

    - Two build fixes for issues introduced during the merge window

    - A fix for a reference count leak in an error path of
    iommu_group_alloc()

    * tag 'iommu-fixes-v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    iommu: Fix reference count leak in iommu_group_alloc.
    x86: Hide the archdata.iommu field behind generic IOMMU_API
    ia64: Hide the archdata.iommu field behind generic IOMMU_API

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:
    "Two small fixes:

    - Revert a block change that mixed up the return values for non-mq
    devices

    - NVMe poll race fix"

    * tag 'block-5.7-2020-05-29' of git://git.kernel.dk/linux-block:
    Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
    nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()

    Linus Torvalds
     
  • Pull rdma fixes from Jason Gunthorpe:
    "Nothing profound here, just a last set of long standing bug fixes:

    - Incorrect error unwind in qib and pvrdma

    - User triggerable NULL pointer crash in mlx5 with ODP prefetch

    - syzkaller RCU race in uverbs

    - Rare double free crash in ipoib"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
    IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
    RDMA/core: Fix double destruction of uobject
    RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()
    RDMA/mlx5: Fix NULL pointer dereference in destroy_prefetch_work
    IB/qib: Call kobject_put() when kobject_init_and_add() fails

    Linus Torvalds
     
  • Added a verifier test for assigning 32bit reg states to
    64bit where 32bit reg holds a constant value of 0.

    Without previous kernel verifier.c fix, the test in
    this patch will fail.

    Signed-off-by: Yonghong Song
    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/159077335867.6014.2075350327073125374.stgit@john-Precision-5820-Tower

    John Fastabend
     
  • After previous fix for zero extension test_verifier tests #65 and #66 now
    fail. Before the fix we can see the alu32 mov op at insn 10

    10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
    smin_value=4294967168,smax_value=4294967423,
    umin_value=4294967168,umax_value=4294967423,
    var_off=(0x0; 0x1ffffffff),
    s32_min_value=-2147483648,s32_max_value=2147483647,
    u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
    10: (bc) w1 = w1
    11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
    smin_value=0,smax_value=2147483647,
    umin_value=0,umax_value=4294967295,
    var_off=(0x0; 0xffffffff),
    s32_min_value=-2147483648,s32_max_value=2147483647,
    u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

    After the fix at insn 10 because we have 's32_min_value < 0' the following
    step 11 now has 'smax_value=U32_MAX' where before we pulled the s32_max_value
    bound into the smax_value as seen above in 11 with smax_value=2147483647.

    10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
    smin_value=4294967168,smax_value=4294967423,
    umin_value=4294967168,umax_value=4294967423,
    var_off=(0x0; 0x1ffffffff),
    s32_min_value=-2147483648, s32_max_value=2147483647,
    u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
    10: (bc) w1 = w1
    11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
    smin_value=0,smax_value=4294967295,
    umin_value=0,umax_value=4294967295,
    var_off=(0x0; 0xffffffff),
    s32_min_value=-2147483648, s32_max_value=2147483647,
    u32_min_value=0, u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

    The fall out of this is by the time we get to the failing instruction at
    step 14 where previously we had the following:

    14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
    smin_value=72057594021150720,smax_value=72057594029539328,
    umin_value=72057594021150720,umax_value=72057594029539328,
    var_off=(0xffffffff000000; 0xffffff),
    s32_min_value=-16777216,s32_max_value=-1,
    u32_min_value=-16777216,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
    14: (0f) r0 += r1

    We now have,

    14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
    smin_value=0,smax_value=72057594037927935,
    umin_value=0,umax_value=72057594037927935,
    var_off=(0x0; 0xffffffffffffff),
    s32_min_value=-2147483648,s32_max_value=2147483647,
    u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
    14: (0f) r0 += r1

    In the original step 14 'smin_value=72057594021150720' this trips the logic
    in the verifier function check_reg_sane_offset(),

    if (smin >= BPF_MAX_VAR_OFF || smin <>= 8

    Then the shift clears the top bit and smin_value is set to 0. Note we still
    have the smax_value in the fixed code so any reads will fail. An alternative
    would be to have reg_sane_check() do both smin and smax value tests.

    To fix the test we can omit the 'r1 >>=8' at line 13. This will change the
    err string, but keeps the intention of the test as suggseted by the title,
    "check after truncation of boundary-crossing range". If the verifier logic
    changes a different value is likely to be thrown in the error or the error
    will no longer be thrown forcing this test to be examined. With this change
    we see the new state at step 13.

    13: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
    smin_value=-4294967168,smax_value=127,
    umin_value=0,umax_value=18446744073709551615,
    s32_min_value=-2147483648,s32_max_value=2147483647,
    u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

    Giving the expected out of bounds error, "value -4294967168 makes map_value
    pointer be out of bounds" However, for unpriv case we see a different error
    now because of the mixed signed bounds pointer arithmatic. This seems OK so
    I've only added the unpriv_errstr for this. Another optino may have been to
    do addition on r1 instead of subtraction but I favor the approach above
    slightly.

    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov
    Acked-by: Yonghong Song
    Link: https://lore.kernel.org/bpf/159077333942.6014.14004320043595756079.stgit@john-Precision-5820-Tower

    John Fastabend
     
  • With the latest trunk llvm (llvm 11), I hit a verifier issue for
    test_prog subtest test_verif_scale1.

    The following simplified example illustrate the issue:
    w9 = 0 /* R9_w=inv0 */
    r8 = *(u32 *)(r1 + 80) /* __sk_buff->data_end */
    r7 = *(u32 *)(r1 + 76) /* __sk_buff->data */
    ......
    w2 = w9 /* R2_w=inv0 */
    r6 = r7 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r6 += r2 /* R6_w=inv(id=0) */
    r3 = r6 /* R3_w=inv(id=0) */
    r3 += 14 /* R3_w=inv(id=0) */
    if r3 > r8 goto end
    r5 = *(u32 *)(r6 + 0) /* R6_w=inv(id=0) */
    smax_value is assigned to be U32_MAX.
    The 64bit reg->smin_value is 0 and the 64bit register
    itself remains constant based on reg->var_off.

    In adjust_ptr_min_max_vals(), the verifier checks for a known constant,
    smin_val must be equal to smax_val. Since they are not equal,
    the verifier decides r6 is a unknown scalar, which caused later failure.

    The llvm10 does not have this issue as it generates different code:
    w9 = 0 /* R9_w=inv0 */
    r8 = *(u32 *)(r1 + 80) /* __sk_buff->data_end */
    r7 = *(u32 *)(r1 + 76) /* __sk_buff->data */
    ......
    r6 = r7 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r6 += r9 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r3 = r6 /* R3_w=pkt(id=0,off=0,r=0,imm=0) */
    r3 += 14 /* R3_w=pkt(id=0,off=14,r=0,imm=0) */
    if r3 > r8 goto end
    ...

    To fix the above issue, we can include zero in the test condition for
    assigning the s32_max_value and s32_min_value to their 64-bit equivalents
    smax_value and smin_value.

    Further, fix the condition to avoid doing zero extension bounds checks
    when s32_min_value
    Signed-off-by: Alexei Starovoitov
    Acked-by: Yonghong Song
    Link: https://lore.kernel.org/bpf/159077331983.6014.5758956193749002737.stgit@john-Precision-5820-Tower

    John Fastabend
     
  • Pull MMC fixes from Ulf Hansson:
    "MMC core:
    - Fix use-after-free issue for rpmb partition

    MMC host:
    - Fix quirk for broken CQE support"

    * tag 'mmc-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
    mmc: block: Fix use-after-free issue for rpmb
    mmc: sdhci: Fix SDHCI_QUIRK_BROKEN_CQE

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "Only a few last-minute small fixes: the change in ALSA core hwdep is
    about the undefined behavior of bit shift, which is almost harmless
    but still worth to pick up quickly.

    The rest are all device-specific fixes for HD-audio and USB-audio, and
    safe to apply at the late stage"

    * tag 'sound-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda/realtek - Add new codec supported for ALC287
    ALSA: usb-audio: Quirks for Gigabyte TRX40 Aorus Master onboard audio
    ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC
    ALSA: hda/realtek - Add a model for Thinkpad T570 without DAC workaround
    ALSA: hwdep: fix a left shifting 1 by 31 UB bug

    Linus Torvalds
     
  • Pull clk fixes from Stephen Boyd:
    "Two fixes for the new SM8150 and SM8250 Qualcomm clk drivers to fix a
    randconfig build error and an incorrect parent mapping"

    * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
    clk: qcom: gcc: Fix parent for gpll0_out_even
    clk: qcom: sm8250 gcc depends on QCOM_GDSC

    Linus Torvalds
     
  • Fix the following issue:
    [ 436.749342] BUG: KASAN: use-after-free in bpf_trampoline_put+0x39/0x2a0
    [ 436.749995] Write of size 4 at addr ffff8881ef38b8a0 by task kworker/3:5/2243
    [ 436.750712]
    [ 436.752677] Workqueue: events bpf_prog_free_deferred
    [ 436.753183] Call Trace:
    [ 436.756483] bpf_trampoline_put+0x39/0x2a0
    [ 436.756904] bpf_prog_free_deferred+0x16d/0x3d0
    [ 436.757377] process_one_work+0x94a/0x15b0
    [ 436.761969]
    [ 436.762130] Allocated by task 2529:
    [ 436.763323] bpf_trampoline_lookup+0x136/0x540
    [ 436.763776] bpf_check+0x2872/0xa0a8
    [ 436.764144] bpf_prog_load+0xb6f/0x1350
    [ 436.764539] __do_sys_bpf+0x16d7/0x3720
    [ 436.765825]
    [ 436.765988] Freed by task 2529:
    [ 436.767084] kfree+0xc6/0x280
    [ 436.767397] bpf_trampoline_put+0x1fd/0x2a0
    [ 436.767826] bpf_check+0x6832/0xa0a8
    [ 436.768197] bpf_prog_load+0xb6f/0x1350
    [ 436.768594] __do_sys_bpf+0x16d7/0x3720

    prog->aux->trampoline = tr should be set only when prog is valid.
    Otherwise prog freeing will try to put trampoline via prog->aux->trampoline,
    but it may not point to a valid trampoline.

    Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN")
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Acked-by: KP Singh
    Link: https://lore.kernel.org/bpf/20200529043839.15824-2-alexei.starovoitov@gmail.com

    Alexei Starovoitov
     
  • The drivers reports EINVAL to userspace through netlink on invalid meta
    match. This is confusing since EINVAL is usually reserved for malformed
    netlink messages. Replace it by more meaningful codes.

    Fixes: 6d65bc64e232 ("net/mlx5e: Add mlx5e_flower_parse_meta support")
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Saeed Mahameed

    Pablo Neira Ayuso
     
  • Change MLX5_TC_CT config dependencies to include MLX5_ESWITCH instead of
    MLX5_CORE_EN && NET_SWITCHDEV, which are already required by MLX5_ESWITCH.
    Without this change mlx5 fails to compile if user disables MLX5_ESWITCH
    without also manually disabling MLX5_TC_CT.

    Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking")
    Signed-off-by: Vlad Buslov
    Reviewed-by: Roi Dayan
    Signed-off-by: Saeed Mahameed

    Vlad Buslov