17 Mar, 2021

5 commits

  • crypto_aead_encrypt returns
    Link: https://lore.kernel.org/r/20210309204137.823268-1-daniel.phan36@gmail.com
    Signed-off-by: Johannes Berg

    Daniel Phan
     
  • We observed some Cisco APs sending the following HE Operation IE in
    associate response:

    ff 0a 24 f4 3f 00 01 fc ff 00 00 00

    Its HE operation parameter is 0x003ff4, so the expected total length is
    7 which does not match the actual length = 10. This causes association
    failing with "HE AP is missing HE Capability/operation."

    According to P802.11ax_D4 Table9-94, HE operation is extensible, and
    according to 802.11-2016 10.27.8, STA should discard the part beyond
    the maximum length and parse the truncated element.

    Allow HE operation element to be longer than expected to handle this
    case and future extensions.

    Fixes: e4d005b80dee ("mac80211: refactor extended element parsing")
    Signed-off-by: Brian Norris
    Signed-off-by: Yen-lin Lai
    Link: https://lore.kernel.org/r/20210223051926.2653301-1-yenlinlai@chromium.org
    Signed-off-by: Johannes Berg

    Brian Norris
     
  • This probably came in through some refactoring and what is
    now a call to minstrel_ht_group_min_rate_offset(), remove
    the unused variable.

    Reported-by: kernel test robot
    Acked-by: Felix Fietkau
    Link: https://lore.kernel.org/r/20210219105744.f2538a80f6cf.I3d53554c158d5b896ac07ea546bceac67372ec28@changeid
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • Clear beacon ie pointer and ie length after free
    in order to prevent double free.

    ==================================================================
    BUG: KASAN: double-free or invalid-free \
    in ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876

    CPU: 0 PID: 8472 Comm: syz-executor100 Not tainted 5.11.0-rc6-syzkaller #0
    Call Trace:
    __dump_stack lib/dump_stack.c:79 [inline]
    dump_stack+0x107/0x163 lib/dump_stack.c:120
    print_address_description.constprop.0.cold+0x5b/0x2c6 mm/kasan/report.c:230
    kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355
    ____kasan_slab_free+0xcc/0xe0 mm/kasan/common.c:341
    kasan_slab_free include/linux/kasan.h:192 [inline]
    __cache_free mm/slab.c:3424 [inline]
    kfree+0xed/0x270 mm/slab.c:3760
    ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876
    rdev_leave_ibss net/wireless/rdev-ops.h:545 [inline]
    __cfg80211_leave_ibss+0x19a/0x4c0 net/wireless/ibss.c:212
    __cfg80211_leave+0x327/0x430 net/wireless/core.c:1172
    cfg80211_leave net/wireless/core.c:1221 [inline]
    cfg80211_netdev_notifier_call+0x9e8/0x12c0 net/wireless/core.c:1335
    notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
    call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040
    call_netdevice_notifiers_extack net/core/dev.c:2052 [inline]
    call_netdevice_notifiers net/core/dev.c:2066 [inline]
    __dev_close_many+0xee/0x2e0 net/core/dev.c:1586
    __dev_close net/core/dev.c:1624 [inline]
    __dev_change_flags+0x2cb/0x730 net/core/dev.c:8476
    dev_change_flags+0x8a/0x160 net/core/dev.c:8549
    dev_ifsioc+0x210/0xa70 net/core/dev_ioctl.c:265
    dev_ioctl+0x1b1/0xc40 net/core/dev_ioctl.c:511
    sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
    sock_ioctl+0x477/0x6a0 net/socket.c:1177
    vfs_ioctl fs/ioctl.c:48 [inline]
    __do_sys_ioctl fs/ioctl.c:753 [inline]
    __se_sys_ioctl fs/ioctl.c:739 [inline]
    __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Reported-by: syzbot+93976391bf299d425f44@syzkaller.appspotmail.com
    Signed-off-by: Markus Theil
    Link: https://lore.kernel.org/r/20210213133653.367130-1-markus.theil@tu-ilmenau.de
    Signed-off-by: Johannes Berg

    Markus Theil
     
  • Coverity reported the strange "if (~...)" condition that's
    always true. It suggested that ! was intended instead of ~,
    but upon further analysis I'm convinced that what really was
    intended was a comparison to 0xff/0xffff (in HT/VHT cases
    respectively), since this indicates that all of the rates
    are enabled.

    Change the comparison accordingly.

    I'm guessing this never really mattered because a reset to
    not having a rate mask is basically equivalent to having a
    mask that enables all rates.

    Reported-by: Colin Ian King
    Fixes: 2ffbe6d33366 ("mac80211: fix and optimize MCS mask handling")
    Fixes: b119ad6e726c ("mac80211: add rate mask logic for vht rates")
    Reviewed-by: Colin Ian King
    Link: https://lore.kernel.org/r/20210212112213.36b38078f569.I8546a20c80bc1669058eb453e213630b846e107b@changeid
    Signed-off-by: Johannes Berg

    Johannes Berg
     

16 Mar, 2021

5 commits

  • Currently, Linux computes the HMAC contained in ADD_ADDR sub-option using
    the Address Id and the IP Address, and hardcodes a destination port equal
    to zero. This is not ok for ADD_ADDR with port: ensure to account for the
    endpoint port when computing the HMAC, in compliance with RFC8684 §3.4.1.

    Fixes: 22fb85ffaefb ("mptcp: add port support for ADD_ADDR suboption writing")
    Reviewed-by: Mat Martineau
    Acked-by: Geliang Tang
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     
  • Currently tcp_check_req can be called with obsolete req socket for which big
    socket have been already created (because of CPU race or early demux
    assigning req socket to multiple packets in gro batch).

    Commit e0f9759f530bf789e984 ("tcp: try to keep packet if SYN_RCV race
    is lost") added retry in case when tcp_check_req is called for PSH|ACK packet.
    But if client sends RST+ACK immediatly after connection being
    established (it is performing healthcheck, for example) retry does not
    occur. In that case tcp_check_req tries to close req socket,
    leaving big socket active.

    Fixes: e0f9759f530 ("tcp: try to keep packet if SYN_RCV race is lost")
    Signed-off-by: Alexander Ovechkin
    Reported-by: Oleg Senin
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Ovechkin
     
  • Before calling tipc_aead_key_size(ptr), we need to ensure
    we have enough data to dereference ptr->keylen.

    We probably also want to make sure tipc_aead_key_size()
    wont overflow with malicious ptr->keylen values.

    Syzbot reported:

    BUG: KMSAN: uninit-value in __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
    BUG: KMSAN: uninit-value in tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
    CPU: 0 PID: 21060 Comm: syz-executor.5 Not tainted 5.11.0-rc7-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:79 [inline]
    dump_stack+0x21c/0x280 lib/dump_stack.c:120
    kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
    __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
    __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
    tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
    genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
    genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
    genl_rcv_msg+0x1319/0x1610 net/netlink/genetlink.c:800
    netlink_rcv_skb+0x6fa/0x810 net/netlink/af_netlink.c:2494
    genl_rcv+0x63/0x80 net/netlink/genetlink.c:811
    netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
    netlink_unicast+0x11d6/0x14a0 net/netlink/af_netlink.c:1330
    netlink_sendmsg+0x1740/0x1840 net/netlink/af_netlink.c:1919
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg net/socket.c:672 [inline]
    ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
    ___sys_sendmsg net/socket.c:2399 [inline]
    __sys_sendmsg+0x714/0x830 net/socket.c:2432
    __compat_sys_sendmsg net/compat.c:347 [inline]
    __do_compat_sys_sendmsg net/compat.c:354 [inline]
    __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
    __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
    do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
    __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
    do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
    do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
    entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
    RIP: 0023:0xf7f60549
    Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
    RSP: 002b:00000000f555a5fc EFLAGS: 00000296 ORIG_RAX: 0000000000000172
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000200
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
    kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104
    kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76
    slab_alloc_node mm/slub.c:2907 [inline]
    __kmalloc_node_track_caller+0xa37/0x1430 mm/slub.c:4527
    __kmalloc_reserve net/core/skbuff.c:142 [inline]
    __alloc_skb+0x2f8/0xb30 net/core/skbuff.c:210
    alloc_skb include/linux/skbuff.h:1099 [inline]
    netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
    netlink_sendmsg+0xdbc/0x1840 net/netlink/af_netlink.c:1894
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg net/socket.c:672 [inline]
    ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
    ___sys_sendmsg net/socket.c:2399 [inline]
    __sys_sendmsg+0x714/0x830 net/socket.c:2432
    __compat_sys_sendmsg net/compat.c:347 [inline]
    __do_compat_sys_sendmsg net/compat.c:354 [inline]
    __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
    __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
    do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
    __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
    do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
    do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
    entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

    Fixes: e1f32190cf7d ("tipc: add support for AEAD key setting via netlink")
    Signed-off-by: Eric Dumazet
    Cc: Tuong Lien
    Cc: Jon Maloy
    Cc: Ying Xue
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • if pl->mac_ops->mac_finish() failed, phylink_err should use
    "mac_finish" instead of "mac_prepare".

    Fixes: b7ad14c2fe2d4 ("net: phylink: re-implement interface configuration with PCS")
    Signed-off-by: Ong Boon Leong
    Signed-off-by: David S. Miller

    Ong Boon Leong
     
  • "x25_close" is called by "hdlc_close" in "hdlc.c", which is called by
    hardware drivers' "ndo_stop" function.
    "x25_xmit" is called by "hdlc_start_xmit" in "hdlc.c", which is hardware
    drivers' "ndo_start_xmit" function.
    "x25_rx" is called by "hdlc_rcv" in "hdlc.c", which receives HDLC frames
    from "net/core/dev.c".

    "x25_close" races with "x25_xmit" and "x25_rx" because their callers race.

    However, we need to ensure that the LAPB APIs called in "x25_xmit" and
    "x25_rx" are called before "lapb_unregister" is called in "x25_close".

    This patch adds locking to ensure when "x25_xmit" and "x25_rx" are doing
    their work, "lapb_unregister" is not yet called in "x25_close".

    Reasons for not solving the racing between "x25_close" and "x25_xmit" by
    calling "netif_tx_disable" in "x25_close":
    1. We still need to solve the racing between "x25_close" and "x25_rx";
    2. The design of the HDLC subsystem assumes the HDLC hardware drivers
    have full control over the TX queue, and the HDLC protocol drivers (like
    this driver) have no control. Controlling the queue here in the protocol
    driver may interfere with hardware drivers' control of the queue.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Xie He
    Signed-off-by: David S. Miller

    Xie He
     

15 Mar, 2021

3 commits

  • flow_dissector_key_icmp::id is of type u16 (CPU byteorder),
    ICMP header has its ID field in network byteorder obviously.
    Sparse says:

    net/core/flow_dissector.c:178:43: warning: restricted __be16 degrades to integer

    Convert ID value to CPU byteorder when storing it into
    flow_dissector_key_icmp.

    Fixes: 5dec597e5cd0 ("flow_dissector: extract more ICMP information")
    Signed-off-by: Alexander Lobakin
    Signed-off-by: David S. Miller

    Alexander Lobakin
     
  • struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently
    does not clear it before copying kernel data to user space.

    It might be too late to name the hole since sockaddr_qrtr structure is uapi.

    BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
    CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:79 [inline]
    dump_stack+0x21c/0x280 lib/dump_stack.c:120
    kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
    kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402
    kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
    instrument_copy_to_user include/linux/instrumented.h:121 [inline]
    _copy_to_user+0x1ac/0x270 lib/usercopy.c:33
    copy_to_user include/linux/uaccess.h:209 [inline]
    move_addr_to_user+0x3a2/0x640 net/socket.c:237
    ____sys_recvmsg+0x696/0xd50 net/socket.c:2575
    ___sys_recvmsg net/socket.c:2610 [inline]
    do_recvmmsg+0xa97/0x22d0 net/socket.c:2710
    __sys_recvmmsg net/socket.c:2789 [inline]
    __do_sys_recvmmsg net/socket.c:2812 [inline]
    __se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805
    __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805
    do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x465f69
    Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
    RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69
    RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003
    RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60
    R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000

    Local variable ----addr@____sys_recvmsg created at:
    ____sys_recvmsg+0x168/0xd50 net/socket.c:2550
    ____sys_recvmsg+0x168/0xd50 net/socket.c:2550

    Bytes 2-3 of 12 are uninitialized
    Memory access of size 12 starts at ffff88817c627b40
    Data copied to user address 0000000020000140

    Fixes: bdabad3e363d ("net: Add Qualcomm IPC router")
    Signed-off-by: Eric Dumazet
    Cc: Courtney Cavin
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • There are two issues when handling error case in com20020pci_probe()

    1. priv might be not initialized yet when calling com20020pci_remove()
    from com20020pci_probe(), since the priv is set at the very last but it
    can jump to error handling in the middle and priv remains NULL.
    2. memory leak - the net device is allocated in alloc_arcdev but not
    properly released if error happens in the middle of the big for loop

    [ 1.529110] BUG: kernel NULL pointer dereference, address: 0000000000000008
    [ 1.531447] RIP: 0010:com20020pci_remove+0x15/0x60 [com20020_pci]
    [ 1.536805] Call Trace:
    [ 1.536939] com20020pci_probe+0x3f2/0x48c [com20020_pci]
    [ 1.537226] local_pci_probe+0x48/0x80
    [ 1.539918] com20020pci_init+0x3f/0x1000 [com20020_pci]

    Signed-off-by: Tong Zhang
    Signed-off-by: David S. Miller

    Tong Zhang
     

14 Mar, 2021

2 commits

  • This commit fixes three spelling typos in devlink-dpipe.rst and
    devlink-port.rst.

    Signed-off-by: Eva Dengler
    Acked-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Eva Dengler
     
  • When a QMI handle is initialized, an array of message handler
    structures is provided, defining how any received message should
    be handled based on its type and message ID. The QMI core code
    traverses this array when a message arrives and calls the function
    associated with the (type, msg_id) found in the array.

    The array is supposed to be terminated with an empty (all zero)
    entry though. Without it, an unsupported message will cause
    the QMI core code to go past the end of the array.

    Fix this bug, by properly terminating the message handler arrays
    provided when QMI handles are set up by the IPA driver.

    Fixes: 530f9216a9537 ("soc: qcom: ipa: AP/modem communications")
    Reported-by: Sujit Kautkar
    Signed-off-by: Alex Elder
    Reviewed-by: Bjorn Andersson
    Signed-off-by: David S. Miller

    Alex Elder
     

13 Mar, 2021

11 commits

  • /tnguy/net-queue

    Tony Nguyen says:

    ====================
    Intel Wired LAN Driver Updates 2021-03-12

    This series contains updates to ice, i40e, ixgbe and igb drivers.

    Magnus adjusts the return value for xsk allocation for ice. This fixes
    reporting of napi work done and matches the behavior of other Intel NIC
    drivers for xsk allocations.

    Maciej moves storing of the rx_offset value to after the build_skb flag
    is set as this flag affects the offset value for ice, i40e, and ixgbe.

    Li RongQing resolves an issue where an Rx buffer can be reused
    prematurely with XDP redirect for igb.
    ====================

    David S. Miller
     
  • Tom wrote most of the driver code and his experience is valuable to us.
    Add him as a Reviewer so that patches will be Cc'ed and reviewed by him.

    Signed-off-by: Lijun Pan
    Signed-off-by: David S. Miller

    Lijun Pan
     
  • The join self tests previously used the '-c' command line option to
    enable creation of pcap files for the tests that run, but the change to
    allow running a subset of the join tests made overlapping use of that
    option.

    Restore the capture functionality with '-c' and move the syncookie test
    option to '-k'.

    Fixes: 1002b89f23ea ("selftests: mptcp: add command line arguments for mptcp_join.sh")
    Acked-and-tested-by: Geliang Tang
    Co-developed-by: Matthieu Baerts
    Signed-off-by: Matthieu Baerts
    Signed-off-by: Mat Martineau
    Signed-off-by: David S. Miller

    Mat Martineau
     
  • A recent change to MIPS ralink reset logic made it so mt7530 actually
    resets the switch on platforms such as mt7621 (where bit 2 is the reset
    line for the switch). That exposed an issue where the switch would not
    function properly in TRGMII mode after a reset.

    Reconfigure core clock in TRGMII mode to fix the issue.

    Tested on Ubiquiti ER-X (MT7621) with TRGMII mode enabled.

    Fixes: 3f9ef7785a9c ("MIPS: ralink: manage low reset lines")
    Signed-off-by: Ilya Lipnitskiy
    Signed-off-by: David S. Miller

    Ilya Lipnitskiy
     
  • The MPTCP_PUSH_PENDING define is 6 and these tests should be testing if
    BIT(6) is set.

    Fixes: c2e6048fa1cf ("mptcp: fix race in release_cb")
    Signed-off-by: Dan Carpenter
    Reviewed-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • The PHY driver entry for BCM50160 and BCM50610M calls
    bcm54xx_config_init() but does not call bcm54xx_config_clock_delay() in
    order to configuration appropriate clock delays on the PHY, fix that.

    Fixes: 733336262b28 ("net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • The interrupt handler may set the flag to reset the mac in the future,
    but that flag is not cleared once the reset has occurred.

    Fixes: 10cbd6407609 ("ftgmac100: Rework NAPI & interrupts handling")
    Signed-off-by: Dylan Hung
    Acked-by: Benjamin Herrenschmidt
    Reviewed-by: Joel Stanley
    Signed-off-by: Joel Stanley
    Signed-off-by: David S. Miller

    Dylan Hung
     
  • The driver did not always clean up all allocated resources when probe
    failed. Fix the probe cleanup path to clean up everything that was
    allocated.

    Fixes: 57baf8cc70ea ("net: axienet: Handle deferred probe on clock properly")
    Signed-off-by: Robert Hancock
    Signed-off-by: David S. Miller

    Robert Hancock
     
  • The "backlog" argument in listen() specifies
    the maximom length of pending connections,
    so the accept queue should be considered full
    if there are exactly "backlog" elements.

    Signed-off-by: liuyacan
    Signed-off-by: David S. Miller

    liuyacan
     
  • This reverts commit 2055a99da8a253a357bdfd359b3338ef3375a26c.

    This change rejects legitimate configurations.

    A slave doesn't need to exist nor implement ndo_slave_setup.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Igb needs a similar fix as commit 75aab4e10ae6a ("i40e: avoid
    premature Rx buffer reuse")

    The page recycle code, incorrectly, relied on that a page fragment
    could not be freed inside xdp_do_redirect(). This assumption leads to
    that page fragments that are used by the stack/XDP redirect can be
    reused and overwritten.

    To avoid this, store the page count prior invoking xdp_do_redirect().

    Longer explanation:

    Intel NICs have a recycle mechanism. The main idea is that a page is
    split into two parts. One part is owned by the driver, one part might
    be owned by someone else, such as the stack.

    t0: Page is allocated, and put on the Rx ring
    +---------------
    used by NIC ->| upper buffer
    (rx_buffer) +---------------
    | lower buffer
    +---------------
    page count == USHRT_MAX
    rx_buffer->pagecnt_bias == USHRT_MAX

    t1: Buffer is received, and passed to the stack (e.g.)
    +---------------
    | upper buff (skb)
    +---------------
    used by NIC ->| lower buffer
    (rx_buffer) +---------------
    page count == USHRT_MAX
    rx_buffer->pagecnt_bias == USHRT_MAX - 1

    t2: Buffer is received, and redirected
    +---------------
    | upper buff (skb)
    +---------------
    used by NIC ->| lower buffer
    (rx_buffer) +---------------

    Now, prior calling xdp_do_redirect():
    page count == USHRT_MAX
    rx_buffer->pagecnt_bias == USHRT_MAX - 2

    This means that buffer *cannot* be flipped/reused, because the skb is
    still using it.

    The problem arises when xdp_do_redirect() actually frees the
    segment. Then we get:
    page count == USHRT_MAX - 1
    rx_buffer->pagecnt_bias == USHRT_MAX - 2

    From a recycle perspective, the buffer can be flipped and reused,
    which means that the skb data area is passed to the Rx HW ring!

    To work around this, the page count is stored prior calling
    xdp_do_redirect().

    Fixes: 9cbc948b5a20 ("igb: add XDP support")
    Signed-off-by: Li RongQing
    Reviewed-by: Alexander Duyck
    Tested-by: Vishakha Jambekar
    Signed-off-by: Tony Nguyen

    Li RongQing
     

12 Mar, 2021

14 commits

  • ixgbe_rx_offset(), that is supposed to initialize the Rx buffer headroom,
    relies on __IXGBE_RX_BUILD_SKB_ENABLED flag.

    Currently, the callsite of mentioned function is placed incorrectly
    within ixgbe_setup_rx_resources() where Rx ring's build skb flag is not
    set yet. This causes the XDP_REDIRECT to be partially broken due to
    inability to create xdp_frame in the headroom space, as the headroom is
    0.

    Fix this by moving ixgbe_rx_offset() to ixgbe_configure_rx_ring() after
    the flag setting, which happens to be set in ixgbe_set_rx_buffer_len.

    Fixes: c0d4e9d223c5 ("ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring")
    Signed-off-by: Maciej Fijalkowski
    Tested-by: Vishakha Jambekar
    Signed-off-by: Tony Nguyen

    Maciej Fijalkowski
     
  • ice_rx_offset(), that is supposed to initialize the Rx buffer headroom,
    relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence.

    Currently, the callsite of mentioned function is placed incorrectly
    within ice_setup_rx_ring() where Rx ring's build skb flag is not
    set yet. This causes the XDP_REDIRECT to be partially broken due to
    inability to create xdp_frame in the headroom space, as the headroom is
    0.

    Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag
    setting.

    Fixes: f1b1f409bf79 ("ice: store the result of ice_rx_offset() onto ice_ring")
    Signed-off-by: Maciej Fijalkowski
    Tested-by: Kiran Bhandare
    Signed-off-by: Tony Nguyen

    Maciej Fijalkowski
     
  • i40e_rx_offset(), that is supposed to initialize the Rx buffer headroom,
    relies on I40E_RXR_FLAGS_BUILD_SKB_ENABLED flag.

    Currently, the callsite of mentioned function is placed incorrectly
    within i40e_setup_rx_descriptors() where Rx ring's build skb flag is not
    set yet. This causes the XDP_REDIRECT to be partially broken due to
    inability to create xdp_frame in the headroom space, as the headroom is
    0.

    For the record, below is the call graph:

    i40e_vsi_open
    i40e_vsi_setup_rx_resources
    i40e_setup_rx_descriptors
    i40e_rx_offset()
    Co-developed-by: Jesper Dangaard Brouer
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: Maciej Fijalkowski
    Acked-by: Jesper Dangaard Brouer
    Tested-by: Jesper Dangaard Brouer
    Tested-by: Kiran Bhandare
    Signed-off-by: Tony Nguyen

    Maciej Fijalkowski
     
  • Fix the wrong napi work done reporting in the xsk path of the ice
    driver. The code in the main Rx processing loop was written to assume
    that the buffer allocation code returns true if all allocations where
    successful and false if not. In contrast with all other Intel NIC xsk
    drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up
    the work done reporting in the napi loop.

    This can be fixed either by inverting the return value from
    ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect
    way, or by changing the return value of ice_alloc_rx_bufs_zc(). We
    chose the latter as it makes all the xsk allocation functions for
    Intel NICs behave in the same way. My guess is that it was this
    unexpected discrepancy that gave rise to this bug in the first place.

    Fixes: 5bb0c4b5eb61 ("ice, xsk: Move Rx allocation out of while-loop")
    Reported-by: Maciej Fijalkowski
    Signed-off-by: Magnus Karlsson
    Tested-by: Kiran Bhandare
    Signed-off-by: Tony Nguyen

    Magnus Karlsson
     
  • Maxim Mikityanskiy says:

    ====================
    Bugfixes for HTB

    The HTB offload feature introduced a few bugs in HTB. One affects the
    non-offload mode, preventing attaching qdiscs to HTB classes, and the
    other affects the error flow, when the netdev doesn't support the
    offload, but it was requested. This short series fixes them.
    ====================

    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    David S. Miller
     
  • htb_init may fail to do the offload if it's not supported or if a
    runtime error happens when allocating direct qdiscs. In those cases
    TC_HTB_CREATE command is not sent to the driver, however, htb_destroy
    gets called anyway and attempts to send TC_HTB_DESTROY.

    It shouldn't happen, because the driver didn't receive TC_HTB_CREATE,
    and also because the driver may not support ndo_setup_tc at all, while
    q->offload is true, and htb_destroy mistakenly thinks the offload is
    supported. Trying to call ndo_setup_tc in the latter case will lead to a
    NULL pointer dereference.

    This commit fixes the issues with htb_destroy by deferring assignment of
    q->offload until after the TC_HTB_CREATE command. The necessary cleanup
    of the offload entities is already done in htb_init.

    Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
    Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload")
    Suggested-by: Eric Dumazet
    Signed-off-by: Maxim Mikityanskiy
    Reviewed-by: Tariq Toukan
    Signed-off-by: David S. Miller

    Maxim Mikityanskiy
     
  • htb_select_queue assumes it's always the offload mode, and it ends up in
    calling ndo_setup_tc without any checks. It may lead to a NULL pointer
    dereference if ndo_setup_tc is not implemented, or to an error returned
    from the driver, which will prevent attaching qdiscs to HTB classes in
    the non-offload mode.

    This commit fixes the bug by adding the missing check to
    htb_select_queue. In the non-offload mode it will return sch->dev_queue,
    mimicking tc_modify_qdisc's behavior for the case where select_queue is
    not implemented.

    Reported-by: syzbot+b53a709f04722ca12a3c@syzkaller.appspotmail.com
    Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload")
    Signed-off-by: Maxim Mikityanskiy
    Reviewed-by: Tariq Toukan
    Signed-off-by: David S. Miller

    Maxim Mikityanskiy
     
  • Per the datasheet, when we clear the power down bit, the PHY remains in
    an internal reset state for 40us and then resume normal operation.
    Account for that delay to avoid any issues in the future if
    genphy_resume() changes.

    Fixes: fe26821fa614 ("net: phy: broadcom: Wire suspend/resume for BCM54810")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • setup_fritz() in avmfritz.c might fail with -EIO and in this case the
    isac.type and isac.write_reg is not initialized and remains 0(NULL).
    A subsequent call to isac_release() will dereference isac->write_reg and
    crash.

    [ 1.737444] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 1.737809] #PF: supervisor instruction fetch in kernel mode
    [ 1.738106] #PF: error_code(0x0010) - not-present page
    [ 1.738378] PGD 0 P4D 0
    [ 1.738515] Oops: 0010 [#1] SMP NOPTI
    [ 1.738711] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted 5.12.0-rc2+ #78
    [ 1.739077] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-p
    rebuilt.qemu.org 04/01/2014
    [ 1.739664] RIP: 0010:0x0
    [ 1.739807] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
    [ 1.740200] RSP: 0018:ffffc9000027ba10 EFLAGS: 00010202
    [ 1.740478] RAX: 0000000000000000 RBX: ffff888102f41840 RCX: 0000000000000027
    [ 1.740853] RDX: 00000000000000ff RSI: 0000000000000020 RDI: ffff888102f41800
    [ 1.741226] RBP: ffffc9000027ba20 R08: ffff88817bc18440 R09: ffffc9000027b808
    [ 1.741600] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888102f41840
    [ 1.741976] R13: 00000000fffffffb R14: ffff888102f41800 R15: ffff8881008b0000
    [ 1.742351] FS: 00007fda3a38a8c0(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000
    [ 1.742774] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1.743076] CR2: ffffffffffffffd6 CR3: 00000001021ec000 CR4: 00000000000006f0
    [ 1.743452] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1.743828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 1.744206] Call Trace:
    [ 1.744339] isac_release+0xcc/0xe0 [mISDNipac]
    [ 1.744582] fritzpci_probe.cold+0x282/0x739 [avmfritz]
    [ 1.744861] local_pci_probe+0x48/0x80
    [ 1.745063] pci_device_probe+0x10f/0x1c0
    [ 1.745278] really_probe+0xfb/0x420
    [ 1.745471] driver_probe_device+0xe9/0x160
    [ 1.745693] device_driver_attach+0x5d/0x70
    [ 1.745917] __driver_attach+0x8f/0x150
    [ 1.746123] ? device_driver_attach+0x70/0x70
    [ 1.746354] bus_for_each_dev+0x7e/0xc0
    [ 1.746560] driver_attach+0x1e/0x20
    [ 1.746751] bus_add_driver+0x152/0x1f0
    [ 1.746957] driver_register+0x74/0xd0
    [ 1.747157] ? 0xffffffffc00d8000
    [ 1.747334] __pci_register_driver+0x54/0x60
    [ 1.747562] AVM_init+0x36/0x1000 [avmfritz]
    [ 1.747791] do_one_initcall+0x48/0x1d0
    [ 1.747997] ? __cond_resched+0x19/0x30
    [ 1.748206] ? kmem_cache_alloc_trace+0x390/0x440
    [ 1.748458] ? do_init_module+0x28/0x250
    [ 1.748669] do_init_module+0x62/0x250
    [ 1.748870] load_module+0x23ee/0x26a0
    [ 1.749073] __do_sys_finit_module+0xc2/0x120
    [ 1.749307] ? __do_sys_finit_module+0xc2/0x120
    [ 1.749549] __x64_sys_finit_module+0x1a/0x20
    [ 1.749782] do_syscall_64+0x38/0x90

    Signed-off-by: Tong Zhang
    Signed-off-by: David S. Miller

    Tong Zhang
     
  • In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by
    vfree(). But unfortunately, it is used when extended is true.

    Fixes: 7061b2bdd620e ("qlogic: Deletion of unnecessary checks before two function calls")
    Signed-off-by: Lv Yunlong
    Signed-off-by: David S. Miller

    Lv Yunlong
     
  • Tony Nguyen says:

    ====================
    Intel Wired LAN Driver Updates 2021-03-11

    This series contains updates to igc and e1000e drivers.

    Sasha adds locking to reset task to prevent race condition for igc.

    Muhammad fixes reporting of supported pause frame as well as advertised
    pause frame for Tx/Rx off for igc.

    Andre fixes timestamp retrieval from the wrong timer for igc.

    Vitaly adds locking to reset task to prevent race condition for e1000e.

    Dinghao Liu adds a missed check to return on error in
    e1000_set_d0_lplu_state_82571.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Introduce the new function tw_prot_init (inspired by
    req_prot_init) to simplify "proto_register" function.

    tw_prot_cleanup will take care of a partially initialized
    timewait_sock_ops.

    Signed-off-by: Tonghao Zhang
    Reviewed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Tonghao Zhang
     
  • There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571
    that we have caught its return value but lack further handling.
    Check and terminate the execution flow just like other e1e_wphy()
    in this function.

    Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
    Signed-off-by: Dinghao Liu
    Acked-by: Sasha Neftin
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Dinghao Liu
     
  • A possible race condition was found in e1000_reset_task,
    after discovering a similar issue in igb driver via
    commit 024a8168b749 ("igb: reinit_locked() should be called
    with rtnl_lock").

    Added rtnl_lock() and rtnl_unlock() to avoid this.

    Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
    Suggested-by: Jakub Kicinski
    Signed-off-by: Vitaly Lifshits
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Vitaly Lifshits