12 Mar, 2021

8 commits

  • Tony Nguyen says:

    ====================
    Intel Wired LAN Driver Updates 2021-03-11

    This series contains updates to igc and e1000e drivers.

    Sasha adds locking to reset task to prevent race condition for igc.

    Muhammad fixes reporting of supported pause frame as well as advertised
    pause frame for Tx/Rx off for igc.

    Andre fixes timestamp retrieval from the wrong timer for igc.

    Vitaly adds locking to reset task to prevent race condition for e1000e.

    Dinghao Liu adds a missed check to return on error in
    e1000_set_d0_lplu_state_82571.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Introduce the new function tw_prot_init (inspired by
    req_prot_init) to simplify "proto_register" function.

    tw_prot_cleanup will take care of a partially initialized
    timewait_sock_ops.

    Signed-off-by: Tonghao Zhang
    Reviewed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Tonghao Zhang
     
  • There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571
    that we have caught its return value but lack further handling.
    Check and terminate the execution flow just like other e1e_wphy()
    in this function.

    Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
    Signed-off-by: Dinghao Liu
    Acked-by: Sasha Neftin
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Dinghao Liu
     
  • A possible race condition was found in e1000_reset_task,
    after discovering a similar issue in igb driver via
    commit 024a8168b749 ("igb: reinit_locked() should be called
    with rtnl_lock").

    Added rtnl_lock() and rtnl_unlock() to avoid this.

    Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
    Suggested-by: Jakub Kicinski
    Signed-off-by: Vitaly Lifshits
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Vitaly Lifshits
     
  • The comment describing the timestamps layout in the packet buffer is
    wrong and the code is actually retrieving the timestamp in Timer 1
    reference instead of Timer 0. This hasn't been a big issue so far
    because hardware is configured to report both timestamps using Timer 0
    (see IGC_SRRCTL register configuration in igc_ptp_enable_rx_timestamp()
    helper). This patch fixes the comment and the code so we retrieve the
    timestamp in Timer 0 reference as expected.

    This patch also takes the opportunity to get rid of the hw.mac.type check
    since it is not required.

    Fixes: 81b055205e8ba ("igc: Add support for RX timestamping")
    Signed-off-by: Andre Guedes
    Signed-off-by: Vedang Patel
    Signed-off-by: Jithu Joseph
    Reviewed-by: Maciej Fijalkowski
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Andre Guedes
     
  • The Supported Pause Frame always display "No" even though the Advertised
    pause frame showing the correct setting based on the pause parameters via
    ethtool. Set bit in link_ksettings to "Supported" for Pause Frame.

    Before output:
    Supported pause frame use: No

    Expected output:
    Supported pause frame use: Symmetric

    Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
    Signed-off-by: Muhammad Husaini Zulkifli
    Reviewed-by: Malli C
    Tested-by: Dvora Fuxbrumer
    Acked-by: Sasha Neftin
    Signed-off-by: Tony Nguyen

    Muhammad Husaini Zulkifli
     
  • Fix Pause Frame Advertising when getting the advertisement via ethtool.
    Remove setting the "advertising" bit in link_ksettings during default
    case when Tx and Rx are in off state with Auto Negotiate off.

    Below is the original output of advertisement link during Tx and Rx off:
    Advertised pause frame use: Symmetric Receive-only

    Expected output:
    Advertised pause frame use: No

    Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
    Signed-off-by: Muhammad Husaini Zulkifli
    Reviewed-by: Malli C
    Acked-by: Sasha Neftin
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Muhammad Husaini Zulkifli
     
  • This commit applies to the igc_reset_task the same changes that
    were applied to the igb driver in commit 024a8168b749 ("igb:
    reinit_locked() should be called with rtnl_lock")
    and fix possible race in reset subtask.

    Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers")
    Suggested-by: Jakub Kicinski
    Signed-off-by: Sasha Neftin
    Tested-by: Dvora Fuxbrumer
    Signed-off-by: Tony Nguyen

    Sasha Neftin
     

11 Mar, 2021

31 commits

  • Similar to commit 92696286f3bb37ba50e4bd8d1beb24afb759a799 ("net:
    bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to
    qualify the phydev->dev_flags based on whether the port is connected to
    an internal or external PHY otherwise we risk having a flags collision
    with a completely different interpretation depending on the driver.

    Fixes: aa9aef77c761 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • The bcm_sf2 driver uses the b53 driver as a library but does not make
    usre of the b53_setup() function, this made it fail to inherit the
    vlan_filtering_is_global attribute. Fix this by moving the assignment to
    b53_switch_alloc() which is used by bcm_sf2.

    Fixes: 7228b23e68f7 ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • iproute2 package is well behaved, but malicious user space can
    provide illegal shift values and trigger UBSAN reports.

    Add stab parameter to red_check_params() to validate user input.

    syzbot reported:

    UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18
    shift exponent 111 is too large for 64-bit type 'long unsigned int'
    CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:79 [inline]
    dump_stack+0x141/0x1d7 lib/dump_stack.c:120
    ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
    __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
    red_calc_qavg_from_idle_time include/net/red.h:312 [inline]
    red_calc_qavg include/net/red.h:353 [inline]
    choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221
    __dev_xmit_skb net/core/dev.c:3837 [inline]
    __dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150
    neigh_hh_output include/net/neighbour.h:499 [inline]
    neigh_output include/net/neighbour.h:508 [inline]
    ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117
    __ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
    __ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161
    ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192
    NF_HOOK_COND include/linux/netfilter.h:290 [inline]
    ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215
    dst_output include/net/dst.h:448 [inline]
    NF_HOOK include/linux/netfilter.h:301 [inline]
    NF_HOOK include/linux/netfilter.h:295 [inline]
    ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320
    inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135
    dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138
    dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535
    dccp_finish_passive_close net/dccp/proto.c:123 [inline]
    dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118
    dccp_terminate_connection net/dccp/proto.c:958 [inline]
    dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028
    inet_release+0x12e/0x280 net/ipv4/af_inet.c:431
    inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478
    __sock_release+0xcd/0x280 net/socket.c:599
    sock_close+0x18/0x20 net/socket.c:1258
    __fput+0x288/0x920 fs/file_table.c:280
    task_work_run+0xdd/0x1a0 kernel/task_work.c:140
    tracehook_notify_resume include/linux/tracehook.h:189 [inline]

    Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • mlx5-fixes-2021-03-10

    Signed-off-by: David S. Miller

    David S. Miller
     
  • BCM4908 uses 2 Gbps link between switch and the Ethernet interface.
    Without this BCM4908 devices were able to achieve only 2 x ~895 Mb/s.
    This allows handling e.g. NAT traffic with 940 Mb/s.

    Signed-off-by: Rafał Miłecki
    Acked-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Rafał Miłecki
     
  • pxa168_eth_remove() firstly calls unregister_netdev(),
    then cancels a timeout work. unregister_netdev() shuts down a device
    interface and removes it from the kernel tables. If the timeout occurs
    in parallel, the timeout work (pxa168_eth_tx_timeout_task) performs stop
    and open of the device. It may lead to an inconsistent state and memory
    leaks.

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Pavel Andrianov
    Signed-off-by: David S. Miller

    Pavel Andrianov
     
  • macvlan_count_rx() can be called from process context, it is thus
    necessary to disable preemption before calling u64_stats_update_begin()

    syzbot was able to spot this on 32bit arch:

    WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert include/linux/seqlock.h:271 [inline]
    WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269
    Modules linked in:
    Kernel panic - not syncing: panic_on_warn set ...
    CPU: 1 PID: 4632 Comm: kworker/1:3 Not tainted 5.12.0-rc2-syzkaller #0
    Hardware name: ARM-Versatile Express
    Workqueue: events macvlan_process_broadcast
    Backtrace:
    [] (dump_backtrace) from [] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
    r7:00000080 r6:60000093 r5:00000000 r4:8422a3c4
    [] (show_stack) from [] (__dump_stack lib/dump_stack.c:79 [inline])
    [] (show_stack) from [] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120)
    [] (dump_stack) from [] (panic+0x130/0x378 kernel/panic.c:231)
    r7:830209b4 r6:84069ea4 r5:00000000 r4:844350d0
    [] (panic) from [] (__warn+0xb0/0x164 kernel/panic.c:605)
    r3:8404ec8c r2:00000000 r1:00000000 r0:830209b4
    r7:0000010f
    [] (__warn) from [] (warn_slowpath_fmt+0x68/0xd4 kernel/panic.c:628)
    r7:81363f70 r6:0000010f r5:83018e50 r4:00000000
    [] (warn_slowpath_fmt) from [] (__seqprop_assert include/linux/seqlock.h:271 [inline])
    [] (warn_slowpath_fmt) from [] (__seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269)
    r8:5a109000 r7:0000000f r6:a568dac0 r5:89802300 r4:00000001
    [] (__seqprop_assert.constprop.0) from [] (u64_stats_update_begin include/linux/u64_stats_sync.h:128 [inline])
    [] (__seqprop_assert.constprop.0) from [] (macvlan_count_rx include/linux/if_macvlan.h:47 [inline])
    [] (__seqprop_assert.constprop.0) from [] (macvlan_broadcast+0x154/0x26c drivers/net/macvlan.c:291)
    r5:89802300 r4:8a927740
    [] (macvlan_broadcast) from [] (macvlan_process_broadcast+0x258/0x2d0 drivers/net/macvlan.c:317)
    r10:81364f78 r9:8a86d000 r8:8a9c7e7c r7:8413aa5c r6:00000000 r5:00000000
    r4:89802840
    [] (macvlan_process_broadcast) from [] (process_one_work+0x2d4/0x998 kernel/workqueue.c:2275)
    r10:00000008 r9:8404ec98 r8:84367a02 r7:ddfe6400 r6:ddfe2d40 r5:898dac80
    r4:8a86d43c
    [] (process_one_work) from [] (worker_thread+0x64/0x54c kernel/workqueue.c:2421)
    r10:00000008 r9:8a9c6000 r8:84006d00 r7:ddfe2d78 r6:898dac94 r5:ddfe2d40
    r4:898dac80
    [] (worker_thread) from [] (kthread+0x184/0x1a4 kernel/kthread.c:292)
    r10:85247e64 r9:898dac80 r8:80269d68 r7:00000000 r6:8a9c6000 r5:89a2ee40
    r4:8a97bd00
    [] (kthread) from [] (ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158)
    Exception stack(0x8a9c7fb0 to 0x8a9c7ff8)

    Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue")
    Signed-off-by: Eric Dumazet
    Cc: Herbert Xu
    Reported-by: syzbot
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • In the rare case that drop_monitor fails to register its probe on the
    'napi_poll' tracepoint, it will not deactivate its hysteresis timer as
    part of the error path. If the hysteresis timer was armed by the shortly
    lived 'kfree_skb' probe and user space retries to initiate tracing, a
    warning will be emitted for trying to initialize an active object [1].

    Fix this by properly undoing all the operations that were done prior to
    probe registration, in both software and hardware code paths.

    Note that syzkaller managed to fail probe registration by injecting a
    slab allocation failure [2].

    [1]
    ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135
    WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
    Modules linked in:
    CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
    [...]
    Call Trace:
    __debug_object_init+0x524/0xd10 lib/debugobjects.c:588
    debug_timer_init kernel/time/timer.c:722 [inline]
    debug_init kernel/time/timer.c:770 [inline]
    init_timer_key+0x2d/0x340 kernel/time/timer.c:814
    net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline]
    set_all_monitor_traces net/core/drop_monitor.c:1188 [inline]
    net_dm_monitor_start net/core/drop_monitor.c:1295 [inline]
    net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339
    genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
    genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
    genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
    netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
    genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
    netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
    netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
    netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xcf/0x120 net/socket.c:672
    ____sys_sendmsg+0x6e8/0x810 net/socket.c:2348
    ___sys_sendmsg+0xf3/0x170 net/socket.c:2402
    __sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    [2]
    FAULT_INJECTION: forcing a failure.
    name failslab, interval 1, probability 0, space 0, times 1
    CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    dump_stack+0xfa/0x151
    should_fail.cold+0x5/0xa
    should_failslab+0x5/0x10
    __kmalloc+0x72/0x3f0
    tracepoint_add_func+0x378/0x990
    tracepoint_probe_register+0x9c/0xe0
    net_dm_cmd_trace+0x7fc/0x1220
    genl_family_rcv_msg_doit+0x228/0x320
    genl_rcv_msg+0x328/0x580
    netlink_rcv_skb+0x153/0x420
    genl_rcv+0x24/0x40
    netlink_unicast+0x533/0x7d0
    netlink_sendmsg+0x856/0xd90
    sock_sendmsg+0xcf/0x120
    ____sys_sendmsg+0x6e8/0x810
    ___sys_sendmsg+0xf3/0x170
    __sys_sendmsg+0xe5/0x1b0
    do_syscall_64+0x2d/0x70
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    Fixes: 70c69274f354 ("drop_monitor: Initialize timer and work item upon tracing enable")
    Fixes: 8ee2267ad33e ("drop_monitor: Convert to using devlink tracepoint")
    Reported-by: syzbot+779559d6503f3a56213d@syzkaller.appspotmail.com
    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2021-03-10

    The following pull-request contains BPF updates for your *net* tree.

    We've added 8 non-merge commits during the last 5 day(s) which contain
    a total of 11 files changed, 136 insertions(+), 17 deletions(-).

    The main changes are:

    1) Reject bogus use of vmlinux BTF as map/prog creation BTF, from Alexei Starovoitov.

    2) Fix allocation failure splat in x86 JIT for large progs. Also fix overwriting
    percpu cgroup storage from tracing programs when nested, from Yonghong Song.

    3) Fix rx queue retrieval in XDP for multi-queue veth, from Maciej Fijalkowski.

    4) Fix bpf_check_mtu() helper API before freeze to have mtu_len as custom skb/xdp
    L3 input length, from Jesper Dangaard Brouer.

    5) Fix inode_storage's lookup_elem return value upon having bad fd, from Tal Lossos.

    6) Fix bpftool and libbpf cross-build on MacOS, from Georgi Valkov.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Syzbot reported the suspecious RCU usage in nexthop_fib6_nh() when
    called from ipv6_route_seq_show(). The reason is ipv6_route_seq_start()
    calls rcu_read_lock_bh(), while nexthop_fib6_nh() calls
    rcu_dereference_rtnl().
    The fix proposed is to add a variant of nexthop_fib6_nh() to use
    rcu_dereference_bh_rtnl() for ipv6_route_seq_show().

    The reported trace is as follows:
    ./include/net/nexthop.h:416 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 1
    2 locks held by syz-executor.0/17895:
    at: seq_read+0x71/0x12a0 fs/seq_file.c:169
    at: seq_file_net include/linux/seq_file_net.h:19 [inline]
    at: ipv6_route_seq_start+0xaf/0x300 net/ipv6/ip6_fib.c:2616

    stack backtrace:
    CPU: 1 PID: 17895 Comm: syz-executor.0 Not tainted 4.15.0-syzkaller #0
    Call Trace:
    [] __dump_stack lib/dump_stack.c:17 [inline]
    [] dump_stack+0xd8/0x147 lib/dump_stack.c:53
    [] lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5745
    [] nexthop_fib6_nh include/net/nexthop.h:416 [inline]
    [] ipv6_route_native_seq_show net/ipv6/ip6_fib.c:2488 [inline]
    [] ipv6_route_seq_show+0x436/0x7a0 net/ipv6/ip6_fib.c:2673
    [] seq_read+0xccf/0x12a0 fs/seq_file.c:276
    [] proc_reg_read+0x10c/0x1d0 fs/proc/inode.c:231
    [] do_loop_readv_writev fs/read_write.c:714 [inline]
    [] do_loop_readv_writev fs/read_write.c:701 [inline]
    [] do_iter_read+0x49e/0x660 fs/read_write.c:935
    [] vfs_readv+0xfb/0x170 fs/read_write.c:997
    [] kernel_readv fs/splice.c:361 [inline]
    [] default_file_splice_read+0x487/0x9c0 fs/splice.c:416
    [] do_splice_to+0x129/0x190 fs/splice.c:879
    [] splice_direct_to_actor+0x256/0x890 fs/splice.c:951
    [] do_splice_direct+0x1dd/0x2b0 fs/splice.c:1060
    [] do_sendfile+0x597/0xce0 fs/read_write.c:1459
    [] SYSC_sendfile64 fs/read_write.c:1520 [inline]
    [] SyS_sendfile64+0x155/0x170 fs/read_write.c:1506
    [] do_syscall_64+0x1ff/0x310 arch/x86/entry/common.c:305
    [] entry_SYSCALL_64_after_hwframe+0x42/0xb7

    Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info")
    Reported-by: syzbot
    Signed-off-by: Wei Wang
    Cc: David Ahern
    Cc: Ido Schimmel
    Cc: Petr Machata
    Cc: Eric Dumazet
    Reviewed-by: Ido Schimmel
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Wei Wang
     
  • Daniel Borkmann says:

    ====================
    Fix ip6ip6 crash for collect_md skbs

    Fix a NULL pointer deref panic I ran into for regular ip6ip6 tunnel devices
    when collect_md populated skbs were redirected to them for xmit. See patches
    for further details, thanks!
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • I ran into a crash where setting up a ip6ip6 tunnel device which was /not/
    set to collect_md mode was receiving collect_md populated skbs for xmit.

    The BPF prog was populating the skb via bpf_skb_set_tunnel_key() which is
    assigning special metadata dst entry and then redirecting the skb to the
    device, taking ip6_tnl_start_xmit() -> ipxip6_tnl_xmit() -> ip6_tnl_xmit()
    and in the latter it performs a neigh lookup based on skb_dst(skb) where
    we trigger a NULL pointer dereference on dst->ops->neigh_lookup() since
    the md_dst_ops do not populate neigh_lookup callback with a fake handler.

    Transform the md_dst_ops into generic dst_blackhole_ops that can also be
    reused elsewhere when needed, and use them for the metadata dst entries as
    callback ops.

    Also, remove the dst_md_discard{,_out}() ops and rely on dst_discard{,_out}()
    from dst_init() which free the skb the same way modulo the splat. Given we
    will be able to recover just fine from there, avoid any potential splats
    iff this gets ever triggered in future (or worse, panic on warns when set).

    Fixes: f38a9eb1f77b ("dst: Metadata destinations")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Move generic blackhole dst ops to the core and use them from both
    ipv4_dst_blackhole_ops and ip6_dst_blackhole_ops where possible. No
    functional change otherwise. We need these also in other locations
    and having to define them over and over again is not great.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Fix 32-bit variable shift wrapping in dr_ste_v1_get_miss_addr.

    Fixes: a6098129c781 ("net/mlx5: DR, Add STEv1 setters and getters")
    Reported-by: Dan Carpenter
    Signed-off-by: Yevgeny Kliteynik
    Reviewed-by: Alex Vesker
    Signed-off-by: Saeed Mahameed

    Yevgeny Kliteynik
     
  • When SF id is unavailable, code jumps to wrong label that accesses
    sw id array outside of its range.
    Hence, when SF id is not allocated, avoid accessing such array.

    Fixes: 8f0105418668 ("net/mlx5: SF, Add port add delete functionality")
    Signed-off-by: Shay Drory
    Reviewed-by: Parav Pandit
    Signed-off-by: Saeed Mahameed

    Shay Drory
     
  • Cited patch in the fixes tag missed to free the allocated work.
    Fix it by freeing the work after work execution.

    Fixes: f3196bb0f14c ("net/mlx5: Introduce vhca state event notifier")
    Signed-off-by: Shay Drory
    Reviewed-by: Parav Pandit
    Signed-off-by: Saeed Mahameed

    Shay Drory
     
  • Fix vhca context size as defined by device interface specification.

    Fixes: f3196bb0f14c ("net/mlx5: Introduce vhca state event notifier")
    Signed-off-by: Parav Pandit
    Signed-off-by: Saeed Mahameed

    Parav Pandit
     
  • do_div() returns reminder, while cited patch wanted to use
    quotient.
    Fix it by using quotient.

    Fixes: 0e22bfb7c046 ("net/mlx5e: E-switch, Fix rate calculation for overflow")
    Signed-off-by: Parav Pandit
    Signed-off-by: Maor Dickman
    Signed-off-by: Saeed Mahameed

    Parav Pandit
     
  • 1. Don't set the ts_format bit to default when it reserved - device is
    running in the old mode (free running).
    2. XRC doesn't have a CQ therefore the ts format in the QP
    context should be default / free running.
    3. Set ts_format to WQ.

    Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Saeed Mahameed

    Maor Gottlieb
     
  • QPs which don't care from timestamp mode, should set the ts_format
    to default, otherwise the QP creation could be failed if the timestamp
    mode is not supported.

    Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Saeed Mahameed

    Maor Gottlieb
     
  • Move priv memset from init to cleanup to avoid double priv cleanup
    that can happen on profile change if also roolback fails.
    Add missing cleanup flow in mlx5e_netdev_attach_profile().

    Fixes: c4d7eb57687f ("net/mxl5e: Add change profile method")
    Signed-off-by: Roi Dayan
    Signed-off-by: Saeed Mahameed

    Roi Dayan
     
  • VF tunnel TX traffic offload is adding flow which forward to flow
    tables with lower level, which isn't support on all FW versions
    and may cause firmware to fail with syndrome.

    Fixed by enabling VF tunnel TX offload only if flow table capability
    ignore_flow_level is enabled.

    Fixes: 10742efc20a4 ("net/mlx5e: VF tunnel TX traffic offloading")
    Signed-off-by: Maor Dickman
    Reviewed-by: Vlad Buslov
    Signed-off-by: Saeed Mahameed

    Maor Dickman
     
  • flow_attr->ip_version has the matching that should be done inner/outer.
    When working with chains, decapsulation is done on chain0 and next chain
    match on outer header which is the original inner which could be ipv4.
    So in tunnel route resolution we cannot use that to know which ip version
    we are at so save tun_ip_version when parsing the tunnel match and use
    that.

    Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading")
    Signed-off-by: Roi Dayan
    Reviewed-by: Dmytro Linkin
    Signed-off-by: Saeed Mahameed

    Roi Dayan
     
  • Fix a bug of uninitialized pin index when trying to turn off PPS out.

    Fixes: de19cd6cc977 ("net/mlx5: Move some PPS logic into helper functions")
    Signed-off-by: Aya Levin
    Reviewed-by: Eran Ben Elisha
    Signed-off-by: Saeed Mahameed

    Aya Levin
     
  • The cited change added offload support for Geneve options without verifying
    the validity of the options masks, this caused offload of rules with match
    on Geneve options with class,type and data masks which are zero to fail.

    Fix by ignoring the match on Geneve options in case option masks are
    all zero.

    Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
    Signed-off-by: Maor Dickman
    Reviewed-by: Roi Dayan
    Reviewed-by: Oz Shlomo
    Reviewed-by: Yevgeny Kliteynik
    Signed-off-by: Saeed Mahameed

    Maor Dickman
     
  • Port timestamping for PTP can be enabled/disabled while the channels are
    closed. In that case mlx5e_safe_switch_channels is skipped, and the
    preactivate hook is called directly. However, if that hook returns an
    error, the channel parameters must be reverted back to their old values.
    This commit adds missing handling on this case.

    Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
    Signed-off-by: Maxim Mikityanskiy
    Reviewed-by: Tariq Toukan
    Signed-off-by: Saeed Mahameed

    Maxim Mikityanskiy
     
  • Each RQ (including XSK RQs) takes a reference to the XDP program. When
    an XDP program is attached or detached, the channels and queues are
    recreated, however, there is a special flow for changing an active XDP
    program to another one. In that flow, channels and queues stay alive,
    but the refcounts of the old and new XDP programs are adjusted. This
    flow didn't increment refcount by the number of active XSK RQs, and this
    commit fixes it.

    Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
    Signed-off-by: Maxim Mikityanskiy
    Reviewed-by: Tariq Toukan
    Signed-off-by: Saeed Mahameed

    Maxim Mikityanskiy
     
  • When closing the PTP channel, set its pointer explicitly to NULL. PTP
    channel is opened on demand, the code verify the pointer validity before
    access. Nullify it when closing the PTP channel to avoid unexpected
    behavior.

    Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
    Signed-off-by: Aya Levin
    Reviewed-by: Tariq Toukan
    Signed-off-by: Saeed Mahameed

    Aya Levin
     
  • In addition to .get_ethtool_stats, add port PTP TX stats to
    .ndo_get_stats64.

    Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support")
    Signed-off-by: Aya Levin
    Reviewed-by: Tariq Toukan
    Signed-off-by: Saeed Mahameed

    Aya Levin
     
  • Since cited patch, MLX5E_REQUIRED_WQE_MTTS is not a power of two.
    Hence, usage of MLX5E_LOG_ALIGNED_MPWQE_PPW should be replaced,
    as it lost some accuracy. Use the designated macro to calculate
    the number of required MTTs.

    This makes sure the solution in cited patch works properly.

    While here, un-inline mlx5e_get_mpwqe_offset(), and remove the
    unused RQ parameter.

    Fixes: c3c9402373fe ("net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU")
    Signed-off-by: Tariq Toukan
    Signed-off-by: Saeed Mahameed

    Tariq Toukan
     
  • The ICOSQ size should not go below MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE.
    Enforce this where it's missing.

    Signed-off-by: Tariq Toukan
    Reviewed-by: Maxim Mikityanskiy
    Reviewed-by: Saeed Mahameed
    Signed-off-by: Saeed Mahameed

    Tariq Toukan
     

10 Mar, 2021

1 commit

  • Pull networking fixes from David Miller:

    1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.

    2) TX skb error handling fix in mt76 driver, also from Felix.

    3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.

    4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
    From Cong Wang.

    5) Use correct printf format for dma_addr_t in ath11k, from Geert
    Uytterhoeven.

    6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
    Hsieh.

    7) Don't report truncated frames to mac80211 in mt76 driver, from
    Lorenzop Bianconi.

    8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.

    9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.

    10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
    Arjun Roy.

    11) Ignore routes with deleted nexthop object in mlxsw, from Ido
    Schimmel.

    12) Need to undo tcp early demux lookup sometimes in nf_nat, from
    Florian Westphal.

    13) Fix gro aggregation for udp encaps with zero csum, from Daniel
    Borkmann.

    14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
    Donenfeld.

    15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.

    16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.

    17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
    Oltean.

    18) Make sure textsearch copntrol block is large enough, from Wilem de
    Bruijn.

    19) Revert MAC changes to r8152 leading to instability, from Hates Wang.

    20) Advance iov in 9p even for empty reads, from Jissheng Zhang.

    21) Double hook unregister in nftables, from PabloNeira Ayuso.

    22) Fix memleak in ixgbe, fropm Dinghao Liu.

    23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.

    24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
    Tang.

    25) Fix DOI refcount bugs in cipso, from Paul Moore.

    26) One too many irqsave in ibmvnic, from Junlin Yang.

    27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
    Balazs Nemeth.

    * git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
    s390/qeth: fix notification for pending buffers during teardown
    s390/qeth: schedule TX NAPI on QAOB completion
    s390/qeth: improve completion of pending TX buffers
    s390/qeth: fix memory leak after failed TX Buffer allocation
    net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
    net: check if protocol extracted by virtio_net_hdr_set_proto is correct
    net: dsa: xrs700x: check if partner is same as port in hsr join
    net: lapbether: Remove netif_start_queue / netif_stop_queue
    atm: idt77252: fix null-ptr-dereference
    atm: uPD98402: fix incorrect allocation
    atm: fix a typo in the struct description
    net: qrtr: fix error return code of qrtr_sendmsg()
    mptcp: fix length of ADD_ADDR with port sub-option
    net: bonding: fix error return code of bond_neigh_init()
    net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
    net: enetc: set MAC RX FIFO to recommended value
    net: davicom: Use platform_get_irq_optional()
    net: davicom: Fix regulator not turned off on driver removal
    net: davicom: Fix regulator not turned off on failed probe
    net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
    ...

    Linus Torvalds