18 Dec, 2019

10 commits

  • Dev_hold has to be called always in rx_queue_add_kobject.
    Otherwise usage count drops below 0 in case of failure in
    kobject_init_and_add.

    Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
    Reported-by: syzbot
    Cc: Tetsuo Handa
    Cc: David Miller
    Cc: Lukas Bulwahn
    Signed-off-by: Jouni Hogander
    Signed-off-by: David S. Miller

    Jouni Hogander
     
  • As flower rules are added, they are given a stats ID based on the number
    of rules that can be supported in firmware. Only after the initial
    allocation of all available IDs does the driver begin to reuse those that
    have been released.

    The initial allocation of IDs was modified to account for multiple memory
    units on the offloaded device. However, this introduced a bug whereby the
    counter that controls the IDs could be decremented before the ID was
    assigned (where it is further decremented). This means that the stats ID
    could be assigned as -1/0xfffffff which is out of range.

    Fix this by only decrementing the main counter after the current ID has
    been assigned.

    Fixes: 467322e2627f ("nfp: flower: support multiple memory units for filter offloads")
    Signed-off-by: John Hurley
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    John Hurley
     
  • dsa_link_touch() is not exported, or defined outside of the
    file it is in so make it static to avoid the following warning:

    net/dsa/dsa2.c:127:17: warning: symbol 'dsa_link_touch' was not declared. Should it be static?

    Signed-off-by: Ben Dooks (Codethink)
    Signed-off-by: David S. Miller

    Ben Dooks (Codethink)
     
  • drivers/net/ethernet/atheros/ag71xx.c: In function 'ag71xx_probe':
    drivers/net/ethernet/atheros/ag71xx.c:1776:30: warning: passing argument 2 of
    'of_get_phy_mode' makes pointer from integer without a cast [-Wint-conversion]
    In file included from drivers/net/ethernet/atheros/ag71xx.c:33:
    ./include/linux/of_net.h:15:69: note: expected 'phy_interface_t *'
    {aka 'enum *'} but argument is of type 'int'

    Fixes: 0c65b2b90d13c1 ("net: of_get_phy_mode: Change API to solve int/unit warnings")
    Signed-off-by: Oleksij Rempel
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Oleksij Rempel
     
  • Fix missing '*' kernel-doc notation that causes this warning:

    ../include/linux/netdevice.h:1779: warning: bad line: spinlock

    Fixes: ab92d68fc22f ("net: core: add generic lockdep keys")
    Signed-off-by: Randy Dunlap
    Cc: Taehee Yoo
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • sk->sk_pacing_shift can be read and written without lock
    synchronization. This patch adds annotations to
    document this fact and avoid future syzbot complains.

    This might also avoid unexpected false sharing
    in sk_pacing_shift_update(), as the compiler
    could remove the conditional check and always
    write over sk->sk_pacing_shift :

    if (sk->sk_pacing_shift != val)
    sk->sk_pacing_shift = val;

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • ql_alloc_large_buffers() has the usual RX buffer allocation
    loop where it allocates skbs and maps them for DMA. It also
    treats failure as a fatal error.

    There are (at least) three bugs in the error paths:

    1. ql_free_large_buffers() assumes that the lrg_buf[] entry for the
    first buffer that couldn't be allocated will have .skb == NULL.
    But the qla_buf[] array is not zero-initialised.

    2. ql_free_large_buffers() DMA-unmaps all skbs in lrg_buf[]. This is
    incorrect for the last allocated skb, if DMA mapping failed.

    3. Commit 1acb8f2a7a9f ("net: qlogic: Fix memory leak in
    ql_alloc_large_buffers") added a direct call to dev_kfree_skb_any()
    after the skb is recorded in lrg_buf[], so ql_free_large_buffers()
    will double-free it.

    The bugs are somewhat inter-twined, so fix them all at once:

    * Clear each entry in qla_buf[] before attempting to allocate
    an skb for it. This goes half-way to fixing bug 1.
    * Set the .skb field only after the skb is DMA-mapped. This
    fixes the rest.

    Fixes: 1357bfcf7106 ("qla3xxx: Dynamically size the rx buffer queue ...")
    Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() ...")
    Fixes: 1acb8f2a7a9f ("net: qlogic: Fix memory leak in ql_alloc_large_buffers")
    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • syzbot reported a memory leak when an allocation fails within
    genradix_prealloc() for output streams. That's because
    genradix_prealloc() leaves initialized members initialized when the
    issue happens and SCTP stack will abort the current initialization but
    without cleaning up such members.

    The fix here is to always call genradix_free() when genradix_prealloc()
    fails, for output and also input streams, as it suffers from the same
    issue.

    Reported-by: syzbot+772d9e36c490b18d51d1@syzkaller.appspotmail.com
    Fixes: 2075e50caf5e ("sctp: convert to genradix")
    Signed-off-by: Marcelo Ricardo Leitner
    Tested-by: Xin Long
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     
  • …rnel/git/kvalo/wireless-drivers

    Kalle Valo says:

    ====================
    wireless-drivers fixes for v5.5

    First set of fixes for v5.5. Fixing security issues, some regressions
    and few major bugs.

    mwifiex

    * security fix for handling country Information Elements (CVE-2019-14895)

    * security fix for handling TDLS Information Elements

    ath9k

    * fix endian issue with ath9k_pci_owl_loader

    mt76

    * fix default mac address handling

    iwlwifi

    * fix merge damage which lead to firmware crashing during boot on some devices

    * fix device initialisation regression on some devices
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Upon reusing the ptp_qoriq driver, the ptp_qoriq_free() function was
    used on the remove path to free any allocated resources.
    The ptp_qoriq IRQ is among these resources that are freed in
    ptp_qoriq_free() even though it is also a managed one (allocated using
    devm_request_threaded_irq).

    Drop the resource managed version of requesting the IRQ in order to not
    trigger a double free of the interrupt as below:

    [ 226.731005] Trying to free already-free IRQ 126
    [ 226.735533] WARNING: CPU: 6 PID: 749 at kernel/irq/manage.c:1707
    __free_irq+0x9c/0x2b8
    [ 226.743435] Modules linked in:
    [ 226.746480] CPU: 6 PID: 749 Comm: bash Tainted: G W
    5.4.0-03629-gfd7102c32b2c-dirty #912
    [ 226.755857] Hardware name: NXP Layerscape LX2160ARDB (DT)
    [ 226.761244] pstate: 40000085 (nZcv daIf -PAN -UAO)
    [ 226.766022] pc : __free_irq+0x9c/0x2b8
    [ 226.769758] lr : __free_irq+0x9c/0x2b8
    [ 226.773493] sp : ffff8000125039f0
    (...)
    [ 226.856275] Call trace:
    [ 226.858710] __free_irq+0x9c/0x2b8
    [ 226.862098] free_irq+0x30/0x70
    [ 226.865229] devm_irq_release+0x14/0x20
    [ 226.869054] release_nodes+0x1b0/0x220
    [ 226.872790] devres_release_all+0x34/0x50
    [ 226.876790] device_release_driver_internal+0x100/0x1c0

    Fixes: d346c9e86d86 ("dpaa2-ptp: reuse ptp_qoriq driver")
    Cc: Yangbo Lu
    Signed-off-by: Ioana Ciornei
    Reviewed-by: Yangbo Lu
    Signed-off-by: David S. Miller

    Ioana Ciornei
     

17 Dec, 2019

7 commits

  • …rnel/git/jberg/mac80211

    Johannes Berg says:

    ====================
    A handful of fixes:
    * disable AQL on most drivers, addressing the iwlwifi issues
    * fix double-free on network namespace changes
    * fix TID field in frames injected through monitor interfaces
    * fix ieee80211_calc_rx_airtime()
    * fix NULL pointer dereference in rfkill (and remove BUG_ON)
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Selecting MSCC_OCELOT_SWITCH is not possible when NET_VENDOR_MICROSEMI
    is disabled:

    WARNING: unmet direct dependencies detected for MSCC_OCELOT_SWITCH
    Depends on [n]: NETDEVICES [=y] && ETHERNET [=n] && NET_VENDOR_MICROSEMI [=n] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y]
    Selected by [m]:
    - NET_DSA_MSCC_FELIX [=m] && NETDEVICES [=y] && HAVE_NET_DSA [=y] && NET_DSA [=y] && PCI [=y]

    Add a Kconfig dependency on NET_VENDOR_MICROSEMI, which also implies
    CONFIG_NETDEVICES.

    Depending on a vendor config violates menuconfig locality for the DSA
    driver, but is the smallest compromise since all other solutions are
    much more complicated (see [0]).

    https://www.spinics.net/lists/netdev/msg618808.html

    Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Mao Wenan
    Signed-off-by: Vladimir Oltean
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • In the implementation of gmac_setup_txqs() the allocated desc_ring is
    leaked if TX queue base is not aligned. Release it via
    dma_free_coherent.

    Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
    Signed-off-by: Navid Emamdoost
    Reviewed-by: Linus Walleij
    Signed-off-by: David S. Miller

    Navid Emamdoost
     
  • There were several issues with 53568438e381 ("net: dsa: b53: Add support for port_egress_floods callback") that resulted in breaking connectivity for standalone ports:

    - both user and CPU ports must allow unicast and multicast forwarding by
    default otherwise this just flat out breaks connectivity for
    standalone DSA ports
    - IP multicast is treated similarly as multicast, but has separate
    control registers
    - the UC, MC and IPMC lookup failure register offsets were wrong, and
    instead used bit values that are meaningful for the
    B53_IP_MULTICAST_CTRL register

    Fixes: 53568438e381 ("net: dsa: b53: Add support for port_egress_floods callback")
    Signed-off-by: Florian Fainelli
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Stefano Garzarella says:

    ====================
    vsock/virtio: fix null-pointer dereference and related precautions

    This series mainly solves a possible null-pointer dereference in
    virtio_transport_recv_listen() introduced with the multi-transport
    support [PATCH 1].

    PATCH 2 adds a WARN_ON check for the same potential issue
    and a returned error in the virtio_transport_send_pkt_info() function
    to avoid crashing the kernel.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • virtio_transport_get_ops() and virtio_transport_send_pkt_info()
    can only be used on connecting/connected sockets, since a socket
    assigned to a transport is required.

    This patch adds a WARN_ON() on virtio_transport_get_ops() to check
    this requirement, a comment and a returned error on
    virtio_transport_send_pkt_info(),

    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     
  • With multi-transport support, listener sockets are not bound to any
    transport. So, calling virtio_transport_reset(), when an error
    occurs, on a listener socket produces the following null-pointer
    dereference:

    BUG: kernel NULL pointer dereference, address: 00000000000000e8
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 20 Comm: kworker/0:1 Not tainted 5.5.0-rc1-ste-00003-gb4be21f316ac-dirty #56
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
    Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport]
    RIP: 0010:virtio_transport_send_pkt_info+0x20/0x130 [vmw_vsock_virtio_transport_common]
    Code: 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 10 44 8b 76 20 e8 c0 ba fe ff 8b 80 e8 00 00 00 e8 64 e3 7d c1 45 8b 45 00 41 8b 8c 24 d4 02
    RSP: 0018:ffffc900000b7d08 EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff88807bf12728 RCX: 0000000000000000
    RDX: ffff88807bf12700 RSI: ffffc900000b7d50 RDI: ffff888035c84000
    RBP: ffffc900000b7d40 R08: ffff888035c84000 R09: ffffc900000b7d08
    R10: ffff8880781de800 R11: 0000000000000018 R12: ffff888035c84000
    R13: ffffc900000b7d50 R14: 0000000000000000 R15: ffff88807bf12724
    FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000e8 CR3: 00000000790f4004 CR4: 0000000000160ef0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    virtio_transport_reset+0x59/0x70 [vmw_vsock_virtio_transport_common]
    virtio_transport_recv_pkt+0x5bb/0xe50 [vmw_vsock_virtio_transport_common]
    ? detach_buf_split+0xf1/0x130
    virtio_transport_rx_work+0xba/0x130 [vmw_vsock_virtio_transport]
    process_one_work+0x1c0/0x300
    worker_thread+0x45/0x3c0
    kthread+0xfc/0x130
    ? current_work+0x40/0x40
    ? kthread_park+0x90/0x90
    ret_from_fork+0x35/0x40
    Modules linked in: sunrpc kvm_intel kvm vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common irqbypass vsock virtio_rng rng_core
    CR2: 00000000000000e8
    ---[ end trace e75400e2ea2fa824 ]---

    This happens because virtio_transport_reset() calls
    virtio_transport_send_pkt_info() that can be used only on
    connecting/connected sockets.

    This patch fixes the issue, using virtio_transport_reset_no_sock()
    instead of virtio_transport_reset() when we are handling a listener
    socket.

    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     

16 Dec, 2019

5 commits

  • In rfkill_register, the struct rfkill pointer is first derefernced
    and then checked for NULL. This patch removes the BUG_ON and returns
    an error to the caller in case rfkill is NULL.

    Signed-off-by: Aditya Pakki
    Link: https://lore.kernel.org/r/20191215153409.21696-1-pakki001@umn.edu
    Signed-off-by: Johannes Berg

    Aditya Pakki
     
  • In function xenvif_disconnect_queue(), the value of queue->rx_irq is
    zeroed *before* queue->task is stopped. Unfortunately that task may call
    notify_remote_via_irq(queue->rx_irq) and calling that function with a
    zero value results in a NULL pointer dereference in evtchn_from_irq().

    This patch simply re-orders things, stopping all tasks before zero-ing the
    irq values, thereby avoiding the possibility of the race.

    Fixes: 2ac061ce97f4 ("xen/netback: cleanup init and deinit code")
    Signed-off-by: Paul Durrant
    Acked-by: Wei Liu
    Signed-off-by: Jakub Kicinski

    Paul Durrant
     
  • Display the return code as decimal integer.

    Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
    Signed-off-by: Cristian Birsan
    Signed-off-by: Jakub Kicinski

    Cristian Birsan
     
  • The sge_info debugfs collects offload queue info even when offload
    capability is disabled and leads to panic.

    [ 144.139871] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 144.139874] CR2: 0000000000000000 CR3: 000000082d456005 CR4: 00000000001606e0
    [ 144.139876] Call Trace:
    [ 144.139887] sge_queue_start+0x12/0x30 [cxgb4]
    [ 144.139897] seq_read+0x1d4/0x3d0
    [ 144.139906] full_proxy_read+0x50/0x70
    [ 144.139913] vfs_read+0x89/0x140
    [ 144.139916] ksys_read+0x55/0xd0
    [ 144.139924] do_syscall_64+0x5b/0x1d0
    [ 144.139933] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 144.139936] RIP: 0033:0x7f4b01493990

    Fix this crash by skipping the offload queue access in sge_qinfo when
    offload capability is disabled

    Signed-off-by: Herat Ramani
    Signed-off-by: Vishal Kulkarni
    Signed-off-by: Jakub Kicinski

    Vishal Kulkarni
     
  • FASTOPEN setsockopt() or sendmsg() may switch the SMC socket to fallback
    mode. Once fallback mode is active, the native TCP socket functions are
    called. Nevertheless there is a small race window, when FASTOPEN
    setsockopt/sendmsg runs in parallel to a connect(), and switch the
    socket into fallback mode before connect() takes the sock lock.
    Make sure the SMC-specific connect setup is omitted in this case.

    This way a syzbot-reported refcount problem is fixed, triggered by
    different threads running non-blocking connect() and FASTOPEN_KEY
    setsockopt.

    Reported-by: syzbot+96d3f9ff6a86d37e44c8@syzkaller.appspotmail.com
    Fixes: 6d6dd528d5af ("net/smc: fix refcount non-blocking connect() -part 2")
    Signed-off-by: Ursula Braun
    Signed-off-by: Karsten Graul
    Signed-off-by: Jakub Kicinski

    Ursula Braun
     

15 Dec, 2019

16 commits

  • A mismerge between the following two commits:

    c678726305b9 ("net: phylink: ensure consistent phy interface mode")
    27755ff88c0e ("net: phylink: Add phylink_mac_link_{up, down} wrapper functions")

    resulted in the wrong interface being passed to the mac_link_up()
    function. Fix this up.

    Fixes: b4b12b0d2f02 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
    Signed-off-by: Russell King
    Signed-off-by: Jakub Kicinski

    Russell King
     
  • This test only works when [1] is applied, which was rejected.

    Basically, the errors are reported and cleared. In this particular case of
    tls sockets, following reads will block.

    The test case was originally submitted with the rejected patch, but, then,
    was included as part of a different patchset, possibly by mistake.

    [1] https://lore.kernel.org/netdev/20191007035323.4360-2-jakub.kicinski@netronome.com/#t

    Thanks Paolo Pisati for pointing out the original patchset where this
    appeared.

    Fixes: 65190f77424d (selftests/tls: add a test for fragmented messages)
    Reported-by: Paolo Pisati
    Signed-off-by: Thadeu Lima de Souza Cascardo
    Signed-off-by: Jakub Kicinski

    Thadeu Lima de Souza Cascardo
     
  • Taehee Yoo says:

    ====================
    gtp: fix several bugs in gtp module

    This patchset fixes several bugs in the GTP module.

    1. Do not allow adding duplicate TID and ms_addr pdp context.
    In the current code, duplicate TID and ms_addr pdp context could be added.
    So, RX and TX path could find correct pdp context.

    2. Fix wrong condition in ->dumpit() callback.
    ->dumpit() callback is re-called if dump packet size is too big.
    So, before return, it saves last position and then restart from
    last dump position.
    TID value is used to find last dump position.
    GTP module allows adding zero TID value. But ->dumpit() callback ignores
    zero TID value.
    So, dump would not work correctly if dump packet size too big.

    3. Fix use-after-free in ipv4_pdp_find().
    RX and TX patch always uses gtp->tid_hash and gtp->addr_hash.
    but while packet processing, these hash pointer would be freed.
    So, use-after-free would occur.

    4. Fix panic because of zero size hashtable
    GTP hashtable size could be set by user-space.
    If hashsize is set to 0, hashtable will not work and panic will occur.
    ====================

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     
  • GTP default hashtable size is 1024 and userspace could set specific
    hashtable size with IFLA_GTP_PDP_HASHSIZE. If hashtable size is set to 0
    from userspace, hashtable will not work and panic will occur.

    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Jakub Kicinski

    Taehee Yoo
     
  • ipv4_pdp_find() is called in TX packet path of GTP.
    ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context.
    In the current code, gtp->tid_hash and gtp->addr_hash are freed by
    ->dellink(), which is gtp_dellink().
    But gtp_dellink() would be called while packets are processing.
    So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash.
    Instead, dev->priv_destructor() would be used because this callback
    is called after all packet processing safely.

    Test commands:
    ip link add veth1 type veth peer name veth2
    ip a a 172.0.0.1/24 dev veth1
    ip link set veth1 up
    ip a a 172.99.0.1/32 dev lo

    gtp-link add gtp1 &

    gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2
    ip r a 172.99.0.2/32 dev gtp1
    ip link set gtp1 mtu 1500

    ip netns add ns2
    ip link set veth2 netns ns2
    ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2
    ip netns exec ns2 ip link set veth2 up
    ip netns exec ns2 ip a a 172.99.0.2/32 dev lo
    ip netns exec ns2 ip link set lo up

    ip netns exec ns2 gtp-link add gtp2 &
    ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1
    ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2
    ip netns exec ns2 ip link set gtp2 mtu 1500

    hping3 172.99.0.2 -2 --flood &
    ip link del gtp1

    Splat looks like:
    [ 72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.568916][ T1195] Read of size 8 at addr ffff8880b9a35d28 by task hping3/1195
    [ 72.569631][ T1195]
    [ 72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199
    [ 72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [ 72.571438][ T1195] Call Trace:
    [ 72.571764][ T1195] dump_stack+0x96/0xdb
    [ 72.572171][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.572761][ T1195] print_address_description.constprop.5+0x1be/0x360
    [ 72.573400][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.573971][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.574544][ T1195] __kasan_report+0x12a/0x16f
    [ 72.575014][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.575593][ T1195] kasan_report+0xe/0x20
    [ 72.576004][ T1195] ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
    [ 72.576577][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
    [ ... ]
    [ 72.647671][ T1195] BUG: unable to handle page fault for address: ffff8880b9a35d28
    [ 72.648512][ T1195] #PF: supervisor read access in kernel mode
    [ 72.649158][ T1195] #PF: error_code(0x0000) - not-present page
    [ 72.649849][ T1195] PGD a6c01067 P4D a6c01067 PUD 11fb07067 PMD 11f939067 PTE 800fffff465ca060
    [ 72.652958][ T1195] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G B 5.5.0-rc1 #199
    [ 72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp]
    [ ... ]
    [ 72.679168][ T1195] Call Trace:
    [ 72.679603][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
    [ 72.681915][ T1195] ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp]
    [ 72.682513][ T1195] ? lock_acquire+0x164/0x3b0
    [ 72.682966][ T1195] ? gtp_dev_xmit+0x35e/0x890 [gtp]
    [ 72.683481][ T1195] gtp_dev_xmit+0x3c2/0x890 [gtp]
    [ ... ]

    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Jakub Kicinski

    Taehee Yoo
     
  • gtp_genl_dump_pdp() is ->dumpit() callback of GTP module and it is used
    to dump pdp contexts. it would be re-executed because of dump packet size.

    If dump packet size is too big, it saves current dump pointer
    (gtp interface pointer, bucket, TID value) then it restarts dump from
    last pointer.
    Current GTP code allows adding zero TID pdp context but dump code
    ignores zero TID value. So, last dump pointer will not be found.

    In addition, this patch adds missing rcu_read_lock() in
    gtp_genl_dump_pdp().

    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Jakub Kicinski

    Taehee Yoo
     
  • GTP RX packet path lookups pdp context with TID. If duplicate TID pdp
    contexts are existing in the list, it couldn't select correct pdp context.
    So, TID value should be unique.
    GTP TX packet path lookups pdp context with ms_addr. If duplicate ms_addr pdp
    contexts are existing in the list, it couldn't select correct pdp context.
    So, ms_addr value should be unique.

    Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Jakub Kicinski

    Taehee Yoo
     
  • After the recent fix in commit 1899bb325149 ("bonding: fix state
    transition issue in link monitoring"), the active-backup mode with
    miimon initially come-up fine but after a link-failure, both members
    transition into backup state.

    Following steps to reproduce the scenario (eth1 and eth2 are the
    slaves of the bond):

    ip link set eth1 up
    ip link set eth2 down
    sleep 1
    ip link set eth2 up
    ip link set eth1 down
    cat /sys/class/net/eth1/bonding_slave/state
    cat /sys/class/net/eth2/bonding_slave/state

    Fixes: 1899bb325149 ("bonding: fix state transition issue in link monitoring")
    CC: Jay Vosburgh
    Signed-off-by: Mahesh Bandewar
    Acked-by: Jay Vosburgh
    Signed-off-by: Jakub Kicinski

    Mahesh Bandewar
     
  • Manish Chopra says:

    ====================
    bnx2x: bug fixes

    This series has two driver changes, one to fix some unexpected
    hardware behaviour casued during the parity error recovery in
    presence of SR-IOV VFs and another one related for fixing resource
    management in the driver among the PFs configured on an engine.

    Please consider applying it to "net".

    V1->V2:
    =======
    Fix the compilation errors reported by kbuild test robot
    on the patch #1 with CONFIG_BNX2X_SRIOV=n
    ====================

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     
  • Driver doesn't calculate total number of PFs configured on a
    given engine correctly which messed up resources in the PFs
    loaded on that engine, leading driver to exceed configuration
    of resources (like vlan filters etc.) beyond the limit per
    engine, which ended up with asserts from the firmware.

    Signed-off-by: Manish Chopra
    Signed-off-by: Ariel Elior
    Signed-off-by: Jakub Kicinski

    Manish Chopra
     
  • Parity error from the hardware will cause PF to lose the state
    of their VFs due to PF's internal reload and hardware reset following
    the parity error. Restrict any configuration request from the VFs after
    the parity as it could cause unexpected hardware behavior, only way
    for VFs to recover would be to trigger FLR on VFs and reload them.

    Signed-off-by: Manish Chopra
    Signed-off-by: Ariel Elior
    Signed-off-by: Jakub Kicinski

    Manish Chopra
     
  • Without the common part of the driver, the new file fails to link:

    drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_probe':
    cpsw_new.c:(.text+0x312c): undefined reference to `ti_cm_get_macid'

    Use the same Makefile hack as before, and build cpsw-common.o for
    any driver that needs it.

    Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Jakub Kicinski

    Arnd Bergmann
     
  • The new driver misses a dependency:

    drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_rx_handler':
    cpsw_new.c:(.text+0x259c): undefined reference to `__page_pool_put_page'
    cpsw_new.c:(.text+0x25d0): undefined reference to `page_pool_alloc_pages'
    drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_fill_rx_channels':
    cpsw_priv.c:(.text+0x22d8): undefined reference to `page_pool_alloc_pages'
    cpsw_priv.c:(.text+0x2420): undefined reference to `__page_pool_put_page'
    drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_create_xdp_rxqs':
    cpsw_priv.c:(.text+0x2624): undefined reference to `page_pool_create'
    drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_run_xdp':
    cpsw_priv.c:(.text+0x2dc8): undefined reference to `__page_pool_put_page'

    Other drivers use 'select' for PAGE_POOL, so do the same here.

    Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
    Signed-off-by: Arnd Bergmann
    Acked-by: Ilias Apalodimas
    Acked-by: Jesper Dangaard Brouer
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Jakub Kicinski

    Arnd Bergmann
     
  • Host can provide send indirection table messages anytime after RSS is
    enabled by calling rndis_filter_set_rss_param(). So the host provided
    table values may be overwritten by the initialization in
    rndis_set_subchannel().

    To prevent this problem, move the tx_table initialization before calling
    rndis_filter_set_rss_param().

    Fixes: a6fb6aa3cfa9 ("hv_netvsc: Set tx_table to equal weight after subchannels open")
    Signed-off-by: Haiyang Zhang
    Signed-off-by: Jakub Kicinski

    Haiyang Zhang
     
  • phylink requires the MAC to report when its link status changes when
    operating in inband modes. Failure to report link status changes
    means that phylink has no idea when the link events happen, which
    results in either the network interface's carrier remaining up or
    remaining permanently down.

    For example, with a fiber module, if the interface is brought up and
    link is initially established, taking the link down at the far end
    will cut the optical power. The SFP module's LOS asserts, we
    deactivate the link, and the network interface reports no carrier.

    When the far end is brought back up, the SFP module's LOS deasserts,
    but the MAC may be slower to establish link. If this happens (which
    in my tests is a certainty) then phylink never hears that the MAC
    has established link with the far end, and the network interface is
    stuck reporting no carrier. This means the interface is
    non-functional.

    Avoiding the link interrupt when we have phylink is basically not
    an option, so remove the !port->phylink from the test.

    Fixes: 4bb043262878 ("net: mvpp2: phylink support")
    Tested-by: Sven Auhagen
    Tested-by: Antoine Tenart
    Signed-off-by: Russell King
    Signed-off-by: Jakub Kicinski

    Russell King
     
  • Eric Dumazet says:
    ====================
    tcp: take care of empty skbs in write queue

    We understood recently that TCP sockets could have an empty
    skb at the tail of the write queue, leading to various problems.

    This patch series :

    1) Make sure we do not send an empty packet since this
    was unintended and causing crashes in old kernels.

    2) Change tcp_write_queue_empty() to not be fooled by
    the presence of an empty skb.

    3) Fix a bug that could trigger suboptimal epoll()
    application behavior under memory pressure.
    ====================

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     

14 Dec, 2019

2 commits

  • At the time commit ce5ec440994b ("tcp: ensure epoll edge trigger
    wakeup when write queue is empty") was added to the kernel,
    we still had a single write queue, combining rtx and write queues.

    Once we moved the rtx queue into a separate rb-tree, testing
    if sk_write_queue is empty has been suboptimal.

    Indeed, if we have packets in the rtx queue, we probably want
    to delay the EPOLLOUT generation at the time incoming packets
    will free them, making room, but more importantly avoiding
    flooding application with EPOLLOUT events.

    Solution is to use tcp_rtx_and_write_queues_empty() helper.

    Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
    Signed-off-by: Eric Dumazet
    Cc: Jason Baron
    Cc: Neal Cardwell
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: Jakub Kicinski

    Eric Dumazet
     
  • Due to how tcp_sendmsg() is implemented, we can have an empty
    skb at the tail of the write queue.

    Most [1] tcp_write_queue_empty() callers want to know if there is
    anything to send (payload and/or FIN)

    Instead of checking if the sk_write_queue is empty, we need
    to test if tp->write_seq == tp->snd_nxt

    [1] tcp_send_fin() was the only caller that expected to
    see if an skb was in the write queue, I have changed the code
    to reuse the tcp_write_queue_tail() result.

    Signed-off-by: Eric Dumazet
    Cc: Neal Cardwell
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: Jakub Kicinski

    Eric Dumazet