11 Feb, 2021

1 commit

  • Pull networking fixes from David Miller:
    "Another pile of networing fixes:

    1) ath9k build error fix from Arnd Bergmann

    2) dma memory leak fix in mediatec driver from Lorenzo Bianconi.

    3) bpf int3 kprobe fix from Alexei Starovoitov.

    4) bpf stackmap integer overflow fix from Bui Quang Minh.

    5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from
    Christoph Schemmel.

    6) Don't update deleted entry in xt_recent netfilter module, from
    Jazsef Kadlecsik.

    7) Use after free in nftables, fix from Pablo Neira Ayuso.

    8) Header checksum fix in flowtable from Sven Auhagen.

    9) Validate user controlled length in qrtr code, from Sabyrzhan
    Tasbolatov.

    10) Fix race in xen/netback, from Juergen Gross,

    11) New device ID in cxgb4, from Raju Rangoju.

    12) Fix ring locking in rxrpc release call, from David Howells.

    13) Don't return LAPB error codes from x25_open(), from Xie He.

    14) Missing error returns in gsi_channel_setup() from Alex Elder.

    15) Get skb_copy_and_csum_datagram working properly with odd segment
    sizes, from Willem de Bruijn.

    16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean.

    17) Do teardown on probe failure in DSA, from Vladimir Oltean.

    18) Fix compilation failures of txtimestamp selftest, from Vadim
    Fedorenko.

    19) Limit rx per-napi gro queue size to fix latency regression, from
    Eric Dumazet.

    20) dpaa_eth xdp fixes from Camelia Groza.

    21) Missing txq mode update when switching CBS off, in stmmac driver,
    from Mohammad Athari Bin Ismail.

    22) Failover pending logic fix in ibmvnic driver, from Sukadev
    Bhattiprolu.

    23) Null deref fix in vmw_vsock, from Norbert Slusarek.

    24) Missing verdict update in xdp paths of ena driver, from Shay
    Agroskin.

    25) seq_file iteration fix in sctp from Neil Brown.

    26) bpf 32-bit src register truncation fix on div/mod, from Daniel
    Borkmann.

    27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann.

    28) Fix locking in vsock_shutdown(), from Stefano Garzarella.

    29) Various missing index bound checks in hns3 driver, from Yufeng Mo.

    30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from
    Vladimir Oltean.

    31) Don't mix up stp and mrp port states in bridge layer, from Horatiu
    Vultur.

    32) Fix locking during netif_tx_disable(), from Edwin Peer"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
    bpf: Fix 32 bit src register truncation on div/mod
    bpf: Fix verifier jmp32 pruning decision logic
    bpf: Fix verifier jsgt branch analysis on max bound
    vsock: fix locking in vsock_shutdown()
    net: hns3: add a check for index in hclge_get_rss_key()
    net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
    net: hns3: add a check for queue_id in hclge_reset_vf_queue()
    net: dsa: felix: implement port flushing on .phylink_mac_link_down
    switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
    bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
    net: watchdog: hold device global xmit lock during tx disable
    netfilter: nftables: relax check for stateful expressions in set definition
    netfilter: conntrack: skip identical origin tuple in same zone only
    vsock/virtio: update credit only if socket is not closed
    net: fix iteration for sctp transport seq_files
    net: ena: Update XDP verdict upon failure
    net/vmw_vsock: improve locking in vsock_connect_timeout()
    net/vmw_vsock: fix NULL pointer dereference
    ibmvnic: Clear failover_pending if unable to schedule
    net: stmmac: set TxQ mode back to DCB after disabling CBS
    ...

    Linus Torvalds
     

10 Feb, 2021

2 commits

  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    The following patchset contains Netfilter fixes for net:

    1) nf_conntrack_tuple_taken() needs to recheck zone for
    NAT clash resolution, from Florian Westphal.

    2) Restore support for stateful expressions when set definition
    specifies no stateful expressions.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In vsock_shutdown() we touched some socket fields without holding the
    socket lock, such as 'state' and 'sk_flags'.

    Also, after the introduction of multi-transport, we are accessing
    'vsk->transport' in vsock_send_shutdown() without holding the lock
    and this call can be made while the connection is in progress, so
    the transport can change in the meantime.

    To avoid issues, we hold the socket lock when we enter in
    vsock_shutdown() and release it when we leave.

    Among the transports that implement the 'shutdown' callback, only
    hyperv_transport acquired the lock. Since the caller now holds it,
    we no longer take it.

    Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     

09 Feb, 2021

5 commits

  • The function br_mrp_port_switchdev_set_state was called both with MRP
    port state and STP port state, which is an issue because they don't
    match exactly.

    Therefore, update the function to be used only with STP port state and
    use the id SWITCHDEV_ATTR_ID_PORT_STP_STATE.

    The choice of using STP over MRP is that the drivers already implement
    SWITCHDEV_ATTR_ID_PORT_STP_STATE and already in SW we update the port
    STP state.

    Fixes: 9a9f26e8f7ea30 ("bridge: mrp: Connect MRP API with the switchdev API")
    Fixes: fadd409136f0f2 ("bridge: switchdev: mrp: Implement MRP API for switchdev")
    Fixes: 2f1a11ae11d222 ("bridge: mrp: Add MRP interface.")
    Reported-by: Rasmus Villemoes
    Signed-off-by: Horatiu Vultur
    Signed-off-by: David S. Miller

    Horatiu Vultur
     
  • Restore the original behaviour where users are allowed to add an element
    with any stateful expression if the set definition specifies no stateful
    expressions. Make sure upper maximum number of stateful expressions of
    NFT_SET_EXPR_MAX is not reached.

    Fixes: 8cfd9b0f8515 ("netfilter: nftables: generalize set expressions support")
    Fixes: 48b0ae046ee9 ("netfilter: nftables: netlink support for several set element expressions")
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • The origin skip check needs to re-test the zone. Else, we might skip
    a colliding tuple in the reply direction.

    This only occurs when using 'directional zones' where origin tuples
    reside in different zones but the reply tuples share the same zone.

    This causes the new conntrack entry to be dropped at confirmation time
    because NAT clash resolution was elided.

    Fixes: 4e35c1cb9460240 ("netfilter: nf_nat: skip nat clash resolution for same-origin entries")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • If the socket is closed or is being released, some resources used by
    virtio_transport_space_update() such as 'vsk->trans' may be released.

    To avoid a use after free bug we should only update the available credit
    when we are sure the socket is still open and we have the lock held.

    Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
    Signed-off-by: Stefano Garzarella
    Acked-by: Michael S. Tsirkin
    Link: https://lore.kernel.org/r/20210208144454.84438-1-sgarzare@redhat.com
    Signed-off-by: Jakub Kicinski

    Stefano Garzarella
     
  • The sctp transport seq_file iterators take a reference to the transport
    in the ->start and ->next functions and releases the reference in the
    ->show function. The preferred handling for such resources is to
    release them in the subsequent ->next or ->stop function call.

    Since Commit 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration
    code and interface") there is no guarantee that ->show will be called
    after ->next, so this function can now leak references.

    So move the sctp_transport_put() call to ->next and ->stop.

    Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface")
    Reported-by: Xin Long
    Signed-off-by: NeilBrown
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: Jakub Kicinski

    NeilBrown
     

07 Feb, 2021

3 commits

  • A possible locking issue in vsock_connect_timeout() was recognized by
    Eric Dumazet which might cause a null pointer dereference in
    vsock_transport_cancel_pkt(). This patch assures that
    vsock_transport_cancel_pkt() will be called within the lock, so a race
    condition won't occur which could result in vsk->transport to be set to NULL.

    Fixes: 380feae0def7 ("vsock: cancel packets when failing to connect")
    Reported-by: Eric Dumazet
    Signed-off-by: Norbert Slusarek
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/trinity-f8e0937a-cf0e-4d80-a76e-d9a958ba3ef1-1612535522360@3c-app-gmx-bap12
    Signed-off-by: Jakub Kicinski

    Norbert Slusarek
     
  • In vsock_stream_connect(), a thread will enter schedule_timeout().
    While being scheduled out, another thread can enter vsock_stream_connect()
    as well and set vsk->transport to NULL. In case a signal was sent, the
    first thread can leave schedule_timeout() and vsock_transport_cancel_pkt()
    will be called right after. Inside vsock_transport_cancel_pkt(), a null
    dereference will happen on transport->cancel_pkt.

    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Signed-off-by: Norbert Slusarek
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/trinity-c2d6cede-bfb1-44e2-85af-1fbc7f541715-1612535117028@3c-app-gmx-bap12
    Signed-off-by: Jakub Kicinski

    Norbert Slusarek
     
  • …rnel/git/kvalo/wireless-drivers

    Kalle Valo says:

    ====================
    wireless-drivers fixes for v5.11

    Third, and most likely the last, set of fixes for v5.11. Two very
    small fixes.

    ath9k
    * fix build regression related to LEDS_CLASS

    mt76
    * fix a memory leak

    * tag 'wireless-drivers-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers:
    mt76: dma: fix a possible memory leak in mt76_add_fragment()
    ath9k: fix build error with LEDS_CLASS=m
    ====================

    Link: https://lore.kernel.org/r/20210205163434.14D94C433ED@smtp.codeaurora.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

    Jakub Kicinski
     

06 Feb, 2021

2 commits

  • Commit c80794323e82 ("net: Fix packet reordering caused by GRO and
    listified RX cooperation") had the unfortunate effect of adding
    latencies in common workloads.

    Before the patch, GRO packets were immediately passed to
    upper stacks.

    After the patch, we can accumulate quite a lot of GRO
    packets (depdending on NAPI budget).

    My fix is counting in napi->rx_count number of segments
    instead of number of logical packets.

    Fixes: c80794323e82 ("net: Fix packet reordering caused by GRO and listified RX cooperation")
    Signed-off-by: Eric Dumazet
    Bisected-by: John Sperbeck
    Tested-by: Jian Yang
    Cc: Maxim Mikityanskiy
    Reviewed-by: Saeed Mahameed
    Reviewed-by: Edward Cree
    Reviewed-by: Alexander Lobakin
    Link: https://lore.kernel.org/r/20210204213146.4192368-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski

    Eric Dumazet
     
  • Pull nfsd fix from Chuck Lever:
    "Fix non-page-aligned NFS READs"

    * tag 'nfsd-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
    SUNRPC: Fix NFS READs that start at non-page-aligned offsets

    Linus Torvalds
     

05 Feb, 2021

4 commits

  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    1) Fix combination of --reap and --update in xt_recent that triggers
    UAF, from Jozsef Kadlecsik.

    2) Fix current year in nft_meta selftest, from Fabian Frederick.

    3) Fix possible UAF in the netns destroy path of nftables.

    4) Fix incorrect checksum calculation when mangling ports in flowtable,
    from Sven Auhagen.

    * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
    netfilter: flowtable: fix tcp and udp header checksum update
    netfilter: nftables: fix possible UAF over chains from packet path in netns
    selftests: netfilter: fix current year
    netfilter: xt_recent: Fix attempt to update deleted entry
    ====================

    Link: https://lore.kernel.org/r/20210205001727.2125-1-pablo@netfilter.org
    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     
  • Since teardown is supposed to undo the effects of the setup method, it
    should be called in the error path for dsa_switch_setup, not just in
    dsa_switch_teardown.

    Fixes: 5e3f847a02aa ("net: dsa: Add teardown callback for drivers")
    Signed-off-by: Vladimir Oltean
    Reviewed-by: Andrew Lunn
    Reviewed-by: Florian Fainelli
    Link: https://lore.kernel.org/r/20210204163351.2929670-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski

    Vladimir Oltean
     
  • When iteratively computing a checksum with csum_block_add, track the
    offset "pos" to correctly rotate in csum_block_add when offset is odd.

    The open coded implementation of skb_copy_and_csum_datagram did this.
    With the switch to __skb_datagram_iter calling csum_and_copy_to_iter,
    pos was reinitialized to 0 on each call.

    Bring back the pos by passing it along with the csum to the callback.

    Changes v1->v2
    - pass csum value, instead of csump pointer (Alexander Duyck)

    Link: https://lore.kernel.org/netdev/20210128152353.GB27281@optiplex/
    Fixes: 950fcaecd5cc ("datagram: consolidate datagram copy to iter helpers")
    Reported-by: Oliver Graute
    Signed-off-by: Willem de Bruijn
    Reviewed-by: Alexander Duyck
    Reviewed-by: Eric Dumazet
    Link: https://lore.kernel.org/r/20210203192952.1849843-1-willemdebruijn.kernel@gmail.com
    Signed-off-by: Jakub Kicinski

    Willem de Bruijn
     
  • At the end of rxrpc_release_call(), rxrpc_cleanup_ring() is called to clear
    the Rx/Tx skbuff ring, but this doesn't lock the ring whilst it's accessing
    it. Unfortunately, rxrpc_resend() might be trying to retransmit a packet
    concurrently with this - and whilst it does lock the ring, this isn't
    protection against rxrpc_cleanup_call().

    Fix this by removing the call to rxrpc_cleanup_ring() from
    rxrpc_release_call(). rxrpc_cleanup_ring() will be called again anyway
    from rxrpc_cleanup_call(). The earlier call is just an optimisation to
    recycle skbuffs more quickly.

    Alternative solutions include rxrpc_release_call() could try to cancel the
    work item or wait for it to complete or rxrpc_cleanup_ring() could lock
    when accessing the ring (which would require a bh lock).

    This can produce a report like the following:

    BUG: KASAN: use-after-free in rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372
    Read of size 4 at addr ffff888011606e04 by task kworker/0:0/5
    ...
    Workqueue: krxrpcd rxrpc_process_call
    Call Trace:
    ...
    kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
    rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372
    rxrpc_resend net/rxrpc/call_event.c:266 [inline]
    rxrpc_process_call+0x1634/0x1f60 net/rxrpc/call_event.c:412
    process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
    ...

    Allocated by task 2318:
    ...
    sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2348
    rxrpc_send_data+0xb51/0x2bf0 net/rxrpc/sendmsg.c:358
    rxrpc_do_sendmsg+0xc03/0x1350 net/rxrpc/sendmsg.c:744
    rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:560
    ...

    Freed by task 2318:
    ...
    kfree_skb+0x140/0x3f0 net/core/skbuff.c:704
    rxrpc_free_skb+0x11d/0x150 net/rxrpc/skbuff.c:78
    rxrpc_cleanup_ring net/rxrpc/call_object.c:485 [inline]
    rxrpc_release_call+0x5dd/0x860 net/rxrpc/call_object.c:552
    rxrpc_release_calls_on_socket+0x21c/0x300 net/rxrpc/call_object.c:579
    rxrpc_release_sock net/rxrpc/af_rxrpc.c:885 [inline]
    rxrpc_release+0x263/0x5a0 net/rxrpc/af_rxrpc.c:916
    __sock_release+0xcd/0x280 net/socket.c:597
    ...

    The buggy address belongs to the object at ffff888011606dc0
    which belongs to the cache skbuff_head_cache of size 232

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Reported-by: syzbot+174de899852504e4a74a@syzkaller.appspotmail.com
    Reported-by: syzbot+3d1c772efafd3c38d007@syzkaller.appspotmail.com
    Signed-off-by: David Howells
    cc: Hillf Danton
    Link: https://lore.kernel.org/r/161234207610.653119.5287360098400436976.stgit@warthog.procyon.org.uk
    Signed-off-by: Jakub Kicinski

    David Howells
     

04 Feb, 2021

4 commits

  • syzbot found WARNING in qrtr_tun_write_iter [1] when write_iter length
    exceeds KMALLOC_MAX_SIZE causing order >= MAX_ORDER condition.

    Additionally, there is no check for 0 length write.

    [1]
    WARNING: mm/page_alloc.c:5011
    [..]
    Call Trace:
    alloc_pages_current+0x18c/0x2a0 mm/mempolicy.c:2267
    alloc_pages include/linux/gfp.h:547 [inline]
    kmalloc_order+0x2e/0xb0 mm/slab_common.c:837
    kmalloc_order_trace+0x14/0x120 mm/slab_common.c:853
    kmalloc include/linux/slab.h:557 [inline]
    kzalloc include/linux/slab.h:682 [inline]
    qrtr_tun_write_iter+0x8a/0x180 net/qrtr/tun.c:83
    call_write_iter include/linux/fs.h:1901 [inline]

    Reported-by: syzbot+c2a7e5c5211605a90865@syzkaller.appspotmail.com
    Signed-off-by: Sabyrzhan Tasbolatov
    Link: https://lore.kernel.org/r/20210202092059.1361381-1-snovitoll@gmail.com
    Signed-off-by: Jakub Kicinski

    Sabyrzhan Tasbolatov
     
  • When updating the tcp or udp header checksum on port nat the function
    inet_proto_csum_replace2 with the last parameter pseudohdr as true.
    This leads to an error in the case that GRO is used and packets are
    split up in GSO. The tcp or udp checksum of all packets is incorrect.

    The error is probably masked due to the fact the most network driver
    implement tcp/udp checksum offloading. It also only happens when GRO is
    applied and not on single packets.

    The error is most visible when using a pppoe connection which is not
    triggering the tcp/udp checksum offload.

    Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure")
    Signed-off-by: Sven Auhagen
    Signed-off-by: Pablo Neira Ayuso

    Sven Auhagen
     
  • Although hooks are released via call_rcu(), chain and rule objects are
    immediately released while packets are still walking over these bits.

    This patch adds the .pre_exit callback which is invoked before
    synchronize_rcu() in the netns framework to stay safe.

    Remove a comment which is not valid anymore since the core does not use
    synchronize_net() anymore since 8c873e219970 ("netfilter: core: free
    hooks with call_rcu").

    Suggested-by: Florian Westphal
    Fixes: df05ef874b28 ("netfilter: nf_tables: release objects on netns destruction")
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • When both --reap and --update flag are specified, there's a code
    path at which the entry to be updated is reaped beforehand,
    which then leads to kernel crash. Reap only entries which won't be
    updated.

    Fixes kernel bugzilla #207773.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773
    Reported-by: Reindl Harald
    Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper")
    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Jozsef Kadlecsik
     

03 Feb, 2021

5 commits

  • Pull networking fixes from Jakub Kicinski:
    "Networking fixes for 5.11-rc7, including fixes from bpf and mac80211
    trees.

    Current release - regressions:

    - ip_tunnel: fix mtu calculation

    - mlx5: fix function calculation for page trees

    Previous releases - regressions:

    - vsock: fix the race conditions in multi-transport support

    - neighbour: prevent a dead entry from updating gc_list

    - dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add

    Previous releases - always broken:

    - bpf, cgroup: two copy_{from,to}_user() warn_on_once splats for BPF
    cgroup getsockopt infra when user space is trying to race against
    optlen, from Loris Reiff.

    - bpf: add missing fput() in BPF inode storage map update helper

    - udp: ipv4: manipulate network header of NATed UDP GRO fraglist

    - mac80211: fix station rate table updates on assoc

    - r8169: work around RTL8125 UDP HW bug

    - igc: report speed and duplex as unknown when device is runtime
    suspended

    - rxrpc: fix deadlock around release of dst cached on udp tunnel"

    * tag 'net-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
    net: hsr: align sup_multicast_addr in struct hsr_priv to u16 boundary
    net: ipa: fix two format specifier errors
    net: ipa: use the right accessor in ipa_endpoint_status_skip()
    net: ipa: be explicit about endianness
    net: ipa: add a missing __iomem attribute
    net: ipa: pass correct dma_handle to dma_free_coherent()
    r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
    net/rds: restrict iovecs length for RDS_CMSG_RDMA_ARGS
    net: mvpp2: TCAM entry enable should be written after SRAM data
    net: lapb: Copy the skb before sending a packet
    net/mlx5e: Release skb in case of failure in tc update skb
    net/mlx5e: Update max_opened_tc also when channels are closed
    net/mlx5: Fix leak upon failure of rule creation
    net/mlx5: Fix function calculation for page trees
    docs: networking: swap words in icmp_errors_use_inbound_ifaddr doc
    udp: ipv4: manipulate network header of NATed UDP GRO fraglist
    net: ip_tunnel: fix mtu calculation
    vsock: fix the race conditions in multi-transport support
    net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
    ibmvnic: device remove has higher precedence over reset
    ...

    Linus Torvalds
     
  • sup_multicast_addr is passed to ether_addr_equal for address comparison
    which casts the address inputs to u16 leading to an unaligned access.
    Aligning the sup_multicast_addr to u16 boundary fixes the issue.

    Signed-off-by: Andreas Oetken
    Link: https://lore.kernel.org/r/20210202090304.2740471-1-ennoerlangen@gmail.com
    Signed-off-by: Jakub Kicinski

    Andreas Oetken
     
  • syzbot found WARNING in rds_rdma_extra_size [1] when RDS_CMSG_RDMA_ARGS
    control message is passed with user-controlled
    0x40001 bytes of args->nr_local, causing order >= MAX_ORDER condition.

    The exact value 0x40001 can be checked with UIO_MAXIOV which is 0x400.
    So for kcalloc() 0x400 iovecs with sizeof(struct rds_iovec) = 0x10
    is the closest limit, with 0x10 leftover.

    Same condition is currently done in rds_cmsg_rdma_args().

    [1] WARNING: mm/page_alloc.c:5011
    [..]
    Call Trace:
    alloc_pages_current+0x18c/0x2a0 mm/mempolicy.c:2267
    alloc_pages include/linux/gfp.h:547 [inline]
    kmalloc_order+0x2e/0xb0 mm/slab_common.c:837
    kmalloc_order_trace+0x14/0x120 mm/slab_common.c:853
    kmalloc_array include/linux/slab.h:592 [inline]
    kcalloc include/linux/slab.h:621 [inline]
    rds_rdma_extra_size+0xb2/0x3b0 net/rds/rdma.c:568
    rds_rm_size net/rds/send.c:928 [inline]

    Reported-by: syzbot+1bd2b07f93745fa38425@syzkaller.appspotmail.com
    Signed-off-by: Sabyrzhan Tasbolatov
    Acked-by: Santosh Shilimkar
    Link: https://lore.kernel.org/r/20210201203233.1324704-1-snovitoll@gmail.com
    Signed-off-by: Jakub Kicinski

    Sabyrzhan Tasbolatov
     
  • When sending a packet, we will prepend it with an LAPB header.
    This modifies the shared parts of a cloned skb, so we should copy the
    skb rather than just clone it, before we prepend the header.

    In "Documentation/networking/driver.rst" (the 2nd point), it states
    that drivers shouldn't modify the shared parts of a cloned skb when
    transmitting.

    The "dev_queue_xmit_nit" function in "net/core/dev.c", which is called
    when an skb is being sent, clones the skb and sents the clone to
    AF_PACKET sockets. Because the LAPB drivers first remove a 1-byte
    pseudo-header before handing over the skb to us, if we don't copy the
    skb before prepending the LAPB header, the first byte of the packets
    received on AF_PACKET sockets can be corrupted.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Xie He
    Acked-by: Martin Schiller
    Link: https://lore.kernel.org/r/20210201055706.415842-1-xie.he.0141@gmail.com
    Signed-off-by: Jakub Kicinski

    Xie He
     
  • …rnel/git/jberg/mac80211

    Johannes Berg says:

    ====================
    Two fixes:
    - station rate tables were not updated correctly
    after association, leading to bad configuration
    - rtl8723bs (staging) was initializing data incorrectly
    after the previous fix and needed to move the init
    later

    * tag 'mac80211-for-net-2021-02-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
    staging: rtl8723bs: Move wiphy setup to after reading the regulatory settings from the chip
    mac80211: fix station rate table updates on assoc
    ====================

    Link: https://lore.kernel.org/r/20210202143505.37610-1-johannes@sipsolutions.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

    Jakub Kicinski
     

02 Feb, 2021

3 commits

  • UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
    forwarding. Only the header of head_skb from ip_finish_output_gso ->
    skb_gso_segment is updated but following frag_skbs are not updated.

    A call path skb_mac_gso_segment -> inet_gso_segment ->
    udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
    does not try to update UDP/IP header of the segment list but copy
    only the MAC header.

    Update port, addr and check of each skb of the segment list in
    __udp_gso_segment_list. It covers both SNAT and DNAT.

    Fixes: 9fd1ff5d2ac7 (udp: Support UDP fraglist GRO/GSO.)
    Signed-off-by: Dongseok Yi
    Acked-by: Steffen Klassert
    Link: https://lore.kernel.org/r/1611962007-80092-1-git-send-email-dseok.yi@samsung.com
    Signed-off-by: Jakub Kicinski

    Dongseok Yi
     
  • dev->hard_header_len for tunnel interface is set only when header_ops
    are set too and already contains full overhead of any tunnel encapsulation.
    That's why there is not need to use this overhead twice in mtu calc.

    Fixes: fdafed459998 ("ip_gre: set dev->hard_header_len and dev->needed_headroom properly")
    Reported-by: Slava Bacherikov
    Signed-off-by: Vadim Fedorenko
    Link: https://lore.kernel.org/r/1611959267-20536-1-git-send-email-vfedorenko@novek.ru
    Signed-off-by: Jakub Kicinski

    Vadim Fedorenko
     
  • There are multiple similar bugs implicitly introduced by the
    commit c0cfa2d8a788fcf4 ("vsock: add multi-transports support") and
    commit 6a2c0962105ae8ce ("vsock: prevent transport modules unloading").

    The bug pattern:
    [1] vsock_sock.transport pointer is copied to a local variable,
    [2] lock_sock() is called,
    [3] the local variable is used.
    VSOCK multi-transport support introduced the race condition:
    vsock_sock.transport value may change between [1] and [2].

    Let's copy vsock_sock.transport pointer to local variables after
    the lock_sock() call.

    Fixes: c0cfa2d8a788fcf4 ("vsock: add multi-transports support")
    Signed-off-by: Alexander Popov
    Reviewed-by: Stefano Garzarella
    Reviewed-by: Jorgen Hansen
    Link: https://lore.kernel.org/r/20210201084719.2257066-1-alex.popov@linux.com
    Signed-off-by: Jakub Kicinski

    Alexander Popov
     

01 Feb, 2021

3 commits

  • Anj Duvnjak reports that the Kodi.tv NFS client is not able to read
    video files from a v5.10.11 Linux NFS server.

    The new sendpage-based TCP sendto logic was not attentive to non-
    zero page_base values. nfsd_splice_read() sets that field when a
    READ payload starts in the middle of a page.

    The Linux NFS client rarely emits an NFS READ that is not page-
    aligned. All of my testing so far has been with Linux clients, so I
    missed this one.

    Reported-by: A. Duvnjak
    BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211471
    Fixes: 4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again")
    Signed-off-by: Chuck Lever
    Tested-by: A. Duvnjak

    Chuck Lever
     
  • If the driver uses .sta_add, station entries are only uploaded after the sta
    is in assoc state. Fix early station rate table updates by deferring them
    until the sta has been uploaded.

    Cc: stable@vger.kernel.org
    Signed-off-by: Felix Fietkau
    Link: https://lore.kernel.org/r/20210201083324.3134-1-nbd@nbd.name
    [use rcu_access_pointer() instead since we won't dereference here]
    Signed-off-by: Johannes Berg

    Felix Fietkau
     
  • Pull NFS client fixes from Trond Myklebust:

    - SUNRPC: Handle 0 length opaque XDR object data properly

    - Fix a layout segment leak in pnfs_layout_process()

    - pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn

    - pNFS/NFSv4: Improve rejection of out-of-order layouts

    - pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()

    * tag 'nfs-for-5.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: Handle 0 length opaque XDR object data properly
    SUNRPC: Move simple_get_bytes and simple_get_netobj into private header
    pNFS/NFSv4: Improve rejection of out-of-order layouts
    pNFS/NFSv4: Update the layout barrier when we schedule a layoutreturn
    pNFS/NFSv4: Try to return invalid layout in pnfs_layout_process()
    pNFS/NFSv4: Fix a layout segment leak in pnfs_layout_process()

    Linus Torvalds
     

31 Jan, 2021

1 commit

  • Following race condition was detected:
    - neigh_flush_dev() is under execution and calls
    neigh_mark_dead(n) marking the neighbour entry 'n' as dead.

    - Executing: __netif_receive_skb() ->
    __netif_receive_skb_core() -> arp_rcv() -> arp_process().arp_process()
    calls __neigh_lookup() which takes a reference on neighbour entry 'n'.

    - Moves further along neigh_flush_dev() and calls
    neigh_cleanup_and_release(n), but since reference count increased in t2,
    'n' couldn't be destroyed.

    - Moves further along, arp_process() and calls
    neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds
    the neighbour entry back in gc_list(neigh_mark_dead(), removed it
    earlier in t0 from gc_list)

    - arp_process() finally calls neigh_release(n), destroying
    the neighbour entry.

    This leads to 'n' still being part of gc_list, but the actual
    neighbour structure has been freed.

    The situation can be prevented from happening if we disallow a dead
    entry to have any possibility of updating gc_list. This is what the
    patch intends to achieve.

    Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")
    Signed-off-by: Chinmay Agarwal
    Reviewed-by: Cong Wang
    Reviewed-by: David Ahern
    Link: https://lore.kernel.org/r/20210127165453.GA20514@chinagar-linux.qualcomm.com
    Signed-off-by: Jakub Kicinski

    Chinmay Agarwal
     

30 Jan, 2021

1 commit

  • AF_RXRPC sockets use UDP ports in encap mode. This causes socket and dst
    from an incoming packet to get stolen and attached to the UDP socket from
    whence it is leaked when that socket is closed.

    When a network namespace is removed, the wait for dst records to be cleaned
    up happens before the cleanup of the rxrpc and UDP socket, meaning that the
    wait never finishes.

    Fix this by moving the rxrpc (and, by dependence, the afs) private
    per-network namespace registrations to the device group rather than subsys
    group. This allows cached rxrpc local endpoints to be cleared and their
    UDP sockets closed before we try waiting for the dst records.

    The symptom is that lines looking like the following:

    unregister_netdevice: waiting for lo to become free

    get emitted at regular intervals after running something like the
    referenced syzbot test.

    Thanks to Vadim for tracking this down and work out the fix.

    Reported-by: syzbot+df400f2f24a1677cd7e0@syzkaller.appspotmail.com
    Reported-by: Vadim Fedorenko
    Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook")
    Signed-off-by: David Howells
    Acked-by: Vadim Fedorenko
    Link: https://lore.kernel.org/r/161196443016.3868642.5577440140646403533.stgit@warthog.procyon.org.uk
    Signed-off-by: Jakub Kicinski

    David Howells
     

29 Jan, 2021

2 commits

  • Pull networking fixes from Jakub Kicinski:
    "Networking fixes including fixes from can, xfrm, wireless,
    wireless-drivers and netfilter trees. Nothing scary, Intel
    WiFi-related fixes seemed most notable to the users.

    Current release - regressions:

    - dsa: microchip: ksz8795: fix KSZ8794 port map again to program the
    CPU port correctly

    Current release - new code bugs:

    - iwlwifi: pcie: reschedule in long-running memory reads

    Previous releases - regressions:

    - iwlwifi: dbg: don't try to overwrite read-only FW data

    - iwlwifi: provide gso_type to GSO packets

    - octeontx2: make sure the buffer is 128 byte aligned

    - tcp: make TCP_USER_TIMEOUT accurate for zero window probes

    - xfrm: fix wraparound in xfrm_policy_addr_delta()

    - xfrm: fix oops in xfrm_replay_advance_bmp due to a race between
    CPUs in presence of packet reorder

    - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to
    OPEN

    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()

    Previous releases - always broken:

    - igc: fix link speed advertising

    - stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA
    addressing

    - team: protect features update by RCU to avoid deadlock

    - xfrm: fix disable_xfrm sysctl when used on xfrm interfaces
    themselves

    - fec: fix temporary RMII clock reset on link up

    - can: dev: prevent potential information leak in can_fill_info()

    Misc:

    - mrp: fix bad packing of MRP test packet structures

    - uapi: fix big endian definition of ipv6_rpl_sr_hdr

    - add David Ahern to IPv4/IPv6 maintainers"

    * tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
    rxrpc: Fix memory leak in rxrpc_lookup_local
    mlxsw: spectrum_span: Do not overwrite policer configuration
    selftests: forwarding: Specify interface when invoking mausezahn
    stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
    net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family.
    ibmvnic: Ensure that CRQ entry read are correctly ordered
    MAINTAINERS: add missing header for bonding
    net: decnet: fix netdev refcount leaking on error path
    net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
    can: dev: prevent potential information leak in can_fill_info()
    net: fec: Fix temporary RMII clock reset on link up
    net: lapb: Add locking to the lapb module
    team: protect features update by RCU to avoid deadlock
    MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers
    net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
    net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
    net/mlx5e: Revert parameters on errors when changing trust state without reset
    net/mlx5e: Correctly handle changing the number of queues when the interface is down
    net/mlx5e: Fix CT rule + encap slow path offload and deletion
    net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
    ...

    Linus Torvalds
     
  • Commit 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
    Then release ref in __rxrpc_put_peer and rxrpc_put_peer_locked.

    struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp)
    - peer->local = local;
    + peer->local = rxrpc_get_local(local);

    rxrpc_discard_prealloc also need ref release in discarding.

    syzbot report:
    BUG: memory leak
    unreferenced object 0xffff8881080ddc00 (size 256):
    comm "syz-executor339", pid 8462, jiffies 4294942238 (age 12.350s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 0a 00 00 00 00 c0 00 08 81 88 ff ff ................
    backtrace:
    [] kmalloc include/linux/slab.h:552 [inline]
    [] kzalloc include/linux/slab.h:682 [inline]
    [] rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
    [] rxrpc_lookup_local+0x1c1/0x760 net/rxrpc/local_object.c:244
    [] rxrpc_bind+0x174/0x240 net/rxrpc/af_rxrpc.c:149
    [] afs_open_socket+0xdb/0x200 fs/afs/rxrpc.c:64
    [] afs_net_init+0x2b4/0x340 fs/afs/main.c:126
    [] ops_init+0x4e/0x190 net/core/net_namespace.c:152
    [] setup_net+0xde/0x2d0 net/core/net_namespace.c:342
    [] copy_net_ns+0x19f/0x3e0 net/core/net_namespace.c:483
    [] create_new_namespaces+0x199/0x4f0 kernel/nsproxy.c:110
    [] unshare_nsproxy_namespaces+0x9b/0x120 kernel/nsproxy.c:226
    [] ksys_unshare+0x2fe/0x5c0 kernel/fork.c:2957
    [] __do_sys_unshare kernel/fork.c:3025 [inline]
    [] __se_sys_unshare kernel/fork.c:3023 [inline]
    [] __x64_sys_unshare+0x12/0x20 kernel/fork.c:3023
    [] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
    Signed-off-by: Takeshi Misawa
    Reported-and-tested-by: syzbot+305326672fed51b205f7@syzkaller.appspotmail.com
    Signed-off-by: David Howells
    Link: https://lore.kernel.org/r/161183091692.3506637.3206605651502458810.stgit@warthog.procyon.org.uk
    Signed-off-by: Jakub Kicinski

    Takeshi Misawa
     

28 Jan, 2021

4 commits

  • When CONFIG_ATH9K is built-in but LED support is in a loadable
    module, both ath9k drivers fails to link:

    x86_64-linux-ld: drivers/net/wireless/ath/ath9k/gpio.o: in function `ath_deinit_leds':
    gpio.c:(.text+0x36): undefined reference to `led_classdev_unregister'
    x86_64-linux-ld: drivers/net/wireless/ath/ath9k/gpio.o: in function `ath_init_leds':
    gpio.c:(.text+0x179): undefined reference to `led_classdev_register_ext'

    The problem is that the 'imply' keyword does not enforce any dependency
    but is only a weak hint to Kconfig to enable another symbol from a
    defconfig file.

    Change imply to a 'depends on LEDS_CLASS' that prevents the incorrect
    configuration but still allows building the driver without LED support.

    The 'select MAC80211_LEDS' is now ensures that the LED support is
    actually used if it is present, and the added Kconfig dependency
    on MAC80211_LEDS ensures that it cannot be enabled manually when it
    has no effect.

    Fixes: 197f466e93f5 ("ath9k_htc: Do not select MAC80211_LEDS by default")
    Signed-off-by: Arnd Bergmann
    Acked-by: Johannes Berg
    Signed-off-by: Kalle Valo
    Link: https://lore.kernel.org/r/20210125113654.2408057-1-arnd@kernel.org

    Arnd Bergmann
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    1) Honor stateful expressions defined in the set from the dynset
    extension. The set definition provides a stateful expression
    that must be used by the dynset expression in case it is specified.

    2) Missing timeout extension in the set element in the dynset
    extension leads to inconsistent ruleset listing, not allowing
    the user to restore timeout and expiration on ruleset reload.

    3) Do not dump the stateful expression from the dynset extension
    if it coming from the set definition.

    * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
    netfilter: nft_dynset: dump expressions when set definition contains no expressions
    netfilter: nft_dynset: add timeout extension to template
    netfilter: nft_dynset: honor stateful expressions in set definition
    ====================

    Link: https://lore.kernel.org/r/20210127132512.5472-1-pablo@netfilter.org
    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     
  • On building the route there is an assumption that the destination
    could be local. In this case loopback_dev is used to get the address.
    If the address is still cannot be retrieved dn_route_output_slow
    returns EADDRNOTAVAIL with loopback_dev reference taken.

    Cannot find hash for the fixes tag because this code was introduced
    long time ago. I don't think that this bug has ever fired but the
    patch is done just to have a consistent code base.

    Signed-off-by: Vadim Fedorenko
    Link: https://lore.kernel.org/r/1611619334-20955-1-git-send-email-vfedorenko@novek.ru
    Signed-off-by: Jakub Kicinski

    Vadim Fedorenko
     
  • It's not true that switchdev_port_obj_notify() only inspects the
    ->handled field of "struct switchdev_notifier_port_obj_info" if
    call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON()
    triggering for a non-zero return combined with ->handled not being
    true. But the real problem here is that -EOPNOTSUPP is not being
    properly handled.

    The wrapper functions switchdev_handle_port_obj_add() et al change a
    return value of -EOPNOTSUPP to 0, and the treatment of ->handled in
    switchdev_port_obj_notify() seems to be designed to change that back
    to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e.,
    everybody returned -EOPNOTSUPP).

    Currently, as soon as some device down the stack passes the check_cb()
    check, ->handled gets set to true, which means that
    switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP.

    This, for example, means that the detection of hardware offload
    support in the MRP code is broken: switchdev_port_obj_add() used by
    br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP
    code thinks the generation of MRP test frames has been offloaded, no
    such frames are actually put on the wire. Similarly,
    br_mrp_switchdev_set_ring_role() also always returns 0, causing
    mrp->ring_role_offloaded to be set to 1.

    To fix this, continue to set ->handled true if any callback returns
    success or any error distinct from -EOPNOTSUPP. But if all the
    callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so
    the logic in switchdev_port_obj_notify() can propagate that
    information.

    Fixes: 9a9f26e8f7ea ("bridge: mrp: Connect MRP API with the switchdev API")
    Fixes: f30f0601eb93 ("switchdev: Add helpers to aid traversal through lower devices")
    Reviewed-by: Petr Machata
    Signed-off-by: Rasmus Villemoes
    Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dk
    Signed-off-by: Jakub Kicinski

    Rasmus Villemoes