20 Jan, 2021

10 commits

  • This is the 5.10.9 stable release

    * tag 'v5.10.9': (153 commits)
    Linux 5.10.9
    netfilter: nf_nat: Fix memleak in nf_nat_init
    netfilter: conntrack: fix reading nf_conntrack_buckets
    ...

    Signed-off-by: Jason Liu

    Jason Liu
     
  • This is the 5.10.8 stable release

    * tag 'v5.10.8': (104 commits)
    Linux 5.10.8
    tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
    drm/panfrost: Remove unused variables in panfrost_job_close()
    ...

    Signed-off-by: Jason Liu

    Jason Liu
     
  • This is the 5.10.7 stable release

    * tag 'v5.10.7': (144 commits)
    Linux 5.10.7
    scsi: target: Fix XCOPY NAA identifier lookup
    rtlwifi: rise completion at the last step of firmware callback
    ...

    Signed-off-by: Jason Liu

    Jason Liu
     
  • This is the 5.10.6 stable release

    * tag 'v5.10.6': (21 commits)
    Linux 5.10.6
    mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
    exec: Transform exec_update_mutex into a rw_semaphore
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    drivers/rtc/rtc-pcf2127.c

    Jason Liu
     
  • This is the 5.10.5 stable release

    * tag 'v5.10.5': (63 commits)
    Linux 5.10.5
    device-dax: Fix range release
    ext4: avoid s_mb_prefetch to be zero in individual scenarios
    ...

    Signed-off-by: Jason Liu

    Jason Liu
     
  • commit 869f4fdaf4ca7bb6e0d05caf6fa1108dddc346a7 upstream.

    When register_pernet_subsys() fails, nf_nat_bysource
    should be freed just like when nf_ct_extend_register()
    fails.

    Fixes: 1cd472bf036ca ("netfilter: nf_nat: add nat hook register functions to nf_nat")
    Signed-off-by: Dinghao Liu
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Dinghao Liu
     
  • commit f6351c3f1c27c80535d76cac2299aec44c36291e upstream.

    The old way of changing the conntrack hashsize runtime was through changing
    the module param via file /sys/module/nf_conntrack/parameters/hashsize. This
    was extended to sysctl change in commit 3183ab8997a4 ("netfilter: conntrack:
    allow increasing bucket size via sysctl too").

    The commit introduced second "user" variable nf_conntrack_htable_size_user
    which shadow actual variable nf_conntrack_htable_size. When hashsize is
    changed via module param this "user" variable isn't updated. This results in
    sysctl net/netfilter/nf_conntrack_buckets shows the wrong value when users
    update via the old way.

    This patch fix the issue by always updating "user" variable when reading the
    proc file. This will take care of changes to the actual variable without
    sysctl need to be aware.

    Fixes: 3183ab8997a4 ("netfilter: conntrack: allow increasing bucket size via sysctl too")
    Reported-by: Yoel Caspersen
    Signed-off-by: Jesper Dangaard Brouer
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Jesper Dangaard Brouer
     
  • commit 86b53fbf08f48d353a86a06aef537e78e82ba721 upstream.

    A return value of 0 means success. This is documented in lib/kstrtox.c.

    This was found by trying to mount an NFS share from a link-local IPv6
    address with the interface specified by its index:

    mount("[fe80::1%1]:/srv/nfs", "/mnt", "nfs", 0, "nolock,addr=fe80::1%1")

    Before this commit this failed with EINVAL and also caused the following
    message in dmesg:

    [...] NFS: bad IP address specified: addr=fe80::1%1

    The syscall using the same address based on the interface name instead
    of its index succeeds.

    Credits for this patch go to my colleague Christian Speich, who traced
    the origin of this bug to this line of code.

    Signed-off-by: Johannes Nixdorf
    Fixes: 00cfaa943ec3 ("replace strict_strto calls")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    j.nixdorf@avm.de
     
  • [ Upstream commit 152a8a6c017bfdeda7f6d052fbc6e151891bd9b6 ]

    Without crc32 support, this fails to link:

    arm-linux-gnueabi-ld: net/wireless/scan.o: in function `cfg80211_scan_6ghz':
    scan.c:(.text+0x928): undefined reference to `crc32_le'

    Fixes: c8cb5b854b40 ("nl80211/cfg80211: support 6 GHz scanning")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Arnd Bergmann
     
  • [ Upstream commit 2b33d6ffa9e38f344418976b06057e2fc2aa9e2a ]

    currently mtype_resize() can cause oops

    t = ip_set_alloc(htable_size(htable_bits));
    if (!t) {
    ret = -ENOMEM;
    goto out;
    }
    t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));

    Increased htable_bits can force htable_size() to return 0.
    In own turn ip_set_alloc(0) returns not 0 but ZERO_SIZE_PTR,
    so follwoing access to t->hregion should trigger an OOPS.

    Signed-off-by: Vasily Averin
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Sasha Levin

    Vasily Averin
     

17 Jan, 2021

11 commits

  • commit 54970a2fbb673f090b7f02d7f57b10b2e0707155 upstream.

    syzbot reproduces BUG_ON in skb_checksum_help():
    tun creates (bogus) skb with huge partial-checksummed area and
    small ip packet inside. Then ip_rcv trims the skb based on size
    of internal ip packet, after that csum offset points beyond of
    trimmed skb. Then checksum_tg() called via netfilter hook
    triggers BUG_ON:

    offset = skb_checksum_start_offset(skb);
    BUG_ON(offset >= skb_headlen(skb));

    To work around the problem this patch forces pskb_trim_rcsum_slow()
    to return -EINVAL in described scenario. It allows its callers to
    drop such kind of packets.

    Link: https://syzkaller.appspot.com/bug?id=b419a5ca95062664fe1a60b764621eb4526e2cd0
    Reported-by: syzbot+7010af67ced6105e5ab6@syzkaller.appspotmail.com
    Signed-off-by: Vasily Averin
    Acked-by: Willem de Bruijn
    Link: https://lore.kernel.org/r/1b2494af-2c56-8ee2-7bc0-923fcad1cdf8@virtuozzo.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     
  • commit b42b3a2744b3e8f427de79896720c72823af91ad upstream.

    Initialize the sockaddr_can structure to prevent a data leak to user space.

    Suggested-by: Cong Wang
    Reported-by: syzbot+057884e2f453e8afebc8@syzkaller.appspotmail.com
    Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
    Signed-off-by: Oliver Hartkopp
    Link: https://lore.kernel.org/r/20210112091643.11789-1-socketcan@hartkopp.net
    Signed-off-by: Marc Kleine-Budde
    Signed-off-by: Greg Kroah-Hartman

    Oliver Hartkopp
     
  • commit b1b95cb5c0a9694d47d5f845ba97e226cfda957d upstream.

    Rollback the reservation in the completion ring when we get a
    NETDEV_TX_BUSY. When this error is received from the driver, we are
    supposed to let the user application retry the transmit again. And in
    order to do this, we need to roll back the failed send so it can be
    retried. Unfortunately, we did not cancel the reservation we had made
    in the completion ring. By not doing this, we actually make the
    completion ring one entry smaller per NETDEV_TX_BUSY error we get, and
    after enough of these errors the completion ring will be of size zero
    and transmit will stop working.

    Fix this by cancelling the reservation when we get a NETDEV_TX_BUSY
    error.

    Fixes: 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY")
    Reported-by: Xuan Zhuo
    Signed-off-by: Magnus Karlsson
    Signed-off-by: Daniel Borkmann
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20201218134525.13119-3-magnus.karlsson@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Magnus Karlsson
     
  • commit f09ced4053bc0a2094a12b60b646114c966ef4c6 upstream.

    Fix a race when multiple sockets are simultaneously calling sendto()
    when the completion ring is shared in the SKB case. This is the case
    when you share the same netdev and queue id through the
    XDP_SHARED_UMEM bind flag. The problem is that multiple processes can
    be in xsk_generic_xmit() and call the backpressure mechanism in
    xskq_prod_reserve(xs->pool->cq). As this is a shared resource in this
    specific scenario, a race might occur since the rings are
    single-producer single-consumer.

    Fix this by moving the tx_completion_lock from the socket to the pool
    as the pool is shared between the sockets that share the completion
    ring. (The pool is not shared when this is not the case.) And then
    protect the accesses to xskq_prod_reserve() with this lock. The
    tx_completion_lock is renamed cq_lock to better reflect that it
    protects accesses to the potentially shared completion ring.

    Fixes: 35fcde7f8deb ("xsk: support for Tx")
    Reported-by: Xuan Zhuo
    Signed-off-by: Magnus Karlsson
    Signed-off-by: Daniel Borkmann
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20201218134525.13119-2-magnus.karlsson@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Magnus Karlsson
     
  • [ Upstream commit b19218b27f3477316d296e8bcf4446aaf017aa69 ]

    The function nh_check_attr_group() is called to validate nexthop groups.
    The intention of that code seems to have been to bounce all attributes
    above NHA_GROUP_TYPE except for NHA_FDB. However instead it bounces all
    these attributes except when NHA_FDB attribute is present--then it accepts
    them.

    NHA_FDB validation that takes place before, in rtm_to_nh_config(), already
    bounces NHA_OIF, NHA_BLACKHOLE, NHA_ENCAP and NHA_ENCAP_TYPE. Yet further
    back, NHA_GROUPS and NHA_MASTER are bounced unconditionally.

    But that still leaves NHA_GATEWAY as an attribute that would be accepted in
    FDB nexthop groups (with no meaning), so long as it keeps the address
    family as unspecified:

    # ip nexthop add id 1 fdb via 127.0.0.1
    # ip nexthop add id 10 fdb via default group 1

    The nexthop code is still relatively new and likely not used very broadly,
    and the FDB bits are newer still. Even though there is a reproducer out
    there, it relies on an improbable gateway arguments "via default", "via
    all" or "via any". Given all this, I believe it is OK to reformulate the
    condition to do the right thing and bounce NHA_GATEWAY.

    Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops")
    Signed-off-by: Petr Machata
    Signed-off-by: Ido Schimmel
    Reviewed-by: David Ahern
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Petr Machata
     
  • [ Upstream commit 7b01e53eee6dce7a8a6736e06b99b68cd0cc7a27 ]

    In case of error, remove the nexthop group entry from the list to which
    it was previously added.

    Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
    Signed-off-by: Ido Schimmel
    Reviewed-by: Petr Machata
    Reviewed-by: David Ahern
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Ido Schimmel
     
  • [ Upstream commit 07e61a979ca4dddb3661f59328b3cd109f6b0070 ]

    A reference was not taken for the current nexthop entry, so do not try
    to put it in the error path.

    Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
    Signed-off-by: Ido Schimmel
    Reviewed-by: Petr Machata
    Reviewed-by: David Ahern
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Ido Schimmel
     
  • [ Upstream commit bb4cc1a18856a73f0ff5137df0c2a31f4c50f6cf ]

    Conntrack reassembly records the largest fragment size seen in IPCB.
    However, when this gets forwarded/transmitted, fragmentation will only
    be forced if one of the fragmented packets had the DF bit set.

    In that case, a flag in IPCB will force fragmentation even if the
    MTU is large enough.

    This should work fine, but this breaks with ip tunnels.
    Consider client that sends a UDP datagram of size X to another host.

    The client fragments the datagram, so two packets, of size y and z, are
    sent. DF bit is not set on any of these packets.

    Middlebox netfilter reassembles those packets back to single size-X
    packet, before routing decision.

    packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit
    isn't set. At output time, ip refragmentation is skipped as well
    because x is still smaller than the mtu of the output device.

    If ttransmit device is an ip tunnel, the packet size increases to
    x+overhead.

    Also, tunnel might be configured to force DF bit on outer header.

    In this case, packet will be dropped (exceeds MTU) and an ICMP error is
    generated back to sender.

    But sender already respects the announced MTU, all the packets that
    it sent did fit the announced mtu.

    Force refragmentation as per original sizes unconditionally so ip tunnel
    will encapsulate the fragments instead.

    The only other solution I see is to place ip refragmentation in
    the ip_tunnel code to handle this case.

    Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet")
    Reported-by: Christian Perle
    Signed-off-by: Florian Westphal
    Acked-by: Pablo Neira Ayuso
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • [ Upstream commit 50c661670f6a3908c273503dfa206dfc7aa54c07 ]

    For some reason ip_tunnel insist on setting the DF bit anyway when the
    inner header has the DF bit set, EVEN if the tunnel was configured with
    'nopmtudisc'.

    This means that the script added in the previous commit
    cannot be made to work by adding the 'nopmtudisc' flag to the
    ip tunnel configuration. Doing so breaks connectivity even for the
    without-conntrack/netfilter scenario.

    When nopmtudisc is set, the tunnel will skip the mtu check, so no
    icmp error is sent to client. Then, because inner header has DF set,
    the outer header gets added with DF bit set as well.

    IP stack then sends an error to itself because the packet exceeds
    the device MTU.

    Fixes: 23a3647bc4f93 ("ip_tunnels: Use skb-len to PMTU check.")
    Cc: Stefano Brivio
    Signed-off-by: Florian Westphal
    Acked-by: Pablo Neira Ayuso
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • [ Upstream commit d8f5c29653c3f6995e8979be5623d263e92f6b86 ]

    Route removal is handled by two code paths. The main removal path is via
    fib6_del_route() which will handle purging any PMTU exceptions from the
    cache, removing all per-cpu copies of the DST entry used by the route, and
    releasing the fib6_info struct.

    The second removal location is during fib6_add_rt2node() during a route
    replacement operation. This path also calls fib6_purge_rt() to handle
    cleaning up the per-cpu copies of the DST entries and releasing the
    fib6_info associated with the older route, but it does not flush any PMTU
    exceptions that the older route had. Since the older route is removed from
    the tree during the replacement, we lose any way of accessing it again.

    As these lingering DSTs and the fib6_info struct are holding references to
    the underlying netdevice struct as well, unregistering that device from the
    kernel can never complete.

    Fixes: 2b760fcf5cfb3 ("ipv6: hook up exception table to store dst cache")
    Signed-off-by: Sean Tranchetti
    Reviewed-by: David Ahern
    Link: https://lore.kernel.org/r/1609892546-11389-1-git-send-email-stranche@quicinc.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Sean Tranchetti
     
  • [ Upstream commit 55b7ab1178cbf41f979ff83236d3321ad35ed2ad ]

    VLAN checks for NETREG_UNINITIALIZED to distinguish between
    registration failure and unregistration in progress.

    Since commit cb626bf566eb ("net-sysfs: Fix reference count leak")
    registration failure may, however, result in NETREG_UNREGISTERED
    as well as NETREG_UNINITIALIZED.

    This fix is similer to cebb69754f37 ("rtnetlink: Fix
    memory(net_device) leak when ->newlink fails")

    Fixes: cb626bf566eb ("net-sysfs: Fix reference count leak")
    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jakub Kicinski
     

13 Jan, 2021

14 commits

  • commit 8bee683384087a6275c9183a483435225f7bb209 upstream.

    Fix a possible memory leak when a bind of an AF_XDP socket fails. When
    the fill and completion rings are created, they are tied to the
    socket. But when the buffer pool is later created at bind time, the
    ownership of these two rings are transferred to the buffer pool as
    they might be shared between sockets (and the buffer pool cannot be
    created until we know what we are binding to). So, before the buffer
    pool is created, these two rings are cleaned up with the socket, and
    after they have been transferred they are cleaned up together with
    the buffer pool.

    The problem is that ownership was transferred before it was absolutely
    certain that the buffer pool could be created and initialized
    correctly and when one of these errors occurred, the fill and
    completion rings did neither belong to the socket nor the pool and
    where therefore leaked. Solve this by moving the ownership transfer
    to the point where the buffer pool has been completely set up and
    there is no way it can fail.

    Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
    Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
    Signed-off-by: Magnus Karlsson
    Signed-off-by: Daniel Borkmann
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20201214085127.3960-1-magnus.karlsson@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Magnus Karlsson
     
  • commit 95cd4bca7b1f4a25810f3ddfc5e767fb46931789 upstream.

    If userspace requests a feature which is not available the original set
    definition, then bail out with EOPNOTSUPP. If userspace sends
    unsupported dynset flags (new feature not supported by this kernel),
    then report EOPNOTSUPP to userspace. EINVAL should be only used to
    report malformed netlink messages from userspace.

    Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Pablo Neira Ayuso
     
  • commit 6cb56218ad9e580e519dcd23bfb3db08d8692e5a upstream.

    syzbot reports:
    detected buffer overflow in strlen
    [..]
    Call Trace:
    strlen include/linux/string.h:325 [inline]
    strlcpy include/linux/string.h:348 [inline]
    xt_rateest_tg_checkentry+0x2a5/0x6b0 net/netfilter/xt_RATEEST.c:143

    strlcpy assumes src is a c-string. Check info->name before its used.

    Reported-by: syzbot+e86f7c428c8c50db65b4@syzkaller.appspotmail.com
    Fixes: 5859034d7eb8793 ("[NETFILTER]: x_tables: add RATEEST target")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit 5c8193f568ae16f3242abad6518dc2ca6c8eef86 upstream.

    htable_bits() can call jhash_size(32) and trigger shift-out-of-bounds

    UBSAN: shift-out-of-bounds in net/netfilter/ipset/ip_set_hash_gen.h:151:6
    shift exponent 32 is too large for 32-bit type 'unsigned int'
    CPU: 0 PID: 8498 Comm: syz-executor519
    Not tainted 5.10.0-rc7-next-20201208-syzkaller #0
    Call Trace:
    __dump_stack lib/dump_stack.c:79 [inline]
    dump_stack+0x107/0x163 lib/dump_stack.c:120
    ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
    __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
    htable_bits net/netfilter/ipset/ip_set_hash_gen.h:151 [inline]
    hash_mac_create.cold+0x58/0x9b net/netfilter/ipset/ip_set_hash_gen.h:1524
    ip_set_create+0x610/0x1380 net/netfilter/ipset/ip_set_core.c:1115
    nfnetlink_rcv_msg+0xecc/0x1180 net/netfilter/nfnetlink.c:252
    netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
    nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:600
    netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
    netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
    netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xcf/0x120 net/socket.c:672
    ____sys_sendmsg+0x6e8/0x810 net/socket.c:2345
    ___sys_sendmsg+0xf3/0x170 net/socket.c:2399
    __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
    do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    This patch replaces htable_bits() by simple fls(hashsize - 1) call:
    it alone returns valid nbits both for round and non-round hashsizes.
    It is normal to set any nbits here because it is validated inside
    following htable_size() call which returns 0 for nbits>31.

    Fixes: 1feab10d7e6d("netfilter: ipset: Unified hash type generation")
    Reported-by: syzbot+d66bfadebca46cf61a2b@syzkaller.appspotmail.com
    Signed-off-by: Vasily Averin
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     
  • commit 443d6e86f821a165fae3fc3fc13086d27ac140b1 upstream.

    This fixes the dereference to fetch the RCU pointer when holding
    the appropriate xtables lock.

    Reported-by: kernel test robot
    Fixes: cc00bcaa5899 ("netfilter: x_tables: Switch synchronization to RCU")
    Signed-off-by: Subash Abhinov Kasiviswanathan
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Subash Abhinov Kasiviswanathan
     
  • [ Upstream commit 085c7c4e1c0e50d90b7d90f61a12e12b317a91e2 ]

    Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not
    have an erspan header. So the check in gre_parse_header() is wrong,
    we have to distinguish version 1 from version 0.

    We can just check the gre header length like is_erspan_type1().

    Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
    Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com
    Cc: William Tu
    Cc: Lorenzo Bianconi
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit bd1248f1ddbc48b0c30565fce897a3b6423313b8 ]

    Check Scell_log shift size in red_check_params() and modify all callers
    of red_check_params() to pass Scell_log.

    This prevents a shift out-of-bounds as detected by UBSAN:
    UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
    shift exponent 72 is too large for 32-bit type 'int'

    Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
    Signed-off-by: Randy Dunlap
    Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
    Cc: Nogah Frankel
    Cc: Jamal Hadi Salim
    Cc: Cong Wang
    Cc: Jiri Pirko
    Cc: netdev@vger.kernel.org
    Cc: "David S. Miller"
    Cc: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     
  • [ Upstream commit 21fdca22eb7df2a1e194b8adb812ce370748b733 ]

    RT_TOS() only clears one of the ECN bits. Therefore, when
    fib_compute_spec_dst() resorts to a fib lookup, it can return
    different results depending on the value of the second ECN bit.

    For example, ECT(0) and ECT(1) packets could be treated differently.

    $ ip netns add ns0
    $ ip netns add ns1
    $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
    $ ip -netns ns0 link set dev lo up
    $ ip -netns ns1 link set dev lo up
    $ ip -netns ns0 link set dev veth01 up
    $ ip -netns ns1 link set dev veth10 up

    $ ip -netns ns0 address add 192.0.2.10/24 dev veth01
    $ ip -netns ns1 address add 192.0.2.11/24 dev veth10

    $ ip -netns ns1 address add 192.0.2.21/32 dev lo
    $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21
    $ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0

    With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21
    (ping uses -Q to set all TOS and ECN bits):

    $ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255
    [...]
    64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms

    But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11
    because the "tos 4" route isn't matched:

    $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
    [...]
    64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms

    After this patch the ECN bits don't affect the result anymore:

    $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
    [...]
    64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms

    Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.")
    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Guillaume Nault
     
  • [ Upstream commit 4ae2bb81649dc03dfc95875f02126b14b773f7ab ]

    Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
    protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
    see an actual bug being triggered, but let's be safe here and take the
    rtnl lock while accessing the map in sysfs.

    Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
    Signed-off-by: Antoine Tenart
    Reviewed-by: Alexander Duyck
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Antoine Tenart
     
  • [ Upstream commit 2d57b4f142e0b03e854612b8e28978935414bced ]

    Two race conditions can be triggered when storing xps rxqs, resulting in
    various oops and invalid memory accesses:

    1. Calling netdev_set_num_tc while netif_set_xps_queue:

    - netif_set_xps_queue uses dev->tc_num as one of the parameters to
    compute the size of new_dev_maps when allocating it. dev->tc_num is
    also used to access the map, and the compiler may generate code to
    retrieve this field multiple times in the function.

    - netdev_set_num_tc sets dev->tc_num.

    If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
    is set to a higher value through netdev_set_num_tc, later accesses to
    new_dev_maps in netif_set_xps_queue could lead to accessing memory
    outside of new_dev_maps; triggering an oops.

    2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

    2.1. netdev_set_num_tc starts by resetting the xps queues,
    dev->tc_num isn't updated yet.

    2.2. netif_set_xps_queue is called, setting up the map with the
    *old* dev->num_tc.

    2.3. netdev_set_num_tc updates dev->tc_num.

    2.4. Later accesses to the map lead to out of bound accesses and
    oops.

    A similar issue can be found with netdev_reset_tc.

    One way of triggering this is to set an iface up (for which the driver
    uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
    xps_rxqs in a concurrent thread. With the right timing an oops is
    triggered.

    Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
    and netdev_reset_tc should be mutually exclusive. We do that by taking
    the rtnl lock in xps_rxqs_store.

    Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
    Signed-off-by: Antoine Tenart
    Reviewed-by: Alexander Duyck
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Antoine Tenart
     
  • [ Upstream commit fb25038586d0064123e393cadf1fadd70a9df97a ]

    Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
    protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
    see an actual bug being triggered, but let's be safe here and take the
    rtnl lock while accessing the map in sysfs.

    Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
    Signed-off-by: Antoine Tenart
    Reviewed-by: Alexander Duyck
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Antoine Tenart
     
  • [ Upstream commit 1ad58225dba3f2f598d2c6daed4323f24547168f ]

    Two race conditions can be triggered when storing xps cpus, resulting in
    various oops and invalid memory accesses:

    1. Calling netdev_set_num_tc while netif_set_xps_queue:

    - netif_set_xps_queue uses dev->tc_num as one of the parameters to
    compute the size of new_dev_maps when allocating it. dev->tc_num is
    also used to access the map, and the compiler may generate code to
    retrieve this field multiple times in the function.

    - netdev_set_num_tc sets dev->tc_num.

    If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
    is set to a higher value through netdev_set_num_tc, later accesses to
    new_dev_maps in netif_set_xps_queue could lead to accessing memory
    outside of new_dev_maps; triggering an oops.

    2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

    2.1. netdev_set_num_tc starts by resetting the xps queues,
    dev->tc_num isn't updated yet.

    2.2. netif_set_xps_queue is called, setting up the map with the
    *old* dev->num_tc.

    2.3. netdev_set_num_tc updates dev->tc_num.

    2.4. Later accesses to the map lead to out of bound accesses and
    oops.

    A similar issue can be found with netdev_reset_tc.

    One way of triggering this is to set an iface up (for which the driver
    uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
    xps_cpus in a concurrent thread. With the right timing an oops is
    triggered.

    Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
    and netdev_reset_tc should be mutually exclusive. We do that by taking
    the rtnl lock in xps_cpus_store.

    Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
    Signed-off-by: Antoine Tenart
    Reviewed-by: Alexander Duyck
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Antoine Tenart
     
  • [ Upstream commit 427c940558560bff2583d07fc119a21094675982 ]

    When aggregating ncsi interfaces and dedicated interfaces to bond
    interfaces, the ncsi response handler will use the wrong net device to
    find ncsi_dev, so that the ncsi interface will not work properly.
    Here, we use the original net device to fix it.

    Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
    Signed-off-by: John Wang
    Link: https://lore.kernel.org/r/20201223055523.2069-1-wangzhiqiang.bj@bytedance.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    John Wang
     
  • [ Upstream commit 698285da79f5b0b099db15a37ac661ac408c80eb ]

    taprio_graft() can insert a NULL element in the array of child qdiscs. As
    a consquence, taprio_reset() might not reset child qdiscs completely, and
    taprio_destroy() might leak resources. Fix it by ensuring that loops that
    iterate over q->qdiscs[] don't end when they find the first NULL item.

    Fixes: 44d4775ca518 ("net/sched: sch_taprio: reset child qdiscs before freeing them")
    Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
    Suggested-by: Jakub Kicinski
    Signed-off-by: Davide Caratti
    Link: https://lore.kernel.org/r/13edef6778fef03adc751582562fba4a13e06d6a.1608240532.git.dcaratti@redhat.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

09 Jan, 2021

1 commit

  • commit a31489d2a368d2f9225ed6a6f595c63bc7d10de8 upstream.

    During controller initialization, an LE Set RPA Timeout command is sent
    to the controller if supported. However, the value checked to determine
    if the command is supported is incorrect. Page 1921 of the Bluetooth
    Core Spec v5.2 shows that bit 2 of octet 35 of the Supported_Commands
    field corresponds to the LE Set RPA Timeout command, but currently
    bit 6 of octet 35 is checked. This patch checks the correct value
    instead.

    This issue led to the error seen in the following btmon output during
    initialization of an adapter (rtl8761b) and prevented initialization
    from completing.

    < HCI Command: LE Set Resolvable Private Address Timeout (0x08|0x002e) plen 2
    Timeout: 900 seconds
    > HCI Event: Command Complete (0x0e) plen 4
    LE Set Resolvable Private Address Timeout (0x08|0x002e) ncmd 2
    Status: Unsupported Remote Feature / Unsupported LMP Feature (0x1a)
    = Close Index: 00:E0:4C:6B:E5:03

    The error did not appear when running with this patch.

    Signed-off-by: Edward Vear
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Johan Hedberg
    Cc: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Edward Vear
     

06 Jan, 2021

4 commits

  • [ Upstream commit efb796f5571f030743e1d9c662cdebdad724f8c5 ]

    Syzbot reported a shift of a u32 by more than 31 in strset_parse_request()
    which is undefined behavior. This is caused by range check of string set id
    using variable ret (which is always 0 at this point) instead of id (string
    set id from request).

    Fixes: 71921690f974 ("ethtool: provide string sets with STRSET_GET request")
    Reported-by: syzbot+96523fb438937cd01220@syzkaller.appspotmail.com
    Signed-off-by: Michal Kubecek
    Link: https://lore.kernel.org/r/b54ed5c5fd972a59afea3e1badfb36d86df68799.1607952208.git.mkubecek@suse.cz
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Michal Kubecek
     
  • [ Upstream commit ef72cd3c5ce168829c6684ecb2cae047d3493690 ]

    Fix two error paths in ethnl_set_channels() to avoid lock-up caused
    but unreleased RTNL.

    Fixes: e19c591eafad ("ethtool: set device channel counts with CHANNELS_SET request")
    Reported-by: LiLiang
    Signed-off-by: Ivan Vecera
    Reviewed-by: Michal Kubecek
    Link: https://lore.kernel.org/r/20201215090810.801777-1-ivecera@redhat.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Ivan Vecera
     
  • [ Upstream commit 0c14846032f2c0a3b63234e1fc2759f4155b6067 ]

    Currently MPTCP is not propagating the security context
    from the ingress request socket to newly created msk
    at clone time.

    Address the issue invoking the missing security helper.

    Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
    Signed-off-by: Paolo Abeni
    Reviewed-by: Mat Martineau
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 44d4775ca51805b376a8db5b34f650434a08e556 ]

    syzkaller shows that packets can still be dequeued while taprio_destroy()
    is running. Let sch_taprio use the reset() function to cancel the advance
    timer and drop all skbs from the child qdiscs.

    Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
    Link: https://syzkaller.appspot.com/bug?id=f362872379bf8f0017fb667c1ab158f2d1e764ae
    Reported-by: syzbot+8971da381fb5a31f542d@syzkaller.appspotmail.com
    Signed-off-by: Davide Caratti
    Acked-by: Vinicius Costa Gomes
    Link: https://lore.kernel.org/r/63b6d79b0e830ebb0283e020db4df3cdfdfb2b94.1608142843.git.dcaratti@redhat.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti