08 Oct, 2020

1 commit

  • * tag 'v5.4.70': (3051 commits)
    Linux 5.4.70
    netfilter: ctnetlink: add a range check for l3/l4 protonum
    ep_create_wakeup_source(): dentry name can change under you...
    ...

    Conflicts:
    arch/arm/mach-imx/pm-imx6.c
    arch/arm64/boot/dts/freescale/imx8mm-evk.dts
    arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
    drivers/crypto/caam/caamalg.c
    drivers/gpu/drm/imx/dw_hdmi-imx.c
    drivers/gpu/drm/imx/imx-ldb.c
    drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
    drivers/mmc/host/sdhci-esdhc-imx.c
    drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
    drivers/net/ethernet/freescale/enetc/enetc.c
    drivers/net/ethernet/freescale/enetc/enetc_pf.c
    drivers/thermal/imx_thermal.c
    drivers/usb/cdns3/ep0.c
    drivers/xen/swiotlb-xen.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c

    Signed-off-by: Jason Liu

    Jason Liu
     

27 Sep, 2020

3 commits

  • [ Upstream commit b5b73b26b3ca34574124ed7ae9c5ba8391a7f176 ]

    It's possible that the user specifies an interval that couldn't allow
    any packet to be transmitted. This also avoids the issue of the
    hrtimer handler starving the other threads because it's running too
    often.

    The solution is to reject interval sizes that according to the current
    link speed wouldn't allow any packet to be transmitted.

    Reported-by: syzbot+8267241609ae8c23b248@syzkaller.appspotmail.com
    Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit 2fb541c862c987d02dfdf28f1545016deecfa0d5 ]

    Currently there is concurrent reset and enqueue operation for the
    same lockless qdisc when there is no lock to synchronize the
    q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
    qdisc_deactivate() called by dev_deactivate_queue(), which may cause
    out-of-bounds access for priv->ring[] in hns3 driver if user has
    requested a smaller queue num when __dev_xmit_skb() still enqueue a
    skb with a larger queue_mapping after the corresponding qdisc is
    reset, and call hns3_nic_net_xmit() with that skb later.

    Reused the existing synchronize_net() in dev_deactivate_many() to
    make sure skb with larger queue_mapping enqueued to old qdisc(which
    is saved in dev_queue->qdisc_sleeping) will always be reset when
    dev_reset_queue() is called.

    Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
    Signed-off-by: Yunsheng Lin
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Yunsheng Lin
     
  • [ Upstream commit cc8e58f8325cdf14b9516b61c384cdfd02a4f408 ]

    The following deadlock scenario is triggered by syzbot:

    Thread A: Thread B:
    tcf_idr_check_alloc()
    ...
    populate_metalist()
    rtnl_unlock()
    rtnl_lock()
    ...
    request_module() tcf_idr_check_alloc()
    rtnl_lock()

    At this point, thread A is waiting for thread B to release RTNL
    lock, while thread B is waiting for thread A to commit the IDR
    change with tcf_idr_insert() later.

    Break this deadlock situation by preloading ife modules earlier,
    before tcf_idr_check_alloc(), this is fine because we only need
    to load modules we need potentially.

    Reported-and-tested-by: syzbot+80e32b5d1f9923f8ace6@syzkaller.appspotmail.com
    Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
    Cc: Jamal Hadi Salim
    Cc: Vlad Buslov
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

12 Sep, 2020

1 commit

  • [ Upstream commit 09e31cf0c528dac3358a081dc4e773d1b3de1bc9 ]

    Since commit 9c66d1564676 ("taprio: Add support for hardware
    offloading") there's a bit of inconsistency when offloading schedules
    to the hardware:

    In software mode, the gate masks are specified in terms of traffic
    classes, so if say "sched-entry S 03 20000", it means that the traffic
    classes 0 and 1 are open for 20us; when taprio is offloaded to
    hardware, the gate masks are specified in terms of hardware queues.

    The idea here is to fix hardware offloading, so schedules in hardware
    and software mode have the same behavior. What's needed to do is to
    map traffic classes to queues when applying the offload to the driver.

    Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     

03 Sep, 2020

1 commit

  • [ Upstream commit eda814b97dfb8d9f4808eb2f65af9bd3705c4cae ]

    tcf_ct_handle_fragments() shouldn't free the skb when ip_defrag() call
    fails. Otherwise, we will cause a double-free bug.
    In such cases, just return the error to the caller.

    Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
    Signed-off-by: Alaa Hleihel
    Reviewed-by: Roi Dayan
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alaa Hleihel
     

22 Jul, 2020

2 commits

  • [ Upstream commit d7bf2ebebc2bd61ab95e2a8e33541ef282f303d4 ]

    There are a couple of places in net/sched/ that check skb->protocol and act
    on the value there. However, in the presence of VLAN tags, the value stored
    in skb->protocol can be inconsistent based on whether VLAN acceleration is
    enabled. The commit quoted in the Fixes tag below fixed the users of
    skb->protocol to use a helper that will always see the VLAN ethertype.

    However, most of the callers don't actually handle the VLAN ethertype, but
    expect to find the IP header type in the protocol field. This means that
    things like changing the ECN field, or parsing diffserv values, stops
    working if there's a VLAN tag, or if there are multiple nested VLAN
    tags (QinQ).

    To fix this, change the helper to take an argument that indicates whether
    the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
    make sure to skip all of them, so behaviour is consistent even in QinQ
    mode.

    To make the helper usable from the ECN code, move it to if_vlan.h instead
    of pkt_sched.h.

    v3:
    - Remove empty lines
    - Move vlan variable definitions inside loop in skb_protocol()
    - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
    bpf_skb_ecn_set_ce()

    v2:
    - Use eth_type_vlan() helper in skb_protocol()
    - Also fix code that reads skb->protocol directly
    - Change a couple of 'if/else if' statements to switch constructs to avoid
    calling the helper twice

    Reported-by: Ilya Ponetayev
    Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     
  • [ Upstream commit 306381aec7c2b5a658eebca008c8a1b666536cba ]

    When tcf_block_get() fails inside atm_tc_init(),
    atm_tc_put() is called to release the qdisc p->link.q.
    But the flow->ref prevents it to do so, as the flow->ref
    is still zero.

    Fix this by moving the p->link.ref initialization before
    tcf_block_get().

    Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
    Reported-and-tested-by: syzbot+d411cff6ab29cc2c311b@syzkaller.appspotmail.com
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

01 Jul, 2020

4 commits

  • [ Upstream commit 1a3db27ad9a72d033235b9673653962c02e3486e ]

    Since the quiesce/activate rework, __netdev_watchdog_up() is directly
    called in the ucc_geth driver.

    Unfortunately, this function is not available for modules and thus
    ucc_geth cannot be built as a module anymore. Fix it by exporting
    __netdev_watchdog_up().

    Since the commit introducing the regression was backported to stable
    branches, this one should ideally be as well.

    Fixes: 79dde73cf9bc ("net/ethernet/freescale: rework quiesce/activate for ucc_geth")
    Signed-off-by: Valentin Longchamp
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Valentin Longchamp
     
  • [ Upstream commit 3f608f0c41360b11b04c763f348b712f651c8bac ]

    I spotted a few nits when comparing the in-tree version of sch_cake with
    the out-of-tree one: A redundant error variable declaration shadowing an
    outer declaration, and an indentation alignment issue. Fix both of these.

    Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     
  • [ Upstream commit 8c95eca0bb8c4bd2231a0d581f1ad0d50c90488c ]

    As a further optimisation of the diffserv parsing codepath, we can skip it
    entirely if CAKE is configured to neither use diffserv-based
    classification, nor to zero out the diffserv bits.

    Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     
  • [ Upstream commit 9208d2863ac689a563b92f2161d8d1e7127d0add ]

    cake_handle_diffserv() tries to linearize mac and network header parts of
    skb and to make it writable unconditionally. In some cases it leads to full
    skb reallocation, which reduces throughput and increases CPU load. Some
    measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core
    CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable()
    reallocates skb, if skb was allocated in ethernet driver via so-called
    'build skb' method from page cache (it was discovered by strange increase
    of kmalloc-2048 slab at first).

    Obtain DSCP value via read-only skb_header_pointer() call, and leave
    linearization only for DSCP bleaching or ECN CE setting. And, as an
    additional optimisation, skip diffserv parsing entirely if it is not needed
    by the current configuration.

    Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
    Signed-off-by: Ilya Ponetayev
    [ fix a few style issues, reflow commit message ]
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ilya Ponetayev
     

19 Jun, 2020

1 commit

  • * tag 'v5.4.47': (2193 commits)
    Linux 5.4.47
    KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
    KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
    ...

    Conflicts:
    arch/arm/boot/dts/imx6qdl.dtsi
    arch/arm/mach-imx/Kconfig
    arch/arm/mach-imx/common.h
    arch/arm/mach-imx/suspend-imx6.S
    arch/arm64/boot/dts/freescale/imx8qxp-mek.dts
    arch/powerpc/include/asm/cacheflush.h
    drivers/cpufreq/imx6q-cpufreq.c
    drivers/dma/imx-sdma.c
    drivers/edac/synopsys_edac.c
    drivers/firmware/imx/imx-scu.c
    drivers/net/ethernet/freescale/fec.h
    drivers/net/ethernet/freescale/fec_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/phy_device.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/usb/cdns3/gadget.c
    drivers/usb/dwc3/gadget.c
    include/uapi/linux/dma-buf.h

    Signed-off-by: Jason Liu

    Jason Liu
     

20 May, 2020

1 commit

  • [ Upstream commit a7df4870d79b00742da6cc93ca2f336a71db77f7 ]

    When we tell kernel to dump filters from root (ffff:ffff),
    those filters on ingress (ffff:0000) are matched, but their
    true parents must be dumped as they are. However, kernel
    dumps just whatever we tell it, that is either ffff:ffff
    or ffff:0000:

    $ nl-cls-list --dev=dummy0 --parent=root
    cls basic dev dummy0 id none parent root prio 49152 protocol ip match-all
    cls basic dev dummy0 id :1 parent root prio 49152 protocol ip match-all
    $ nl-cls-list --dev=dummy0 --parent=ffff:
    cls basic dev dummy0 id none parent ffff: prio 49152 protocol ip match-all
    cls basic dev dummy0 id :1 parent ffff: prio 49152 protocol ip match-all

    This is confusing and misleading, more importantly this is
    a regression since 4.15, so the old behavior must be restored.

    And, when tc filters are installed on a tc class, the parent
    should be the classid, rather than the qdisc handle. Commit
    edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto")
    removed the classid we save for filters, we can just restore
    this classid in tcf_block.

    Steps to reproduce this:
    ip li set dev dummy0 up
    tc qd add dev dummy0 ingress
    tc filter add dev dummy0 parent ffff: protocol arp basic action pass
    tc filter show dev dummy0 root

    Before this patch:
    filter protocol arp pref 49152 basic
    filter protocol arp pref 49152 basic handle 0x1
    action order 1: gact action pass
    random type none pass val 0
    index 1 ref 1 bind 1

    After this patch:
    filter parent ffff: protocol arp pref 49152 basic
    filter parent ffff: protocol arp pref 49152 basic handle 0x1
    action order 1: gact action pass
    random type none pass val 0
    index 1 ref 1 bind 1

    Fixes: a10fa20101ae ("net: sched: propagate q and parent from caller down to tcf_fill_node")
    Fixes: edf6711c9840 ("net: sched: remove classid and q fields from tcf_proto")
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Cong Wang
     

14 May, 2020

4 commits

  • [ Upstream commit df4953e4e997e273501339f607b77953772e3559 ]

    syzbot managed to set up sfq so that q->scaled_quantum was zero,
    triggering an infinite loop in sfq_dequeue()

    More generally, we must only accept quantum between 1 and 2^18 - 7,
    meaning scaled_quantum must be in [1, 0x7FFF] range.

    Otherwise, we also could have a loop in sfq_dequeue()
    if scaled_quantum happens to be 0x8000, since slot->allot
    could indefinitely switch between 0 and 0x8000.

    Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com
    Cc: Jason A. Donenfeld
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 8738c85c72b3108c9b9a369a39868ba5f8e10ae0 ]

    If choke_init() could not allocate q->tab, we would crash later
    in choke_reset().

    BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline]
    BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326
    Write of size 8 at addr 0000000000000000 by task syz-executor822/7022

    CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x188/0x20d lib/dump_stack.c:118
    __kasan_report.cold+0x5/0x4d mm/kasan/report.c:515
    kasan_report+0x33/0x50 mm/kasan/common.c:625
    check_memory_region_inline mm/kasan/generic.c:187 [inline]
    check_memory_region+0x141/0x190 mm/kasan/generic.c:193
    memset+0x20/0x40 mm/kasan/common.c:85
    memset include/linux/string.h:366 [inline]
    choke_reset+0x208/0x340 net/sched/sch_choke.c:326
    qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910
    dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138
    netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline]
    dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195
    dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233
    qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051
    tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670
    rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454
    netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
    netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
    netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
    netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xcf/0x120 net/socket.c:672
    ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
    ___sys_sendmsg+0x100/0x170 net/socket.c:2416
    __sys_sendmsg+0xec/0x1b0 net/socket.c:2449
    do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295

    Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 2761121af87de45951989a0adada917837d8fa82 ]

    Do not assume the attribute has the right size.

    Fixes: aea5f654e6b7 ("net/sched: add skbprio scheduler")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 14695212d4cd8b0c997f6121b6df8520038ce076 ]

    My intent was to not let users set a zero drop_batch_size,
    it seems I once again messed with min()/max().

    Fixes: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
    Signed-off-by: Eric Dumazet
    Acked-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

29 Apr, 2020

1 commit

  • [ Upstream commit a1211bf9a7774706722ba3b18c6157d980319f79 ]

    skb->sk does not always point to a full blown socket,
    we need to use sk_fullsock() before accessing fields which
    only make sense on full socket.

    BUG: KASAN: use-after-free in report_sock_error+0x286/0x300 net/sched/sch_etf.c:141
    Read of size 1 at addr ffff88805eb9b245 by task syz-executor.5/9630

    CPU: 1 PID: 9630 Comm: syz-executor.5 Not tainted 5.7.0-rc2-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x188/0x20d lib/dump_stack.c:118
    print_address_description.constprop.0.cold+0xd3/0x315 mm/kasan/report.c:382
    __kasan_report.cold+0x35/0x4d mm/kasan/report.c:511
    kasan_report+0x33/0x50 mm/kasan/common.c:625
    report_sock_error+0x286/0x300 net/sched/sch_etf.c:141
    etf_enqueue_timesortedlist+0x389/0x740 net/sched/sch_etf.c:170
    __dev_xmit_skb net/core/dev.c:3710 [inline]
    __dev_queue_xmit+0x154a/0x30a0 net/core/dev.c:4021
    neigh_hh_output include/net/neighbour.h:499 [inline]
    neigh_output include/net/neighbour.h:508 [inline]
    ip6_finish_output2+0xfb5/0x25b0 net/ipv6/ip6_output.c:117
    __ip6_finish_output+0x442/0xab0 net/ipv6/ip6_output.c:143
    ip6_finish_output+0x34/0x1f0 net/ipv6/ip6_output.c:153
    NF_HOOK_COND include/linux/netfilter.h:296 [inline]
    ip6_output+0x239/0x810 net/ipv6/ip6_output.c:176
    dst_output include/net/dst.h:435 [inline]
    NF_HOOK include/linux/netfilter.h:307 [inline]
    NF_HOOK include/linux/netfilter.h:301 [inline]
    ip6_xmit+0xe1a/0x2090 net/ipv6/ip6_output.c:280
    tcp_v6_send_synack+0x4e7/0x960 net/ipv6/tcp_ipv6.c:521
    tcp_rtx_synack+0x10d/0x1a0 net/ipv4/tcp_output.c:3916
    inet_rtx_syn_ack net/ipv4/inet_connection_sock.c:669 [inline]
    reqsk_timer_handler+0x4c2/0xb40 net/ipv4/inet_connection_sock.c:763
    call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1405
    expire_timers kernel/time/timer.c:1450 [inline]
    __run_timers kernel/time/timer.c:1774 [inline]
    __run_timers kernel/time/timer.c:1741 [inline]
    run_timer_softirq+0x623/0x1600 kernel/time/timer.c:1787
    __do_softirq+0x26c/0x9f7 kernel/softirq.c:292
    invoke_softirq kernel/softirq.c:373 [inline]
    irq_exit+0x192/0x1d0 kernel/softirq.c:413
    exiting_irq arch/x86/include/asm/apic.h:546 [inline]
    smp_apic_timer_interrupt+0x19e/0x600 arch/x86/kernel/apic/apic.c:1140
    apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829

    RIP: 0010:des_encrypt+0x157/0x9c0 lib/crypto/des.c:792
    Code: 85 22 06 00 00 41 31 dc 41 8b 4d 04 44 89 e2 41 83 e4 3f 4a 8d 3c a5 60 72 72 88 81 e2 3f 3f 3f 3f 48 89 f8 48 c1 e8 03 31 d9 b6 34 28 48 89 f8 c1 c9 04 83 e0 07 83 c0 03 40 38 f0 7c 09 40
    RSP: 0018:ffffc90003b5f6c0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
    RAX: 1ffffffff10e4e55 RBX: 00000000d2f846d0 RCX: 00000000d2f846d0
    RDX: 0000000012380612 RSI: ffffffff839863ca RDI: ffffffff887272a8
    RBP: dffffc0000000000 R08: ffff888091d0a380 R09: 0000000000800081
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000012
    R13: ffff8880a8ae8078 R14: 00000000c545c93e R15: 0000000000000006
    cipher_crypt_one crypto/cipher.c:75 [inline]
    crypto_cipher_encrypt_one+0x124/0x210 crypto/cipher.c:82
    crypto_cbcmac_digest_update+0x1b5/0x250 crypto/ccm.c:830
    crypto_shash_update+0xc4/0x120 crypto/shash.c:119
    shash_ahash_update+0xa3/0x110 crypto/shash.c:246
    crypto_ahash_update include/crypto/hash.h:547 [inline]
    hash_sendmsg+0x518/0xad0 crypto/algif_hash.c:102
    sock_sendmsg_nosec net/socket.c:652 [inline]
    sock_sendmsg+0xcf/0x120 net/socket.c:672
    ____sys_sendmsg+0x308/0x7e0 net/socket.c:2362
    ___sys_sendmsg+0x100/0x170 net/socket.c:2416
    __sys_sendmmsg+0x195/0x480 net/socket.c:2506
    __do_sys_sendmmsg net/socket.c:2535 [inline]
    __se_sys_sendmmsg net/socket.c:2532 [inline]
    __x64_sys_sendmmsg+0x99/0x100 net/socket.c:2532
    do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
    RIP: 0033:0x45c829
    Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f6d9528ec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00000000004fc080 RCX: 000000000045c829
    RDX: 0000000000000001 RSI: 0000000020002640 RDI: 0000000000000004
    RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 00000000000008d7 R14: 00000000004cb7aa R15: 00007f6d9528f6d4

    Fixes: 4b15c7075352 ("net/sched: Make etf report drops on error_queue")
    Fixes: 25db26a91364 ("net/sched: Introduce the ETF Qdisc")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Vinicius Costa Gomes
    Reviewed-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

13 Apr, 2020

2 commits

  • [ Upstream commit a8eab6d35e22f4f21471f16147be79529cd6aaf7 ]

    The initial refcnt of struct tcindex_data should be 1,
    it is clear that I forgot to set it to 1 in tcindex_init().
    This leads to a dec-after-zero warning.

    Reported-by: syzbot+8325e509a1bf83ec741d@syzkaller.appspotmail.com
    Fixes: 304e024216a8 ("net_sched: add a temporary refcnt for struct tcindex_data")
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Cc: Paul E. McKenney
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit 304e024216a802a7dc8ba75d36de82fa136bbf3e ]

    Although we intentionally use an ordered workqueue for all tc
    filter works, the ordering is not guaranteed by RCU work,
    given that tcf_queue_work() is esstenially a call_rcu().

    This problem is demostrated by Thomas:

    CPU 0:
    tcf_queue_work()
    tcf_queue_work(&r->rwork, tcindex_destroy_rexts_work);

    -> Migration to CPU 1

    CPU 1:
    tcf_queue_work(&p->rwork, tcindex_destroy_work);

    so the 2nd work could be queued before the 1st one, which leads
    to a free-after-free.

    Enforcing this order in RCU work is hard as it requires to change
    RCU code too. Fortunately we can workaround this problem in tcindex
    filter by taking a temporary refcnt, we only refcnt it right before
    we begin to destroy it. This simplifies the code a lot as a full
    refcnt requires much more changes in tcindex_set_parms().

    Reported-by: syzbot+46f513c3033d592409d2@syzkaller.appspotmail.com
    Fixes: 3d210534cc93 ("net_sched: fix a race condition in tcindex_destroy()")
    Cc: Thomas Gleixner
    Cc: Paul E. McKenney
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Reviewed-by: Paul E. McKenney
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

01 Apr, 2020

6 commits

  • commit 2c64605b590edadb3fb46d1ec6badb49e940b479 upstream.

    net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’:
    net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’
    pkt->skb->tc_redirected = 1;
    ^~
    net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’
    pkt->skb->tc_from_ingress = 1;
    ^~

    To avoid a direct dependency with tc actions from netfilter, wrap the
    redirect bits around CONFIG_NET_REDIRECT and move helpers to
    include/linux/skbuff.h. Turn on this toggle from the ifb driver, the
    only existing client of these bits in the tree.

    This patch adds skb_set_redirected() that sets on the redirected bit
    on the skbuff, it specifies if the packet was redirect from ingress
    and resets the timestamp (timestamp reset was originally missing in the
    netfilter bugfix).

    Fixes: bcfabee1afd99484 ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress")
    Reported-by: noreply@ellerman.id.au
    Reported-by: Geert Uytterhoeven
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Pablo Neira Ayuso
     
  • [ Upstream commit 0d1c3530e1bd38382edef72591b78e877e0edcd3 ]

    In commit 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex")
    I moved cp->hash calculation before the first
    tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched.
    This difference could lead to another out of bound access.

    cp->alloc_hash should always be the size allocated, we should
    update it after this tcindex_alloc_perfect_hash().

    Reported-and-tested-by: syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com
    Fixes: 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex")
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit b1be2e8cd290f620777bfdb8aa00890cd2fa02b5 ]

    syzbot reported a use-after-free in tcindex_dump(). This is due to
    the lack of RTNL in the deferred rcu work. We queue this work with
    RTNL in tcindex_change(), later, tcindex_dump() is called:

    fh = tp->ops->get(tp, t->tcm_handle);
    ...
    err = tp->ops->change(..., &fh, ...);
    tfilter_notify(..., fh, ...);

    but there is nothing to serialize the pending
    tcindex_partial_destroy_work() with tcindex_dump().

    Fix this by simply holding RTNL in tcindex_partial_destroy_work(),
    so that it won't be called until RTNL is released after
    tc_new_tfilter() is completed.

    Reported-and-tested-by: syzbot+653090db2562495901dc@syzkaller.appspotmail.com
    Fixes: 3d210534cc93 ("net_sched: fix a race condition in tcindex_destroy()")
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit ef299cc3fa1a9e1288665a9fdc8bff55629fd359 ]

    route4_change() allocates a new filter and copies values from
    the old one. After the new filter is inserted into the hash
    table, the old filter should be removed and freed, as the final
    step of the update.

    However, the current code mistakenly removes the new one. This
    looks apparently wrong to me, and it causes double "free" and
    use-after-free too, as reported by syzbot.

    Reported-and-tested-by: syzbot+f9b32aaacd60305d9687@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+2f8c233f131943d6056d@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+9c2df9fd5e9445b74e01@syzkaller.appspotmail.com
    Fixes: 1109c00547fc ("net: sched: RCU cls_route")
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Cc: John Fastabend
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit dd2af10402684cb5840a127caec9e7cdcff6d167 ]

    Currently, on replace, the previous action instance params
    is swapped with a newly allocated params. The old params is
    only freed (via kfree_rcu), without releasing the allocated
    ct zone template related to it.

    Call tcf_ct_params_free (via call_rcu) for the old params,
    so it will release it.

    Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
    Signed-off-by: Paul Blakey
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paul Blakey
     
  • [ Upstream commit 961d0e5b32946703125964f9f5b6321d60f4d706 ]

    Currently the software CBS does not consider the packet sending time
    when depleting the credits. It caused the throughput to be
    Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
    Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
    is expected. In order to fix the issue above, this patch takes the time
    when the packet sending completes into account by moving the anchor time
    variable "last" ahead to the send completion time upon transmission and
    adding wait when the next dequeue request comes before the send
    completion time of the previous packet.

    changelog:
    V2->V3:
    - remove unnecessary whitespace cleanup
    - add the checks if port_rate is 0 before division

    V1->V2:
    - combine variable "send_completed" into "last"
    - add the comment for estimate of the packet sending

    Fixes: 585d763af09c ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
    Signed-off-by: Zh-yuan Ye
    Reviewed-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Zh-yuan Ye
     

18 Mar, 2020

3 commits

  • [ Upstream commit e13aaa0643da10006ec35715954e7f92a62899a5 ]

    Add missing attribute validation for TCA_TAPRIO_ATTR_TXTIME_DELAY
    to the netlink policy.

    Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
    Signed-off-by: Jakub Kicinski
    Reviewed-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jakub Kicinski
     
  • [ Upstream commit 7e6dc03eeb023e18427a373522f1d247b916a641 ]

    Add missing attribute validation for TCA_FQ_ORPHAN_MASK
    to the netlink policy.

    Fixes: 06eb395fa985 ("pkt_sched: fq: better control of DDOS traffic")
    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jakub Kicinski
     
  • [ Upstream commit b09fe70ef520e011ba4a64f4b93f948a8f14717b ]

    There was a bug that was causing packets to be sent to the driver
    without first calling dequeue() on the "child" qdisc. And the KASAN
    report below shows that sending a packet without calling dequeue()
    leads to bad results.

    The problem is that when checking the last qdisc "child" we do not set
    the returned skb to NULL, which can cause it to be sent to the driver,
    and so after the skb is sent, it may be freed, and in some situations a
    reference to it may still be in the child qdisc, because it was never
    dequeued.

    The crash log looks like this:

    [ 19.937538] ==================================================================
    [ 19.938300] BUG: KASAN: use-after-free in taprio_dequeue_soft+0x620/0x780
    [ 19.938968] Read of size 4 at addr ffff8881128628cc by task swapper/1/0
    [ 19.939612]
    [ 19.939772] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc3+ #97
    [ 19.940397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qe4
    [ 19.941523] Call Trace:
    [ 19.941774]
    [ 19.941985] dump_stack+0x97/0xe0
    [ 19.942323] print_address_description.constprop.0+0x3b/0x60
    [ 19.942884] ? taprio_dequeue_soft+0x620/0x780
    [ 19.943325] ? taprio_dequeue_soft+0x620/0x780
    [ 19.943767] __kasan_report.cold+0x1a/0x32
    [ 19.944173] ? taprio_dequeue_soft+0x620/0x780
    [ 19.944612] kasan_report+0xe/0x20
    [ 19.944954] taprio_dequeue_soft+0x620/0x780
    [ 19.945380] __qdisc_run+0x164/0x18d0
    [ 19.945749] net_tx_action+0x2c4/0x730
    [ 19.946124] __do_softirq+0x268/0x7bc
    [ 19.946491] irq_exit+0x17d/0x1b0
    [ 19.946824] smp_apic_timer_interrupt+0xeb/0x380
    [ 19.947280] apic_timer_interrupt+0xf/0x20
    [ 19.947687]
    [ 19.947912] RIP: 0010:default_idle+0x2d/0x2d0
    [ 19.948345] Code: 00 00 41 56 41 55 65 44 8b 2d 3f 8d 7c 7c 41 54 55 53 0f 1f 44 00 00 e8 b1 b2 c5 fd e9 07 00 3
    [ 19.950166] RSP: 0018:ffff88811a3efda0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
    [ 19.950909] RAX: 0000000080000000 RBX: ffff88811a3a9600 RCX: ffffffff8385327e
    [ 19.951608] RDX: 1ffff110234752c0 RSI: 0000000000000000 RDI: ffffffff8385262f
    [ 19.952309] RBP: ffffed10234752c0 R08: 0000000000000001 R09: ffffed10234752c1
    [ 19.953009] R10: ffffed10234752c0 R11: ffff88811a3a9607 R12: 0000000000000001
    [ 19.953709] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
    [ 19.954408] ? default_idle_call+0x2e/0x70
    [ 19.954816] ? default_idle+0x1f/0x2d0
    [ 19.955192] default_idle_call+0x5e/0x70
    [ 19.955584] do_idle+0x3d4/0x500
    [ 19.955909] ? arch_cpu_idle_exit+0x40/0x40
    [ 19.956325] ? _raw_spin_unlock_irqrestore+0x23/0x30
    [ 19.956829] ? trace_hardirqs_on+0x30/0x160
    [ 19.957242] cpu_startup_entry+0x19/0x20
    [ 19.957633] start_secondary+0x2a6/0x380
    [ 19.958026] ? set_cpu_sibling_map+0x18b0/0x18b0
    [ 19.958486] secondary_startup_64+0xa4/0xb0
    [ 19.958921]
    [ 19.959078] Allocated by task 33:
    [ 19.959412] save_stack+0x1b/0x80
    [ 19.959747] __kasan_kmalloc.constprop.0+0xc2/0xd0
    [ 19.960222] kmem_cache_alloc+0xe4/0x230
    [ 19.960617] __alloc_skb+0x91/0x510
    [ 19.960967] ndisc_alloc_skb+0x133/0x330
    [ 19.961358] ndisc_send_ns+0x134/0x810
    [ 19.961735] addrconf_dad_work+0xad5/0xf80
    [ 19.962144] process_one_work+0x78e/0x13a0
    [ 19.962551] worker_thread+0x8f/0xfa0
    [ 19.962919] kthread+0x2ba/0x3b0
    [ 19.963242] ret_from_fork+0x3a/0x50
    [ 19.963596]
    [ 19.963753] Freed by task 33:
    [ 19.964055] save_stack+0x1b/0x80
    [ 19.964386] __kasan_slab_free+0x12f/0x180
    [ 19.964830] kmem_cache_free+0x80/0x290
    [ 19.965231] ip6_mc_input+0x38a/0x4d0
    [ 19.965617] ipv6_rcv+0x1a4/0x1d0
    [ 19.965948] __netif_receive_skb_one_core+0xf2/0x180
    [ 19.966437] netif_receive_skb+0x8c/0x3c0
    [ 19.966846] br_handle_frame_finish+0x779/0x1310
    [ 19.967302] br_handle_frame+0x42a/0x830
    [ 19.967694] __netif_receive_skb_core+0xf0e/0x2a90
    [ 19.968167] __netif_receive_skb_one_core+0x96/0x180
    [ 19.968658] process_backlog+0x198/0x650
    [ 19.969047] net_rx_action+0x2fa/0xaa0
    [ 19.969420] __do_softirq+0x268/0x7bc
    [ 19.969785]
    [ 19.969940] The buggy address belongs to the object at ffff888112862840
    [ 19.969940] which belongs to the cache skbuff_head_cache of size 224
    [ 19.971202] The buggy address is located 140 bytes inside of
    [ 19.971202] 224-byte region [ffff888112862840, ffff888112862920)
    [ 19.972344] The buggy address belongs to the page:
    [ 19.972820] page:ffffea00044a1800 refcount:1 mapcount:0 mapping:ffff88811a2bd1c0 index:0xffff8881128625c0 compo0
    [ 19.973930] flags: 0x8000000000010200(slab|head)
    [ 19.974388] raw: 8000000000010200 ffff88811a2ed650 ffff88811a2ed650 ffff88811a2bd1c0
    [ 19.975151] raw: ffff8881128625c0 0000000000190013 00000001ffffffff 0000000000000000
    [ 19.975915] page dumped because: kasan: bad access detected
    [ 19.976461] page_owner tracks the page as allocated
    [ 19.976946] page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NO)
    [ 19.978332] prep_new_page+0x24b/0x330
    [ 19.978707] get_page_from_freelist+0x2057/0x2c90
    [ 19.979170] __alloc_pages_nodemask+0x218/0x590
    [ 19.979619] new_slab+0x9d/0x300
    [ 19.979948] ___slab_alloc.constprop.0+0x2f9/0x6f0
    [ 19.980421] __slab_alloc.constprop.0+0x30/0x60
    [ 19.980870] kmem_cache_alloc+0x201/0x230
    [ 19.981269] __alloc_skb+0x91/0x510
    [ 19.981620] alloc_skb_with_frags+0x78/0x4a0
    [ 19.982043] sock_alloc_send_pskb+0x5eb/0x750
    [ 19.982476] unix_stream_sendmsg+0x399/0x7f0
    [ 19.982904] sock_sendmsg+0xe2/0x110
    [ 19.983262] ____sys_sendmsg+0x4de/0x6d0
    [ 19.983660] ___sys_sendmsg+0xe4/0x160
    [ 19.984032] __sys_sendmsg+0xab/0x130
    [ 19.984396] do_syscall_64+0xe7/0xae0
    [ 19.984761] page last free stack trace:
    [ 19.985142] __free_pages_ok+0x432/0xbc0
    [ 19.985533] qlist_free_all+0x56/0xc0
    [ 19.985907] quarantine_reduce+0x149/0x170
    [ 19.986315] __kasan_kmalloc.constprop.0+0x9e/0xd0
    [ 19.986791] kmem_cache_alloc+0xe4/0x230
    [ 19.987182] prepare_creds+0x24/0x440
    [ 19.987548] do_faccessat+0x80/0x590
    [ 19.987906] do_syscall_64+0xe7/0xae0
    [ 19.988276] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 19.988775]
    [ 19.988930] Memory state around the buggy address:
    [ 19.989402] ffff888112862780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [ 19.990111] ffff888112862800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
    [ 19.990822] >ffff888112862880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 19.991529] ^
    [ 19.992081] ffff888112862900: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
    [ 19.992796] ffff888112862980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

    Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
    Reported-by: Michael Schmidt
    Signed-off-by: Vinicius Costa Gomes
    Acked-by: Andre Guedes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     

08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

05 Mar, 2020

1 commit

  • [ Upstream commit 8a9093c79863b58cc2f9874d7ae788f0d622a596 ]

    tc flower rules that are based on src or dst port blocking are sometimes
    ineffective due to uninitialized stack data. __skb_flow_dissect() extracts
    ports from the skb for tc flower to match against. However, the port
    dissection is not done when when the FLOW_DIS_IS_FRAGMENT bit is set in
    key_control->flags. All callers of __skb_flow_dissect(), zero-out the
    key_control field except for fl_classify() as used by the flower
    classifier. Thus, the FLOW_DIS_IS_FRAGMENT may be set on entry to
    __skb_flow_dissect(), since key_control is allocated on the stack
    and may not be initialized.

    Since key_basic and key_control are present for all flow keys, let's
    make sure they are initialized.

    Fixes: 62230715fd24 ("flow_dissector: do not dissect l4 ports for fragments")
    Co-developed-by: Eric Dumazet
    Signed-off-by: Eric Dumazet
    Acked-by: Cong Wang
    Signed-off-by: Jason Baron
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason Baron
     

24 Feb, 2020

2 commits

  • [ Upstream commit e2debf0852c4d66ba1a8bde12869b196094c70a7 ]

    unlike other classifiers that can be offloaded (i.e. users can set flags
    like 'skip_hw' and 'skip_sw'), 'cls_flower' doesn't validate the size of
    netlink attribute 'TCA_FLOWER_FLAGS' provided by user: add a proper entry
    to fl_policy.

    Fixes: 5b33f48842fa ("net/flower: Introduce hardware offload support")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 1afa3cc90f8fb745c777884d79eaa1001d6927a6 ]

    unlike other classifiers that can be offloaded (i.e. users can set flags
    like 'skip_hw' and 'skip_sw'), 'cls_matchall' doesn't validate the size
    of netlink attribute 'TCA_MATCHALL_FLAGS' provided by user: add a proper
    entry to mall_policy.

    Fixes: b87f7936a932 ("net/sched: Add match-all classifier hw offloading.")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

11 Feb, 2020

6 commits

  • [ Upstream commit bfabd41da34180d05382312533a3adc2e012dee0 ]

    When using taprio offloading together with ETF offloading, configured
    like this, for example:

    $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
    num_tc 4 \
    map 2 2 1 0 3 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 1@2 1@3 \
    base-time $BASE_TIME \
    sched-entry S 01 1000000 \
    sched-entry S 0e 1000000 \
    flags 0x2

    $ tc qdisc replace dev $IFACE parent 100:1 etf \
    offload delta 300000 clockid CLOCK_TAI

    During enqueue, it works out that the verification added for the
    "txtime" assisted mode is run when using taprio + ETF offloading, the
    only thing missing is initializing the 'next_txtime' of all the cycle
    entries. (if we don't set 'next_txtime' all packets from SO_TXTIME
    sockets are dropped)

    Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit 7c16680a08ee1e444a67d232c679ccf5b30fad16 ]

    When destroying the current taprio instance, which can happen when the
    creation of one fails, we should reset the traffic class configuration
    back to the default state.

    netdev_reset_tc() is a better way because in addition to setting the
    number of traffic classes to zero, it also resets the priority to
    traffic classes mapping to the default value.

    Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit 49c684d79cfdc3032344bf6f3deeea81c4efedbf ]

    netlink policy validation for the 'flags' argument was missing.

    Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit a9d6227436f32142209f4428f2dc616761485112 ]

    Because 'q->flags' starts as zero, and zero is a valid value, we
    aren't able to detect the transition from zero to something else
    during "runtime".

    The solution is to initialize 'q->flags' with an invalid value, so we
    can detect if 'q->flags' was set by the user or not.

    To better solidify the behavior, 'flags' handling is moved to a
    separate function. The behavior is:
    - 'flags' if unspecified by the user, is assumed to be zero;
    - 'flags' cannot change during "runtime" (i.e. a change() request
    cannot modify it);

    With this new function we can remove taprio_flags, which should reduce
    the risk of future accidents.

    Allowing flags to be changed was causing the following RCU stall:

    [ 1730.558249] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
    [ 1730.558258] rcu: 6-...0: (190 ticks this GP) idle=922/0/0x1 softirq=25580/25582 fqs=16250
    [ 1730.558264] (detected by 2, t=65002 jiffies, g=33017, q=81)
    [ 1730.558269] Sending NMI from CPU 2 to CPUs 6:
    [ 1730.559277] NMI backtrace for cpu 6
    [ 1730.559277] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G E 5.5.0-rc6+ #35
    [ 1730.559278] Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS ULTRA/Z390 AORUS ULTRA-CF, BIOS F7 03/14/2019
    [ 1730.559278] RIP: 0010:__hrtimer_run_queues+0xe2/0x440
    [ 1730.559278] Code: 48 8b 43 28 4c 89 ff 48 8b 75 c0 48 89 45 c8 e8 f4 bb 7c 00 0f 1f 44 00 00 65 8b 05 40 31 f0 68 89 c0 48 0f a3 05 3e 5c 25 01 82 fc 01 00 00 48 8b 45 c8 48 89 df ff d0 89 45 c8 0f 1f 44 00
    [ 1730.559279] RSP: 0018:ffff9970802d8f10 EFLAGS: 00000083
    [ 1730.559279] RAX: 0000000000000006 RBX: ffff8b31645bff38 RCX: 0000000000000000
    [ 1730.559280] RDX: 0000000000000000 RSI: ffffffff9710f2ec RDI: ffffffff978daf0e
    [ 1730.559280] RBP: ffff9970802d8f68 R08: 0000000000000000 R09: 0000000000000000
    [ 1730.559280] R10: 0000018336d7944e R11: 0000000000000001 R12: ffff8b316e39f9c0
    [ 1730.559281] R13: ffff8b316e39f940 R14: ffff8b316e39f998 R15: ffff8b316e39f7c0
    [ 1730.559281] FS: 0000000000000000(0000) GS:ffff8b316e380000(0000) knlGS:0000000000000000
    [ 1730.559281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1730.559281] CR2: 00007f1105303760 CR3: 0000000227210005 CR4: 00000000003606e0
    [ 1730.559282] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1730.559282] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 1730.559282] Call Trace:
    [ 1730.559282]
    [ 1730.559283] ? taprio_dequeue_soft+0x2d0/0x2d0 [sch_taprio]
    [ 1730.559283] hrtimer_interrupt+0x104/0x220
    [ 1730.559283] ? irqtime_account_irq+0x34/0xa0
    [ 1730.559283] smp_apic_timer_interrupt+0x6d/0x230
    [ 1730.559284] apic_timer_interrupt+0xf/0x20
    [ 1730.559284]
    [ 1730.559284] RIP: 0010:cpu_idle_poll+0x35/0x1a0
    [ 1730.559285] Code: 88 82 ff 65 44 8b 25 12 7d 73 68 0f 1f 44 00 00 e8 90 c3 89 ff fb 65 48 8b 1c 25 c0 7e 01 00 48 8b 03 a8 08 74 0b eb 1c f3 90 8b 03 a8 08 75 13 8b 05 be a8 a8 00 85 c0 75 ed e8 75 48 84 ff
    [ 1730.559285] RSP: 0018:ffff997080137ea8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
    [ 1730.559285] RAX: 0000000000000001 RBX: ffff8b316bc3c580 RCX: 0000000000000000
    [ 1730.559286] RDX: 0000000000000001 RSI: 000000002819aad9 RDI: ffffffff978da730
    [ 1730.559286] RBP: ffff997080137ec0 R08: 0000018324a6d387 R09: 0000000000000000
    [ 1730.559286] R10: 0000000000000400 R11: 0000000000000001 R12: 0000000000000006
    [ 1730.559286] R13: ffff8b316bc3c580 R14: 0000000000000000 R15: 0000000000000000
    [ 1730.559287] ? cpu_idle_poll+0x20/0x1a0
    [ 1730.559287] ? cpu_idle_poll+0x20/0x1a0
    [ 1730.559287] do_idle+0x4d/0x1f0
    [ 1730.559287] ? complete+0x44/0x50
    [ 1730.559288] cpu_startup_entry+0x1b/0x20
    [ 1730.559288] start_secondary+0x142/0x180
    [ 1730.559288] secondary_startup_64+0xb6/0xc0
    [ 1776.686313] nvme nvme0: I/O 96 QID 1 timeout, completion polled

    Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
    Signed-off-by: Vinicius Costa Gomes
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit 5652e63df3303c2a702bac25fbf710b9cb64dfba ]

    If the driver implementing taprio offloading depends on the value of
    the network device number of traffic classes (dev->num_tc) for
    whatever reason, it was going to receive the value zero. The value was
    only set after the offloading function is called.

    So, moving setting the number of traffic classes to before the
    offloading function is called fixes this issue. This is safe because
    this only happens when taprio is instantiated (we don't allow this
    configuration to be changed without first removing taprio).

    Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
    Reported-by: Po Liu
    Signed-off-by: Vinicius Costa Gomes
    Acked-by: Vladimir Oltean
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vinicius Costa Gomes
     
  • [ Upstream commit 52b5ae501c045010aeeb1d5ac0373ff161a88291 ]

    Jakub noticed there is a potential resource leak in
    tcindex_set_parms(): when tcindex_filter_result_init() fails
    and it jumps to 'errout1' which doesn't release the memory
    and resources allocated by tcindex_alloc_perfect_hash().

    We should just jump to 'errout_alloc' which calls
    tcindex_free_perfect_hash().

    Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()")
    Reported-by: Jakub Kicinski
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang