11 Nov, 2017

2 commits

  • The current code does not return after successfully preparing the VLAN
    addition on every ports member of a it. Fix this.

    Fixes: 1ca4aa9cd4cc ("net: dsa: check VLAN capability of every switch")
    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • The current code does not return after successfully preparing the MDB
    addition on every ports member of a multicast group. Fix this.

    Fixes: a1a6b7ea7f2d ("net: dsa: add cross-chip multicast support")
    Reported-by: Egil Hjelmeland
    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     

10 Nov, 2017

6 commits

  • This patch fixes the cause of an WARNING indicatng TCP has pending
    retransmission in Open state in tcp_fastretrans_alert().

    The root cause is a bad interaction between path mtu probing,
    if enabled, and the RACK loss detection. Upong receiving a SACK
    above the sequence of the MTU probing packet, RACK could mark the
    probe packet lost in tcp_fastretrans_alert(), prior to calling
    tcp_simple_retransmit().

    tcp_simple_retransmit() only enters Loss state if it newly marks
    the probe packet lost. If the probe packet is already identified as
    lost by RACK, the sender remains in Open state with some packets
    marked lost and retransmitted. Then the next SACK would trigger
    the warning. The likely scenario is that the probe packet was
    lost due to its size or network congestion. The actual impact of
    this warning is small by potentially entering fast recovery an
    ACK later.

    The simple fix is always entering recovery (Loss) state if some
    packet is marked lost during path MTU probing.

    Fixes: a0370b3f3f2c ("tcp: enable RACK loss detection to trigger recovery")
    Reported-by: Oleksandr Natalenko
    Reported-by: Alexei Starovoitov
    Reported-by: Roman Gushchin
    Signed-off-by: Yuchung Cheng
    Reviewed-by: Eric Dumazet
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Yuchung Cheng
     
  • When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
    and N2, we want to transfer socket ownership to the new fresh skbs.

    In order to avoid expensive atomic operations on a cache line subject to
    cache bouncing, we replace the sequence :

    refcount_add(N1, &sk->sk_wmem_alloc);
    refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments

    refcount_sub(O, &sk->sk_wmem_alloc);

    by a single

    refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);

    Problem is :

    In some pathological cases, sum(N) - O might be a negative number, and
    syzkaller bot was apparently able to trigger this trace [1]

    atomic_t was ok with this construct, but we need to take care of the
    negative delta with refcount_t

    [1]
    refcount_t: saturated; leaking memory.
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    panic+0x1e4/0x41c kernel/panic.c:183
    __warn+0x1c4/0x1e0 kernel/panic.c:546
    report_bug+0x211/0x2d0 lib/bug.c:183
    fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
    do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
    do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
    do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
    do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
    invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
    RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
    RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
    RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
    RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
    RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
    R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
    R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
    refcount_add+0x1b/0x60 lib/refcount.c:101
    tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
    tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
    inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
    skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
    __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
    skb_gso_segment include/linux/netdevice.h:3971 [inline]
    validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
    __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
    dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
    neigh_hh_output include/net/neighbour.h:471 [inline]
    neigh_output include/net/neighbour.h:479 [inline]
    ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
    ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
    NF_HOOK_COND include/linux/netfilter.h:238 [inline]
    ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
    dst_output include/net/dst.h:459 [inline]
    ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
    ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
    tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
    tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
    __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
    tcp_push_pending_frames include/net/tcp.h:1722 [inline]
    tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
    tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
    tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
    sk_backlog_rcv include/net/sock.h:909 [inline]
    __release_sock+0x124/0x360 net/core/sock.c:2264
    release_sock+0xa4/0x2a0 net/core/sock.c:2776
    tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
    inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
    sock_sendmsg_nosec net/socket.c:632 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:642
    ___sys_sendmsg+0x31c/0x890 net/socket.c:2048
    __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138

    Fixes: 14afee4b6092 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • rds_ib_recv_refill() is a function that refills an IB receive
    queue. It can be called from both the CQE handler (tasklet) and a
    worker thread.

    Just after the call to ib_post_recv(), a debug message is printed with
    rdsdebug():

    ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
    rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
    recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
    (long) ib_sg_dma_address(
    ic->i_cm_id->device,
    &recv->r_frag->f_sg),
    ret);

    Now consider an invocation of rds_ib_recv_refill() from the worker
    thread, which is preemptible. Further, assume that the worker thread
    is preempted between the ib_post_recv() and rdsdebug() statements.

    Then, if the preemption is due to a receive CQE event, the
    rds_ib_recv_cqe_handler() will be invoked. This function processes
    receive completions, including freeing up data structures, such as the
    recv->r_frag.

    In this scenario, rds_ib_recv_cqe_handler() will process the receive
    WR posted above. That implies, that the recv->r_frag has been freed
    before the above rdsdebug() statement has been executed. When it is
    later executed, we will have a NULL pointer dereference:

    [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
    [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
    [ 4088.082686] PGD 0 P4D 0
    [ 4088.085515] Oops: 0000 [#1] SMP
    [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
    [ 4088.168486] fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
    [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G OE 4.14.0-rc7.master.20171105.ol7.x86_64 #1
    [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
    [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
    [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
    [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
    [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
    [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
    [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
    [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
    [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
    [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
    [ 4088.280397] FS: 0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
    [ 4088.289434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
    [ 4088.303816] Call Trace:
    [ 4088.306557] rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
    [ 4088.312982] ? __dynamic_pr_debug+0x8c/0xb0
    [ 4088.317664] ? __queue_work+0x142/0x3c0
    [ 4088.321944] rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
    [ 4088.328370] cma_ib_handler+0xcd/0x280 [rdma_cm]
    [ 4088.333522] cm_process_work+0x25/0x120 [ib_cm]
    [ 4088.338580] cm_work_handler+0xd6b/0x17aa [ib_cm]
    [ 4088.343832] process_one_work+0x149/0x360
    [ 4088.348307] worker_thread+0x4d/0x3e0
    [ 4088.352397] kthread+0x109/0x140
    [ 4088.355996] ? rescuer_thread+0x380/0x380
    [ 4088.360467] ? kthread_park+0x60/0x60
    [ 4088.364563] ret_from_fork+0x25/0x30
    [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
    [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
    [ 4088.397772] CR2: 0000000000000020
    [ 4088.401505] ---[ end trace fe922e6ccf004431 ]---

    This bug was provoked by compiling rds out-of-tree with
    EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
    between the rdsdebug() and ib_ib_port_recv() statements:

    /* XXX when can this fail? */
    ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
    + if (can_wait)
    + usleep_range(1000, 5000);
    rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
    recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
    (long) ib_sg_dma_address(

    The fix is simply to move the rdsdebug() statement up before the
    ib_post_recv() and remove the printing of ret, which is taken care of
    anyway by the non-debug code.

    Signed-off-by: Håkon Bugge
    Reviewed-by: Knut Omang
    Reviewed-by: Wei Lin Guay
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Håkon Bugge
     
  • Pull final power management fixes from Rafael Wysocki:
    "These fix a regression in the schedutil cpufreq governor introduced by
    a recent change and blacklist Dell XPS13 9360 from using the Low Power
    S0 Idle _DSM interface which triggers serious problems on one of these
    machines.

    Specifics:

    - Prevent the schedutil cpufreq governor from using the utilization
    of a wrong CPU in some cases which started to happen after one of
    the recent changes in it (Chris Redpath).

    - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
    interface as that causes serious issue (related to NVMe) to appear
    on one of these machines, even though the other Dells XPS13 9360 in
    somewhat different HW configurations behave correctly (Rafael
    Wysocki)"

    * tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
    cpufreq: schedutil: Examine the correct CPU when we update util

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "The amount of the changes isn't as quite small as wished, nevertheless
    they are straight fixes that deserve merging to 4.14 final.

    Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
    fix for the previous nested rwsem patch, a fix to avoid the resource
    hogs due to too many concurrent ALSA timer invocations, and a fix for
    a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
    used by none but fuzzer.

    The rest are usual HD-audio and USB-audio device-specific quirks,
    which are safe to apply"

    * tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda - fix headset mic problem for Dell machines with alc274
    ALSA: seq: Fix OSS sysex delivery in OSS emulation
    ALSA: seq: Avoid invalid lockdep class warning
    ALSA: timer: Limit max instances per timer
    ALSA: usb-audio: support new Amanero Combo384 firmware version

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix use-after-free in IPSEC input parsing, desintation address
    pointer was loaded before pskb_may_pull() which can change the SKB
    data pointers. From Florian Westphal.

    2) Stack out-of-bounds read in xfrm_state_find(), from Steffen
    Klassert.

    3) IPVS state of SKB is not properly reset when moving between
    namespaces, from Ye Yin.

    4) Fix crash in asix driver suspend and resume, from Andrey Konovalov.

    5) Don't deliver ipv6 l2tp tunnel packets to ipv4 l2tp tunnels, and
    vice versa, from Guillaume Nault.

    6) Fix DSACK undo on non-dup ACKs, from Priyaranjan Jha.

    7) Fix regression in bond_xmit_hash()'s behavior after the TCP port
    selection changes back in 4.2, from Hangbin Liu.

    8) Two divide by zero bugs in USB networking drivers when parsing
    descriptors, from Bjorn Mork.

    9) Fix bonding slaves being stuck in BOND_LINK_FAIL state, from Jay
    Vosburgh.

    10) Missing skb_reset_mac_header() in qmi_wwan, from Kristian Evensen.

    11) Fix the destruction of tc action object races properly, from Cong
    Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
    cls_u32: use tcf_exts_get_net() before call_rcu()
    cls_tcindex: use tcf_exts_get_net() before call_rcu()
    cls_rsvp: use tcf_exts_get_net() before call_rcu()
    cls_route: use tcf_exts_get_net() before call_rcu()
    cls_matchall: use tcf_exts_get_net() before call_rcu()
    cls_fw: use tcf_exts_get_net() before call_rcu()
    cls_flower: use tcf_exts_get_net() before call_rcu()
    cls_flow: use tcf_exts_get_net() before call_rcu()
    cls_cgroup: use tcf_exts_get_net() before call_rcu()
    cls_bpf: use tcf_exts_get_net() before call_rcu()
    cls_basic: use tcf_exts_get_net() before call_rcu()
    net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
    Revert "net_sched: hold netns refcnt for each action"
    net: usb: asix: fill null-ptr-deref in asix_suspend
    Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
    qmi_wwan: Add missing skb_reset_mac_header-call
    bonding: fix slave stuck in BOND_LINK_FAIL state
    qrtr: Move to postcore_initcall
    net: qmi_wwan: fix divide by 0 on bad descriptors
    net: cdc_ether: fix divide by 0 on bad descriptors
    ...

    Linus Torvalds
     

09 Nov, 2017

22 commits

  • Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
    the pin 0x1a is for Headphone Mic, he suggested to apply
    ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
    verified applying this FIXUP can fix this problem.

    Cc:
    Cc: Kailang Yang
    Signed-off-by: Hui Wang
    Signed-off-by: Takashi Iwai

    Hui Wang
     
  • Steffen Klassert says:

    ====================
    pull request (net): ipsec 2017-11-09

    1) Fix a use after free due to a reallocated skb head.
    From Florian Westphal.

    2) Fix sporadic lookup failures on labeled IPSEC.
    From Florian Westphal.

    3) Fix a stack out of bounds when a socket policy is applied
    to an IPv6 socket that sends IPv4 packets.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Cong Wang says:

    ====================
    net_sched: close the race between call_rcu() and cleanup_net()

    This patchset tries to fix the race between call_rcu() and
    cleanup_net() again. Without holding the netns refcnt the
    tc_action_net_exit() in netns workqueue could be called before
    filter destroy works in tc filter workqueue. This patchset
    moves the netns refcnt from tc actions to tcf_exts, without
    breaking per-netns tc actions.

    Patch 1 reverts the previous fix, patch 2 introduces two new
    API's to help to address the bug and the rest patches switch
    to the new API's. Please see each patch for details.

    I was not able to reproduce this bug, but now after adding
    some delay in filter destroy work I manage to trigger the
    crash. After this patchset, the crash is not reproducible
    any more and the debugging printk's show the order is expected
    too.
    ====================

    Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions")
    Reported-by: Lucas Bates
    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.

    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Instead of holding netns refcnt in tc actions, we can minimize
    the holding time by saving it in struct tcf_exts instead. This
    means we can just hold netns refcnt right before call_rcu() and
    release it after tcf_exts_destroy() is done.

    However, because on netns cleanup path we call tcf_proto_destroy()
    too, obviously we can not hold netns for a zero refcnt, in this
    case we have to do cleanup synchronously. It is fine for RCU too,
    the caller cleanup_net() already waits for a grace period.

    For other cases, refcnt is non-zero and we can safely grab it as
    normal and release it after we are done.

    This patch provides two new API for each filter to use:
    tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
    use the following pattern:

    void __destroy_filter() {
    tcf_exts_destroy();
    tcf_exts_put_net(); //
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59.
    If we hold that refcnt, the netns can never be destroyed until
    all actions are destroyed by user, this breaks our netns design
    which we expect all actions are destroyed when we destroy the
    whole netns.

    Cc: Lucas Bates
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • When asix_suspend() is called dev->driver_priv might not have been
    assigned a value, so we need to check that it's not NULL.

    Similar issue is present in asix_resume(), this patch fixes it as well.

    Found by syzkaller.

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    Modules linked in:
    CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: usb_hub_wq hub_event
    task: ffff88006bb36300 task.stack: ffff88006bba8000
    RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
    RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
    RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
    RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
    R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
    FS: 0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
    Call Trace:
    usb_suspend_interface drivers/usb/core/driver.c:1209
    usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
    usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
    __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
    rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
    rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
    __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
    pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
    usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
    hub_port_connect drivers/usb/core/hub.c:4903
    hub_port_connect_change drivers/usb/core/hub.c:5009
    port_event drivers/usb/core/hub.c:5115
    hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
    process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
    worker_thread+0x221/0x1850 kernel/workqueue.c:2253
    kthread+0x3a1/0x470 kernel/kthread.c:231
    ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
    Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
    00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03
    3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
    RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
    ---[ end trace dfc4f5649284342c ]---

    Signed-off-by: Andrey Konovalov
    Signed-off-by: David S. Miller

    Andrey Konovalov
     
  • This reverts commit baedf68a068ca29624f241426843635920f16e1d.

    There is an updated version of this fix which covers
    the problem more thoroughly.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * pm-cpufreq-sched:
    cpufreq: schedutil: Examine the correct CPU when we update util

    Rafael J. Wysocki
     
  • Commit 7744ccdbc16f0 ("x86/mm: Add Secure Memory Encryption (SME)
    support") as a side-effect made PAGE_KERNEL all of a sudden unavailable
    to modules which can't make use of EXPORT_SYMBOL_GPL() symbols.

    This is because once SME is enabled, sme_me_mask (which is introduced as
    EXPORT_SYMBOL_GPL) makes its way to PAGE_KERNEL through _PAGE_ENC,
    causing imminent build failure for all the modules which make use of all
    the EXPORT-SYMBOL()-exported API (such as vmap(), __vmalloc(),
    remap_pfn_range(), ...).

    Exporting (as EXPORT_SYMBOL()) interfaces (and having done so for ages)
    that take pgprot_t argument, while making it impossible to -- all of a
    sudden -- pass PAGE_KERNEL to it, feels rather incosistent.

    Restore the original behavior and make it possible to pass PAGE_KERNEL
    to all its EXPORT_SYMBOL() consumers.

    [ This is all so not wonderful. We shouldn't need that "sme_me_mask"
    access at all in all those places that really don't care about that
    level of detail, and just want _PAGE_KERNEL or whatever.

    We have some similar issues with _PAGE_CACHE_WP and _PAGE_NOCACHE,
    both of which hide a "cachemode2protval()" call, and which also ends
    up using another EXPORT_SYMBOL(), but at least that only triggers for
    the much more rare cases.

    Maybe we could move these dynamic page table bits to be generated much
    deeper down in the VM layer, instead of hiding them in the macros that
    everybody uses.

    So this all would merit some cleanup. But not today. - Linus ]

    Cc: Tom Lendacky
    Signed-off-by: Jiri Kosina
    Despised-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     
  • …jmorris/linux-security

    Pull key handling fix from James Morris:
    "Fix by Eric Biggers for the keys subsystem"

    * 'fixes-v4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2]

    Linus Torvalds
     
  • This came in yesterday, and I have verified our regression tests
    were missing this and it can cause an oops. Please apply.

    There is a an off-by-one comparision on sig against MAXMAPPED_SIG
    that can lead to a read outside the sig_map array if sig
    is MAXMAPPED_SIG. Fix this.

    Verified that the check is an out of bounds case that can cause an oops.

    Revised: add comparison fix to second case
    Fixes: cd1dbf76b23d ("apparmor: add the ability to mediate signals")
    Signed-off-by: Colin Ian King
    Signed-off-by: John Johansen
    Signed-off-by: Linus Torvalds

    John Johansen
     

08 Nov, 2017

10 commits

  • syzkaller reported a NULL pointer dereference in asn1_ber_decoder(). It
    can be reproduced by the following command, assuming
    CONFIG_PKCS7_TEST_KEY=y:

    keyctl add pkcs7_test desc '' @s

    The bug is that if the data buffer is empty, an integer underflow occurs
    in the following check:

    if (unlikely(dp >= datalen - 1))
    goto data_overrun_error;

    This results in the NULL data pointer being dereferenced.

    Fix it by checking for 'datalen - dp < 2' instead.

    Also fix the similar check for 'dp >= datalen - n' later in the same
    function. That one possibly could result in a buffer overread.

    The NULL pointer dereference was reproducible using the "pkcs7_test" key
    type but not the "asymmetric" key type because the "asymmetric" key type
    checks for a 0-length payload before calling into the ASN.1 decoder but
    the "pkcs7_test" key type does not.

    The bug report was:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
    task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
    RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
    RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
    RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS: 00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
    Call Trace:
    pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
    verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
    pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
    key_create_or_update+0x180/0x530 security/keys/key.c:855
    SYSC_add_key security/keys/keyctl.c:122 [inline]
    SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4585c9
    RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
    RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
    RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
    R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
    Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
    RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
    CR2: 0000000000000000

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Reported-by: syzbot
    Cc: # v3.7+
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Signed-off-by: James Morris

    Eric Biggers
     
  • When we receive a packet on a QMI device in raw IP mode, we should call
    skb_reset_mac_header() to ensure that skb->mac_header contains a valid
    offset in the packet. While it shouldn't really matter, the packets have
    no MAC header and the interface is configured as-such, it seems certain
    parts of the network stack expects a "good" value in skb->mac_header.

    Without the skb_reset_mac_header() call added in this patch, for example
    shaping traffic (using tc) triggers the following oops on the first
    received packet:

    [ 303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
    [ 303.655045] Kernel bug detected[#1]:
    [ 303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
    [ 303.664339] task: 8fdf05e0 task.stack: 8f15c000
    [ 303.668844] $ 0 : 00000000 00000001 0000007a 00000000
    [ 303.674062] $ 4 : 8149a2fc 8149a2fc 8149ce20 00000000
    [ 303.679284] $ 8 : 00000030 3878303a 31623465 20303235
    [ 303.684510] $12 : ded731e3 2626a277 00000000 03bd0000
    [ 303.689747] $16 : 8ef62b40 00000043 8f137918 804db5fc
    [ 303.694978] $20 : 00000001 00000004 8fc13800 00000003
    [ 303.700215] $24 : 00000001 8024ab10
    [ 303.705442] $28 : 8f15c000 8fc19cf0 00000043 802cc920
    [ 303.710664] Hi : 00000000
    [ 303.713533] Lo : 74e58000
    [ 303.716436] epc : 802cc920 skb_panic+0x58/0x5c
    [ 303.721046] ra : 802cc920 skb_panic+0x58/0x5c
    [ 303.725639] Status: 11007c03 KERNEL EXL IE
    [ 303.729823] Cause : 50800024 (ExcCode 09)
    [ 303.733817] PrId : 0001992f (MIPS 1004Kc)
    [ 303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
    Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
    [ 303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
    [ 303.970871] 8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
    [ 303.979219] 8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
    [ 303.987568] 8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
    [ 303.995934] 00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
    [ 304.004324] ...
    [ 304.006767] Call Trace:
    [ 304.009241] [] skb_panic+0x58/0x5c
    [ 304.013504] [] skb_push+0x78/0x90
    [ 304.017783] [] 0x8f137918
    [ 304.021269] Code: 00602825 0c02a3b4 24842888 8c870060 8c8200a0 0007382b 00070336 8c88005c
    [ 304.031034]
    [ 304.032805] ---[ end trace b778c482b3f0bda9 ]---
    [ 304.041384] Kernel panic - not syncing: Fatal exception in interrupt
    [ 304.051975] Rebooting in 3 seconds..

    While the oops is for a 4.9-kernel, I was able to trigger the same oops with
    net-next as of yesterday.

    Fixes: 32f7adf633b9 ("net: qmi_wwan: support "raw IP" mode")
    Signed-off-by: Kristian Evensen
    Acked-by: Bjørn Mork
    Signed-off-by: David S. Miller

    Kristian Evensen
     
  • The bonding miimon logic has a flaw, in that a failure of the
    rtnl_trylock can cause a slave to become permanently stuck in
    BOND_LINK_FAIL state.

    The sequence of events to cause this is as follows:

    1) bond_miimon_inspect finds that a slave's link is down, and so
    calls bond_propose_link_state, setting slave->new_link_state to
    BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
    non-zero.

    2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
    rescheduled. No change is committed.

    3) bond_miimon_inspect is called again, but this time the slave
    from step 1 has recovered. slave->new_link is reset to NOCHANGE, and, as
    slave->link was never changed, the switch enters the BOND_LINK_UP case,
    and does nothing. The pending BOND_LINK_FAIL state from step 1 remains
    pending, as new_link_state is not reset.

    4) The state from step 3 persists until another slave changes link
    state and causes bond_miimon_inspect to return non-zero. At this point,
    the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
    and the slave will remain stuck in BOND_LINK_FAIL state even though it
    is actually link up.

    The remedy for this is to initialize new_link_state on each entry
    to bond_miimon_inspect, as is already done with new_link.

    Fixes: fb9eb899a6dc ("bonding: handle link transition from FAIL to UP correctly")
    Reported-by: Alex Sidorenko
    Reviewed-by: Jarod Wilson
    Signed-off-by: Jay Vosburgh
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller

    Jay Vosburgh
     
  • Registering qrtr with module_init makes the ability of typical platform
    code to create AF_QIPCRTR socket during probe a matter of link order
    luck. Moving qrtr to postcore_initcall() avoids this.

    Signed-off-by: Bjorn Andersson
    Signed-off-by: David S. Miller

    Bjorn Andersson
     
  • A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
    cause a divide error in usbnet_probe:

    divide error: 0000 [#1] PREEMPT SMP KASAN
    Modules linked in:
    CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: usb_hub_wq hub_event
    task: ffff88006bef5c00 task.stack: ffff88006bf60000
    RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
    RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
    RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
    RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
    RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
    R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
    R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
    Call Trace:
    usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
    qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
    usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
    really_probe drivers/base/dd.c:413
    driver_probe_device+0x522/0x740 drivers/base/dd.c:557

    Fix by simply ignoring the bogus descriptor, as it is optional
    for QMI devices anyway.

    Fixes: 423ce8caab7e ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
    Reported-by: Andrey Konovalov
    Signed-off-by: Bjørn Mork
    Signed-off-by: David S. Miller

    Bjørn Mork
     
  • Setting dev->hard_mtu to 0 will cause a divide error in
    usbnet_probe. Protect against devices with bogus CDC Ethernet
    functional descriptors by ignoring a zero wMaxSegmentSize.

    Signed-off-by: Bjørn Mork
    Acked-by: Oliver Neukum
    Signed-off-by: David S. Miller

    Bjørn Mork
     
  • After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range
    in connect()"), we will try to use even ports for connect(). Then if an
    application (seen clearly with iperf) opens multiple streams to the same
    destination IP and port, each stream will be given an even source port.

    So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
    will always hash all these streams to the same interface. And the total
    throughput will limited to a single slave.

    Change the tcp code will impact the whole tcp behavior, only for bonding
    usage. Paolo Abeni suggested fix this by changing the bonding code only,
    which should be more reasonable, and less impact.

    Fix this by discarding the lowest hash bit because it contains little entropy.
    After the fix we can re-balance between slaves.

    Signed-off-by: Paolo Abeni
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     
  • hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced
    by accessing hn->ai.addr

    Fix this by copying the MAC address into a local variable for its safe use
    in all possible execution paths within function mlx5e_execute_l2_action.

    Addresses-Coverity-ID: 1417789
    Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS")
    Signed-off-by: Gustavo A. R. Silva
    Acked-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • The mvpp2 driver can't cope at all with the TX affinities being
    changed from userspace, and spit an endless stream of

    [ 91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [ 91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [ 91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [ 91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [ 91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [ 91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing

    rendering the box completely useless (I've measured around 600k
    interrupts/s on a 8040 box) once irqbalance kicks in and start
    doing its job.

    Obviously, the driver was never designed with this in mind. So let's
    work around the problem by preventing userspace from interacting
    with these interrupts altogether.

    Signed-off-by: Marc Zyngier
    Signed-off-by: David S. Miller

    Marc Zyngier
     
  • The 0day bot reports the below failure which happens occasionally, with
    their randconfig testing (once every ~100 boots). The Code points at
    the private pointer ->driver_data being NULL, which hints at a race of
    sorts where the private driver_data descriptor has disappeared by the
    time we get to run the workqueue.

    So let's check that pointer before we continue with issuing the command
    to the drive.

    This fix is of the brown paper bag nature but considering that IDE is
    long deprecated, let's do that so that random testing which happens to
    enable CONFIG_IDE during randconfig builds, doesn't fail because of
    this.

    Besides, failing the TEST_UNIT_READY command because the drive private
    data is gone is something which we could simply do anyway, to denote
    that there was a problem communicating with the device.

    BUG: unable to handle kernel NULL pointer dereference at 000001c0
    IP: cdrom_check_status
    *pde = 00000000
    Oops: 0000 [#1] SMP
    CPU: 1 PID: 155 Comm: kworker/1:2 Not tainted 4.14.0-rc8 #127
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
    Workqueue: events_freezable_power_ disk_events_workfn
    task: 4fe90980 task.stack: 507ac000
    EIP: cdrom_check_status+0x2c/0x90
    EFLAGS: 00210246 CPU: 1
    EAX: 00000000 EBX: 4fefec00 ECX: 00000000 EDX: 00000000
    ESI: 00000003 EDI: ffffffff EBP: 467a9340 ESP: 507aded0
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    CR0: 80050033 CR2: 000001c0 CR3: 06e0f000 CR4: 00000690
    Call Trace:
    ? ide_cdrom_check_events_real
    ? cdrom_check_events
    ? disk_check_events
    ? process_one_work
    ? process_one_work
    ? worker_thread
    ? kthread
    ? process_one_work
    ? __kthread_create_on_node
    ? ret_from_fork
    Code: 53 83 ec 14 89 c3 89 d1 be 03 00 00 00 65 a1 14 00 00 00 89 44 24 10 31 c0 8b 43 18 c7 44 24 04 00 00 00 00 c7 04 24 00 00 00 00 80 c0 01 00 00 c7 44 24 08 00 00 00 00 83 e0 03 c7 44 24 0c
    EIP: cdrom_check_status+0x2c/0x90 SS:ESP: 0068:507aded0
    CR2: 00000000000001c0
    ---[ end trace 2410e586dd8f88b2 ]---

    Reported-and-tested-by: Fengguang Wu
    Signed-off-by: Borislav Petkov
    Cc: "David S. Miller"
    Cc: Jens Axboe
    Cc: Bart Van Assche
    Signed-off-by: Linus Torvalds

    Borislav Petkov