18 Feb, 2017

8 commits

  • [ Upstream commit 5fa8bbda38c668e56b0c6cdecced2eac2fe36dec ]

    Dmitry reported a warning [1] showing that we were calling
    net_disable_timestamp() -> static_key_slow_dec() from a non
    process context.

    Grabbing a mutex while holding a spinlock or rcu_read_lock()
    is not allowed.

    As Cong suggested, we now use a work queue.

    It is possible netstamp_clear() exits while netstamp_needed_deferred
    is not zero, but it is probably not worth trying to do better than that.

    netstamp_needed_deferred atomic tracks the exact number of deferred
    decrements.

    [1]
    [ INFO: suspicious RCU usage. ]
    4.10.0-rc5+ #192 Not tainted
    -------------------------------
    ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
    critical section!

    other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 0
    2 locks held by syz-executor14/23111:
    #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
    include/net/sock.h:1454 [inline]
    #0: (sk_lock-AF_INET6){+.+.+.}, at: []
    rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
    #1: (rcu_read_lock){......}, at: [] nf_hook
    include/linux/netfilter.h:201 [inline]
    #1: (rcu_read_lock){......}, at: []
    __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160

    stack backtrace:
    CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:15 [inline]
    dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
    lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
    rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
    ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
    __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
    mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
    atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
    __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
    static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
    net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
    sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
    __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
    sk_destruct+0x47/0x80 net/core/sock.c:1460
    __sk_free+0x57/0x230 net/core/sock.c:1468
    sock_wfree+0xae/0x120 net/core/sock.c:1645
    skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
    skb_release_all+0x15/0x60 net/core/skbuff.c:668
    __kfree_skb+0x15/0x20 net/core/skbuff.c:684
    kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
    inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
    inet_frag_put include/net/inet_frag.h:133 [inline]
    nf_ct_frag6_gather+0x1106/0x3840
    net/ipv6/netfilter/nf_conntrack_reasm.c:617
    ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
    nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
    nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
    nf_hook include/linux/netfilter.h:212 [inline]
    __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
    ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
    ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
    ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
    rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
    rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
    inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
    sock_sendmsg_nosec net/socket.c:635 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:645
    sock_write_iter+0x326/0x600 net/socket.c:848
    do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
    do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
    vfs_writev+0x87/0xc0 fs/read_write.c:911
    do_writev+0x110/0x2c0 fs/read_write.c:944
    SYSC_writev fs/read_write.c:1017 [inline]
    SyS_writev+0x27/0x30 fs/read_write.c:1014
    entry_SYSCALL_64_fastpath+0x1f/0xc2
    RIP: 0033:0x445559
    RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014
    RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559
    RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005
    RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
    R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400
    BUG: sleeping function called from invalid context at
    kernel/locking/mutex.c:752
    in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
    INFO: lockdep is turned off.
    CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:15 [inline]
    dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
    ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
    __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
    mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
    atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
    __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
    static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
    net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
    sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
    __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
    sk_destruct+0x47/0x80 net/core/sock.c:1460
    __sk_free+0x57/0x230 net/core/sock.c:1468
    sock_wfree+0xae/0x120 net/core/sock.c:1645
    skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
    skb_release_all+0x15/0x60 net/core/skbuff.c:668
    __kfree_skb+0x15/0x20 net/core/skbuff.c:684
    kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
    inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
    inet_frag_put include/net/inet_frag.h:133 [inline]
    nf_ct_frag6_gather+0x1106/0x3840
    net/ipv6/netfilter/nf_conntrack_reasm.c:617
    ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
    nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
    nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
    nf_hook include/linux/netfilter.h:212 [inline]
    __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
    ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
    ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
    ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
    rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
    rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
    inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
    sock_sendmsg_nosec net/socket.c:635 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:645
    sock_write_iter+0x326/0x600 net/socket.c:848
    do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
    do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
    vfs_writev+0x87/0xc0 fs/read_write.c:911
    do_writev+0x110/0x2c0 fs/read_write.c:944
    SYSC_writev fs/read_write.c:1017 [inline]
    SyS_writev+0x27/0x30 fs/read_write.c:1014
    entry_SYSCALL_64_fastpath+0x1f/0xc2
    RIP: 0033:0x445559

    Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context")
    Suggested-by: Cong Wang
    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 0a764db103376cf69d04449b10688f3516cc0b88 ]

    DW GMAC databook says the following about bits in "Register 15 (Interrupt
    Mask Register)":
    --------------------------->8-------------------------
    When set, this bit __disables_the_assertion_of_the_interrupt_signal__
    because of the setting of XXX bit in Register 14 (Interrupt
    Status Register).
    --------------------------->8-------------------------

    In fact even if we mask one bit in the mask register it doesn't prevent
    corresponding bit to appear in the status register, it only disables
    interrupt generation for corresponding event.

    But currently we expect a bit different behavior: status bits to be in
    sync with their masks, i.e. if mask for bit A is set in the mask
    register then bit A won't appear in the interrupt status register.

    This was proven to be incorrect assumption, see discussion here [1].
    That misunderstanding causes unexpected behaviour of the GMAC, for
    example we were happy enough to just see bogus messages about link
    state changes.

    So from now on we'll be only checking bits that really may trigger an
    interrupt.

    [1] https://lkml.org/lkml/2016/11/3/413

    Signed-off-by: Alexey Brodkin
    Cc: Giuseppe Cavallaro
    Cc: Fabrice Gasnier
    Cc: Joachim Eastwood
    Cc: Phil Reid
    Cc: David Miller
    Cc: Alexandre Torgue
    Cc: Vineet Gupta
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Brodkin
     
  • [ Upstream commit 06425c308b92eaf60767bc71d359f4cbc7a561f8 ]

    syszkaller fuzzer was able to trigger a divide by zero, when
    TCP window scaling is not enabled.

    SO_RCVBUF can be used not only to increase sk_rcvbuf, also
    to decrease it below current receive buffers utilization.

    If mss is negative or 0, just return a zero TCP window.

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 63117f09c768be05a0bf465911297dc76394f686 ]

    Casting is a high precedence operation but "off" and "i" are in terms of
    bytes so we need to have some parenthesis here.

    Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
    Signed-off-by: Dan Carpenter
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • [ Upstream commit fbfa743a9d2a0ffa24251764f10afc13eb21e739 ]

    This function suffers from multiple issues.

    First one is that pskb_may_pull() may reallocate skb->head,
    so the 'raw' pointer needs either to be reloaded or not used at all.

    Second issue is that NEXTHDR_DEST handling does not validate
    that the options are present in skb->data, so we might read
    garbage or access non existent memory.

    With help from Willem de Bruijn.

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit fd62d9f5c575f0792f150109f1fd24a0d4b3f854 ]

    In the current version, the matchall internal state is split into two
    structs: cls_matchall_head and cls_matchall_filter. This makes little
    sense, as matchall instance supports only one filter, and there is no
    situation where one exists and the other does not. In addition, that led
    to some races when filter was deleted while packet was processed.

    Unify that two structs into one, thus simplifying the process of matchall
    creation and deletion. As a result, the new, delete and get callbacks have
    a dummy implementation where all the work is done in destroy and change
    callbacks, as was done in cls_cgroup.

    Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier")
    Reported-by: Daniel Borkmann
    Signed-off-by: Yotam Gigi
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Yotam Gigi
     
  • [ Upstream commit a100ff3eef193d2d79daf98dcd97a54776ffeb78 ]

    Modifying TIR hash should change selected fields bitmask in addition to
    the function and key.

    Formerly, Only on ethool mlx5e_set_rxfh "ethtoo -X" we would not set this
    field resulting in zeroing of its value, which means no packet fields are
    used for RX RSS hash calculation thus causing all traffic to arrive in
    RQ[0].

    On driver load out of the box we don't have this issue, since the TIR
    hash is fully created from scratch.

    Tested:
    ethtool -X ethX hkey
    ethtool -X ethX hfunc
    ethtool -X ethX equal

    All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.

    Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change")
    Signed-off-by: Gal Pressman
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Greg Kroah-Hartman

    Gal Pressman
     
  • [ Upstream commit f1712c73714088a7252d276a57126d56c7d37e64 ]

    Zhang Yanmin reported crashes [1] and provided a patch adding a
    synchronize_rcu() call in can_rx_unregister()

    The main problem seems that the sockets themselves are not RCU
    protected.

    If CAN uses RCU for delivery, then sockets should be freed only after
    one RCU grace period.

    Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
    ease stable backports with the following fix instead.

    [1]
    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] selinux_socket_sock_rcv_skb+0x65/0x2a0

    Call Trace:

    [] security_sock_rcv_skb+0x4c/0x60
    [] sk_filter+0x41/0x210
    [] sock_queue_rcv_skb+0x53/0x3a0
    [] raw_rcv+0x2a3/0x3c0
    [] can_rcv_filter+0x12b/0x370
    [] can_receive+0xd9/0x120
    [] can_rcv+0xab/0x100
    [] __netif_receive_skb_core+0xd8c/0x11f0
    [] __netif_receive_skb+0x24/0xb0
    [] process_backlog+0x127/0x280
    [] net_rx_action+0x33b/0x4f0
    [] __do_softirq+0x184/0x440
    [] do_softirq_own_stack+0x1c/0x30

    [] do_softirq.part.18+0x3b/0x40
    [] do_softirq+0x1d/0x20
    [] netif_rx_ni+0xe5/0x110
    [] slcan_receive_buf+0x507/0x520
    [] flush_to_ldisc+0x21c/0x230
    [] process_one_work+0x24f/0x670
    [] worker_thread+0x9d/0x6f0
    [] ? rescuer_thread+0x480/0x480
    [] kthread+0x12c/0x150
    [] ret_from_fork+0x3f/0x70

    Reported-by: Zhang Yanmin
    Signed-off-by: Eric Dumazet
    Acked-by: Oliver Hartkopp
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

15 Feb, 2017

32 commits

  • Greg Kroah-Hartman
     
  • commit 451d24d1e5f40bad000fa9abe36ddb16fc9928cb upstream.

    Alexei had his box explode because doing read() on a package
    (rapl/uncore) event that isn't currently scheduled in ends up doing an
    out-of-bounds load.

    Rework the code to more explicitly deal with event->oncpu being -1.

    Reported-by: Alexei Starovoitov
    Tested-by: Alexei Starovoitov
    Tested-by: David Carrillo-Cisneros
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: eranian@google.com
    Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
    Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit 8381cdd0e32dd748bd34ca3ace476949948bd793 upstream.

    The -o/--order option is to select column number to sort a diff result.

    It does the job by adding a hpp field at the beginning of the sort list.
    But it should not be added to the output field list as it has no
    callbacks required by a output field.

    During the setup_sorting(), the perf_hpp__setup_output_field() appends
    the given sort keys to the output field if it's not there already.

    Originally it was checked by fmt->list being non-empty. But commit
    3f931f2c4274 ("perf hists: Make hpp setup function generic") changed it
    to check the ->equal callback.

    Anyways, we don't need to add the pseudo hpp field to the output field
    list since it won't be used for output. So just skip fields if they
    have no ->color or ->entry callbacks.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Peter Zijlstra
    Fixes: 3f931f2c4274 ("perf hists: Make hpp setup function generic")
    Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Namhyung Kim
     
  • commit a1c9f97f0b64e6337d9cfcc08c134450934fdd90 upstream.

    Commit 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field
    interface") changed list_add() to perf_hpp__register_sort_field().

    This resulted in a behavior change since the field was added to the tail
    instead of the head. So the -o option is mostly ignored due to its
    order in the list.

    This patch fixes it by adding perf_hpp__prepend_sort_field().

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Peter Zijlstra
    Fixes: 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface")
    Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Namhyung Kim
     
  • commit bfeda41d06d85ad9d52f2413cfc2b77be5022f75 upstream.

    Since KERN_CONT became meaningful again, lockdep stack traces have had
    annoying extra newlines, like this:

    [ 5.561122] -> #1 (B){+.+...}:
    [ 5.561528]
    [ 5.561532] [] lock_acquire+0xc3/0x210
    [ 5.562178]
    [ 5.562181] [] mutex_lock_nested+0x74/0x6d0
    [ 5.562861]
    [ 5.562880] [] init_btrfs_fs+0x21/0x196 [btrfs]
    [ 5.563717]
    [ 5.563721] [] do_one_initcall+0x52/0x1b0
    [ 5.564554]
    [ 5.564559] [] do_init_module+0x5f/0x209
    [ 5.565357]
    [ 5.565361] [] load_module+0x218d/0x2b80
    [ 5.566020]
    [ 5.566021] [] SyS_finit_module+0xeb/0x120
    [ 5.566694]
    [ 5.566696] [] entry_SYSCALL_64_fastpath+0x1f/0xc2

    That's happening because each printk() call now gets printed on its own
    line, and we do a separate call to print the spaces before the symbol.
    Fix it by doing the printk() directly instead of using the
    print_ip_sym() helper.

    Additionally, the symbol address isn't very helpful, so let's get rid of
    that, too. The final result looks like this:

    [ 5.194518] -> #1 (B){+.+...}:
    [ 5.195002] lock_acquire+0xc3/0x210
    [ 5.195439] mutex_lock_nested+0x74/0x6d0
    [ 5.196491] do_one_initcall+0x52/0x1b0
    [ 5.196939] do_init_module+0x5f/0x209
    [ 5.197355] load_module+0x218d/0x2b80
    [ 5.197792] SyS_finit_module+0xeb/0x120
    [ 5.198251] entry_SYSCALL_64_fastpath+0x1f/0xc2

    Suggested-by: Linus Torvalds
    Signed-off-by: Omar Sandoval
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kernel-team@fb.com
    Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
    Link: http://lkml.kernel.org/r/43b4e114724b2bdb0308fa86cb33aa07d3d67fad.1486510315.git.osandov@fb.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Omar Sandoval
     
  • commit 647bf3d8a8e5777319da92af672289b2a6c4dc66 upstream.

    Update the range check to avoid integer-overflow in edge case.
    Resolves CVE 2016-8636.

    Signed-off-by: Eyal Itkin
    Signed-off-by: Dan Carpenter
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eyal Itkin
     
  • commit 628f07d33c1f2e7bf31e0a4a988bb07914bd5e73 upstream.

    Update the response's resid field when larger than MTU, instead of only
    updating the local resid variable.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Eyal Itkin
    Signed-off-by: Dan Carpenter
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eyal Itkin
     
  • commit 08b259631b5a1d912af4832847b5642f377d9101 upstream.

    After:

    a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")

    our SMT scheduling topology for Fam17h systems is broken, because
    the ThreadId is included in the ApicId when SMT is enabled.

    So, without further decoding cpu_core_id is unique for each thread
    rather than the same for threads on the same core. This didn't affect
    systems with SMT disabled. Make cpu_core_id be what it is defined to be.

    Signed-off-by: Yazen Ghannam
    Signed-off-by: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20170205105022.8705-2-bp@alien8.de
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Yazen Ghannam
     
  • commit 79a8b9aa388b0620cc1d525d7c0f0d9a8a85e08e upstream.

    Commit:

    a33d331761bc ("x86/CPU/AMD: Fix Bulldozer topology")

    restored the initial approach we had with the Fam15h topology of
    enumerating CU (Compute Unit) threads as cores. And this is still
    correct - they're beefier than HT threads but still have some
    shared functionality.

    Our current approach has a problem with the Mad Max Steam game, for
    example. Yves Dionne reported a certain "choppiness" while playing on
    v4.9.5.

    That problem stems most likely from the fact that the CU threads share
    resources within one CU and when we schedule to a thread of a different
    compute unit, this incurs latency due to migrating the working set to a
    different CU through the caches.

    When the thread siblings mask mirrors that aspect of the CUs and
    threads, the scheduler pays attention to it and tries to schedule within
    one CU first. Which takes care of the latency, of course.

    Reported-by: Yves Dionne
    Signed-off-by: Borislav Petkov
    Cc: Brice Goglin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Yazen Ghannam
    Link: http://lkml.kernel.org/r/20170205105022.8705-1-bp@alien8.de
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 146fbb766934dc003fcbf755b519acef683576bf upstream.

    CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow.
    In that case ptdump_walk_pgd_level_core() takes a lot of time to
    walk across all page tables and doing this without
    a rescheduling causes soft lockups:

    NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1]
    ...
    Call Trace:
    ptdump_walk_pgd_level_core+0x40c/0x550
    ptdump_walk_pgd_level_checkwx+0x17/0x20
    mark_rodata_ro+0x13b/0x150
    kernel_init+0x2f/0x120
    ret_from_fork+0x2c/0x40

    I guess that this issue might arise even without KASAN on huge machines
    with several terabytes of RAM.

    Stick cond_resched() in pgd loop to fix this.

    Reported-by: Tobias Regnery
    Signed-off-by: Andrey Ryabinin
    Cc: kasan-dev@googlegroups.com
    Cc: Alexander Potapenko
    Cc: "Paul E . McKenney"
    Cc: Dmitry Vyukov
    Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Andrey Ryabinin
     
  • commit f3d83317a69e7d658e7c83e24f8b31ac533c39e3 upstream.

    This reverts commit f6a0dd107ad0c8b59d1c9735eea4b8cb9f460949.

    The commit caused a regression on LINE6 Transport that has no control
    caps. Although reverting the commit may result back in a spurious
    error message for some device again, it's the simplest regression fix,
    hence it's taken as is at first. The further code fix will follow
    later.

    Fixes: f6a0dd107ad0 ("ALSA: line6: Only determine control port properties if needed")
    Reported-by: Igor Zinovev
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 37a7ea4a9b81f6a864c10a7cb0b96458df5310a3 upstream.

    snd_seq_pool_done() syncs with closing of all opened threads, but it
    aborts the wait loop with a timeout, and proceeds to the release
    resource even if not all threads have been closed. The timeout was 5
    seconds, and if you run a crazy stuff, it can exceed easily, and may
    result in the access of the invalid memory address -- this is what
    syzkaller detected in a bug report.

    As a fix, let the code graduate from naiveness, simply remove the loop
    timeout.

    BugLink: http://lkml.kernel.org/r/CACT4Y+YdhDV2H5LLzDTJDVF-qiYHUHhtRaW4rbb4gUhTCQB81w@mail.gmail.com
    Reported-by: Dmitry Vyukov
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 4842e98f26dd80be3623c4714a244ba52ea096a8 upstream.

    When a sequencer queue is created in snd_seq_queue_alloc(),it adds the
    new queue element to the public list before referencing it. Thus the
    queue might be deleted before the call of snd_seq_queue_use(), and it
    results in the use-after-free error, as spotted by syzkaller.

    The fix is to reference the queue object at the right time.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit af677166cf63c179dc2485053166e02c4aea01eb upstream.

    Without this change, the HDMI/DP codec will be recognised as a
    generic codec, and there is no sound when playing through this codec.

    As suggested by NVidia side, after adding the new ID in the driver,
    the sound playing works well.

    Signed-off-by: Hui Wang
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Hui Wang
     
  • commit 74470954857c264168d2b5a113904cf0cfd27d18 upstream.

    rx_refill_timer should be deleted as soon as we disconnect from the
    backend since otherwise it is possible for the timer to go off before
    we get to xennet_destroy_queues(). If this happens we may dereference
    queue->rx.sring which is set to NULL in xennet_disconnect_backend().

    Signed-off-by: Boris Ostrovsky
    Reviewed-by: Juergen Gross
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Boris Ostrovsky
     
  • commit 9b256714979fad61ae11d90b53cf67dd5e6484eb upstream.

    The IPIs come in as HVI not EE, so we need to test the appropriate
    SRR1 bits. The encoding is such that it won't have false positives
    on P7 and P8 so we can just test it like that. We also need to handle
    the icp-opal variant of the flush.

    Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend")
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Herrenschmidt
     
  • commit 90c1e3c2fafec57fcb55b5d69bcf293b1a5fc8b3 upstream.

    Three tiny changes to the ERAT flushing logic: First don't make
    it depend on DD1. It hasn't been decided yet but we might run
    DD2 in a mode that also requires explicit flushes for performance
    reasons so make it unconditional. We also add a missing isync, and
    finally remove the flush from _tlbiel_va as it is only necessary
    for congruence-class invalidations (PID, LPID and full TLB), not
    targetted invalidations.

    Fixes: 96ed1fe511a8 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1")
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Herrenschmidt
     
  • commit 2a362249187a8d0f6d942d6e1d763d150a296f47 upstream.

    Commit 4c63c2454ef incorrectly assumed that returning -ENOIOCTLCMD would
    cause the native ioctl to be called. The ->compat_ioctl callback is
    expected to handle all ioctls, not just compat variants. As a result,
    when using 32-bit userspace on 64-bit kernels, everything except those
    three ioctls would return -ENOTTY.

    Fixes: 4c63c2454ef ("btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl")
    Signed-off-by: Jeff Mahoney
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Jeff Mahoney
     
  • commit 2780f3c8f0233de90b6b47a23fc422b7780c5436 upstream.

    Avoid that issuing a LIP as follows:

    find /sys -name 'issue_lip'|while read f; do echo 1 > $f; done

    triggers the following:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    Call Trace:
    qla2x00_abort_all_cmds+0xed/0x140 [qla2xxx]
    qla2x00_abort_isp_cleanup+0x1e3/0x280 [qla2xxx]
    qla2x00_abort_isp+0xef/0x690 [qla2xxx]
    qla2x00_do_dpc+0x36c/0x880 [qla2xxx]
    kthread+0x10c/0x140

    [mkp: consolidated Mauricio's and Bart's fixes]

    Signed-off-by: Mauricio Faria de Oliveira
    Reported-by: Bart Van Assche
    Fixes: 1535aa75a3d8 ("qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
    Cc: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Mauricio Faria de Oliveira
     
  • commit ffdadd68af5a397b8a52289ab39d62e1acb39e63 upstream.

    MPI2 controllers sometimes got lost (i.e. disappear from
    /sys/bus/pci/devices) if ASMP is enabled.

    Signed-off-by: Slava Kardakov
    Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60644
    Acked-by: Sreekanth Reddy
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    ojab
     
  • commit 8af8e1c22f9994bb1849c01d66c24fe23f9bc9a0 upstream.

    commit 78cbccd3bd68 ("aacraid: Fix for KDUMP driver hang")

    caused a problem on older controllers which do not support MSI-x (namely
    ASR3405,ASR3805). This patch conditionalizes the previous patch to
    controllers which support MSI-x

    Fixes: 78cbccd3bd68 ("aacraid: Fix for KDUMP driver hang")
    Reported-by: Arkadiusz Miskiewicz
    Signed-off-by: Dave Carroll
    Reviewed-by: Raghava Aditya Renukunta
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Dave Carroll
     
  • commit 2dfa6688aafdc3f74efeb1cf05fb871465d67f79 upstream.

    Dan Carpenter kindly reported:

    The patch d27a7cb91960: "zfcp: trace on request for open and close of
    WKA port" from Aug 10, 2016, leads to the following static checker
    warning:

    drivers/s390/scsi/zfcp_fsf.c:1615 zfcp_fsf_open_wka_port()
    warn: 'req' was already freed.

    drivers/s390/scsi/zfcp_fsf.c
    1609 zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
    1610 retval = zfcp_fsf_req_send(req);
    1611 if (retval)
    1612 zfcp_fsf_req_free(req);
    ^^^
    Freed.

    1613 out:
    1614 spin_unlock_irq(&qdio->req_q_lock);
    1615 if (req && !IS_ERR(req))
    1616 zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id);
    ^^^^^^^^^^^
    Use after free.

    1617 return retval;
    1618 }

    Same thing for zfcp_fsf_close_wka_port() as well.

    Rather than relying on req being NULL (or ERR_PTR) for all cases where
    we don't want to trace or should not trace,
    simply check retval which is unconditionally initialized with -EIO != 0
    and it can only become 0 on successful retval = zfcp_fsf_req_send(req).
    With that we can also remove the then again unnecessary unconditional
    initialization of req which was introduced with that earlier commit.

    Reported-by: Dan Carpenter
    Suggested-by: Benjamin Block
    Signed-off-by: Steffen Maier
    Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port")
    Reviewed-by: Benjamin Block
    Reviewed-by: Jens Remus
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     
  • commit 433e19cf33d34bb6751c874a9c00980552fe508c upstream.

    Commit a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
    hv_need_to_signal_on_read()")
    added the proper mb(), but removed the test "prev_write_sz < pending_sz"
    when making the signal decision.

    As a result, the guest can signal the host unnecessarily,
    and then the host can throttle the guest because the host
    thinks the guest is buggy or malicious; finally the user
    running stress test can perceive intermittent freeze of
    the guest.

    This patch brings back the test, and properly handles the
    in-place consumption APIs used by NetVSC (see get_next_pkt_raw(),
    put_pkt_raw() and commit_rd_index()).

    Fixes: a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
    hv_need_to_signal_on_read()")

    Signed-off-by: Dexuan Cui
    Reported-by: Rolf Neugebauer
    Tested-by: Rolf Neugebauer
    Cc: "K. Y. Srinivasan"
    Cc: Haiyang Zhang
    Cc: Stephen Hemminger
    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     
  • commit 3372592a140db69fd63837e81f048ab4abf8111e upstream.

    Signal the host when we determine the host is to be signaled -
    on th read path. The currrent code determines the need to signal in the
    ringbuffer code and actually issues the signal elsewhere. This can result
    in the host viewing this interrupt as spurious since the host may also
    poll the channel. Make the necessary adjustments.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit 1f6ee4e7d83586c8b10bd4f2f4346353d04ce884 upstream.

    Signal the host when we determine the host is to be signaled.
    The currrent code determines the need to signal in the ringbuffer
    code and actually issues the signal elsewhere. This can result
    in the host viewing this interrupt as spurious since the host may also
    poll the channel. Make the necessary adjustments.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit 74198eb4a42c4a3c4fbef08fa01a291a282f7c2e upstream.

    One of the factors that can result in the host concluding that a given
    guest in mounting a DOS attack is if the guest generates interrupts
    to the host when the host is not expecting it. If these "spurious"
    interrupts reach a certain rate, the host can throttle the guest to
    minimize the impact. The host computation of the "expected number
    of interrupts" is strictly based on the ring transitions. Until
    the host logic is fixed, base the guest logic to interrupt solely
    on the ring state.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit 1d3398facd08a7fd4202f269317a95668eb880b9 upstream.

    We don't need to modify our TIRs unless the user requested a change in
    the hash function/key, for example when changing indirection only.

    Tested:
    # Modify TIRs hash is needed
    ethtool -X ethX hkey
    ethtool -X ethX hfunc

    # Modify TIRs hash is not needed
    ethtool -X ethX equal

    All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.

    Fixes: bdfc028de1b3 ("net/mlx5e: Fix ethtool RX hash func configuration change")
    Signed-off-by: Gal Pressman
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Greg Kroah-Hartman

    Gal Pressman
     
  • commit da7061c82e4a1bc6a5e134ef362c86261906c860 upstream.

    The function ieee80211_ie_split_vendor doesn't return 0 on errors. Instead
    it returns any offset < ielen when WLAN_EID_VENDOR_SPECIFIC is found. The
    return value in mesh_add_vendor_ies must therefore be checked against
    ifmsh->ie_len and not 0. Otherwise all ifmsh->ie starting with
    WLAN_EID_VENDOR_SPECIFIC will be rejected.

    Fixes: 082ebb0c258d ("mac80211: fix mesh beacon format")
    Signed-off-by: Thorsten Horstmann
    Signed-off-by: Mathias Kretschmer
    Signed-off-by: Simon Wunderlich
    [sven@narfation.org: Add commit message]
    Signed-off-by: Sven Eckelmann
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Thorsten Horstmann
     
  • commit fd551bac4795854adaa87bad7e5136083719802b upstream.

    A previous change to fix checks for NL80211_MESHCONF_HT_OPMODE
    missed setting the flag when replacing FILL_IN_MESH_PARAM_IF_SET
    with checking codes. This results in dropping the received HT
    operation value when called by nl80211_update_mesh_config(). Fix
    this by setting the flag properly.

    Fixes: 9757235f451c ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value")
    Signed-off-by: Masashi Honma
    [rewrite commit message to use Fixes: line]
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Masashi Honma
     
  • commit 6e7eb1783be7f19eb071c96ddda0bbf22279ff46 upstream.

    We're using non-canonical addresses in drm_mm, and we're making sure that
    userspace is using canonical addressing - both in case of softpin
    (verifying incoming offset) and when relocating (converting to canonical
    when updating offset returned to userspace).
    Unfortunately when considering the need for relocations, we're comparing
    offset from userspace (in canonical form) with drm_mm node (in
    non-canonical form), and as a result, we end up always relocating if our
    offsets are in the "problematic" range.
    Let's always convert the offsets to avoid the performance impact of
    relocations.

    Fixes: a5f0edf63bdf ("drm/i915: Avoid writing relocs with addresses in non-canonical form")
    Cc: Chris Wilson
    Cc: Michel Thierry
    Reported-by: Michał Pyrzowski
    Signed-off-by: Michał Winiarski
    Link: http://patchwork.freedesktop.org/patch/msgid/20170207195559.18798-1-michal.winiarski@intel.com
    Reviewed-by: Chris Wilson
    Signed-off-by: Chris Wilson
    (cherry picked from commit 038c95a313e4ca954ee5ab8a0c7559a646b0f462)
    Signed-off-by: Jani Nikula
    Signed-off-by: Greg Kroah-Hartman

    Michał Winiarski
     
  • commit 97a98ae5b8acf08d07d972c087b2def060bc9b73 upstream.

    Asynchronous external abort is coded differently in DFSR with LPAE enabled.

    Fixes: 9254970c "ARM: 8447/1: catch pending imprecise abort on unmask".
    Signed-off-by: Alexander Sverdlin
    Cc: Russell King
    Cc: Andrew Morton
    Cc: linux-arm-kernel@lists.infradead.org
    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Alexander Sverdlin
     
  • commit 7f59b319111bbc3a5f32730c8a43b201e9522f52 upstream.

    GPIO4_11 is on pin 152(MX6DL_PAD_KEY_ROW2) and not on pin
    151(MX6DL_PAD_KEY_ROW1).

    I found the error while booting a mainline kernel on APF6S SoM and
    noticed the following message:

    [ 2.609337] imx6dl-pinctrl 20e0000.iomuxc: pin MX6DL_PAD_KEY_ROW1
    already requested by 20a8000.gpio:105; cannot claim for 20a8000.gpio:107
    [ 2.621884] imx6dl-pinctrl 20e0000.iomuxc: pin-151 (20a8000.gpio:107)
    status -22
    [ 2.629303] spi_imx 2008000.ecspi: Can't get CS GPIO 107

    With this patch, the message is gone and spi_imx driver probes correctly.

    Fixes: bb728d662bed ("ARM: dts: add gpio-ranges property to iMX GPIO controllers")
    Signed-off-by: Sébastien Szymanski
    Signed-off-by: Shawn Guo
    Signed-off-by: Greg Kroah-Hartman

    Sébastien Szymanski