19 Aug, 2017

1 commit

  • Due to commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac ("udp: remove
    headers from UDP packets before queueing"), when udp packets are being
    peeked the requested extra offset is always 0 as there is no need to skip
    the udp header. However, when the offset is 0 and the next skb is
    of length 0, it is only returned once. The behaviour can be seen with
    the following python script:

    from socket import *;
    f=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
    g=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
    f.bind(('::', 0));
    addr=('::1', f.getsockname()[1]);
    g.sendto(b'', addr)
    g.sendto(b'b', addr)
    print(f.recvfrom(10, MSG_PEEK));
    print(f.recvfrom(10, MSG_PEEK));

    Where the expected output should be the empty string twice.

    Instead, make sk_peek_offset return negative values, and pass those values
    to __skb_try_recv_datagram/__skb_try_recv_from_queue. If the passed offset
    to __skb_try_recv_from_queue is negative, the checked skb is never skipped.
    __skb_try_recv_from_queue will then ensure the offset is reset back to 0
    if a peek is requested without an offset, unless no packets are found.

    Also simplify the if condition in __skb_try_recv_from_queue. If _off is
    greater then 0, and off is greater then or equal to skb->len, then
    (_off || skb->len) must always be true assuming skb->len >= 0 is always
    true.

    Also remove a redundant check around a call to sk_peek_offset in af_unix.c,
    as it double checked if MSG_PEEK was set in the flags.

    V2:
    - Moved the negative fixup into __skb_try_recv_from_queue, and remove now
    redundant checks
    - Fix peeking in udp{,v6}_recvmsg to report the right value when the
    offset is 0

    V3:
    - Marked new branch in __skb_try_recv_from_queue as unlikely.

    Signed-off-by: Matthew Dawson
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Matthew Dawson
     

17 Aug, 2017

2 commits

  • While working on yet another syzkaller report, I found
    that our IP_MAX_MTU enforcements were not properly done.

    gcc seems to reload dev->mtu for min(dev->mtu, IP_MAX_MTU), and
    final result can be bigger than IP_MAX_MTU :/

    This is a problem because device mtu can be changed on other cpus or
    threads.

    While this patch does not fix the issue I am working on, it is
    probably worth addressing it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • As found by syzkaller, malicious users can set whatever tx_queue_len
    on a tun device and eventually crash the kernel.

    Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
    ring buffer is not fast anyway.

    Fixes: 2e0ab8ca83c1 ("ptr_ring: array based FIFO for pointers")
    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: Michael S. Tsirkin
    Cc: Jason Wang
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 Aug, 2017

4 commits

  • Pull networking fixes from David Miller:

    1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel
    Grumbach.

    2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From
    Vivien Didelot.

    3) Fix two regressions with bonding on wireless, from Andreas Born.

    4) Fix build when busypoll is disabled, from Daniel Borkmann.

    5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet.

    6) Set SKB cached route properly in inet_rtm_getroute(), from Florian
    Westphal.

    7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding
    Tianhong.

    8) Fix module refcnt leak in ULP code, from Sabrina Dubroca.

    9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric
    Dumazet.

    10) Need to purge socket write queue in dccp_destroy_sock(), also from
    Eric Dumazet.

    11) Make bpf_trace_printk() work properly on 32-bit architectures, from
    Daniel Borkmann.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
    bpf: fix bpf_trace_printk on 32 bit archs
    PCI: fix oops when try to find Root Port for a PCI device
    sfc: don't try and read ef10 data on non-ef10 NIC
    net_sched: remove warning from qdisc_hash_add
    net_sched/sfq: update hierarchical backlog when drop packet
    net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
    ipv4: fix NULL dereference in free_fib_info_rcu()
    net: Fix a typo in comment about sock flags.
    ipv6: fix NULL dereference in ip6_route_dev_notify()
    tcp: fix possible deadlock in TCP stack vs BPF filter
    dccp: purge write queue in dccp_destroy_sock()
    udp: fix linear skb reception with PEEK_OFF
    ipv6: release rt6->rt6i_idev properly during ifdown
    af_key: do not use GFP_KERNEL in atomic contexts
    tcp: ulp: avoid module refcnt leak in tcp_set_ulp
    net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
    net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
    PCI: Disable Relaxed Ordering Attributes for AMD A1100
    PCI: Disable Relaxed Ordering for some Intel processors
    PCI: Disable PCIe Relaxed Ordering if unsupported
    ...

    Linus Torvalds
     
  • Signed-off-by: Tonghao Zhang
    Signed-off-by: David S. Miller

    Tonghao Zhang
     
  • Based on a syzkaller report [1], I found that a per cpu allocation
    failure in snmp6_alloc_dev() would then lead to NULL dereference in
    ip6_route_dev_notify().

    It seems this is a very old bug, thus no Fixes tag in this submission.

    Let's add in6_dev_put_clear() helper, as we will probably use
    it elsewhere (once available/present in net-next)

    [1]
    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 17294 Comm: syz-executor6 Not tainted 4.13.0-rc2+ #10
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    task: ffff88019f456680 task.stack: ffff8801c6e58000
    RIP: 0010:__read_once_size include/linux/compiler.h:250 [inline]
    RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
    RIP: 0010:refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178
    RSP: 0018:ffff8801c6e5f1b0 EFLAGS: 00010202
    RAX: 0000000000000037 RBX: dffffc0000000000 RCX: ffffc90005d25000
    RDX: ffff8801c6e5f218 RSI: ffffffff82342bbf RDI: 0000000000000001
    RBP: ffff8801c6e5f240 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff10038dcbe37
    R13: 0000000000000006 R14: 0000000000000001 R15: 00000000000001b8
    FS: 00007f21e0429700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000001ddbc22000 CR3: 00000001d632b000 CR4: 00000000001426e0
    DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
    Call Trace:
    refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
    in6_dev_put include/net/addrconf.h:335 [inline]
    ip6_route_dev_notify+0x1c9/0x4a0 net/ipv6/route.c:3732
    notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
    __raw_notifier_call_chain kernel/notifier.c:394 [inline]
    raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
    call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1678
    call_netdevice_notifiers net/core/dev.c:1694 [inline]
    rollback_registered_many+0x91c/0xe80 net/core/dev.c:7107
    rollback_registered+0x1be/0x3c0 net/core/dev.c:7149
    register_netdevice+0xbcd/0xee0 net/core/dev.c:7587
    register_netdev+0x1a/0x30 net/core/dev.c:7669
    loopback_net_init+0x76/0x160 drivers/net/loopback.c:214
    ops_init+0x10a/0x570 net/core/net_namespace.c:118
    setup_net+0x313/0x710 net/core/net_namespace.c:294
    copy_net_ns+0x27c/0x580 net/core/net_namespace.c:418
    create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107
    unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206
    SYSC_unshare kernel/fork.c:2347 [inline]
    SyS_unshare+0x653/0xfa0 kernel/fork.c:2297
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4512c9
    RSP: 002b:00007f21e0428c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000110
    RAX: ffffffffffffffda RBX: 0000000000718150 RCX: 00000000004512c9
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000062020200
    RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b973d
    R13: 00000000ffffffff R14: 000000002001d000 R15: 00000000000002dd
    Code: 50 2b 34 82 c7 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3
    f3 f3 e8 a1 43 39 ff 4c 89 f8 48 8b 95 70 ff ff ff 48 c1 e8 03 b6
    0c 18 4c 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
    RIP: __read_once_size include/linux/compiler.h:250 [inline] RSP:
    ffff8801c6e5f1b0
    RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP:
    ffff8801c6e5f1b0
    RIP: refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP:
    ffff8801c6e5f1b0
    ---[ end trace e441d046c6410d31 ]---

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • …m/linux/kernel/git/kvalo/wireless-drivers

    Kalle Valo says:

    ====================
    wireless-drivers fixes for 4.13

    This time quite a few fixes for iwlwifi and one major regression fix
    for brcmfmac. For the iwlwifi aggregation bug a small change was
    needed for mac80211, but as Johannes is still away the mac80211 patch
    is taken via wireless-drivers tree.

    brcmfmac

    * fix firmware crash (a recent regression in bcm4343{0,1,8}

    iwlwifi

    * Some simple PCI HW ID fix-ups and additions for family 9000

    * Remove a bogus warning message with new FWs (bug #196915)

    * Don't allow illegal channel options to be used (bug #195299)

    * A fix for checksum offload in family 9000

    * A fix serious throughput degradation in 11ac with multiple streams

    * An old bug in SMPS where the firmware was not aware of SMPS changes

    * Fix a memory leak in the SAR code

    * Fix a stuck queue case in AP mode;

    * Convert a WARN to a simple debug in a legitimate race case (from
    which we can recover)

    * Fix a severe throughput aggregation on 9000-family devices due to
    aggregation issues, needed a small change in mac80211
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     

15 Aug, 2017

2 commits

  • copy_linear_skb() is broken; both of its callers actually
    expect 'len' to be the amount we are trying to copy,
    not the offset of the end.
    Fix it keeping the meanings of arguments in sync with what the
    callers (both of them) expect.
    Also restore a saner behavior on EFAULT (i.e. preserving
    the iov_iter position in case of failure):

    The commit fd851ba9caa9 ("udp: harden copy_linear_skb()")
    avoids the more destructive effect of the buggy
    copy_linear_skb(), e.g. no more invalid memory access, but
    said function still behaves incorrectly: when peeking with
    offset it can fail with EINVAL instead of copying the
    appropriate amount of memory.

    Reported-by: Sasha Levin
    Fixes: b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
    Fixes: fd851ba9caa9 ("udp: harden copy_linear_skb()")
    Signed-off-by: Al Viro
    Acked-by: Paolo Abeni
    Tested-by: Sasha Levin
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Al Viro
     
  • When bit4 is set in the PCIe Device Control register, it indicates
    whether the device is permitted to use relaxed ordering.
    On some platforms using relaxed ordering can have performance issues or
    due to erratum can cause data-corruption. In such cases devices must avoid
    using relaxed ordering.

    The patch adds a new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING to indicate that
    Relaxed Ordering (RO) attribute should not be used for Transaction Layer
    Packets (TLP) targeted towards these affected root complexes.

    This patch checks if there is any node in the hierarchy that indicates that
    using relaxed ordering is not safe. In such cases the patch turns off the
    relaxed ordering by clearing the capability for this device.

    Signed-off-by: Casey Leedom
    Signed-off-by: Ding Tianhong
    Acked-by: Ashok Raj
    Acked-by: Alexander Duyck
    Acked-by: Casey Leedom
    Signed-off-by: David S. Miller

    dingtianhong
     

14 Aug, 2017

2 commits

  • Pull tty/serial fixes from Greg KH:
    "Here are two tty serial driver fixes for 4.13-rc5. One is a revert of
    a -rc1 patch that turned out to not be a good idea, and the other is a
    fix for the pl011 serial driver.

    Both have been in linux-next with no reported issues"

    * tag 'tty-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    Revert "serial: Delete dead code for CIR serial ports"
    tty: pl011: fix initialization order of QDF2400 E44

    Linus Torvalds
     
  • Pull staging/iio fixes from Greg KH:
    "Here are some Staging and IIO driver fixes for 4.13-rc5.

    Nothing major, just a number of small fixes for reported issues. All
    of these have been in linux-next for a while now with no reported
    issues. Full details are in the shortlog"

    * tag 'staging-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    staging: comedi: comedi_fops: do not call blocking ops when !TASK_RUNNING
    iio: aspeed-adc: wait for initial sequence.
    iio: accel: bmc150: Always restore device to normal mode after suspend-resume
    staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
    iio: adc: axp288: Fix the GPADC pin reading often wrongly returning 0
    iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits
    iio: accel: st_accel: add SPI-3wire support
    iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications"
    iio: adc: sun4i-gpadc-iio: fix unbalanced irq enable/disable
    iio: pressure: st_pressure_core: disable multiread by default for LPS22HB
    iio: light: tsl2563: use correct event code

    Linus Torvalds
     

13 Aug, 2017

1 commit

  • Pull SCSI target fixes from Nicholas Bellinger:
    "The highlights include:

    - Fix iscsi-target payload memory leak during
    ISCSI_FLAG_TEXT_CONTINUE (Varun Prakash)

    - Fix tcm_qla2xxx incorrect use of tcm_qla2xxx_free_cmd during ABORT
    (Pascal de Bruijn + Himanshu Madhani + nab)

    - Fix iscsi-target long-standing issue with parallel delete of a
    single network portal across multiple target instances (Gary Guo +
    nab)

    - Fix target dynamic se_node GPF during uncached shutdown regression
    (Justin Maggard + nab)"

    * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
    target: Fix node_acl demo-mode + uncached dynamic shutdown regression
    iscsi-target: Fix iscsi_np reset hung task during parallel delete
    qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT (v2)
    cxgbit: fix sg_nents calculation
    iscsi-target: fix invalid flags in text response
    iscsi-target: fix memory leak in iscsit_setup_text_cmd()
    cxgbit: add missing __kfree_skb()
    tcmu: free old string on reconfig
    tcmu: Fix possible to/from address overflow when doing the memcpy

    Linus Torvalds
     

12 Aug, 2017

4 commits

  • syzkaller got crashes with CONFIG_HARDENED_USERCOPY=y configs.

    Issue here is that recvfrom() can be used with user buffer of Z bytes,
    and SO_PEEK_OFF of X bytes, from a skb with Y bytes, and following
    condition :

    Z < X < Y

    kernel BUG at mm/usercopy.c:72!
    invalid opcode: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 0 PID: 2917 Comm: syzkaller842281 Not tainted 4.13.0-rc3+ #16
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    task: ffff8801d2fa40c0 task.stack: ffff8801d1fe8000
    RIP: 0010:report_usercopy mm/usercopy.c:64 [inline]
    RIP: 0010:__check_object_size+0x3ad/0x500 mm/usercopy.c:264
    RSP: 0018:ffff8801d1fef8a8 EFLAGS: 00010286
    RAX: 0000000000000078 RBX: ffffffff847102c0 RCX: 0000000000000000
    RDX: 0000000000000078 RSI: 1ffff1003a3fded5 RDI: ffffed003a3fdf09
    RBP: ffff8801d1fef998 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d1ea480e
    R13: fffffffffffffffa R14: ffffffff84710280 R15: dffffc0000000000
    FS: 0000000001360880(0000) GS:ffff8801dc000000(0000)
    knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000202ecfe4 CR3: 00000001d1ff8000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    check_object_size include/linux/thread_info.h:108 [inline]
    check_copy_size include/linux/thread_info.h:139 [inline]
    copy_to_iter include/linux/uio.h:105 [inline]
    copy_linear_skb include/net/udp.h:371 [inline]
    udpv6_recvmsg+0x1040/0x1af0 net/ipv6/udp.c:395
    inet_recvmsg+0x14c/0x5f0 net/ipv4/af_inet.c:793
    sock_recvmsg_nosec net/socket.c:792 [inline]
    sock_recvmsg+0xc9/0x110 net/socket.c:799
    SYSC_recvfrom+0x2d6/0x570 net/socket.c:1788
    SyS_recvfrom+0x40/0x50 net/socket.c:1760
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Fixes: b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
    Signed-off-by: Eric Dumazet
    Cc: Paolo Abeni
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • MIN_NAPI_ID is used in various places outside of
    CONFIG_NET_RX_BUSY_POLL wrapping, so when it's not set
    we run into build errors such as:

    net/core/dev.c: In function 'dev_get_by_napi_id':
    net/core/dev.c:886:16: error: ‘MIN_NAPI_ID’ undeclared (first use in this function)
    if (napi_id < MIN_NAPI_ID)
    ^~~~~~~~~~~

    Thus, have MIN_NAPI_ID always defined to fix these errors.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • The patch c4adfc822bf5 ("bonding: make speed, duplex setting consistent
    with link state") puts the link state to down if
    bond_update_speed_duplex() cannot retrieve speed and duplex settings.
    Assumably the patch was written with 802.3ad mode in mind which relies
    on link speed/duplex settings. For other modes like active-backup these
    settings are not required. Thus, only for these other modes, this patch
    reintroduces support for slaves that do not support reporting speed or
    duplex such as wireless devices. This fixes the regression reported in
    bug 196547 (https://bugzilla.kernel.org/show_bug.cgi?id=196547).

    Fixes: c4adfc822bf5 ("bonding: make speed, duplex setting consistent
    with link state")
    Signed-off-by: Andreas Born
    Acked-by: Mahesh Bandewar
    Signed-off-by: David S. Miller

    Andreas Born
     
  • Pull block fixes from Jens Axboe:
    "A set of fixes that should go into this series. This contains:

    - Fix from Bart for blk-mq requeue queue running, preventing a
    continued loop of run/restart.

    - Fix for a bio/blk-integrity issue, in two parts. One from
    Christoph, fixing where verification happens, and one from Milan,
    for a NULL profile.

    - NVMe pull request, most of the changes being for nvme-fc, but also
    a few trivial core/pci fixes"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    nvme: fix directive command numd calculation
    nvme: fix nvme reset command timeout handling
    nvme-pci: fix CMB sysfs file removal in reset path
    lpfc: support nvmet_fc defer_rcv callback
    nvmet_fc: add defer_req callback for deferment of cmd buffer return
    nvme: strip trailing 0-bytes in wwid_show
    block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time
    bio-integrity: only verify integrity on the lowest stacked driver
    bio-integrity: Fix regression if profile verify_fn is NULL

    Linus Torvalds
     

11 Aug, 2017

9 commits

  • Pull NVMe fixes from Christoph:

    "A few more small fixes - the fc/lpfc update is the biggest by far."

    Jens Axboe
     
  • Pull drm fixes from Dave Airlie:
    "Nothing too earth shattering here, it just seems like lots of little
    things all over the place.

    msm has probably the larger amount of changes, but they all seem fine,
    otherwise, some rockchip, i915, etnaviv and exynos fixes, along with
    one nouveau regression fix for some older GPUs"

    * tag 'drm-fixes-for-v4.13-rc5' of git://people.freedesktop.org/~airlied/linux: (35 commits)
    drm/nouveau/disp/nv04: avoid creation of output paths
    drm: make DRM_STM default n
    drm/exynos: forbid creating framebuffers from too small GEM buffers
    drm/etnaviv: Fix off-by-one error in reloc checking
    drm/i915: fix backlight invert for non-zero minimum brightness
    drm/i915/shrinker: Wrap need_resched() inside preempt-disable
    drm/i915/perf: fix flex eu registers programming
    drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut
    drm/i915/gvt: Change the max length of mmio_reg_rw from 4 to 8
    drm/i915/gvt: Initialize MMIO Block with HW state
    drm/rockchip: vop: report error when check resource error
    drm/rockchip: vop: round_up pitches to word align
    drm/rockchip: vop: fix NV12 video display error
    drm/rockchip: vop: fix iommu page fault when resume
    drm/i915/gvt: clean workload queue if error happened
    drm/i915/gvt: change resetting to resetting_eng
    drm/msm: gpu: don't abuse dma_alloc for non-DMA allocations
    drm/msm: gpu: call qcom_mdt interfaces only for ARCH_QCOM
    drm/msm/adreno: Prevent unclocked access when retrieving timestamps
    drm/msm: Remove __user from __u64 data types
    ...

    Linus Torvalds
     
  • Merge misc fixes from Andrew Morton:
    "21 fixes"

    * emailed patches from Andrew Morton : (21 commits)
    userfaultfd: replace ENOSPC with ESRCH in case mm has gone during copy/zeropage
    zram: rework copy of compressor name in comp_algorithm_store()
    rmap: do not call mmu_notifier_invalidate_page() under ptl
    mm: fix list corruptions on shmem shrinklist
    mm/balloon_compaction.c: don't zero ballooned pages
    MAINTAINERS: copy virtio on balloon_compaction.c
    mm: fix KSM data corruption
    mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
    mm: make tlb_flush_pending global
    mm: refactor TLB gathering API
    Revert "mm: numa: defer TLB flush for THP migration as long as possible"
    mm: migrate: fix barriers around tlb_flush_pending
    mm: migrate: prevent racy access to tlb_flush_pending
    fault-inject: fix wrong should_fail() decision in task context
    test_kmod: fix small memory leak on filesystem tests
    test_kmod: fix the lock in register_test_dev_kmod()
    test_kmod: fix bug which allows negative values on two config options
    test_kmod: fix spelling mistake: "EMTPY" -> "EMPTY"
    userfaultfd: hugetlbfs: remove superfluous page unlock in VM_SHARED case
    mm: ratelimit PFNs busy info message
    ...

    Linus Torvalds
     
  • Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
    problem and Mel fixed it[1] and found same problem on MADV_FREE[2].

    Quote from Mel Gorman:
    "The race in question is CPU 0 running madv_free and updating some PTEs
    while CPU 1 is also running madv_free and looking at the same PTEs.
    CPU 1 may have writable TLB entries for a page but fail the pte_dirty
    check (because CPU 0 has updated it already) and potentially fail to
    flush.

    Hence, when madv_free on CPU 1 returns, there are still potentially
    writable TLB entries and the underlying PTE is still present so that a
    subsequent write does not necessarily propagate the dirty bit to the
    underlying PTE any more. Reclaim at some unknown time at the future
    may then see that the PTE is still clean and discard the page even
    though a write has happened in the meantime. I think this is possible
    but I could have missed some protection in madv_free that prevents it
    happening."

    This patch aims for solving both problems all at once and is ready for
    other problem with KSM, MADV_FREE and soft-dirty story[3].

    TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
    and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
    catch there are parallel threads going on. In that case, forcefully,
    flush TLB to prevent for user to access memory via stale TLB entry
    although it fail to gather page table entry.

    I confirmed this patch works with [4] test program Nadav gave so this
    patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
    v2" in current mmotm.

    NOTE:

    This patch modifies arch-specific TLB gathering interface(x86, ia64,
    s390, sh, um). It seems most of architecture are straightforward but
    s390 need to be careful because tlb_flush_mmu works only if
    mm->context.flush_mm is set to non-zero which happens only a pte entry
    really is cleared by ptep_get_and_clear and friends. However, this
    problem never changes the pte entries but need to flush to prevent
    memory access from stale tlb.

    [1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
    [2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
    [3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
    [4] https://patchwork.kernel.org/patch/9861621/

    [minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
    Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
    Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
    Signed-off-by: Minchan Kim
    Signed-off-by: Nadav Amit
    Reported-by: Nadav Amit
    Reported-by: Mel Gorman
    Acked-by: Mel Gorman
    Cc: Ingo Molnar
    Cc: Russell King
    Cc: Tony Luck
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Heiko Carstens
    Cc: Yoshinori Sato
    Cc: Jeff Dike
    Cc: Andrea Arcangeli
    Cc: Andy Lutomirski
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Nadav Amit
    Cc: Rik van Riel
    Cc: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • Currently, tlb_flush_pending is used only for CONFIG_[NUMA_BALANCING|
    COMPACTION] but upcoming patches to solve subtle TLB flush batching
    problem will use it regardless of compaction/NUMA so this patch doesn't
    remove the dependency.

    [akpm@linux-foundation.org: remove more ifdefs from world's ugliest printk statement]
    Link: http://lkml.kernel.org/r/20170802000818.4760-6-namit@vmware.com
    Signed-off-by: Minchan Kim
    Signed-off-by: Nadav Amit
    Acked-by: Mel Gorman
    Cc: "David S. Miller"
    Cc: Andrea Arcangeli
    Cc: Andy Lutomirski
    Cc: Heiko Carstens
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: Jeff Dike
    Cc: Martin Schwidefsky
    Cc: Mel Gorman
    Cc: Nadav Amit
    Cc: Rik van Riel
    Cc: Russell King
    Cc: Sergey Senozhatsky
    Cc: Tony Luck
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • This patch is a preparatory patch for solving race problems caused by
    TLB batch. For that, we will increase/decrease TLB flush pending count
    of mm_struct whenever tlb_[gather|finish]_mmu is called.

    Before making it simple, this patch separates architecture specific part
    and rename it to arch_tlb_[gather|finish]_mmu and generic part just
    calls it.

    It shouldn't change any behavior.

    Link: http://lkml.kernel.org/r/20170802000818.4760-5-namit@vmware.com
    Signed-off-by: Minchan Kim
    Signed-off-by: Nadav Amit
    Acked-by: Mel Gorman
    Cc: Ingo Molnar
    Cc: Russell King
    Cc: Tony Luck
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Heiko Carstens
    Cc: Yoshinori Sato
    Cc: Jeff Dike
    Cc: Andrea Arcangeli
    Cc: Andy Lutomirski
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Nadav Amit
    Cc: Rik van Riel
    Cc: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • Reading tlb_flush_pending while the page-table lock is taken does not
    require a barrier, since the lock/unlock already acts as a barrier.
    Removing the barrier in mm_tlb_flush_pending() to address this issue.

    However, migrate_misplaced_transhuge_page() calls mm_tlb_flush_pending()
    while the page-table lock is already released, which may present a
    problem on architectures with weak memory model (PPC). To deal with
    this case, a new parameter is added to mm_tlb_flush_pending() to
    indicate if it is read without the page-table lock taken, and calling
    smp_mb__after_unlock_lock() in this case.

    Link: http://lkml.kernel.org/r/20170802000818.4760-3-namit@vmware.com
    Signed-off-by: Nadav Amit
    Acked-by: Rik van Riel
    Cc: Minchan Kim
    Cc: Sergey Senozhatsky
    Cc: Andy Lutomirski
    Cc: Mel Gorman
    Cc: "David S. Miller"
    Cc: Andrea Arcangeli
    Cc: Heiko Carstens
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: Jeff Dike
    Cc: Martin Schwidefsky
    Cc: Mel Gorman
    Cc: Nadav Amit
    Cc: Russell King
    Cc: Tony Luck
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadav Amit
     
  • Patch series "fixes of TLB batching races", v6.

    It turns out that Linux TLB batching mechanism suffers from various
    races. Races that are caused due to batching during reclamation were
    recently handled by Mel and this patch-set deals with others. The more
    fundamental issue is that concurrent updates of the page-tables allow
    for TLB flushes to be batched on one core, while another core changes
    the page-tables. This other core may assume a PTE change does not
    require a flush based on the updated PTE value, while it is unaware that
    TLB flushes are still pending.

    This behavior affects KSM (which may result in memory corruption) and
    MADV_FREE and MADV_DONTNEED (which may result in incorrect behavior). A
    proof-of-concept can easily produce the wrong behavior of MADV_DONTNEED.
    Memory corruption in KSM is harder to produce in practice, but was
    observed by hacking the kernel and adding a delay before flushing and
    replacing the KSM page.

    Finally, there is also one memory barrier missing, which may affect
    architectures with weak memory model.

    This patch (of 7):

    Setting and clearing mm->tlb_flush_pending can be performed by multiple
    threads, since mmap_sem may only be acquired for read in
    task_numa_work(). If this happens, tlb_flush_pending might be cleared
    while one of the threads still changes PTEs and batches TLB flushes.

    This can lead to the same race between migration and
    change_protection_range() that led to the introduction of
    tlb_flush_pending. The result of this race was data corruption, which
    means that this patch also addresses a theoretically possible data
    corruption.

    An actual data corruption was not observed, yet the race was was
    confirmed by adding assertion to check tlb_flush_pending is not set by
    two threads, adding artificial latency in change_protection_range() and
    using sysctl to reduce kernel.numa_balancing_scan_delay_ms.

    Link: http://lkml.kernel.org/r/20170802000818.4760-2-namit@vmware.com
    Fixes: 20841405940e ("mm: fix TLB flush race between migration, and
    change_protection_range")
    Signed-off-by: Nadav Amit
    Acked-by: Mel Gorman
    Acked-by: Rik van Riel
    Acked-by: Minchan Kim
    Cc: Andy Lutomirski
    Cc: Hugh Dickins
    Cc: "David S. Miller"
    Cc: Andrea Arcangeli
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Jeff Dike
    Cc: Martin Schwidefsky
    Cc: Mel Gorman
    Cc: Russell King
    Cc: Sergey Senozhatsky
    Cc: Tony Luck
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadav Amit
     
  • Pull PCI fix from Bjorn Helgaas:
    "Work around Renesas uPD72020x 32-bit DMA issue"

    * tag 'pci-v4.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue
    PCI: Add pci_reset_function_locked()

    Linus Torvalds
     

10 Aug, 2017

6 commits

  • At queue creation, the transport allocates a local job struct
    (struct nvmet_fc_fcp_iod) for each possible element of the queue.
    When a new CMD is received from the wire, a jobs struct is allocated
    from the queue and then used for the duration of the command.
    The job struct contains buffer space for the wire command iu. Thus,
    upon allocation of the job struct, the cmd iu buffer is copied to
    the job struct and the LLDD may immediately free/reuse the CMD IU
    buffer passed in the call.

    However, in some circumstances, due to the packetized nature of FC
    and the api of the FC LLDD which may issue a hw command to send the
    wire response, but the LLDD may not get the hw completion for the
    command and upcall the nvmet_fc layer before a new command may be
    asynchronously received on the wire. In other words, its possible
    for the initiator to get the response from the wire, thus believe a
    command slot free, and send a new command iu. The new command iu
    may be received by the LLDD and passed to the transport before the
    LLDD had serviced the hw completion and made the teardown calls for
    the original job struct. As such, there is no available job struct
    available for the new io. E.g. it appears like the host sent more
    queue elements than the queue size. It didn't based on it's
    understanding.

    Rather than treat this as a hard connection failure queue the new
    request until the job struct does free up. As the buffer isn't
    copied as there's no job struct, a special return value must be
    returned to the LLDD to signify to hold off on recycling the cmd
    iu buffer. And later, when a job struct is allocated and the
    buffer copied, a new LLDD callback is introduced to notify the
    LLDD and allow it to recycle it's command iu buffer.

    Signed-off-by: James Smart
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Core Changes:
    - dma-buf: Allow multiple sync_files to wrap a single dma-fence (Chris)

    Driver Changes:
    - rockchip: misc fixes to vop driver from the downstream rockchip tree (Mark)
    - Error path cleanups to tc358767 & host1x (Lucas & Paul, respectively)

    * tag 'drm-misc-fixes-2017-08-08' of git://anongit.freedesktop.org/git/drm-misc:
    drm/rockchip: vop: report error when check resource error
    drm/rockchip: vop: round_up pitches to word align
    drm/rockchip: vop: fix NV12 video display error
    drm/rockchip: vop: fix iommu page fault when resume
    dma-buf/sync_file: Allow multiple sync_files to wrap a single dma-fence
    drm/bridge: tc358767: fix probe without attached output node

    Dave Airlie
     
  • Bunch of msm fixes for 4.13

    * 'msm-fixes-4.13-rc3' of git://people.freedesktop.org/~robclark/linux:
    drm/msm: gpu: don't abuse dma_alloc for non-DMA allocations
    drm/msm: gpu: call qcom_mdt interfaces only for ARCH_QCOM
    drm/msm/adreno: Prevent unclocked access when retrieving timestamps
    drm/msm: Remove __user from __u64 data types
    drm/msm: args->fence should be args->flags
    drm/msm: Turn off hardware clock gating before reading A5XX registers
    drm/msm: Allow hardware clock gating to be toggled
    drm/msm: Remove some potentially blocked register ranges
    drm/msm/mdp5: Drop clock names with "_clk" suffix
    drm/msm/mdp5: Fix typo in encoder_enable path
    drm/msm: NULL pointer dereference in drivers/gpu/drm/msm/msm_gem_vma.c
    drm/msm: fix WARN_ON in add_vma() with no iommu
    drm/msm/dsi: Calculate link clock rates with updated dsi->lanes
    drm/msm/mdp5: fix unclocked register access in _cursor_set()
    drm/msm: unlock on error in msm_gem_get_iova()
    drm/msm: fix an integer overflow test
    drm/msm/mdp5: Fix compilation warnings

    Dave Airlie
     
  • Pull pin control fixes from Linus Walleij:
    "These are the pin control fixes I have gathered since the return from
    my vacation. They boiled in -next a while so let's get them in.

    Apart from the documentation build it is purely driver fixes. Which is
    nice. The Intel fixes seem kind of important.

    - Fix the documentation build as the docs were moved

    - Correct the UART pin list on the Intel Merrifield

    - Fix pin assignment and number of pins on the Marvell Armada 37xx
    pin controller

    - Cover the Setzer models in the Chromebook DMI quirk in the Intel
    cheryview driver so they start working

    - Add the missing "sim" function to the sunxi driver

    - Fix USB pin definitions on Uniphier Pro4

    - Smatch fix for invalid reference in the zx pin control driver"

    * tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    pinctrl: generic: update references to Documentation/pinctrl.txt
    pinctrl: intel: merrifield: Correct UART pin lists
    pinctrl: armada-37xx: Fix number of pin in south bridge
    pinctrl: armada-37xx: Fix the pin 23 on south bridge
    pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
    pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
    pinctrl: uniphier: fix USB3 pin assignment for Pro4
    pinctrl: zte: fix dereference of 'data' in zx_set_mux()

    Linus Torvalds
     
  • Pull i2c fixes from Wolfram Sang:
    "The main thing is to allow empty id_tables for ACPI to make some
    drivers get probed again. It looks a bit bigger than usual because it
    needs some internal renaming, too.

    Other than that, there is a fix for broken DSTDs, a super simple
    enablement for ARM MPS, and two documentation fixes which I'd like to
    see in v4.13 already"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: rephrase explanation of I2C_CLASS_DEPRECATED
    i2c: allow i2c-versatile for ARM MPS platforms
    i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
    i2c: designware: Print clock freq on invalid clock freq error
    i2c: core: Allow empty id_table in ACPI case as well
    i2c: mux: pinctrl: mention correct module name in Kconfig help text

    Linus Torvalds
     
  • Pull networking fixes from David Miller:
    "The pull requests are getting smaller, that's progress I suppose :-)

    1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.

    2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
    from Koichiro Den.

    3) Missing u64_stats_init() calls in several drivers, from Florian
    Fainelli.

    4) TCP can set the congestion window to an invalid ssthresh value
    after congestion window reductions, from Yuchung Cheng.

    5) Fix BPF jit branch generation on s390, from Daniel Borkmann.

    6) Correct MIPS ebpf JIT merge, from David Daney.

    7) Correct byte order test in BPF test_verifier.c, from Daniel
    Borkmann.

    8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.

    9) Handle SCTP checksums properly in mlx4 driver, from Davide
    Caratti.

    10) We can potentially enter tcp_connect() with a cached route
    already, due to fastopen, so we have to explicitly invalidate it.

    11) skb_warn_bad_offload() can bark in legitimate situations, fix from
    Willem de Bruijn"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
    net: avoid skb_warn_bad_offload false positives on UFO
    qmi_wwan: fix NULL deref on disconnect
    ppp: fix xmit recursion detection on ppp channels
    rds: Reintroduce statistics counting
    tcp: fastopen: tcp_connect() must refresh the route
    net: sched: set xt_tgchk_param par.net properly in ipt_init_target
    net: dsa: mediatek: add adjust link support for user ports
    net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
    qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
    hysdn: fix to a race condition in put_log_buffer
    s390/qeth: fix L3 next-hop in xmit qeth hdr
    asix: Fix small memory leak in ax88772_unbind()
    asix: Ensure asix_rx_fixup_info members are all reset
    asix: Add rx->ax_skb = NULL after usbnet_skb_return()
    bpf: fix selftest/bpf/test_pkt_md_access on s390x
    netvsc: fix race on sub channel creation
    bpf: fix byte order test in test_verifier
    xgene: Always get clk source, but ignore if it's missing for SGMII ports
    MIPS: Add missing file for eBPF JIT.
    bpf, s390: fix build for libbpf and selftest suite
    ...

    Linus Torvalds
     

09 Aug, 2017

3 commits

  • Some drivers handle rx buffer reordering internally (and by extension
    handle also the rx ba session timer internally), but do not ofload the
    addba/delba negotiation.
    Add an api for these drivers to properly tear-down the ba session,
    including sending a delba.

    Signed-off-by: Naftali Goldstein
    Signed-off-by: Luca Coelho

    Naftali Goldstein
     
  • Pull rdma fixes from Doug Ledford:
    "Third set of -rc fixes for 4.13 cycle

    - small set of miscellanous fixes

    - a reasonably sizable set of IPoIB fixes that deal with multiple
    long standing issues"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
    IB/hns: checking for IS_ERR() instead of NULL
    RDMA/mlx5: Fix existence check for extended address vector
    IB/uverbs: Fix device cleanup
    RDMA/uverbs: Prevent leak of reserved field
    IB/core: Fix race condition in resolving IP to MAC
    IB/ipoib: Notify on modify QP failure only when relevant
    Revert "IB/core: Allow QP state transition from reset to error"
    IB/ipoib: Remove double pointer assigning
    IB/ipoib: Clean error paths in add port
    IB/ipoib: Add get statistics support to SRIOV VF
    IB/ipoib: Add multicast packets statistics
    IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
    IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
    IB/ipoib: Make sure no in-flight joins while leaving that mcast
    IB/ipoib: Use cancel_delayed_work_sync when needed
    IB/ipoib: Fix race between light events and interface restart

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Two small fixes, one re-fix of a previous fix and five patches sorting
    out hotplug in the bnx2X class of drivers. The latter is rather
    involved, but necessary because these drivers have started dropping
    lockdep recursion warnings on the hotplug lock because of its
    conversion to a percpu rwsem"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    scsi: sg: only check for dxfer_len greater than 256M
    scsi: aacraid: reading out of bounds
    scsi: qedf: Limit number of CQs
    scsi: bnx2i: Simplify cpu hotplug code
    scsi: bnx2fc: Simplify CPU hotplug code
    scsi: bnx2i: Prevent recursive cpuhotplug locking
    scsi: bnx2fc: Prevent recursive cpuhotplug locking
    scsi: bnx2fc: Plug CPU hotplug race

    Linus Torvalds
     

08 Aug, 2017

1 commit

  • Pull MTD fixes from Brian Norris:
    "I missed getting these out for rc4, but here are some MTD fixes.

    Just NAND fixes (in both the core handling, and a few drivers). Notes
    stolen from Boris:

    Core fixes:

    - fix data interface setup for ONFI NANDs that do not support the SET
    FEATURES command

    - fix a kernel doc header

    - fix potential integer overflow when retrieving timing information
    from the parameter page

    - fix wrong OOB layout for small page NANDs

    Driver fixes:

    - fix potential division-by-zero bug

    - fix backward compat with old atmel-nand DT bindings

    - fix ->setup_data_interface() in the atmel NAND driver"

    * tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd:
    mtd: nand: atmel: Fix EDO mode check
    mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow
    mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES
    mtd: nand: Fix a docs build warning
    mtd: nand: sunxi: fix potential divide-by-zero error
    nand: fix wrong default oob layout for small pages using soft ecc
    mtd: nand: atmel: Fix DT backward compatibility in pmecc.c

    Linus Torvalds
     

07 Aug, 2017

3 commits

  • Update deprecated references to Documentation/pinctrl.txt since it has been
    moved to Documentation/driver-api/pinctl.rst.

    Signed-off-by: Ludovic Desroches
    Fixes: 5a9b73832e9e ("pinctrl.txt: move it to the driver-api book")
    Signed-off-by: Linus Walleij

    Ludovic Desroches
     
  • This patch fixes a bug associated with iscsit_reset_np_thread()
    that can occur during parallel configfs rmdir of a single iscsi_np
    used across multiple iscsi-target instances, that would result in
    hung task(s) similar to below where configfs rmdir process context
    was blocked indefinately waiting for iscsi_np->np_restart_comp
    to finish:

    [ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds.
    [ 6726.119440] Tainted: G W O 4.1.26-3321 #2
    [ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88 0 15550 1 0x00000000
    [ 6726.140058] ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08
    [ 6726.147593] ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0
    [ 6726.155132] ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8
    [ 6726.162667] Call Trace:
    [ 6726.165150] [] schedule+0x32/0x80
    [ 6726.170156] [] schedule_timeout+0x214/0x290
    [ 6726.176030] [] ? __send_signal+0x52/0x4a0
    [ 6726.181728] [] wait_for_completion+0x96/0x100
    [ 6726.187774] [] ? wake_up_state+0x10/0x10
    [ 6726.193395] [] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod]
    [ 6726.201278] [] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod]
    [ 6726.210033] [] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod]
    [ 6726.218351] [] configfs_write_file+0xaa/0x110
    [ 6726.224392] [] vfs_write+0xa4/0x1b0
    [ 6726.229576] [] SyS_write+0x41/0xb0
    [ 6726.234659] [] system_call_fastpath+0x12/0x71

    It would happen because each iscsit_reset_np_thread() sets state
    to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting
    for completion on iscsi_np->np_restart_comp.

    However, if iscsi_np was active processing a login request and
    more than a single iscsit_reset_np_thread() caller to the same
    iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np
    kthread process context in __iscsi_target_login_thread() would
    flush pending signals and only perform a single completion of
    np->np_restart_comp before going back to sleep within transport
    specific iscsit_transport->iscsi_accept_np code.

    To address this bug, add a iscsi_np->np_reset_count and update
    __iscsi_target_login_thread() to keep completing np->np_restart_comp
    until ->np_reset_count has reached zero.

    Reported-by: Gary Guo
    Tested-by: Gary Guo
    Cc: Mike Christie
    Cc: Hannes Reinecke
    Cc: stable@vger.kernel.org # 3.10+
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • Pull ext4 fixes from Ted Ts'o:
    "A large number of ext4 bug fixes and cleanups for v4.13"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix copy paste error in ext4_swap_extents()
    ext4: fix overflow caused by missing cast in ext4_resize_fs()
    ext4, project: expand inode extra size if possible
    ext4: cleanup ext4_expand_extra_isize_ea()
    ext4: restructure ext4_expand_extra_isize
    ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
    ext4: make xattr inode reads faster
    ext4: inplace xattr block update fails to deduplicate blocks
    ext4: remove unused mode parameter
    ext4: fix warning about stack corruption
    ext4: fix dir_nlink behaviour
    ext4: silence array overflow warning
    ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
    ext4: release discard bio after sending discard commands
    ext4: convert swap_inode_data() over to use swap() on most of the fields
    ext4: error should be cleared if ea_inode isn't added to the cache
    ext4: Don't clear SGID when inheriting ACLs
    ext4: preserve i_mode if __ext4_set_acl() fails
    ext4: remove unused metadata accounting variables
    ext4: correct comment references to ext4_ext_direct_IO()

    Linus Torvalds
     

06 Aug, 2017

1 commit

  • Pull media fixes from Mauro Carvalho Chehab:
    "This series is larger than I would like to submit for -rc4. My
    original intent were to sent it to either -rc2 or -rc3. Unfortunately,
    due to my vacations, I got a lot of pending stuff after my return, and
    had to do some biz trips, with prevented me to send this earlier.

    Several fixes:

    - some fixes at atomisp staging driver

    - several gcc 7 warning fixes

    - cleanup media SVG files, in order to fix PDF build on some distros

    - fix random Kconfig build of venus driver

    - some fixes for the venus driver

    - some changes from semaphone to mutex in ngene's driver

    - some locking fixes at dib0700 driver

    - several fixes on ngene's driver and frontends to make it properly
    support some new boards added on Kernel 4.13

    - some fixes to CEC drivers

    - omap_vout: vrfb: convert to dmaengine

    - docs-rst: document EBUSY for VIDIOC_S_FMT

    Please notice that the big diffstat changes here are at the SVG files.

    Visually, the images look the same, but the file size is now a lot
    smaller than before, and they don't use some XML tags that would cause
    them to be badly parsed by some ImageMagick versions, or to require a
    lot of memory by TeTex, with would break PDF output on some
    distributions"

    * tag 'media/v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (68 commits)
    media: atomisp2: array underflow in imx_enum_frame_size()
    media: atomisp2: array underflow in ap1302_enum_frame_size()
    media: atomisp2: Array underflow in atomisp_enum_input()
    media: platform: davinci: drop VPFE_CMD_S_CCDC_RAW_PARAMS
    media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl
    media: venus: don't abuse dma_alloc for non-DMA allocations
    media: venus: hfi: fix error handling in hfi_sys_init_done()
    media: venus: fix compile-test build on non-qcom ARM platform
    media: venus: mark PM functions as __maybe_unused
    media: cec-notifier: small improvements
    media: pulse8-cec: persistent_config should be off by default
    media: cec: cec_transmit_attempt_done: ignore CEC_TX_STATUS_MAX_RETRIES
    media: staging: atomisp: array underflow in ioctl
    media: lirc: LIRC_GET_REC_RESOLUTION should return microseconds
    media: svg: avoid too long lines
    media: svg files: simplify files
    media: selection.svg: simplify the SVG file
    media: vimc: set id_table for platform drivers
    media: staging: atomisp: disable warnings with cc-disable-warning
    media: davinci: variable 'common' set but not used
    ...

    Linus Torvalds
     

05 Aug, 2017

1 commit

  • Pull KVM fixes from Radim Krčmář:
    "ARM:

    - Yet another race with VM destruction plugged

    - A set of small vgic fixes

    x86:

    - Preserve pending INIT

    - RCU fixes in paravirtual async pf, VM teardown, and VMXOFF
    emulation

    - nVMX interrupt injection and dirty tracking fixes

    - initialize to make UBSAN happy"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: arm/arm64: vgic: Use READ_ONCE fo cmpxchg
    KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"
    KVM: nVMX: mark vmcs12 pages dirty on L2 exit
    kvm: nVMX: don't flush VMCS12 during VMXOFF or VCPU teardown
    KVM: nVMX: do not pin the VMCS12
    KVM: avoid using rcu_dereference_protected
    KVM: X86: init irq->level in kvm_pv_kick_cpu_op
    KVM: X86: Fix loss of pending INIT due to race
    KVM: async_pf: make rcu irq exit if not triggered from idle task
    KVM: nVMX: fixes to nested virt interrupt injection
    KVM: nVMX: do not fill vm_exit_intr_error_code in prepare_vmcs12
    KVM: arm/arm64: Handle hva aging while destroying the vm
    KVM: arm/arm64: PMU: Fix overflow interrupt injection
    KVM: arm/arm64: Fix bug in advertising KVM_CAP_MSI_DEVID capability

    Linus Torvalds