11 May, 2016

1 commit

  • commit 72676bb53f33fd0ef3a1484fc1ecfd306dc6ff40 upstream.

    Recently added commit 564b026fbd0d ("string_helpers: fix precision loss
    for some inputs") fixed precision issues for string_get_size() and broke
    tests.

    Fix and improve them: test both STRING_UNITS_2 and STRING_UNITS_10 at a
    time, better failure reporting, test small an huge values.

    Fixes: 564b026fbd0d28e9 ("string_helpers: fix precision loss for some inputs")
    Signed-off-by: Vitaly Kuznetsov
    Cc: Andy Shevchenko
    Cc: Rasmus Villemoes
    Cc: James Bottomley
    Cc: James Bottomley
    Cc: "James E.J. Bottomley"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     

05 May, 2016

3 commits

  • commit 3ee0cb5fb5eea2110db1b5cb7f67029b7be8a376 upstream.

    The limbs are integers in the host endianness, so we can't simply
    iterate over the individual bytes. The current code happens to work on
    little-endian, because the order of the limbs in the MPI array is the
    same as the order of the bytes in each limb, but it breaks on
    big-endian.

    Fixes: 0f74fbf77d45 ("MPI: Fix mpi_read_buffer")
    Signed-off-by: Michal Marek
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Michal Marek
     
  • commit 3e26a691fe3fe1e02a76e5bab0c143ace4b137b4 upstream.

    Based on Sergey's test patch [1], this fixes zram with lz4 compression
    on big endian cpus.

    Note that the 64-bit preprocessor test is not a cleanup, it's part of
    the fix, since those identifiers are bogus (for example, __ppc64__
    isn't defined anywhere else in the kernel, which means we'd fall into
    the 32-bit definitions on ppc64).

    Tested on ppc64 with no regression on x86_64.

    [1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4

    Suggested-by: Sergey Senozhatsky
    Signed-off-by: Rui Salvaterra
    Reviewed-by: Sergey Senozhatsky
    Signed-off-by: Greg Kroah-Hartman

    Rui Salvaterra
     
  • commit 8d4a2ec1e0b41b0cf9a0c5cd4511da7f8e4f3de2 upstream.

    Changes since V1: fixed the description and added KASan warning.

    In assoc_array_insert_into_terminal_node(), we call the
    compare_object() method on all non-empty slots, even when they're
    not leaves, passing a pointer to an unexpected structure to
    compare_object(). Currently it causes an out-of-bound read access
    in keyring_compare_object detected by KASan (see below). The issue
    is easily reproduced with keyutils testsuite.
    Only call compare_object() when the slot is a leave.

    KASan warning:
    ==================================================================
    BUG: KASAN: slab-out-of-bounds in keyring_compare_object+0x213/0x240 at addr ffff880060a6f838
    Read of size 8 by task keyctl/1655
    =============================================================================
    BUG kmalloc-192 (Not tainted): kasan: bad access detected
    -----------------------------------------------------------------------------

    Disabling lock debugging due to kernel taint
    INFO: Allocated in assoc_array_insert+0xfd0/0x3a60 age=69 cpu=1 pid=1647
    ___slab_alloc+0x563/0x5c0
    __slab_alloc+0x51/0x90
    kmem_cache_alloc_trace+0x263/0x300
    assoc_array_insert+0xfd0/0x3a60
    __key_link_begin+0xfc/0x270
    key_create_or_update+0x459/0xaf0
    SyS_add_key+0x1ba/0x350
    entry_SYSCALL_64_fastpath+0x12/0x76
    INFO: Slab 0xffffea0001829b80 objects=16 used=8 fp=0xffff880060a6f550 flags=0x3fff8000004080
    INFO: Object 0xffff880060a6f740 @offset=5952 fp=0xffff880060a6e5d1

    Bytes b4 ffff880060a6f730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f740: d1 e5 a6 60 00 88 ff ff 0e 00 00 00 00 00 00 00 ...`............
    Object ffff880060a6f750: 02 cf 8e 60 00 88 ff ff 02 c0 8e 60 00 88 ff ff ...`.......`....
    Object ffff880060a6f760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7d0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    Object ffff880060a6f7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    CPU: 0 PID: 1655 Comm: keyctl Tainted: G B 4.5.0-rc4-kasan+ #291
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    0000000000000000 000000001b2800b4 ffff880060a179e0 ffffffff81b60491
    ffff88006c802900 ffff880060a6f740 ffff880060a17a10 ffffffff815e2969
    ffff88006c802900 ffffea0001829b80 ffff880060a6f740 ffff880060a6e650
    Call Trace:
    [] dump_stack+0x85/0xc4
    [] print_trailer+0xf9/0x150
    [] object_err+0x34/0x40
    [] kasan_report_error+0x230/0x550
    [] ? keyring_get_key_chunk+0x13e/0x210
    [] __asan_report_load_n_noabort+0x5d/0x70
    [] ? keyring_compare_object+0x213/0x240
    [] keyring_compare_object+0x213/0x240
    [] assoc_array_insert+0x86c/0x3a60
    [] ? assoc_array_cancel_edit+0x70/0x70
    [] ? __key_link_begin+0x20d/0x270
    [] __key_link_begin+0xfc/0x270
    [] key_create_or_update+0x459/0xaf0
    [] ? trace_hardirqs_on+0xd/0x10
    [] ? key_type_lookup+0xc0/0xc0
    [] ? lookup_user_key+0x13d/0xcd0
    [] ? memdup_user+0x53/0x80
    [] SyS_add_key+0x1ba/0x350
    [] ? key_get_type_from_user.constprop.6+0xa0/0xa0
    [] ? retint_user+0x18/0x23
    [] ? trace_hardirqs_on_caller+0x3fe/0x580
    [] ? trace_hardirqs_on_thunk+0x17/0x19
    [] entry_SYSCALL_64_fastpath+0x12/0x76
    Memory state around the buggy address:
    ffff880060a6f700: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
    ffff880060a6f780: 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc
    >ffff880060a6f800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ^
    ffff880060a6f880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff880060a6f900: fc fc fc fc fc fc 00 00 00 00 00 00 00 00 00 00
    ==================================================================

    Signed-off-by: Jerome Marchand
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    Jerome Marchand
     

04 Mar, 2016

3 commits

  • commit 5b57167749274961baf15ed1f05a4996b3ab0487 upstream.

    The sw842 library code was merged in linux-4.1 and causes a very rare randconfig
    failure when CONFIG_CRC32 is not set:

    lib/built-in.o: In function `sw842_compress':
    oid_registry.c:(.text+0x12ddc): undefined reference to `crc32_be'
    lib/built-in.o: In function `sw842_decompress':
    oid_registry.c:(.text+0x137e4): undefined reference to `crc32_be'

    This adds an explict 'select CRC32' statement, similar to what the other users
    of the crc32 code have. In practice, CRC32 is always enabled anyway because
    over 100 other symbols select it.

    Signed-off-by: Arnd Bergmann
    Fixes: 2da572c959dd ("lib: add software 842 compression/decompression")
    Acked-by: Dan Streetman
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit a68075908a37850918ad96b056acc9ac4ce1bd90 upstream.

    The comparisons should be >= since 0x800 and 0x80 require an additional bit
    to store.

    For the 3 byte case, the existing shift would drop off 2 more bits than
    intended.

    For the 2 byte case, there should be 5 bits bits in byte 1, and 6 bits in
    byte 2.

    Signed-off-by: Jason Andryuk
    Reviewed-by: Laszlo Ersek
    Cc: Peter Jones
    Cc: Matthew Garrett
    Cc: "Lee, Chun-Yi"
    Signed-off-by: Matt Fleming
    Signed-off-by: Greg Kroah-Hartman

    Jason Andryuk
     
  • commit 73500267c930baadadb0d02284909731baf151f7 upstream.

    This adds ucs2_utf8size(), which tells us how big our ucs2 string is in
    bytes, and ucs2_as_utf8, which translates from ucs2 to utf8..

    Signed-off-by: Peter Jones
    Tested-by: Lee, Chun-Yi
    Acked-by: Matthew Garrett
    Signed-off-by: Matt Fleming
    Signed-off-by: Greg Kroah-Hartman

    Peter Jones
     

26 Feb, 2016

5 commits

  • commit d7ce36924344ace0dbdc855b1206cacc46b36d45 upstream.

    Some servers experienced fatal deadlocks because of a combination of
    bugs, leading to multiple cpus calling dump_stack().

    The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
    skb->csum when CE mark is propagated").

    The second problem is a faulty locking in dump_stack()

    CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

    CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
    call dump_stack() from netdev_rx_csum_fault().

    dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
    dump_lock is owned by CPU1

    While dumping its stack, CPU1 is interrupted by a softirq, and happens
    to process a packet for the TCP socket locked by CPU2.

    CPU1 spins forever in spin_lock() : deadlock

    Stack trace on CPU1 looked like :

    NMI backtrace for cpu 1
    RIP: _raw_spin_lock+0x25/0x30
    ...
    Call Trace:

    tcp_v6_rcv+0x243/0x620
    ip6_input_finish+0x11f/0x330
    ip6_input+0x38/0x40
    ip6_rcv_finish+0x3c/0x90
    ipv6_rcv+0x2a9/0x500
    process_backlog+0x461/0xaa0
    net_rx_action+0x147/0x430
    __do_softirq+0x167/0x2d0
    call_softirq+0x1c/0x30
    do_softirq+0x3f/0x80
    irq_exit+0x6e/0xc0
    smp_call_function_single_interrupt+0x35/0x40
    call_function_single_interrupt+0x6a/0x70

    printk+0x4d/0x4f
    printk_address+0x31/0x33
    print_trace_address+0x33/0x3c
    print_context_stack+0x7f/0x119
    dump_trace+0x26b/0x28e
    show_trace_log_lvl+0x4f/0x5c
    show_stack_log_lvl+0x104/0x113
    show_stack+0x42/0x44
    dump_stack+0x46/0x58
    netdev_rx_csum_fault+0x38/0x3c
    __skb_checksum_complete_head+0x6e/0x80
    __skb_checksum_complete+0x11/0x20
    tcp_rcv_established+0x2bd5/0x2fd0
    tcp_v6_do_rcv+0x13c/0x620
    sk_backlog_rcv+0x15/0x30
    release_sock+0xd2/0x150
    tcp_recvmsg+0x1c1/0xfc0
    inet_recvmsg+0x7d/0x90
    sock_recvmsg+0xaf/0xe0
    ___sys_recvmsg+0x111/0x3b0
    SyS_recvmsg+0x5c/0xb0
    system_call_fastpath+0x16/0x1b

    Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
    Signed-off-by: Eric Dumazet
    Cc: Alex Thorlton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • commit 46437f9a554fbe3e110580ca08ab703b59f2f95a upstream.

    If the indirect_ptr bit is set on a slot, that indicates we need to redo
    the lookup. Introduce a new function radix_tree_iter_retry() which
    forces the loop to retry the lookup by setting 'slot' to NULL and
    turning the iterator back to point at the problematic entry.

    This is a pretty rare problem to hit at the moment; the lookup has to
    race with a grow of the radix tree from a height of 0. The consequences
    of hitting this race are that gang lookup could return a pointer to a
    radix_tree_node instead of a pointer to whatever the user had inserted
    in the tree.

    Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
    Signed-off-by: Matthew Wilcox
    Cc: Hugh Dickins
    Cc: Ohad Ben-Cohen
    Cc: Konstantin Khlebnikov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox
     
  • commit ea535e418c01837d07b6c94e817540f50bfdadb0 upstream.

    In include/asm-generic/sections.h:

    /*
    * Usage guidelines:
    * _text, _data: architecture specific, don't use them in
    * arch-independent code
    * [_stext, _etext]: contains .text.* sections, may also contain
    * .rodata.*
    * and/or .init.* sections

    _text is not guaranteed across architectures. Architectures such as ARM
    may reuse parts which are not actually text and erroneously trigger a bug.
    Switch to using _stext which is guaranteed to contain text sections.

    Came out of https://lkml.kernel.org/g/

    Signed-off-by: Laura Abbott
    Reviewed-by: Kees Cook
    Cc: Russell King
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     
  • commit 564b026fbd0d28e9f70fb3831293d2922bb7855b upstream.

    It was noticed that we lose precision in the final calculation for some
    inputs. The most egregious example is size=3000 blk_size=1900 in units
    of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops).

    This is because the current algorithm doesn't correctly account for
    all the remainders in the logarithms. Fix this by doing a correct
    calculation in the remainders based on napier's algorithm.

    Additionally, now we have the correct result, we have to account for
    arithmetic rounding because we're printing 3 digits of precision. This
    means that if the fourth digit is five or greater, we have to round up,
    so add a section to ensure correct rounding. Finally account for all
    possible inputs correctly, including zero for block size.

    Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
    Signed-off-by: James Bottomley
    Reported-by: Vitaly Kuznetsov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    James Bottomley
     
  • commit 00cd29b799e3449f0c68b1cc77cd4a5f95b42d17 upstream.

    The starting node for a klist iteration is often passed in from
    somewhere way above the klist infrastructure, meaning there's no
    guarantee the node is still on the list. We've seen this in SCSI where
    we use bus_find_device() to iterate through a list of devices. In the
    face of heavy hotplug activity, the last device returned by
    bus_find_device() can be removed before the next call. This leads to

    Dec 3 13:22:02 localhost kernel: WARNING: CPU: 2 PID: 28073 at include/linux/kref.h:47 klist_iter_init_node+0x3d/0x50()
    Dec 3 13:22:02 localhost kernel: Modules linked in: scsi_debug x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel joydev iTCO_wdt dcdbas ipmi_devintf acpi_power_meter iTCO_vendor_support ipmi_si imsghandler pcspkr wmi acpi_cpufreq tpm_tis tpm shpchp lpc_ich mfd_core nfsd nfs_acl lockd grace sunrpc tg3 ptp pps_core
    Dec 3 13:22:02 localhost kernel: CPU: 2 PID: 28073 Comm: cat Not tainted 4.4.0-rc1+ #2
    Dec 3 13:22:02 localhost kernel: Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013
    Dec 3 13:22:02 localhost kernel: ffffffff81a20e77 ffff880613acfd18 ffffffff81321eef 0000000000000000
    Dec 3 13:22:02 localhost kernel: ffff880613acfd50 ffffffff8107ca52 ffff88061176b198 0000000000000000
    Dec 3 13:22:02 localhost kernel: ffffffff814542b0 ffff880610cfb100 ffff88061176b198 ffff880613acfd60
    Dec 3 13:22:02 localhost kernel: Call Trace:
    Dec 3 13:22:02 localhost kernel: [] dump_stack+0x44/0x55
    Dec 3 13:22:02 localhost kernel: [] warn_slowpath_common+0x82/0xc0
    Dec 3 13:22:02 localhost kernel: [] ? proc_scsi_show+0x20/0x20
    Dec 3 13:22:02 localhost kernel: [] warn_slowpath_null+0x1a/0x20
    Dec 3 13:22:02 localhost kernel: [] klist_iter_init_node+0x3d/0x50
    Dec 3 13:22:02 localhost kernel: [] bus_find_device+0x51/0xb0
    Dec 3 13:22:02 localhost kernel: [] scsi_seq_next+0x2d/0x40
    [...]

    And an eventual crash. It can actually occur in any hotplug system
    which has a device finder and a starting device.

    We can fix this globally by making sure the starting node for
    klist_iter_init_node() is actually a member of the list before using it
    (and by starting from the beginning if it isn't).

    Reported-by: Ewan D. Milne
    Tested-by: Ewan D. Milne
    Signed-off-by: James Bottomley
    Signed-off-by: Greg Kroah-Hartman

    James Bottomley
     

18 Feb, 2016

1 commit

  • commit fd7f6727102a1ccf6b4c1dfcc631f9b546526b26 upstream.

    I don't think it makes sense for a module to have a soft dependency
    on itself. This seems quite cyclic by nature and I can't see what
    purpose it could serve.

    OTOH libcrc32c calls crypto_alloc_shash("crc32c", 0, 0) so it pretty
    much assumes that some incarnation of the "crc32c" hash algorithm has
    been loaded. Therefore it makes sense to have the soft dependency
    there (as crc-t10dif does.)

    Cc: Tim Chen
    Cc: "David S. Miller"
    Signed-off-by: Jean Delvare
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Jean Delvare
     

19 Dec, 2015

1 commit

  • The commit c6ff5268293ef98e48a99597e765ffc417e39fa5 ("rhashtable:
    Fix walker list corruption") causes a suspicious RCU usage warning
    because we no longer hold ht->mutex when we dereference ht->tbl.

    However, this is a false positive because we now hold ht->lock
    which also guarantees that ht->tbl won't disppear from under us.

    This patch kills the warning by using rcu_dereference_protected.

    Reported-by: kernel test robot
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

18 Dec, 2015

2 commits

  • Pull networking fixes from David Miller:

    1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
    people reported this... From Arnd Bergmann.

    2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.

    3) Fix spurious EBUSY in rhashtable, from Herbert Xu.

    4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.

    5) Fix race with work structure access in pppoe driver causing
    corruptions, from Guillaume Nault.

    6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
    actually succeeded or not, from Sergei Shtylyov.

    7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
    Bjørn Mork.

    8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.

    9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
    Leitner.

    10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.

    11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
    properly as well, from Jiri Benc.

    12) Handle request sockets properly in xfrm layer, from Eric Dumazet.

    13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
    Shelar.

    14) sk->sk_policy[] needs RCU protection, and as a result
    xfrm_policy_destroy() needs to free policies using an RCU grace
    period, from Eric Dumazet.

    15) SCTP needs to clone ipv6 tx options in order to avoid use after
    free, from Eric Dumazet.

    16) Missing kbuild export if ila.h, from Stephen Hemminger.

    17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
    Tobias Klauser.

    18) Validate protocol value range in ->create() methods, from Hannes
    Frederic Sowa.

    19) Fix early socket demux races that result in illegal dst reuse, from
    Eric Dumazet.

    20) Validate socket address length in pptp code, from WANG Cong.

    21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
    packets, from Vlad Yasevich.

    22) Fix memory leaks in nl80211 registry code, from Ola Olsson.

    23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
    qlcnic. From Dan Carpenter.

    24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
    example, AF_ALG will interpret it as an async call. From Tadeusz
    Struk.

    25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
    Eric Dumazet.

    26) rhashtable enforces the minimum table size not early enough,
    breaking how we calculate the per-cpu lock allocations. From
    Herbert Xu.

    27) Fix FCC port lockup in 82xx driver, from Martin Roth.

    28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.

    29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
    sock_setsockopt() wrt. timestamp handling. From WANG Cong.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
    net: check both type and procotol for tcp sockets
    drivers: net: xgene: fix Tx flow control
    tcp: restore fastopen with no data in SYN packet
    af_unix: Revert 'lock_interruptible' in stream receive code
    fou: clean up socket with kfree_rcu
    82xx: FCC: Fixing a bug causing to FCC port lock-up
    gianfar: Don't enable RX Filer if not supported
    net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
    rhashtable: Fix walker list corruption
    rhashtable: Enforce minimum size on initial hash table
    inet: tcp: fix inetpeer_set_addr_v4()
    ipv6: automatically enable stable privacy mode if stable_secret set
    net: fix uninitialized variable issue
    bluetooth: Validate socket address length in sco_sock_bind().
    net_sched: make qdisc_tree_decrease_qlen() work for non mq
    ser_gigaset: remove unnecessary kfree() calls from release method
    ser_gigaset: fix deallocation of platform device structure
    ser_gigaset: turn nonsense checks into WARN_ON
    ser_gigaset: fix up NULL checks
    qlcnic: fix a timeout loop
    ...

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:

    - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
    These are tagged for -stable. The scatterlist impact is potentially
    corrupted dma addresses on HIGHMEM enabled platforms.

    - A minor locking fix for the NFIT hot-add implementation that is new
    in 4.4-rc. This would only trigger in the case a hot-add raced
    driver removal.

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    dma-debug: Fix dma_debug_entry offset calculation
    Revert "scatterlist: use sg_phys()"
    nfit: acpi_nfit_notify(): Do not leave device locked

    Linus Torvalds
     

17 Dec, 2015

2 commits

  • dma-debug uses struct dma_debug_entry to keep track of dma coherent
    memory allocation requests. The virtual address is converted into a pfn
    and an offset. Previously, the offset was calculated using an incorrect
    bit mask. As a result, we saw incorrect error messages from dma-debug
    like the following:

    "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"

    Cacheline 0x03e00000 does not exist on our platform.

    Cc:
    Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()")
    Signed-off-by: Daniel Mentz
    Signed-off-by: Dan Williams

    Daniel Mentz
     
  • The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable:
    Fix sleeping inside RCU critical section in walk_stop") introduced
    a new spinlock for the walker list. However, it did not convert
    all existing users of the list over to the new spin lock. Some
    continued to use the old mutext for this purpose. This obviously
    led to corruption of the list.

    The fix is to use the spin lock everywhere where we touch the list.

    This also allows us to do rcu_rad_lock before we take the lock in
    rhashtable_walk_start. With the old mutex this would've deadlocked
    but it's safe with the new spin lock.

    Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...")
    Reported-by: Colin Ian King
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

16 Dec, 2015

1 commit

  • William Hua wrote:
    >
    > I wasn't aware there was an enforced minimum size. I simply set the
    > nelem_hint in the rhastable_params struct to 1, expecting it to grow as
    > needed. This caused a segfault afterwards when trying to insert an
    > element.

    OK we're doing the size computation before we enforce the limit
    on min_size.

    ---8
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

06 Dec, 2015

1 commit


05 Dec, 2015

2 commits

  • When an rhashtable user pounds rhashtable hard with back-to-back
    insertions we may end up growing the table in GFP_ATOMIC context.
    Unfortunately when the table reaches a certain size this often
    fails because we don't have enough physically contiguous pages
    to hold the new table.

    Eric Dumazet suggested (and in fact wrote this patch) using
    __vmalloc instead which can be used in GFP_ATOMIC context.

    Reported-by: Phil Sutter
    Suggested-by: Eric Dumazet
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Thomas and Phil observed that under stress rhashtable insertion
    sometimes failed with EBUSY, even though this error should only
    ever been seen when we're under attack and our hash chain length
    has grown to an unacceptable level, even after a rehash.

    It turns out that the logic for detecting whether there is an
    existing rehash is faulty. In particular, when two threads both
    try to grow the same table at the same time, one of them may see
    the newly grown table and thus erroneously conclude that it had
    been rehashed. This is what leads to the EBUSY error.

    This patch fixes this by remembering the current last table we
    used during insertion so that rhashtable_insert_rehash can detect
    when another thread has also done a resize/rehash. When this is
    detected we will give up our resize/rehash and simply retry the
    insertion with the new table.

    Reported-by: Thomas Graf
    Reported-by: Phil Sutter
    Signed-off-by: Herbert Xu
    Tested-by: Phil Sutter
    Signed-off-by: David S. Miller

    Herbert Xu
     

23 Nov, 2015

1 commit

  • There were still a number of references to my old Red Hat email
    address in the kernel source. Remove these while keeping the
    Red Hat copyright notices intact.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 Nov, 2015

3 commits

  • Merge final patch-bomb from Andrew Morton:
    "Various leftovers, mainly Christoph's pci_dma_supported() removals"

    * emailed patches from Andrew Morton :
    pci: remove pci_dma_supported
    usbnet: remove ifdefed out call to dma_supported
    kaweth: remove ifdefed out call to dma_supported
    sfc: don't call dma_supported
    nouveau: don't call pci_dma_supported
    netup_unidvb: use pci_set_dma_mask insted of pci_dma_supported
    cx23885: use pci_set_dma_mask insted of pci_dma_supported
    cx25821: use pci_set_dma_mask insted of pci_dma_supported
    cx88: use pci_set_dma_mask insted of pci_dma_supported
    saa7134: use pci_set_dma_mask insted of pci_dma_supported
    saa7164: use pci_set_dma_mask insted of pci_dma_supported
    tw68-core: use pci_set_dma_mask insted of pci_dma_supported
    pcnet32: use pci_set_dma_mask insted of pci_dma_supported
    lib/string.c: add ULL suffix to the constant definition
    hugetlb: trivial comment fix
    selftests/mlock2: add ULL suffix to 64-bit constants
    selftests/mlock2: add missing #define _GNU_SOURCE

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix null deref in xt_TEE netfilter module, from Eric Dumazet.

    2) Several spots need to get to the original listner for SYN-ACK
    packets, most spots got this ok but some were not. Whilst covering
    the remaining cases, create a helper to do this. From Eric Dumazet.

    3) Missiing check of return value from alloc_netdev() in CAIF SPI code,
    from Rasmus Villemoes.

    4) Don't sleep while != TASK_RUNNING in macvtap, from Vlad Yasevich.

    5) Use after free in mvneta driver, from Justin Maggard.

    6) Fix race on dst->flags access in dst_release(), from Eric Dumazet.

    7) Add missing ZLIB_INFLATE dependency for new qed driver. From Arnd
    Bergmann.

    8) Fix multicast getsockopt deadlock, from WANG Cong.

    9) Fix deadlock in btusb, from Kuba Pawlak.

    10) Some ipv6_add_dev() failure paths were not cleaning up the SNMP6
    counter state. From Sabrina Dubroca.

    11) Fix packet_bind() race, which can cause lost notifications, from
    Francesco Ruggeri.

    12) Fix MAC restoration in qlcnic driver during bonding mode changes,
    from Jarod Wilson.

    13) Revert bridging forward delay change which broke libvirt and other
    userspace things, from Vlad Yasevich.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
    Revert "bridge: Allow forward delay to be cfgd when STP enabled"
    bpf_trace: Make dependent on PERF_EVENTS
    qed: select ZLIB_INFLATE
    net: fix a race in dst_release()
    net: mvneta: Fix memory use after free.
    net: Documentation: Fix default value tcp_limit_output_bytes
    macvtap: Resolve possible __might_sleep warning in macvtap_do_read()
    mvneta: add FIXED_PHY dependency
    net: caif: check return value of alloc_netdev
    net: hisilicon: NET_VENDOR_HISILICON should depend on HAS_DMA
    drivers: net: xgene: fix RGMII 10/100Mb mode
    netfilter: nft_meta: use skb_to_full_sk() helper
    net_sched: em_meta: use skb_to_full_sk() helper
    sched: cls_flow: use skb_to_full_sk() helper
    netfilter: xt_owner: use skb_to_full_sk() helper
    smack: use skb_to_full_sk() helper
    net: add skb_to_full_sk() helper and use it in selinux_netlbl_skbuff_setsid()
    bpf: doc: correct arch list for supported eBPF JIT
    dwc_eth_qos: Delete an unnecessary check before the function call "of_node_put"
    bonding: fix panic on non-ARPHRD_ETHER enslave failure
    ...

    Linus Torvalds
     
  • 8-byte constant is too big for long and compiler complains about this.

    lib/string.c:907:20: warning: constant 0x0101010101010101 is so big it is long

    Append ULL suffix to explicitly show its type.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     

10 Nov, 2015

3 commits

  • Merge third patch-bomb from Andrew Morton:
    "We're pretty much done over here - I'm still waiting for a nouveau
    merge so I can cleanly finish up Christoph's dma-mapping rework.

    - bunch of small misc stuff

    - fold abs64() into abs(), remove abs64()

    - new_valid_dev() cleanups

    - binfmt_elf_fdpic feature work"

    * emailed patches from Andrew Morton : (24 commits)
    fs/binfmt_elf_fdpic.c: provide NOMMU loader for regular ELF binaries
    fs/stat.c: remove unnecessary new_valid_dev() check
    fs/reiserfs/namei.c: remove unnecessary new_valid_dev() check
    fs/nilfs2/namei.c: remove unnecessary new_valid_dev() check
    fs/ncpfs/dir.c: remove unnecessary new_valid_dev() check
    fs/jfs: remove unnecessary new_valid_dev() checks
    fs/hpfs/namei.c: remove unnecessary new_valid_dev() check
    fs/f2fs/namei.c: remove unnecessary new_valid_dev() check
    fs/ext2/namei.c: remove unnecessary new_valid_dev() check
    fs/exofs/namei.c: remove unnecessary new_valid_dev() check
    fs/btrfs/inode.c: remove unnecessary new_valid_dev() check
    fs/9p: remove unnecessary new_valid_dev() checks
    include/linux/kdev_t.h: old/new_valid_dev() can return bool
    include/linux/kdev_t.h: remove unused huge_valid_dev()
    kmap_atomic_to_page() has no users, remove it
    drivers/scsi/cxgbi: fix build with EXTRA_CFLAGS
    dma: remove external references to dma_supported
    Documentation/sysctl/vm.txt: fix misleading code reference of overcommit_memory
    remove abs64()
    kernel.h: make abs() work with 64-bit types
    ...

    Linus Torvalds
     
  • Pull module updates from Rusty Russell:
    "Nothing exciting, minor tweaks and cleanups"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    scripts: [modpost] add new sections to white list
    modpost: Add flag -E for making section mismatches fatal
    params: don't ignore the rest of cmdline if parse_one() fails
    modpost: abort if a module symbol is too long

    Linus Torvalds
     
  • Switch everything to the new and more capable implementation of abs().
    Mainly to give the new abs() a bit of a workout.

    Cc: Michal Nazarewicz
    Cc: John Stultz
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

08 Nov, 2015

1 commit

  • Merge second patch-bomb from Andrew Morton:

    - most of the rest of MM

    - procfs

    - lib/ updates

    - printk updates

    - bitops infrastructure tweaks

    - checkpatch updates

    - nilfs2 update

    - signals

    - various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc,
    dma-debug, dma-mapping, ...

    * emailed patches from Andrew Morton : (102 commits)
    ipc,msg: drop dst nil validation in copy_msg
    include/linux/zutil.h: fix usage example of zlib_adler32()
    panic: release stale console lock to always get the logbuf printed out
    dma-debug: check nents in dma_sync_sg*
    dma-mapping: tidy up dma_parms default handling
    pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
    kexec: use file name as the output message prefix
    fs, seqfile: always allow oom killer
    seq_file: reuse string_escape_str()
    fs/seq_file: use seq_* helpers in seq_hex_dump()
    coredump: change zap_threads() and zap_process() to use for_each_thread()
    coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
    signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT)
    signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread()
    signal: turn dequeue_signal_lock() into kernel_dequeue_signal()
    signals: kill block_all_signals() and unblock_all_signals()
    nilfs2: fix gcc uninitialized-variable warnings in powerpc build
    nilfs2: fix gcc unused-but-set-variable warnings
    MAINTAINERS: nilfs2: add header file for tracing
    nilfs2: add tracepoints for analyzing reading and writing metadata files
    ...

    Linus Torvalds
     

07 Nov, 2015

10 commits

  • Like dma_unmap_sg, dma_sync_sg* should be called with the original number
    of entries passed to dma_map_sg, so do the same check in the sync path as
    we do in the unmap path.

    Signed-off-by: Robin Murphy
    Cc: Arnd Bergmann
    Cc: Marek Szyprowski
    Cc: Sumit Semwal
    Cc: Sakari Ailus
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Murphy
     
  • There is a classical off-by-one error in case when we try to place, for
    example, 1+1 bytes as hex in the buffer of size 6. The expected result is
    to get an output truncated, but in the reality we get 6 bytes filed
    followed by terminating NUL.

    Change the logic how we fill the output in case of byte dumping into
    limited space. This will follow the snprintf() behaviour by truncating
    output even on half bytes.

    Fixes: 114fc1afb2de (hexdump: make it return number of bytes placed in buffer)
    Signed-off-by: Andy Shevchenko
    Reported-by: Aaro Koskinen
    Tested-by: Aaro Koskinen
    Cc: Al Viro
    Cc: Catalin Marinas
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • Change current_is_single_threaded() to use for_each_thread() rather than
    deprecated while_each_thread().

    Signed-off-by: Oleg Nesterov
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Sometimes kobject_set_name_vargs is called with a format string conaining
    no %, or a format string of precisely "%s", where the single vararg
    happens to point to .rodata. kvasprintf_const detects these cases for us
    and returns a copy of that pointer instead of duplicating the string, thus
    saving some run-time memory. Otherwise, it falls back to kvasprintf. We
    just need to always deallocate ->name using kfree_const.

    Unfortunately, the dance we need to do to perform the '/' -> '!'
    sanitization makes the resulting code rather ugly.

    I instrumented kstrdup_const to provide some statistics on the memory
    saved, and for me this gave an additional ~14KB after boot (306KB was
    already saved; this patch bumped that to 320KB). I have
    KMALLOC_SHIFT_LOW==3, and since 80% of the kvasprintf_const hits were
    satisfied by an 8-byte allocation, the 14K would roughly be quadrupled
    when KMALLOC_SHIFT_LOW==5. Whether these numbers are sufficient to
    justify the ugliness I'll leave to others to decide.

    Signed-off-by: Rasmus Villemoes
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     
  • This adds kvasprintf_const which tries to use kstrdup_const if possible:
    If the format string contains no % characters, or if the format string is
    exactly "%s", we delegate to kstrdup_const. Otherwise, we fall back to
    kvasprintf.

    Just as for kstrdup_const, the main motivation is to save memory by
    reusing .rodata when possible.

    The return value should be freed by kfree_const, just like for
    kstrdup_const.

    There is deliberately no kasprintf_const: In the vast majority of cases,
    the format string argument is a literal, so one can determine statically
    whether one could instead use kstrdup_const directly (which would also
    require one to change all corresponding kfree calls to kfree_const).

    Signed-off-by: Rasmus Villemoes
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     
  • llist_del_first reads entry->next, but it did not acquire visibility over
    the entry node. As the result it can get a stale value of entry->next
    (e.g. NULL or whatever garbage was there before the appending thread
    wrote correct value). And then commit that value as llist head with
    cmpxchg. That will corrupt llist.

    Note there is a control-dependency between read of head->first and read of
    entry->next, but it does not make the code correct. Kernel memory model
    unambiguously says: "A load-load control dependency requires a full read
    memory barrier".

    Use smp_load_acquire to acquire visibility over the entry node.

    The data race was found with KernelThreadSanitizer (KTSAN).

    Here is an example of KTSAN report:

    ThreadSanitizer: data-race in llist_del_first

    Read of size 1 by thread T389 (K2630, CPU0):
    [] llist_del_first+0x39/0x70 lib/llist.c:74
    [< inlined >] tty_buffer_alloc drivers/tty/tty_buffer.c:181
    [] __tty_buffer_request_room+0xb4/0x250 drivers/tty/tty_buffer.c:292
    [] tty_insert_flip_string_fixed_flag+0x6c/0x150 drivers/tty/tty_buffer.c:337
    [< inlined >] tty_insert_flip_string include/linux/tty_flip.h:35
    [] pty_write+0x72/0xc0 drivers/tty/pty.c:110
    [< inlined >] process_output_block drivers/tty/n_tty.c:611
    [] n_tty_write+0x346/0x7f0 drivers/tty/n_tty.c:2401
    [< inlined >] do_tty_write drivers/tty/tty_io.c:1159
    [] tty_write+0x21f/0x3f0 drivers/tty/tty_io.c:1245
    [] __vfs_write+0x5f/0x1f0 fs/read_write.c:489
    [] vfs_write+0xef/0x280 fs/read_write.c:538
    [< inlined >] SYSC_write fs/read_write.c:585
    [] SyS_write+0x70/0xe0 fs/read_write.c:577
    [] entry_SYSCALL_64_fastpath+0x12/0x71 arch/x86/entry/entry_64.S:186

    Previous write of size 8 by thread T226 (K761, CPU0):
    [] llist_add_batch+0x32/0x70 lib/llist.c:44 (discriminator 16)
    [< inlined >] llist_add include/linux/llist.h:180
    [] tty_buffer_free+0x6c/0xb0 drivers/tty/tty_buffer.c:221
    [] flush_to_ldisc+0x107/0x300 drivers/tty/tty_buffer.c:514
    [] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
    [] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
    [] kthread+0x150/0x170 kernel/kthread.c:209
    [] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:526

    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Paul E. McKenney
    Cc: Rasmus Villemoes
    Cc: Huang Ying
    Cc: Konstantin Serebryany
    Cc: Andrey Konovalov
    Cc: Alexander Potapenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     
  • Add a couple of simple tests for string_get_size(). The last one will
    hang the kernel without the 'lib/string_helpers.c: fix infinite loop in
    string_get_size()' fix.

    Signed-off-by: Vitaly Kuznetsov
    Cc: James Bottomley
    Cc: Andy Shevchenko
    Cc: Rasmus Villemoes
    Cc: "K. Y. Srinivasan"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vitaly Kuznetsov
     
  • provides rol32() inline function, let's use already
    predefined function instead of direct expression.

    Signed-off-by: Alexander Kuleshov
    Cc: Herbert Xu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Kuleshov
     
  • %n is no longer just ignored; it results in early return from vsnprintf.
    Also add a request to add test cases for future %p extensions.

    Signed-off-by: Rasmus Villemoes
    Reviewed-by: Martin Kletzander
    Reviewed-by: Andy Shevchenko
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     
  • This adds a simple module for testing the kernel's printf facilities.
    Previously, some %p extensions have caused a wrong return value in case
    the entire output didn't fit and/or been unusable in kasprintf(). This
    should help catch such issues. Also, it should help ensure that changes
    to the formatting algorithms don't break anything.

    I'm not sure if we have a struct dentry or struct file lying around at
    boot time or if we can fake one, but most %p extensions should be
    testable, as should the ordinary number and string formatting.

    The nature of vararg functions means we can't use a more conventional
    table-driven approach.

    For now, this is mostly a skeleton; contributions are very
    welcome. Some tests are/will be slightly annoying to write, since the
    expected output depends on stuff like CONFIG_*, sizeof(long), runtime
    values etc.

    Signed-off-by: Rasmus Villemoes
    Reviewed-by: Kees Cook
    Cc: Andy Shevchenko
    Cc: Martin Kletzander
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes