11 Jan, 2018

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2018-01-09

    The following pull-request contains BPF updates for your *net* tree.

    The main changes are:

    1) Prevent out-of-bounds speculation in BPF maps by masking the
    index after bounds checks in order to fix spectre v1, and
    add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for
    removing the BPF interpreter from the kernel in favor of
    JIT-only mode to make spectre v2 harder, from Alexei.

    2) Remove false sharing of map refcount with max_entries which
    was used in spectre v1, from Daniel.

    3) Add a missing NULL psock check in sockmap in order to fix
    a race, from John.

    4) Fix test_align BPF selftest case since a recent change in
    verifier rejects the bit-wise arithmetic on pointers
    earlier but test_align update was missing, from Alexei.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

10 Jan, 2018

1 commit

  • The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

    A quote from goolge project zero blog:
    "At this point, it would normally be necessary to locate gadgets in
    the host kernel code that can be used to actually leak data by reading
    from an attacker-controlled location, shifting and masking the result
    appropriately and then using the result of that as offset to an
    attacker-controlled address for a load. But piecing gadgets together
    and figuring out which ones work in a speculation context seems annoying.
    So instead, we decided to use the eBPF interpreter, which is built into
    the host kernel - while there is no legitimate way to invoke it from inside
    a VM, the presence of the code in the host kernel's text section is sufficient
    to make it usable for the attack, just like with ordinary ROP gadgets."

    To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
    option that removes interpreter from the kernel in favor of JIT-only mode.
    So far eBPF JIT is supported by:
    x64, arm64, arm32, sparc64, s390, powerpc64, mips64

    The start of JITed program is randomized and code page is marked as read-only.
    In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

    v2->v3:
    - move __bpf_prog_ret0 under ifdef (Daniel)

    v1->v2:
    - fix init order, test_bpf and cBPF (Daniel's feedback)
    - fix offloaded bpf (Jakub's feedback)
    - add 'return 0' dummy in case something can invoke prog->bpf_func
    - retarget bpf tree. For bpf-next the patch would need one extra hunk.
    It will be sent when the trees are merged back to net-next

    Considered doing:
    int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
    but it seems better to land the patch as-is and in bpf-next remove
    bpf_jit_enable global variable from all JITs, consolidate in one place
    and remove this jit_init() function.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Alexei Starovoitov
     

06 Jan, 2018

1 commit

  • Pull crypto fixes from Herbert Xu:
    "This fixes the following issues:

    - racy use of ctx->rcvused in af_alg

    - algif_aead crash in chacha20poly1305

    - freeing bogus pointer in pcrypt

    - build error on MIPS in mpi

    - memory leak in inside-secure

    - memory overwrite in inside-secure

    - NULL pointer dereference in inside-secure

    - state corruption in inside-secure

    - build error without CRYPTO_GF128MUL in chelsio

    - use after free in n2"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: inside-secure - do not use areq->result for partial results
    crypto: inside-secure - fix request allocations in invalidation path
    crypto: inside-secure - free requests even if their handling failed
    crypto: inside-secure - per request invalidation
    lib/mpi: Fix umul_ppmm() for MIPS64r6
    crypto: pcrypt - fix freeing pcrypt instances
    crypto: n2 - cure use after free
    crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t
    crypto: chacha20poly1305 - validate the digest size
    crypto: chelsio - select CRYPTO_GF128MUL

    Linus Torvalds
     

01 Jan, 2018

2 commits

  • Pull timer fixes from Thomas Gleixner:
    "A pile of fixes for long standing issues with the timer wheel and the
    NOHZ code:

    - Prevent timer base confusion accross the nohz switch, which can
    cause unlocked access and data corruption

    - Reinitialize the stale base clock on cpu hotplug to prevent subtle
    side effects including rollovers on 32bit

    - Prevent an interrupt storm when the timer softirq is already
    pending caused by tick_nohz_stop_sched_tick()

    - Move the timer start tracepoint to a place where it actually makes
    sense

    - Add documentation to timerqueue functions as they caused confusion
    several times now"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    timerqueue: Document return values of timerqueue_add/del()
    timers: Invoke timer_start_debug() where it makes sense
    nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
    timers: Reinitialize per cpu bases on hotplug
    timers: Use deferrable base independent of base::nohz_active

    Linus Torvalds
     
  • Pull driver core fixes from Greg KH:
    "Here are two driver core fixes for 4.15-rc6, resolving some reported
    issues.

    The first is a cacheinfo fix for DT based systems to resolve a
    reported issue that has been around for a while, and the other is to
    resolve a regression in the kobject uevent code that showed up in
    4.15-rc1.

    Both have been in linux-next for a while with no reported issues"

    * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    kobject: fix suppressing modalias in uevents delivered over netlink
    drivers: base: cacheinfo: fix cache type for non-architected system cache

    Linus Torvalds
     

30 Dec, 2017

1 commit

  • The return values of timerqueue_add/del() are not documented in the kernel doc
    comment. Add proper documentation.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Sebastian Siewior
    Cc: rt@linutronix.de
    Cc: Paul McKenney
    Cc: Anna-Maria Gleixner
    Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de

    Thomas Gleixner
     

22 Dec, 2017

1 commit

  • Current MIPS64r6 toolchains aren't able to generate efficient
    DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
    performs an unsigned 64 x 64 bit multiply and returns the upper and
    lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
    inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
    x 128 multiply. This is both inefficient, and it results in a link error
    since we don't include __multi3 in MIPS linux.

    For example commit 90a53e4432b1 ("cfg80211: implement regdb signature
    checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
    64r6el_defconfig builds by indirectly selecting MPILIB. The same build
    errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:

    lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
    lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
    lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
    lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
    lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
    lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
    lib/mpi/mpih-div.o In function `mpihelp_divrem':
    lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
    lib/mpi/mpih-div.c:142: undefined reference to `__multi3'

    Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
    inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
    calls being emitted.

    Fixes: 7fd08ca58ae6 ("MIPS: Add build support for the MIPS R6 ISA")
    Signed-off-by: James Hogan
    Cc: Ralf Baechle
    Cc: Herbert Xu
    Cc: "David S. Miller"
    Cc: linux-mips@linux-mips.org
    Cc: linux-crypto@vger.kernel.org
    Signed-off-by: Herbert Xu

    James Hogan
     

21 Dec, 2017

1 commit

  • The commit 4a336a23d619 ("kobject: copy env blob in one go") optimized
    constructing uevent data for delivery over netlink by using the raw
    environment buffer, instead of reconstructing it from individual
    environment pointers. Unfortunately in doing so it broke suppressing
    MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this
    attribute only adjusted the environment pointers, but left the buffer
    itself alone. Let's fix it by making sure the offending attribute is
    obliterated form the buffer as well.

    Reported-by: Tariq Toukan
    Reported-by: Casey Leedom
    Fixes: 4a336a23d619 ("kobject: copy env blob in one go")
    Signed-off-by: Dmitry Torokhov
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Torokhov
     

18 Dec, 2017

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2017-12-17

    The following pull-request contains BPF updates for your *net* tree.

    The main changes are:

    1) Fix a corner case in generic XDP where we have non-linear skbs
    but enough tailroom in the skb to not miss to linearizing there,
    from Song.

    2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when
    BPF context is not skb, from Daniel.

    3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper
    call would use the wrong register for the skb, from Daniel.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Dec, 2017

2 commits

  • Pull locking fixes from Ingo Molnar:
    "Misc fixes:

    - Fix a S390 boot hang that was caused by the lock-break logic.
    Remove lock-break to begin with, as review suggested it was
    unreasonably fragile and our confidence in its continued good
    health is lower than our confidence in its removal.

    - Remove the lockdep cross-release checking code for now, because of
    unresolved false positive warnings. This should make lockdep work
    well everywhere again.

    - Get rid of the final (and single) ACCESS_ONCE() straggler and
    remove the API from v4.15.

    - Fix a liblockdep build warning"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tools/lib/lockdep: Add missing declaration of 'pr_cont()'
    checkpatch: Remove ACCESS_ONCE() warning
    compiler.h: Remove ACCESS_ONCE()
    tools/include: Remove ACCESS_ONCE()
    tools/perf: Convert ACCESS_ONCE() to READ_ONCE()
    locking/lockdep: Remove the cross-release locking checks
    locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y
    locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK

    Linus Torvalds
     
  • Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls
    a helper that triggers reload of cached skb data, iv) uses LD_ABS again.
    It's added for test_bpf in order to do runtime testing after JITing as
    well as test_verifier to test that the sequence is allowed.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

15 Dec, 2017

1 commit

  • Add a variant of rbtree_replace_node() that maintains the leftmost cache
    of struct rbtree_root_cached when replacing nodes within the rbtree.

    As drm_mm is the only rb_replace_node() being used on an interval tree,
    the mistake looks fairly self-contained. Furthermore the only user of
    drm_mm_replace_node() is its testsuite...

    Testcase: igt/drm_mm/replace

    Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk
    Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk
    Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection")
    Signed-off-by: Chris Wilson
    Reviewed-by: Joonas Lahtinen
    Acked-by: Davidlohr Bueso
    Cc: Jérôme Glisse
    Cc: Joonas Lahtinen
    Cc: Daniel Vetter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wilson
     

12 Dec, 2017

1 commit

  • This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
    while it found a number of old bugs initially, was also causing too many
    false positives that caused people to disable lockdep - which is arguably
    a worse overall outcome.

    If we disable cross-release by default but keep the code upstream then
    in practice the most likely outcome is that we'll allow the situation
    to degrade gradually, by allowing entropy to introduce more and more
    false positives, until it overwhelms maintenance capacity.

    Another bad side effect was that people were trying to work around
    the false positives by uglifying/complicating unrelated code. There's
    a marked difference between annotating locking operations and
    uglifying good code just due to bad lock debugging code ...

    This gradual decrease in quality happened to a number of debugging
    facilities in the kernel, and lockdep is pretty complex already,
    so we cannot risk this outcome.

    Either cross-release checking can be done right with no false positives,
    or it should not be included in the upstream kernel.

    ( Note that it might make sense to maintain it out of tree and go through
    the false positives every now and then and see whether new bugs were
    introduced. )

    Cc: Byungchul Park
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

09 Dec, 2017

1 commit


08 Dec, 2017

5 commits

  • Callers of sprint_oid() do not check its return value before printing
    the result. In the case where the OID is zero-length, -EBADMSG was
    being returned without anything being written to the buffer, resulting
    in uninitialized stack memory being printed. Fix this by writing
    "(bad)" to the buffer in the cases where -EBADMSG is returned.

    Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings")
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells

    Eric Biggers
     
  • In sprint_oid(), if the input buffer were to be more than 1 byte too
    small for the first snprintf(), 'bufsize' would underflow, causing a
    buffer overflow when printing the remainder of the OID.

    Fortunately this cannot actually happen currently, because no users pass
    in a buffer that can be too small for the first snprintf().

    Regardless, fix it by checking the snprintf() return value correctly.

    For consistency also tweak the second snprintf() check to look the same.

    Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings")
    Cc: Takashi Iwai
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Reviewed-by: James Morris

    Eric Biggers
     
  • asn1_ber_decoder() was ignoring errors from actions associated with the
    opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
    ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this
    meant the pkcs7_note_signed_info() action (since that was the only user
    of those opcodes). Fix it by checking for the error, just like the
    decoder does for actions associated with the other opcodes.

    This bug allowed users to leak slab memory by repeatedly trying to add a
    specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).

    In theory, this bug could also be used to bypass module signature
    verification, by providing a PKCS#7 message that is misparsed such that
    a signature's ->authattrs do not contain its ->msgdigest. But it
    doesn't seem practical in normal cases, due to restrictions on the
    format of the ->authattrs.

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Cc: # v3.7+
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Reviewed-by: James Morris

    Eric Biggers
     
  • In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
    to the action functions before their lengths had been computed, using
    the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in
    reading data past the end of the input buffer, when given a specially
    crafted message.

    Fix it by rearranging the code so that the indefinite length is resolved
    before the action is called.

    This bug was originally found by fuzzing the X.509 parser in userspace
    using libFuzzer from the LLVM project.

    KASAN report (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
    BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    Read of size 128 at addr ffff880035dd9eaf by task keyctl/195

    CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0xd1/0x175 lib/dump_stack.c:53
    print_address_description+0x78/0x260 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x23f/0x350 mm/kasan/report.c:409
    memcpy+0x1f/0x50 mm/kasan/kasan.c:302
    memcpy ./include/linux/string.h:341 [inline]
    x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
    x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
    x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
    asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
    key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
    SYSC_add_key security/keys/keyctl.c:122 [inline]
    SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0x96

    Allocated by task 195:
    __do_kmalloc_node mm/slab.c:3675 [inline]
    __kmalloc_node+0x47/0x60 mm/slab.c:3682
    kvmalloc ./include/linux/mm.h:540 [inline]
    SYSC_add_key security/keys/keyctl.c:104 [inline]
    SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0x96

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Reported-by: Alexander Potapenko
    Cc: # v3.7+
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells

    Eric Biggers
     
  • Commit 28033ae4e0f5 ("net: netlink: Update attr validation to require
    exact length for some types") requires attributes using types NLA_U* and
    NLA_S* to have an exact length. This change is exposing bugs in various
    userspace commands that are sending attributes with an invalid length
    (e.g., attribute has type NLA_U8 and userspace sends NLA_U32). While
    the commands are clearly broken and need to be fixed, users are arguing
    that the sudden change in enforcement is breaking older commands on
    newer kernels for use cases that otherwise "worked".

    Relax the validation to print a warning mesage similar to what is done
    for messages containing extra bytes after parsing.

    Fixes: 28033ae4e0f5 ("net: netlink: Update attr validation to require exact length for some types")
    Signed-off-by: David Ahern
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    David Ahern
     

02 Dec, 2017

2 commits

  • …nux/kernel/git/palmer/linux

    Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt:
    "This contains a handful of small cleanups that are a result of
    feedback that didn't make it into our original patch set, either
    because the feedback hadn't been given yet, I missed the original
    emails, or we weren't ready to submit the changes yet.

    I've been maintaining the various cleanup patch sets I have as their
    own branches, which I then merged together and signed. Each merge
    commit has a short summary of the changes, and each branch is based on
    your latest tag (4.15-rc1, in this case). If this isn't the right way
    to do this then feel free to suggest something else, but it seems sane
    to me.

    Here's a short summary of the changes, roughly in order of how
    interesting they are.

    - libgcc.h has been moved from include/lib, where it's the only
    member, to include/linux. This is meant to avoid tab completion
    conflicts.

    - VDSO entries for clock_get/gettimeofday/getcpu have been added.
    These are simple syscalls now, but we want to let glibc use them
    from the start so we can make them faster later.

    - A VDSO entry for instruction cache flushing has been added so
    userspace can flush the instruction cache.

    - The VDSO symbol versions for __vdso_cmpxchg{32,64} have been
    removed, as those VDSO entries don't actually exist.

    - __io_writes has been corrected to respect the given type.

    - A new READ_ONCE in arch_spin_is_locked().

    - __test_and_op_bit_ord() is now actually ordered.

    - Various small fixes throughout the tree to enable allmodconfig to
    build cleanly.

    - Removal of some dead code in our atomic support headers.

    - Improvements to various comments in our atomic support headers"

    * tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits)
    RISC-V: __io_writes should respect the length argument
    move libgcc.h to include/linux
    RISC-V: Clean up an unused include
    RISC-V: Allow userspace to flush the instruction cache
    RISC-V: Flush I$ when making a dirty page executable
    RISC-V: Add missing include
    RISC-V: Use define for get_cycles like other architectures
    RISC-V: Provide stub of setup_profiling_timer()
    RISC-V: Export some expected symbols for modules
    RISC-V: move empty_zero_page definition to C and export it
    RISC-V: io.h: type fixes for warnings
    RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros
    RISC-V: use generic serial.h
    RISC-V: remove spin_unlock_wait()
    RISC-V: `sfence.vma` orderes the instruction cache
    RISC-V: Add READ_ONCE in arch_spin_is_locked()
    RISC-V: __test_and_op_bit_ord should be strongly ordered
    RISC-V: Remove smb_mb__{before,after}_spinlock()
    RISC-V: Remove __smp_bp__{before,after}_atomic
    RISC-V: Comment on why {,cmp}xchg is ordered how it is
    ...

    Linus Torvalds
     
  • Introducing a new include/lib directory just for this file totally
    messes up tab completion for include/linux, which is highly annoying.

    Move it to include/linux where we have headers for all kinds of other
    lib/ code as well.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Palmer Dabbelt

    Christoph Hellwig
     

30 Nov, 2017

1 commit

  • Instead, just fall back on the new '%p' behavior which hashes the
    pointer.

    Otherwise, '%pK' - that was intended to mark a pointer as restricted -
    just ends up leaking pointers that a normal '%p' wouldn't leak. Which
    just make the whole thing pointless.

    I suspect we should actually get rid of '%pK' entirely, and make it just
    work as '%p' regardless, but this is the minimal obvious fix. People
    who actually use 'kptr_restrict' should weigh in on which behavior they
    want.

    Cc: Tobin Harding
    Cc: Kees Cook
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

29 Nov, 2017

3 commits

  • printk specifier %p now hashes all addresses before printing. Sometimes
    we need to see the actual unmodified address. This can be achieved using
    %lx but then we face the risk that if in future we want to change the
    way the Kernel handles printing of pointers we will have to grep through
    the already existent 50 000 %lx call sites. Let's add specifier %px as a
    clear, opt-in, way to print a pointer and maintain some level of
    isolation from all the other hex integer output within the Kernel.

    Add printk specifier %px to print the actual unmodified address.

    Signed-off-by: Tobin C. Harding

    Tobin C. Harding
     
  • Currently there exist approximately 14 000 places in the kernel where
    addresses are being printed using an unadorned %p. This potentially
    leaks sensitive information regarding the Kernel layout in memory. Many
    of these calls are stale, instead of fixing every call lets hash the
    address by default before printing. This will of course break some
    users, forcing code printing needed addresses to be updated.

    Code that _really_ needs the address will soon be able to use the new
    printk specifier %px to print the address.

    For what it's worth, usage of unadorned %p can be broken down as
    follows (thanks to Joe Perches).

    $ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
    1084 arch
    20 block
    10 crypto
    32 Documentation
    8121 drivers
    1221 fs
    143 include
    101 kernel
    69 lib
    100 mm
    1510 net
    40 samples
    7 scripts
    11 security
    166 sound
    152 tools
    2 virt

    Add function ptr_to_id() to map an address to a 32 bit unique
    identifier. Hash any unadorned usage of specifier %p and any malformed
    specifiers.

    Signed-off-by: Tobin C. Harding

    Tobin C. Harding
     
  • Currently code to handle %pK is all within the switch statement in
    pointer(). This is the wrong level of abstraction. Each of the other switch
    clauses call a helper function, pK should do the same.

    Refactor code out of pointer() to new function restricted_pointer().

    Signed-off-by: Tobin C. Harding

    Tobin C. Harding
     

26 Nov, 2017

1 commit

  • Pull timer updates from Thomas Gleixner:

    - The final conversion of timer wheel timers to timer_setup().

    A few manual conversions and a large coccinelle assisted sweep and
    the removal of the old initialization mechanisms and the related
    code.

    - Remove the now unused VSYSCALL update code

    - Fix permissions of /proc/timer_list. I still need to get rid of that
    file completely

    - Rename a misnomed clocksource function and remove a stale declaration

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    m68k/macboing: Fix missed timer callback assignment
    treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
    timer: Remove redundant __setup_timer*() macros
    timer: Pass function down to initialization routines
    timer: Remove unused data arguments from macros
    timer: Switch callback prototype to take struct timer_list * argument
    timer: Pass timer_list pointer to callbacks unconditionally
    Coccinelle: Remove setup_timer.cocci
    timer: Remove setup_*timer() interface
    timer: Remove init_timer() interface
    treewide: setup_timer() -> timer_setup() (2 field)
    treewide: setup_timer() -> timer_setup()
    treewide: init_timer() -> setup_timer()
    treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
    s390: cmm: Convert timers to use timer_setup()
    lightnvm: Convert timers to use timer_setup()
    drivers/net: cris: Convert timers to use timer_setup()
    drm/vc4: Convert timers to use timer_setup()
    block/laptop_mode: Convert timers to use timer_setup()
    net/atm/mpc: Avoid open-coded assignment of timer callback function
    ...

    Linus Torvalds
     

23 Nov, 2017

1 commit

  • Pull MTD updates from Richard Weinberger:
    "General changes:
    - Unconfuse get_unmapped_area and point/unpoint driver methods
    - New partition parser: sharpslpart
    - Kill GENERIC_IO
    - Various fixes

    NAND changes:
    - Add a flag to mark NANDs that require 3 address cycles to encode a
    page address
    - Set a default ECC/free layout when NAND_ECC_NONE is requested
    - Fix a bug in panic_nand_write()
    - Another batch of cleanups for the denali driver
    - Fix PM support in the atmel driver
    - Remove support for platform data in the omap driver
    - Fix subpage write in the omap driver
    - Fix irq handling in the mtk driver
    - Change link order of mtk_ecc and mtk_nand drivers to speed up boot
    time
    - Change log level of ECC error messages in the mxc driver
    - Patch the pxa3xx driver to support Armada 8k platforms
    - Add BAM DMA support to the qcom driver
    - Convert gpio-nand to the GPIO desc API
    - Fix ECC handling in the mt29f driver

    SPI-NOR changes:
    - Introduce system power management support
    - New mechanism to select the proper .quad_enable() hook by JEDEC
    ID, when needed, instead of only by manufacturer ID
    - Add support to new memory parts from Gigadevice, Winbond, Macronix
    and Everspin
    - Maintainance for Cadence, Intel, Mediatek and STM32 drivers"

    * tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd: (85 commits)
    mtd: Avoid probe failures when mtd->dbg.dfs_dir is invalid
    mtd: sharpslpart: Add sharpslpart partition parser
    mtd: Add sanity checks in mtd_write/read_oob()
    mtd: remove the get_unmapped_area method
    mtd: implement mtd_get_unmapped_area() using the point method
    mtd: chips/map_rom.c: implement point and unpoint methods
    mtd: chips/map_ram.c: implement point and unpoint methods
    mtd: mtdram: properly handle the phys argument in the point method
    mtd: mtdswap: fix spelling mistake: 'TRESHOLD' -> 'THRESHOLD'
    mtd: slram: use memremap() instead of ioremap()
    kconfig: kill off GENERIC_IO option
    mtd: Fix C++ comment in include/linux/mtd/mtd.h
    mtd: constify mtd_partition
    mtd: plat-ram: Replace manual resource management by devm
    mtd: nand: Fix writing mtdoops to nand flash.
    mtd: intel-spi: Add Intel Lewisburg PCH SPI super SKU PCI ID
    mtd: nand: mtk: fix infinite ECC decode IRQ issue
    mtd: spi-nor: Add support for mr25h128
    mtd: nand: mtk: change the compile sequence of mtk_nand.o and mtk_ecc.o
    mtd: spi-nor: enable 4B opcodes for mx66l51235l
    ...

    Linus Torvalds
     

22 Nov, 2017

1 commit

  • This changes all DEFINE_TIMER() callbacks to use a struct timer_list
    pointer instead of unsigned long. Since the data argument has already been
    removed, none of these callbacks are using their argument currently, so
    this renames the argument to "unused".

    Done using the following semantic patch:

    @match_define_timer@
    declarer name DEFINE_TIMER;
    identifier _timer, _callback;
    @@

    DEFINE_TIMER(_timer, _callback);

    @change_callback depends on match_define_timer@
    identifier match_define_timer._callback;
    type _origtype;
    identifier _origarg;
    @@

    void
    -_callback(_origtype _origarg)
    +_callback(struct timer_list *unused)
    { ... }

    Signed-off-by: Kees Cook

    Kees Cook
     

18 Nov, 2017

12 commits

  • Merge more updates from Andrew Morton:

    - a bit more MM

    - procfs updates

    - dynamic-debug fixes

    - lib/ updates

    - checkpatch

    - epoll

    - nilfs2

    - signals

    - rapidio

    - PID management cleanup and optimization

    - kcov updates

    - sysvipc updates

    - quite a few misc things all over the place

    * emailed patches from Andrew Morton : (94 commits)
    EXPERT Kconfig menu: fix broken EXPERT menu
    include/asm-generic/topology.h: remove unused parent_node() macro
    arch/tile/include/asm/topology.h: remove unused parent_node() macro
    arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro
    arch/sh/include/asm/topology.h: remove unused parent_node() macro
    arch/ia64/include/asm/topology.h: remove unused parent_node() macro
    drivers/pcmcia/sa1111_badge4.c: avoid unused function warning
    mm: add infrastructure for get_user_pages_fast() benchmarking
    sysvipc: make get_maxid O(1) again
    sysvipc: properly name ipc_addid() limit parameter
    sysvipc: duplicate lock comments wrt ipc_addid()
    sysvipc: unteach ids->next_id for !CHECKPOINT_RESTORE
    initramfs: use time64_t timestamps
    drivers/watchdog: make use of devm_register_reboot_notifier()
    kernel/reboot.c: add devm_register_reboot_notifier()
    kcov: update documentation
    Makefile: support flag -fsanitizer-coverage=trace-cmp
    kcov: support comparison operands collection
    kcov: remove pointless current != NULL check
    kernel/panic.c: add TAINT_AUX
    ...

    Linus Torvalds
     
  • The flag enables Clang instrumentation of comparison operations
    (currently not supported by GCC). This instrumentation is needed by the
    new KCOV device to collect comparison operands.

    Link: http://lkml.kernel.org/r/20171011095459.70721-2-glider@google.com
    Signed-off-by: Victor Chibotaru
    Signed-off-by: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Andrey Konovalov
    Cc: Mark Rutland
    Cc: Alexander Popov
    Cc: Andrey Ryabinin
    Cc: Kees Cook
    Cc: Vegard Nossum
    Cc: Quentin Casasnovas
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Victor Chibotaru
     
  • find_bit functions are widely used in the kernel, including hot paths.
    This module tests performance of those functions in 2 typical scenarios:
    randomly filled bitmap with relatively equal distribution of set and
    cleared bits, and sparse bitmap which has 1 set bit for 500 cleared
    bits.

    On ThunderX machine:

    Start testing find_bit() with random-filled bitmap
    find_next_bit: 240043 cycles, 164062 iterations
    find_next_zero_bit: 312848 cycles, 163619 iterations
    find_last_bit: 193748 cycles, 164062 iterations
    find_first_bit: 177720874 cycles, 164062 iterations

    Start testing find_bit() with sparse bitmap
    find_next_bit: 3633 cycles, 656 iterations
    find_next_zero_bit: 620399 cycles, 327025 iterations
    find_last_bit: 3038 cycles, 656 iterations
    find_first_bit: 691407 cycles, 656 iterations

    [arnd@arndb.de: use correct format string for find-bit tests]
    Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de
    Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.com
    Signed-off-by: Yury Norov
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Clement Courbet
    Cc: Alexey Dobriyan
    Cc: Matthew Wilcox
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yury Norov
     
  • Fengguang reported soft lockups while running the rbtree and interval
    tree test modules. The logic for these tests all occur in init phase,
    and we currently are pounding with the default values for number of
    nodes and number of iterations of each test. Reduce the latter by two
    orders of magnitude. This does not influence the value of the tests in
    that one thousand times by default is enough to get the picture.

    Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805
    Signed-off-by: Davidlohr Bueso
    Reported-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • Don't leak idle function address in NMI backtrace.

    Link: http://lkml.kernel.org/r/20171106165648.GA95243@sofia
    Signed-off-by: Liu Changcheng
    Reviewed-by: Petr Mladek
    Reviewed-by: Josh Poimboeuf
    Reviewed-by: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Liu, Changcheng
     
  • If the amount of resources allocated to a gen_pool exceeds 2^32 then the
    avail atomic overflows and this causes problems when clients try and
    borrow resources from the pool. This is only expected to be an issue on
    64 bit systems.

    Add the header to pull in atomic_long* operations. So
    that 32 bit systems continue to use atomic32_t but 64 bit systems can
    use atomic64_t.

    Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
    Signed-off-by: Stephen Bates
    Reviewed-by: Logan Gunthorpe
    Reviewed-by: Mathieu Desnoyers
    Reviewed-by: Daniel Mentz
    Cc: Jonathan Corbet
    Cc: Andrew Morton
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Bates
     
  • Our current int_sqrt() is not rough nor any approximation; it calculates
    the exact value of: floor(sqrt()). Document this.

    Link: http://lkml.kernel.org/r/20171020164645.001652117@infradead.org
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Anshul Garg
    Cc: Davidlohr Bueso
    Cc: David Miller
    Cc: Ingo Molnar
    Cc: Joe Perches
    Cc: Kees Cook
    Cc: Matthew Wilcox
    Cc: Michael Davidson
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • The initial value (@m) compute is:

    m = 1UL << (BITS_PER_LONG - 2);
    while (m > x)
    m >>= 2;

    Which is a linear search for the highest even bit smaller or equal to @x
    We can implement this using a binary search using __fls() (or better when
    its hardware implemented).

    m = 1UL << (__fls(x) & ~1UL);

    Especially for small values of @x; which are the more common arguments
    when doing a CDF on idle times; the linear search is near to worst case,
    while the binary search of __fls() is a constant 6 (or 5 on 32bit)
    branches.

    cycles: branches: branch-misses:

    PRE:

    hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
    cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219

    SOFTWARE FLS:

    hot: 29.576176 +- 0.028850 26.666730 +- 0.004511 0.019463 +- 0.000663
    cold: 165.947136 +- 0.188406 26.666746 +- 0.004511 6.133897 +- 0.004386

    HARDWARE FLS:

    hot: 24.720922 +- 0.025161 20.666784 +- 0.004509 0.020836 +- 0.000677
    cold: 132.777197 +- 0.127471 20.666776 +- 0.004509 5.080285 +- 0.003874

    Averages computed over all values
    Suggested-by: Joe Perches
    Acked-by: Will Deacon
    Acked-by: Linus Torvalds
    Cc: Anshul Garg
    Cc: Davidlohr Bueso
    Cc: David Miller
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Matthew Wilcox
    Cc: Michael Davidson
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • The current int_sqrt() computation is sub-optimal for the case of small
    @x. Which is the interesting case when we're going to do cumulative
    distribution functions on idle times, which we assume to be a random
    variable, where the target residency of the deepest idle state gives an
    upper bound on the variable (5e6ns on recent Intel chips).

    In the case of small @x, the compute loop:

    while (m != 0) {
    b = y + m;
    y >>= 1;

    if (x >= b) {
    x -= b;
    y += m;
    }
    m >>= 2;
    }

    can be reduced to:

    while (m > x)
    m >>= 2;

    Because y==0, b==m and until x>=m y will remain 0.

    And while this is computationally equivalent, it runs much faster
    because there's less code, in particular less branches.

    cycles: branches: branch-misses:

    OLD:

    hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593
    cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305

    PRE:

    hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113
    cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568

    POST:

    hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
    cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219

    Averages computed over all values
    Suggested-by: Anshul Garg
    Acked-by: Linus Torvalds
    Cc: Davidlohr Bueso
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Will Deacon
    Cc: Joe Perches
    Cc: David Miller
    Cc: Matthew Wilcox
    Cc: Kees Cook
    Cc: Michael Davidson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Omit extra messages for a memory allocation failure in these functions.

    This issue was detected by using the Coccinelle software.

    Link: http://lkml.kernel.org/r/410a4c5a-4ee0-6fcc-969c-103d8e496b78@users.sourceforge.net
    Signed-off-by: Markus Elfring
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Markus Elfring
     
  • Extract the string test code into its own source file, to allow
    compiling it either to a loadable module, or built into the kernel.

    Fixes: 03270c13c5ffaa6a ("lib/string.c: add testcases for memset16/32/64")
    Link: http://lkml.kernel.org/r/1505397744-3387-1-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Cc: Matthew Wilcox
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • line-range is supposed to treat "1-" as "1-endoffile", so
    handle the special case by setting last_lineno to UINT_MAX.

    Fixes this error:

    dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1
    dynamic_debug:ddebug_exec_query: query parse failed

    Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org
    Signed-off-by: Randy Dunlap
    Acked-by: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap