22 Sep, 2016

1 commit


23 Mar, 2016

1 commit

  • kcov provides code coverage collection for coverage-guided fuzzing
    (randomized testing). Coverage-guided fuzzing is a testing technique
    that uses coverage feedback to determine new interesting inputs to a
    system. A notable user-space example is AFL
    (http://lcamtuf.coredump.cx/afl/). However, this technique is not
    widely used for kernel testing due to missing compiler and kernel
    support.

    kcov does not aim to collect as much coverage as possible. It aims to
    collect more or less stable coverage that is function of syscall inputs.
    To achieve this goal it does not collect coverage in soft/hard
    interrupts and instrumentation of some inherently non-deterministic or
    non-interesting parts of kernel is disbled (e.g. scheduler, locking).

    Currently there is a single coverage collection mode (tracing), but the
    API anticipates additional collection modes. Initially I also
    implemented a second mode which exposes coverage in a fixed-size hash
    table of counters (what Quentin used in his original patch). I've
    dropped the second mode for simplicity.

    This patch adds the necessary support on kernel side. The complimentary
    compiler support was added in gcc revision 231296.

    We've used this support to build syzkaller system call fuzzer, which has
    found 90 kernel bugs in just 2 months:

    https://github.com/google/syzkaller/wiki/Found-Bugs

    We've also found 30+ bugs in our internal systems with syzkaller.
    Another (yet unexplored) direction where kcov coverage would greatly
    help is more traditional "blob mutation". For example, mounting a
    random blob as a filesystem, or receiving a random blob over wire.

    Why not gcov. Typical fuzzing loop looks as follows: (1) reset
    coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
    typical coverage can be just a dozen of basic blocks (e.g. an invalid
    input). In such context gcov becomes prohibitively expensive as
    reset/collect coverage steps depend on total number of basic
    blocks/edges in program (in case of kernel it is about 2M). Cost of
    kcov depends only on number of executed basic blocks/edges. On top of
    that, kernel requires per-thread coverage because there are always
    background threads and unrelated processes that also produce coverage.
    With inlined gcov instrumentation per-thread coverage is not possible.

    kcov exposes kernel PCs and control flow to user-space which is
    insecure. But debugfs should not be mapped as user accessible.

    Based on a patch by Quentin Casasnovas.

    [akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
    [akpm@linux-foundation.org: unbreak allmodconfig]
    [akpm@linux-foundation.org: follow x86 Makefile layout standards]
    Signed-off-by: Dmitry Vyukov
    Reviewed-by: Kees Cook
    Cc: syzkaller
    Cc: Vegard Nossum
    Cc: Catalin Marinas
    Cc: Tavis Ormandy
    Cc: Will Deacon
    Cc: Quentin Casasnovas
    Cc: Kostya Serebryany
    Cc: Eric Dumazet
    Cc: Alexander Potapenko
    Cc: Kees Cook
    Cc: Bjorn Helgaas
    Cc: Sasha Levin
    Cc: David Drysdale
    Cc: Ard Biesheuvel
    Cc: Andrey Ryabinin
    Cc: Kirill A. Shutemov
    Cc: Jiri Slaby
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

06 Sep, 2015

1 commit

  • Pull vfs updates from Al Viro:
    "In this one:

    - d_move fixes (Eric Biederman)

    - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
    handling ought to be fixed)

    - switch of sb_writers to percpu rwsem (Oleg Nesterov)

    - superblock scalability (Josef Bacik and Dave Chinner)

    - swapon(2) race fix (Hugh Dickins)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
    vfs: Test for and handle paths that are unreachable from their mnt_root
    dcache: Reduce the scope of i_lock in d_splice_alias
    dcache: Handle escaped paths in prepend_path
    mm: fix potential data race in SyS_swapon
    inode: don't softlockup when evicting inodes
    inode: rename i_wb_list to i_io_list
    sync: serialise per-superblock sync operations
    inode: convert inode_sb_list_lock to per-sb
    inode: add hlist_fake to avoid the inode hash lock in evict
    writeback: plug writeback at a high level
    change sb_writers to use percpu_rw_semaphore
    shift percpu_counter_destroy() into destroy_super_work()
    percpu-rwsem: kill CONFIG_PERCPU_RWSEM
    percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
    percpu-rwsem: introduce percpu_down_read_trylock()
    document rwsem_release() in sb_wait_write()
    fix the broken lockdep logic in __sb_start_write()
    introduce __sb_writers_{acquired,release}() helpers
    ufs_inode_get{frag,block}(): get rid of 'phys' argument
    ufs_getfrag_block(): tidy up a bit
    ...

    Linus Torvalds
     

15 Aug, 2015

1 commit


20 Jul, 2015

1 commit

  • No one uses this anymore, and this is not the first time the
    idea of replacing it with a (now possible) userspace side.
    Lock stealing logic was removed long ago in when the lock
    was granted to the highest prio.

    Signed-off-by: Davidlohr Bueso
    Cc: Darren Hart
    Cc: Steven Rostedt
    Cc: Mike Galbraith
    Cc: Paul E. McKenney
    Cc: Sebastian Andrzej Siewior
    Cc: Davidlohr Bueso
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1435782588-4177-2-git-send-email-dave@stgolabs.net
    Signed-off-by: Thomas Gleixner

    Davidlohr Bueso
     

12 May, 2015

1 commit

  • To be consistent with the queued spinlocks which use
    CONFIG_QUEUED_SPINLOCKS config parameter, the one for the queued
    rwlocks is now renamed to CONFIG_QUEUED_RWLOCKS.

    Signed-off-by: Waiman Long
    Cc: Borislav Petkov
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431367031-36697-1-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

11 May, 2015

1 commit

  • Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
    in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
    called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

    But the typo was natural: the proper English term for such
    a generic object would be 'queued spinlocks' - so rename
    this and related symbols accordingly to the plural form.

    Reported-by: Valentin Rothberg
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

08 May, 2015

1 commit

  • This patch introduces a new generic queued spinlock implementation that
    can serve as an alternative to the default ticket spinlock. Compared
    with the ticket spinlock, this queued spinlock should be almost as fair
    as the ticket spinlock. It has about the same speed in single-thread
    and it can be much faster in high contention situations especially when
    the spinlock is embedded within the data structure to be protected.

    Only in light to moderate contention where the average queue depth
    is around 1-3 will this queued spinlock be potentially a bit slower
    due to the higher slowpath overhead.

    This queued spinlock is especially suit to NUMA machines with a large
    number of cores as the chance of spinlock contention is much higher
    in those machines. The cost of contention is also higher because of
    slower inter-node memory traffic.

    Due to the fact that spinlocks are acquired with preemption disabled,
    the process will not be migrated to another CPU while it is trying
    to get a spinlock. Ignoring interrupt handling, a CPU can only be
    contending in one spinlock at any one time. Counting soft IRQ, hard
    IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
    activities. By allocating a set of per-cpu queue nodes and used them
    to form a waiting queue, we can encode the queue node address into a
    much smaller 24-bit size (including CPU number and queue node index)
    leaving one byte for the lock.

    Please note that the queue node is only needed when waiting for the
    lock. Once the lock is acquired, the queue node can be released to
    be used later.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

12 Feb, 2015

1 commit

  • Pull s390 updates from Martin Schwidefsky:

    - The remaining patches for the z13 machine support: kernel build
    option for z13, the cache synonym avoidance, SMT support,
    compare-and-delay for spinloops and the CES5S crypto adapater.

    - The ftrace support for function tracing with the gcc hotpatch option.
    This touches common code Makefiles, Steven is ok with the changes.

    - The hypfs file system gets an extension to access diagnose 0x0c data
    in user space for performance analysis for Linux running under z/VM.

    - The iucv hvc console gets wildcard spport for the user id filtering.

    - The cacheinfo code is converted to use the generic infrastructure.

    - Cleanup and bug fixes.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
    s390/process: free vx save area when releasing tasks
    s390/hypfs: Eliminate hypfs interval
    s390/hypfs: Add diagnose 0c support
    s390/cacheinfo: don't use smp_processor_id() in preemptible context
    s390/zcrypt: fixed domain scanning problem (again)
    s390/smp: increase maximum value of NR_CPUS to 512
    s390/jump label: use different nop instruction
    s390/jump label: add sanity checks
    s390/mm: correct missing space when reporting user process faults
    s390/dasd: cleanup profiling
    s390/dasd: add locking for global_profile access
    s390/ftrace: hotpatch support for function tracing
    ftrace: let notrace function attribute disable hotpatching if necessary
    ftrace: allow architectures to specify ftrace compile options
    s390: reintroduce diag 44 calls for cpu_relax()
    s390/zcrypt: Add support for new crypto express (CEX5S) adapter.
    s390/zcrypt: Number of supported ap domains is not retrievable.
    s390/spinlock: add compare-and-delay to lock wait loops
    s390/tape: remove redundant if statement
    s390/hvc_iucv: add simple wildcard matches to the iucv allow filter
    ...

    Linus Torvalds
     

29 Jan, 2015

1 commit

  • If the kernel is compiled with function tracer support the -pg compile option
    is passed to gcc to generate extra code into the prologue of each function.

    This patch replaces the "open-coded" -pg compile flag with a CC_FLAGS_FTRACE
    makefile variable which architectures can override if a different option
    should be used for code generation.

    Acked-by: Steven Rostedt
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

14 Jan, 2015

1 commit

  • We have two flavors of the MCS spinlock: standard and cancelable (OSQ).
    While each one is independent of the other, we currently mix and match
    them. This patch:

    - Moves the OSQ code out of mcs_spinlock.h (which only deals with the traditional
    version) into include/linux/osq_lock.h. No unnecessary code is added to the
    more global header file, anything locks that make use of OSQ must include
    it anyway.

    - Renames mcs_spinlock.c to osq_lock.c. This file only contains osq code.

    - Introduces a CONFIG_LOCK_SPIN_ON_OWNER in order to only build osq_lock
    if there is support for it.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Thomas Gleixner
    Cc: "Paul E. McKenney"
    Cc: Jason Low
    Cc: Linus Torvalds
    Cc: Mikulas Patocka
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1420573509-24774-5-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

06 Jun, 2014

1 commit

  • This rwlock uses the arch_spin_lock_t as a waitqueue, and assuming the
    arch_spin_lock_t is a fair lock (ticket,mcs etc..) the resulting
    rwlock is a fair lock.

    It fits in the same 8 bytes as the regular rwlock_t by folding the
    reader and writer count into a single integer, using the remaining 4
    bytes for the arch_spinlock_t.

    Architectures that can single-copy adress bytes can optimize
    queue_write_unlock() with a 0 write to the LSB (the write count).

    Performance as measured by Davidlohr Bueso (rwlock_t -> qrwlock_t):

    +--------------+-------------+---------------+
    | Workload | #users | delta |
    +--------------+-------------+---------------+
    | alltests | > 1400 | -4.83% |
    | custom | 0-100,> 100 | +1.43%,-1.57% |
    | high_systime | > 1000 | -2.61 |
    | shared | all | +0.32 |
    +--------------+-------------+---------------+

    http://www.stgolabs.net/qrwlock-stuff/aim7-results-vs-rwsem_optsin/

    Signed-off-by: Waiman Long
    [peterz: near complete rewrite]
    Signed-off-by: Peter Zijlstra
    Cc: Arnd Bergmann
    Cc: Linus Torvalds
    Cc: "Paul E.McKenney"
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Link: http://lkml.kernel.org/n/tip-gac1nnl3wvs2ij87zv2xkdzq@git.kernel.org
    Signed-off-by: Ingo Molnar

    Waiman Long
     

08 Apr, 2014

1 commit

  • When the system has only one CPU, lglock is effectively a spinlock; map
    it directly to spinlock to eliminate the indirection and duplicate code.

    In addition to removing overhead, this drops 1.6k of code with a
    defconfig modified to have !CONFIG_SMP, and 1.1k with a minimal config.

    Signed-off-by: Josh Triplett
    Cc: Rusty Russell
    Cc: Michal Marek
    Cc: Thomas Gleixner
    Cc: David Howells
    Cc: "H. Peter Anvin"
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     

01 Apr, 2014

1 commit

  • Pull RCU updates from Ingo Molnar:
    "Main changes:

    - Torture-test changes, including refactoring of rcutorture and
    introduction of a vestigial locktorture.

    - Real-time latency fixes.

    - Documentation updates.

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
    rcu: Provide grace-period piggybacking API
    rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone
    rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c
    notifier: Substitute rcu_access_pointer() for rcu_dereference_raw()
    Documentation/memory-barriers.txt: Clarify release/acquire ordering
    rcutorture: Save kvm.sh output to log
    rcutorture: Add a lock_busted to test the test
    rcutorture: Place kvm-test-1-run.sh output into res directory
    rcutorture: Rename TREE_RCU-Kconfig.txt
    locktorture: Add kvm-recheck.sh plug-in for locktorture
    rcutorture: Gracefully handle NULL cleanup hooks
    locktorture: Add vestigial locktorture configuration
    rcutorture: Introduce "rcu" directory level underneath configs
    rcutorture: Rename kvm-test-1-rcu.sh
    rcutorture: Remove RCU dependencies from ver_functions.sh API
    rcutorture: Create CFcommon file for common Kconfig parameters
    rcutorture: Create config files for scripted test-the-test testing
    rcutorture: Add an rcu_busted to test the test
    locktorture: Add a lock-torture kernel module
    rcutorture: Abstract kvm-recheck.sh
    ...

    Linus Torvalds
     

11 Mar, 2014

1 commit

  • Since we want a task waiting for a mutex_lock() to go to sleep and
    reschedule on need_resched() we must be able to abort the
    mcs_spin_lock() around the adaptive spin.

    Therefore implement a cancelable mcs lock.

    Signed-off-by: Peter Zijlstra
    Cc: chegu_vinod@hp.com
    Cc: paulmck@linux.vnet.ibm.com
    Cc: Waiman.Long@hp.com
    Cc: torvalds@linux-foundation.org
    Cc: tglx@linutronix.de
    Cc: riel@redhat.com
    Cc: akpm@linux-foundation.org
    Cc: davidlohr@hp.com
    Cc: hpa@zytor.com
    Cc: andi@firstfloor.org
    Cc: aswin@hp.com
    Cc: scott.norton@hp.com
    Cc: Jason Low
    Link: http://lkml.kernel.org/n/tip-62hcl5wxydmjzd182zhvk89m@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

24 Feb, 2014

1 commit


06 Nov, 2013

8 commits