21 Mar, 2020

1 commit

  • Extend lockdep to validate lock wait-type context.

    The current wait-types are:

    LD_WAIT_FREE, /* wait free, rcu etc.. */
    LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
    LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
    LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */

    Where lockdep validates that the current lock (the one being acquired)
    fits in the current wait-context (as generated by the held stack).

    This ensures that there is no attempt to acquire mutexes while holding
    spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
    other words, its a more fancy might_sleep().

    Obviously RCU made the entire ordeal more complex than a simple single
    value test because RCU can be acquired in (pretty much) any context and
    while it presents a context to nested locks it is not the same as it
    got acquired in.

    Therefore its necessary to split the wait_type into two values, one
    representing the acquire (outer) and one representing the nested context
    (inner). For most 'normal' locks these two are the same.

    [ To make static initialization easier we have the rule that:
    .outer == INV means .outer == .inner; because INV == 0. ]

    It further means that its required to find the minimal .inner of the held
    stack to compare against the outer of the new lock; because while 'normal'
    RCU presents a CONFIG type to nested locks, if it is taken while already
    holding a SPIN type it obviously doesn't relax the rules.

    Below is an example output generated by the trivial test code:

    raw_spin_lock(&foo);
    spin_lock(&bar);
    spin_unlock(&bar);
    raw_spin_unlock(&foo);

    [ BUG: Invalid wait context ]
    -----------------------------
    swapper/0/1 is trying to lock:
    ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
    other info that might help us debug this:
    1 lock held by swapper/0/1:
    #0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

    The way to read it is to look at the new -{n,m} part in the lock
    description; -{3:3} for the attempted lock, and try and match that up to
    the held locks, which in this case is the one: -{2,2}.

    This tells that the acquiring lock requires a more relaxed environment than
    presented by the lock stack.

    Currently only the normal locks and RCU are converted, the rest of the
    lockdep users defaults to .inner = INV which is ignored. More conversions
    can be done when desired.

    The check for spinlock_t nesting is not enabled by default. It's a separate
    config option for now as there are known problems which are currently
    addressed. The config option allows to identify these problems and to
    verify that the solutions found are indeed solving them.

    The config switch will be removed and the checks will permanently enabled
    once the vast majority of issues has been addressed.

    [ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
    failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
    [ tglx: Add the config option ]

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de

    Peter Zijlstra
     

29 Nov, 2019

1 commit

  • This fixes various data races in spinlock_debug. By testing with KCSAN,
    it is observable that the console gets spammed with data races reports,
    suggesting these are extremely frequent.

    Example data race report:

    read to 0xffff8ab24f403c48 of 4 bytes by task 221 on cpu 2:
    debug_spin_lock_before kernel/locking/spinlock_debug.c:85 [inline]
    do_raw_spin_lock+0x9b/0x210 kernel/locking/spinlock_debug.c:112
    __raw_spin_lock include/linux/spinlock_api_smp.h:143 [inline]
    _raw_spin_lock+0x39/0x40 kernel/locking/spinlock.c:151
    spin_lock include/linux/spinlock.h:338 [inline]
    get_partial_node.isra.0.part.0+0x32/0x2f0 mm/slub.c:1873
    get_partial_node mm/slub.c:1870 [inline]

    write to 0xffff8ab24f403c48 of 4 bytes by task 167 on cpu 3:
    debug_spin_unlock kernel/locking/spinlock_debug.c:103 [inline]
    do_raw_spin_unlock+0xc9/0x1a0 kernel/locking/spinlock_debug.c:138
    __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:159 [inline]
    _raw_spin_unlock_irqrestore+0x2d/0x50 kernel/locking/spinlock.c:191
    spin_unlock_irqrestore include/linux/spinlock.h:393 [inline]
    free_debug_processing+0x1b3/0x210 mm/slub.c:1214
    __slab_free+0x292/0x400 mm/slub.c:2864

    As a side-effect, with KCSAN, this eventually locks up the console, most
    likely due to deadlock, e.g. .. -> printk lock -> spinlock_debug ->
    KCSAN detects data race -> kcsan_print_report() -> printk lock ->
    deadlock.

    This fix will 1) avoid the data races, and 2) allow using lock debugging
    together with KCSAN.

    Reported-by: Qian Cai
    Signed-off-by: Marco Elver
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/20191120155715.28089-1-elver@google.com
    Signed-off-by: Ingo Molnar

    Marco Elver
     

08 Apr, 2019

1 commit


10 Feb, 2017

1 commit

  • The current spinlock lockup detection code can sometimes produce false
    positives because of the unfairness of the locking algorithm itself.

    So the lockup detection code is now removed. Instead, we are relying
    on the NMI watchdog to detect potential lockup. We won't have lockup
    detection if the watchdog isn't running.

    The commented-out read-write lock lockup detection code are also
    removed.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Sasha Levin
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1486583208-11038-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

06 Nov, 2013

1 commit