09 Oct, 2020

1 commit

  • Basically print_lock_class_header()'s for loop is out of sync with the
    the size of of ->usage_traces[].

    Also clean things up a bit while at it, to avoid such mishaps in the future.

    Fixes: 23870f122768 ("locking/lockdep: Fix "USED"
    Debugged-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Tested-by: Qian Cai
    Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net

    Peter Zijlstra
     

03 Sep, 2020

1 commit

  • During the LPC RCU BoF Paul asked how come the "USED" usage_mask & LOCK_USED))
    + if (!(class->usage_mask & LOCKF_USED))

    fixing that will indeed cause rcu_read_lock() to insta-splat :/

    The above typo means that instead of testing for: 0x100 (1 <<
    LOCK_USED), we test for 8 (LOCK_USED), which corresponds to (1 <<
    LOCK_ENABLED_HARDIRQ).

    So instead of testing for _any_ used lock, it will only match any lock
    used with interrupts enabled.

    The rcu_read_lock() annotation uses .check=0, which means it will not
    set any of the interrupt bits and will thus never match.

    In order to properly fix the situation and allow rcu_read_lock() to
    correctly work, split LOCK_USED into LOCK_USED and LOCK_USED_READ and by
    having .read users set USED_READ and test USED, pure read-recursive
    locks are permitted.

    Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED"
    Signed-off-by: Ingo Molnar
    Tested-by: Masami Hiramatsu
    Acked-by: Paul E. McKenney
    Link: https://lore.kernel.org/r/20200902160323.GK1362448@hirez.programming.kicks-ass.net

    peterz@infradead.org
     

11 Feb, 2020

5 commits

  • Once a lock class is zapped, all the lock chains that include the zapped
    class are essentially useless. The lock_chain structure itself can be
    reused, but not the corresponding chain_hlocks[] entries. Over time,
    we will run out of chain_hlocks entries while there are still plenty
    of other lockdep array entries available.

    To fix this imbalance, we have to make chain_hlocks entries reusable
    just like the others. As the freed chain_hlocks entries are in blocks of
    various lengths. A simple bitmap like the one used in the other reusable
    lockdep arrays isn't applicable. Instead the chain_hlocks entries are
    put into bucketed lists (MAX_CHAIN_BUCKETS) of chain blocks. Bucket 0
    is the variable size bucket which houses chain blocks of size larger than
    MAX_CHAIN_BUCKETS sorted in decreasing size order. Initially, the whole
    array is in one chain block (the primordial chain block) in bucket 0.

    The minimum size of a chain block is 2 chain_hlocks entries. That will
    be the minimum allocation size. In other word, allocation requests
    for one chain_hlocks entry will cause 2-entry block to be returned and
    hence 1 entry will be wasted.

    Allocation requests for the chain_hlocks are fulfilled first by looking
    for chain block of matching size. If not found, the first chain block
    from bucket[0] (the largest one) is split. That can cause hlock entries
    fragmentation and reduce allocation efficiency if a chain block of size >
    MAX_CHAIN_BUCKETS is ever zapped and put back to after the primordial
    chain block. So the MAX_CHAIN_BUCKETS must be large enough that this
    should seldom happen.

    By reusing the chain_hlocks entries, we are able to handle workloads
    that add and zap a lot of lock classes without the risk of running out
    of chain_hlocks entries as long as the total number of outstanding lock
    classes at any time remain within a reasonable limit.

    Two new tracking counters, nr_free_chain_hlocks & nr_large_chain_blocks,
    are added to track the total number of chain_hlocks entries in the
    free bucketed lists and the number of large chain blocks in buckets[0]
    respectively. The nr_free_chain_hlocks replaces nr_chain_hlocks.

    The nr_large_chain_blocks counter enables to see if we should increase
    the number of buckets (MAX_CHAIN_BUCKETS) available so as to avoid to
    avoid the fragmentation problem in bucket[0].

    An internal nfsd test that ran for more than an hour and kept on
    loading and unloading kernel modules could cause the following message
    to be displayed.

    [ 4318.443670] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

    The patched kernel was able to complete the test with a lot of free
    chain_hlocks entries to spare:

    # cat /proc/lockdep_stats
    :
    dependency chains: 18867 [max: 65536]
    dependency chain hlocks: 74926 [max: 327680]
    dependency chain hlocks lost: 0
    :
    zapped classes: 1541
    zapped lock chains: 56765
    large chain blocks: 1

    By changing MAX_CHAIN_BUCKETS to 3 and add a counter for the size of the
    largest chain block. The system still worked and We got the following
    lockdep_stats data:

    dependency chains: 18601 [max: 65536]
    dependency chain hlocks used: 73133 [max: 327680]
    dependency chain hlocks lost: 0
    :
    zapped classes: 1541
    zapped lock chains: 56702
    large chain blocks: 45165
    large chain block size: 20165

    By running the test again, I was indeed able to cause chain_hlocks
    entries to get lost:

    dependency chain hlocks used: 74806 [max: 327680]
    dependency chain hlocks lost: 575
    :
    large chain blocks: 48737
    large chain block size: 7

    Due to the fragmentation, it is possible that the
    "MAX_LOCKDEP_CHAIN_HLOCKS too low!" error can happen even if a lot of
    of chain_hlocks entries appear to be free.

    Fortunately, a MAX_CHAIN_BUCKETS value of 16 should be big enough that
    few variable sized chain blocks, other than the initial one, should
    ever be present in bucket 0.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20200206152408.24165-7-longman@redhat.com

    Waiman Long
     
  • Add a new counter nr_zapped_lock_chains to track the number lock chains
    that have been removed.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20200206152408.24165-6-longman@redhat.com

    Waiman Long
     
  • If a lock chain contains a class that is zapped, the whole lock chain is
    likely to be invalid. If the zapped class is at the end of the chain,
    the partial chain without the zapped class should have been stored
    already as the current code will store all its predecessor chains. If
    the zapped class is somewhere in the middle, there is no guarantee that
    the partial chain will actually happen. It may just clutter up the hash
    and make searching slower. I would rather prefer storing the chain only
    when it actually happens.

    So just dump the corresponding chain_hlocks entries for now. A latter
    patch will try to reuse the freed chain_hlocks entries.

    This patch also changes the type of nr_chain_hlocks to unsigned integer
    to be consistent with the other counters.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20200206152408.24165-5-longman@redhat.com

    Waiman Long
     
  • The whole point of the lockdep dynamic key patch is to allow unused
    locks to be removed from the lockdep data buffers so that existing
    buffer space can be reused. However, there is no way to find out how
    many unused locks are zapped and so we don't know if the zapping process
    is working properly.

    Add a new nr_zapped_classes counter to track that and show it in
    /proc/lockdep_stats.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20200206152408.24165-4-longman@redhat.com

    Waiman Long
     
  • There are currently three counters to track the IRQ context of a lock
    chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
    They are incremented when a new lock chain is added, but they are
    not decremented when a lock chain is removed. That causes some of the
    statistic counts reported by /proc/lockdep_stats to be incorrect.
    IRQ
    Fix that by decrementing the right counter when a lock chain is removed.

    Since inc_chains() no longer accesses hardirq_context and softirq_context
    directly, it is moved out from the CONFIG_TRACE_IRQFLAGS conditional
    compilation block.

    Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lkml.kernel.org/r/20200206152408.24165-2-longman@redhat.com

    Waiman Long
     

25 Jul, 2019

3 commits

  • Report the number of stack traces and the number of stack trace hash
    chains. These two numbers are useful because these allow to estimate
    the number of stack trace hash collisions.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/20190722182443.216015-5-bvanassche@acm.org
    Signed-off-by: Ingo Molnar

    Bart Van Assche
     
  • Although commit 669de8bda87b ("kernel/workqueue: Use dynamic lockdep keys
    for workqueues") unregisters dynamic lockdep keys when a workqueue is
    destroyed, a side effect of that commit is that all stack traces
    associated with the lockdep key are leaked when a workqueue is destroyed.
    Fix this by storing each unique stack trace once. Other changes in this
    patch are:

    - Use NULL instead of { .nr_entries = 0 } to represent 'no trace'.
    - Store a pointer to a stack trace in struct lock_class and struct
    lock_list instead of storing 'nr_entries' and 'offset'.

    This patch avoids that the following program triggers the "BUG:
    MAX_STACK_TRACE_ENTRIES too low!" complaint:

    #include
    #include

    int main()
    {
    for (;;) {
    int fd = open("/dev/infiniband/rdma_cm", O_RDWR);
    close(fd);
    }
    }

    Suggested-by: Peter Zijlstra
    Reported-by: Eric Biggers
    Signed-off-by: Bart Van Assche
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: Yuyang Du
    Link: https://lkml.kernel.org/r/20190722182443.216015-4-bvanassche@acm.org
    Signed-off-by: Ingo Molnar

    Bart Van Assche
     
  • This patch does not change the behavior of the lockdep code.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/20190722182443.216015-2-bvanassche@acm.org
    Signed-off-by: Ingo Molnar

    Bart Van Assche
     

25 Jun, 2019

1 commit

  • When system has been running for a long time, signed integer
    counters are not enough for some lockdep statistics. Using
    unsigned long counters can satisfy the requirement. Besides,
    most of lockdep statistics are unsigned. It is better to use
    unsigned int instead of int.

    Remove unused variables.
    - max_recursion_depth
    - nr_cyclic_check_recursions
    - nr_find_usage_forwards_recursions
    - nr_find_usage_backwards_recursions

    Signed-off-by: Kobe Wu
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc:
    Cc: Eason Lin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/1561365348-16050-1-git-send-email-kobe-cp.wu@mediatek.com
    Signed-off-by: Ingo Molnar

    Kobe Wu
     

29 Apr, 2019

1 commit

  • check_prev_add_irq() tests all incompatible scenarios one after the
    other while adding a lock (@next) to a tree dependency (@prev):

    LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
    LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
    LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
    LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

    Also for these four scenarios, we must at least iterate the @prev
    backward dependency. Then if it matches the relevant LOCK_USED_* bit,
    we must also iterate the @next forward dependency.

    Therefore in the best case we iterate 4 times, in the worst case 8 times.

    A different approach can let us divide the number of branch iterations
    by 4:

    1) Iterate through @prev backward dependencies and accumulate all the IRQ
    uses in a single mask. In the best case where the current lock hasn't
    been used in IRQ, we stop here.

    2) Iterate through @next forward dependencies and try to find a lock
    whose usage is exclusive to the accumulated usages gathered in the
    previous step. If we find one (call it @lockA), we have found an
    incompatible use, otherwise we stop here. Only bad locking scenario
    go further. So a sane verification stop here.

    3) Iterate again through @prev backward dependency and find the lock
    whose usage matches @lockA in term of incompatibility. Call that
    lock @lockB.

    4) Report the incompatible usages of @lockA and @lockB

    If no incompatible use is found, the verification never goes beyond
    step 2 which means at most two iterations.

    The following compares the execution measurements of the function
    check_prev_add_irq():

    Number of calls | Avg (ns) | Stdev (ns) | Total time (ns)
    ------------------------------------------------------------------------
    Mainline 8452 | 2652 | 11962 | 22415143
    This patch 8452 | 1518 | 7090 | 12835602

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/20190402160244.32434-5-frederic@kernel.org
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

18 Apr, 2019

1 commit

  • Instead of open-coding the bitmasks, generate them using the
    lockdep_states.h header.

    This prepares for additional states, which would make the manual masks
    tedious and error prone.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

28 Feb, 2019

1 commit

  • This patch does not change any functionality but makes the next patch in
    this series easier to read.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Johannes Berg
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: johannes.berg@intel.com
    Cc: tj@kernel.org
    Link: https://lkml.kernel.org/r/20190214230058.196511-14-bvanassche@acm.org
    Signed-off-by: Ingo Molnar

    Bart Van Assche
     

21 Jan, 2019

1 commit

  • It makes the code more self-explanatory and tells throughout the code
    what magic number refers to:

    - state (Hardirq/Softirq)
    - direction (used in or enabled above state)
    - read or write

    We can even remove some comments that were compensating for the lack of
    those constant names.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/1545973321-24422-3-git-send-email-frederic@kernel.org
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

09 Oct, 2018

1 commit

  • A sizable portion of the CPU cycles spent on the __lock_acquire() is used
    up by the atomic increment of the class->ops stat counter. By taking it out
    from the lock_class structure and changing it to a per-cpu per-lock-class
    counter, we can reduce the amount of cacheline contention on the class
    structure when multiple CPUs are trying to acquire locks of the same
    class simultaneously.

    To limit the increase in memory consumption because of the percpu nature
    of that counter, it is now put back under the CONFIG_DEBUG_LOCKDEP
    config option. So the memory consumption increase will only occur if
    CONFIG_DEBUG_LOCKDEP is defined. The lock_class structure, however,
    is reduced in size by 16 bytes on 64-bit archs after ops removal and
    a minor restructuring of the fields.

    This patch also fixes a bug in the increment code as the counter is of
    the 'unsigned long' type, but atomic_inc() was used to increment it.

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/d66681f3-8781-9793-1dcf-2436a284550b@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

10 Aug, 2017

1 commit

  • Two boots + a make defconfig, the first didn't have the redundant bit
    in, the second did:

    lock-classes: 1168 1169 [max: 8191]
    direct dependencies: 7688 5812 [max: 32768]
    indirect dependencies: 25492 25937
    all direct dependencies: 220113 217512
    dependency chains: 9005 9008 [max: 65536]
    dependency chain hlocks: 34450 34366 [max: 327680]
    in-hardirq chains: 55 51
    in-softirq chains: 371 378
    in-process chains: 8579 8579
    stack-trace entries: 108073 88474 [max: 524288]
    combined max dependencies: 178738560 169094640

    max locking depth: 15 15
    max bfs queue depth: 320 329

    cyclic checks: 9123 9190

    redundant checks: 5046
    redundant links: 1828

    find-mask forwards checks: 2564 2599
    find-mask backwards checks: 39521 39789

    So it saves nearly 2k links and a fair chunk of stack-trace entries, but
    as expected, makes no real difference on the indirect dependencies.

    At the same time, you see the max BFS depth increase, which is also
    expected, although it could easily be boot variance -- these numbers are
    not entirely stable between boots.

    The down side is that the cycles in the graph become larger and thus
    the reports harder to read.

    XXX: do we want this as a CONFIG variable, implied by LOCKDEP_SMALL?

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Byungchul Park
    Cc: Linus Torvalds
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Nikolay Borisov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akpm@linux-foundation.org
    Cc: boqun.feng@gmail.com
    Cc: iamjoonsoo.kim@lge.com
    Cc: kernel-team@lge.com
    Cc: kirill@shutemov.name
    Cc: npiggin@gmail.com
    Cc: walken@google.com
    Link: http://lkml.kernel.org/r/20170303091338.GH6536@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Apr, 2017

1 commit

  • CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the
    kernel text, data, and bss fit in the required 32MB limit, but this
    option is not set for every config that enables lockdep.

    A 4.10 kernel fails to boot with the console output

    Kernel: Using 8 locked TLB entries for main kernel image.
    hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f
    Program terminated

    with these config options

    CONFIG_LOCKDEP=y
    CONFIG_LOCK_STAT=y
    CONFIG_PROVE_LOCKING=n

    To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and
    enable this option with CONFIG_LOCKDEP=y so we get the reduced memory
    usage every time lockdep is turned on.

    Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if
    CONFIG_LOCKDEP is set to 'y'. When other lockdep-related config options
    that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or
    CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also
    enabled.

    Fixes: e6b5f1be7afe ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc")
    Signed-off-by: Daniel Jordan
    Reviewed-by: Babu Moger
    Signed-off-by: David S. Miller

    Daniel Jordan
     

19 Nov, 2016

1 commit


18 Apr, 2014

1 commit

  • Fuzzing a recent kernel with a large configuration hits the static
    allocation limits and disables lockdep.

    This patch doubles the limits.

    Signed-off-by: Sasha Levin
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1389208906-24338-1-git-send-email-sasha.levin@oracle.com
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Sasha Levin
     

06 Nov, 2013

1 commit