13 Jan, 2021

1 commit

  • Changes in 5.10.6
    Revert "drm/amd/display: Fix memory leaks in S3 resume"
    Revert "mtd: spinand: Fix OOB read"
    rtc: pcf2127: move watchdog initialisation to a separate function
    rtc: pcf2127: only use watchdog when explicitly available
    dt-bindings: rtc: add reset-source property
    kdev_t: always inline major/minor helper functions
    Bluetooth: Fix attempting to set RPA timeout when unsupported
    ALSA: hda/realtek - Modify Dell platform name
    ALSA: hda/hdmi: Fix incorrect mutex unlock in silent_stream_disable()
    drm/i915/tgl: Fix Combo PHY DPLL fractional divider for 38.4MHz ref clock
    scsi: ufs: Allow an error return value from ->device_reset()
    scsi: ufs: Re-enable WriteBooster after device reset
    RDMA/core: remove use of dma_virt_ops
    RDMA/siw,rxe: Make emulated devices virtual in the device tree
    fuse: fix bad inode
    perf: Break deadlock involving exec_update_mutex
    rwsem: Implement down_read_killable_nested
    rwsem: Implement down_read_interruptible
    exec: Transform exec_update_mutex into a rw_semaphore
    mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
    Linux 5.10.6

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Id4c57a151a1e8f2162163d2337b6055f04edbe9b

    Greg Kroah-Hartman
     

09 Jan, 2021

2 commits

  • [ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_interruptible. This is needed for perf_event_open to be
    converted (with no semantic changes) from working on a mutex to
    wroking on a rwsem.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     
  • [ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_killable_nested. This is needed so that kcmp_lock
    can be converted from working on a mutexes to working on rw_semaphores.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     

05 Jan, 2021

1 commit

  • The rwsem_waiter struct is needed in vendor hook alter_rwsem_list_add.
    It has parameter sem which is a struct rw_semaphore (already export in
    rwsem.h), inside the structure there is a wait_list to link
    "struct rwsem_waiter" items. The task information in each item of the
    wait_list is needed to be referenced in vendor loadable modules.

    Bug: 174902706
    Signed-off-by: Huang Yiwei
    Change-Id: Ic7d21ffdd795eaa203989751d26f8b1f32134d8b

    Huang Yiwei
     

23 Nov, 2020

1 commit


17 Nov, 2020

1 commit

  • A warning was hit when running xfstests/generic/068 in a Hyper-V guest:

    [...] ------------[ cut here ]------------
    [...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
    [...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 check_flags.part.0+0x165/0x170
    [...] ...
    [...] Workqueue: events pwq_unbound_release_workfn
    [...] RIP: 0010:check_flags.part.0+0x165/0x170
    [...] ...
    [...] Call Trace:
    [...] lock_is_held_type+0x72/0x150
    [...] ? lock_acquire+0x16e/0x4a0
    [...] rcu_read_lock_sched_held+0x3f/0x80
    [...] __send_ipi_one+0x14d/0x1b0
    [...] hv_send_ipi+0x12/0x30
    [...] __pv_queued_spin_unlock_slowpath+0xd1/0x110
    [...] __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
    [...] .slowpath+0x9/0xe
    [...] lockdep_unregister_key+0x128/0x180
    [...] pwq_unbound_release_workfn+0xbb/0xf0
    [...] process_one_work+0x227/0x5c0
    [...] worker_thread+0x55/0x3c0
    [...] ? process_one_work+0x5c0/0x5c0
    [...] kthread+0x153/0x170
    [...] ? __kthread_bind_mask+0x60/0x60
    [...] ret_from_fork+0x1f/0x30

    The cause of the problem is we have call chain lockdep_unregister_key()
    -> lockdep_unlock() ->
    arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() ->
    __send_ipi_one() -> trace_hyperv_send_ipi_one().

    Although this particular warning is triggered because Hyper-V has a
    trace point in ipi sending, but in general arch_spin_unlock() may call
    another function having a trace point in it, so put the arch_spin_lock()
    and arch_spin_unlock() after lock_recursion protection to fix this
    problem and avoid similiar problems.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20201113110512.1056501-1-boqun.feng@gmail.com

    Boqun Feng
     

16 Nov, 2020

1 commit


11 Nov, 2020

1 commit

  • Chris Wilson reported a problem spotted by check_chain_key(): a chain
    key got changed in validate_chain() because we modify the ->read in
    validate_chain() to skip checks for dependency adding, and ->read is
    taken into calculation for chain key since commit f611e8cf98ec
    ("lockdep: Take read/write status in consideration when generate
    chainkey").

    Fix this by avoiding to modify ->read in validate_chain() based on two
    facts: a) since we now support recursive read lock detection, there is
    no need to skip checks for dependency adding for recursive readers, b)
    since we have a), there is only one case left (nest_lock) where we want
    to skip checks in validate_chain(), we simply remove the modification
    for ->read and rely on the return value of check_deadlock() to skip the
    dependency adding.

    Reported-by: Chris Wilson
    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20201102053743.450459-1-boqun.feng@gmail.com

    Boqun Feng
     

02 Nov, 2020

2 commits

  • Linux 5.10-rc2

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Ib7738b2fe5c513b7eb2dc7b475f4dc848df931d2

    Greg Kroah-Hartman
     
  • Pull locking fixes from Thomas Gleixner:
    "A couple of locking fixes:

    - Fix incorrect failure injection handling in the fuxtex code

    - Prevent a preemption warning in lockdep when tracking
    local_irq_enable() and interrupts are already enabled

    - Remove more raw_cpu_read() usage from lockdep which causes state
    corruption on !X86 architectures.

    - Make the nr_unused_locks accounting in lockdep correct again"

    * tag 'locking-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    lockdep: Fix nr_unused_locks accounting
    locking/lockdep: Remove more raw_cpu_read() usage
    futex: Fix incorrect should_fail_futex() handling
    lockdep: Fix preemption WARN for spurious IRQ-enable

    Linus Torvalds
     

31 Oct, 2020

2 commits

  • Chris reported that commit 24d5a3bffef1 ("lockdep: Fix
    usage_traceoverflow") breaks the nr_unused_locks validation code
    triggered by /proc/lockdep_stats.

    By fully splitting LOCK_USED and LOCK_USED_READ it becomes a bad
    indicator for accounting nr_unused_locks; simplyfy by using any first
    bit.

    Fixes: 24d5a3bffef1 ("lockdep: Fix usage_traceoverflow")
    Reported-by: Chris Wilson
    Signed-off-by: Peter Zijlstra (Intel)
    Tested-by: Chris Wilson
    Link: https://lkml.kernel.org/r/20201027124834.GL2628@hirez.programming.kicks-ass.net

    Peter Zijlstra
     
  • I initially thought raw_cpu_read() was OK, since if it is !0 we have
    IRQs disabled and can't get migrated, so if we get migrated both CPUs
    must have 0 and it doesn't matter which 0 we read.

    And while that is true; it isn't the whole store, on pretty much all
    architectures (except x86) this can result in computing the address for
    one CPU, getting migrated, the old CPU continuing execution with another
    task (possibly setting recursion) and then the new CPU reading the value
    of the old CPU, which is no longer 0.

    Similer to:

    baffd723e44d ("lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"")

    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20201026152256.GB2651@hirez.programming.kicks-ass.net

    Peter Zijlstra
     

26 Oct, 2020

1 commit


22 Oct, 2020

1 commit

  • It is valid (albeit uncommon) to call local_irq_enable() without first
    having called local_irq_disable(). In this case we enter
    lockdep_hardirqs_on*() with IRQs enabled and trip a preemption warning
    for using __this_cpu_read().

    Use this_cpu_read() instead to avoid the warning.

    Fixes: 4d004099a6 ("lockdep: Fix lockdep recursion")
    Reported-by: syzbot+53f8ce8bbc07924b6417@syzkaller.appspotmail.com
    Reported-by: kernel test robot
    Signed-off-by: Peter Zijlstra (Intel)

    Peter Zijlstra
     

21 Oct, 2020

1 commit


19 Oct, 2020

1 commit

  • Pull RCU changes from Ingo Molnar:

    - Debugging for smp_call_function()

    - RT raw/non-raw lock ordering fixes

    - Strict grace periods for KASAN

    - New smp_call_function() torture test

    - Torture-test updates

    - Documentation updates

    - Miscellaneous fixes

    [ This doesn't actually pull the tag - I've dropped the last merge from
    the RCU branch due to questions about the series. - Linus ]

    * tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
    smp: Make symbol 'csd_bug_count' static
    kernel/smp: Provide CSD lock timeout diagnostics
    smp: Add source and destination CPUs to __call_single_data
    rcu: Shrink each possible cpu krcp
    rcu/segcblist: Prevent useless GP start if no CBs to accelerate
    torture: Add gdb support
    rcutorture: Allow pointer leaks to test diagnostic code
    rcutorture: Hoist OOM registry up one level
    refperf: Avoid null pointer dereference when buf fails to allocate
    rcutorture: Properly synchronize with OOM notifier
    rcutorture: Properly set rcu_fwds for OOM handling
    torture: Add kvm.sh --help and update help message
    rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
    torture: Update initrd documentation
    rcutorture: Replace HTTP links with HTTPS ones
    locktorture: Make function torture_percpu_rwsem_init() static
    torture: document --allcpus argument added to the kvm.sh script
    rcutorture: Output number of elapsed grace periods
    rcutorture: Remove KCSAN stubs
    rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
    ...

    Linus Torvalds
     

09 Oct, 2020

4 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Steve reported that lockdep_assert*irq*(), when nested inside lockdep
    itself, will trigger a false-positive.

    One example is the stack-trace code, as called from inside lockdep,
    triggering tracing, which in turn calls RCU, which then uses
    lockdep_assert_irqs_disabled().

    Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
    Reported-by: Steven Rostedt
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Basically print_lock_class_header()'s for loop is out of sync with the
    the size of of ->usage_traces[].

    Also clean things up a bit while at it, to avoid such mishaps in the future.

    Fixes: 23870f122768 ("locking/lockdep: Fix "USED"
    Debugged-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Tested-by: Qian Cai
    Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net

    Peter Zijlstra
     
  • …k/linux-rcu into core/rcu

    Pull v5.10 RCU changes from Paul E. McKenney:

    - Debugging for smp_call_function().

    - Strict grace periods for KASAN. The point of this series is to find
    RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD
    Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is
    further disabled by dfefault. Finally, the help text includes
    a goodly list of scary caveats.

    - New smp_call_function() torture test.

    - Torture-test updates.

    - Documentation updates.

    - Miscellaneous fixes.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

29 Sep, 2020

1 commit

  • Qian Cai reported a BFS_EQUEUEFULL warning [1] after read recursive
    deadlock detection merged into tip tree recently. Unlike the previous
    lockep graph searching, which iterate every lock class (every node in
    the graph) exactly once, the graph searching for read recurisve deadlock
    detection needs to iterate every lock dependency (every edge in the
    graph) once, as a result, the maximum memory cost of the circular queue
    changes from O(V), where V is the number of lock classes (nodes or
    vertices) in the graph, to O(E), where E is the number of lock
    dependencies (edges), because every lock class or dependency gets
    enqueued once in the BFS. Therefore we hit the BFS_EQUEUEFULL case.

    However, actually we don't need to enqueue all dependencies for the BFS,
    because every time we enqueue a dependency, we almostly enqueue all
    other dependencies in the same dependency list ("almostly" is because
    we currently check before enqueue, so if a dependency doesn't pass the
    check stage we won't enqueue it, however, we can always do in reverse
    ordering), based on this, we can only enqueue the first dependency from
    a dependency list and every time we want to fetch a new dependency to
    work, we can either:

    1) fetch the dependency next to the current dependency in the
    dependency list
    or

    2) if the dependency in 1) doesn't exist, fetch the dependency from
    the queue.

    With this approach, the "max bfs queue depth" for a x86_64_defconfig +
    lockdep and selftest config kernel can get descreased from:

    max bfs queue depth: 201

    to (after apply this patch)

    max bfs queue depth: 61

    While I'm at it, clean up the code logic a little (e.g. directly return
    other than set a "ret" value and goto the "exit" label).

    [1]: https://lore.kernel.org/lkml/17343f6f7f2438fc376125384133c5ba70c2a681.camel@redhat.com/

    Reported-by: Qian Cai
    Reported-by: syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com
    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200917080210.108095-1-boqun.feng@gmail.com

    Boqun Feng
     

21 Sep, 2020

1 commit


16 Sep, 2020

1 commit

  • The __this_cpu*() accessors are (in general) IRQ-unsafe which, given
    that percpu-rwsem is a blocking primitive, should be just fine.

    However, file_end_write() is used from IRQ context and will cause
    load-store issues on architectures where the per-cpu accessors are not
    natively irq-safe.

    Fix it by using the IRQ-safe this_cpu_*() for operations on
    read_count. This will generate more expensive code on a number of
    platforms, which might cause a performance regression for some of the
    other percpu-rwsem users.

    If any such is reported, we can consider alternative solutions.

    Fixes: 70fe2f48152e ("aio: fix freeze protection of aio writes")
    Signed-off-by: Hou Tao
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Acked-by: Oleg Nesterov
    Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com

    Hou Tao
     

03 Sep, 2020

1 commit

  • During the LPC RCU BoF Paul asked how come the "USED" usage_mask & LOCK_USED))
    + if (!(class->usage_mask & LOCKF_USED))

    fixing that will indeed cause rcu_read_lock() to insta-splat :/

    The above typo means that instead of testing for: 0x100 (1 <<
    LOCK_USED), we test for 8 (LOCK_USED), which corresponds to (1 <<
    LOCK_ENABLED_HARDIRQ).

    So instead of testing for _any_ used lock, it will only match any lock
    used with interrupts enabled.

    The rcu_read_lock() annotation uses .check=0, which means it will not
    set any of the interrupt bits and will thus never match.

    In order to properly fix the situation and allow rcu_read_lock() to
    correctly work, split LOCK_USED into LOCK_USED and LOCK_USED_READ and by
    having .read users set USED_READ and test USED, pure read-recursive
    locks are permitted.

    Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED"
    Signed-off-by: Ingo Molnar
    Tested-by: Masami Hiramatsu
    Acked-by: Paul E. McKenney
    Link: https://lore.kernel.org/r/20200902160323.GK1362448@hirez.programming.kicks-ass.net

    peterz@infradead.org
     

01 Sep, 2020

1 commit


26 Aug, 2020

14 commits

  • Currently, the chainkey of a lock chain is a hash sum of the class_idx
    of all the held locks, the read/write status are not taken in to
    consideration while generating the chainkey. This could result into a
    problem, if we have:

    P1()
    {
    read_lock(B);
    lock(A);
    }

    P2()
    {
    lock(A);
    read_lock(B);
    }

    P3()
    {
    lock(A);
    write_lock(B);
    }

    , and P1(), P2(), P3() run one by one. And when running P2(), lockdep
    detects such a lock chain A -> B is not a deadlock, then it's added in
    the chain cache, and then when running P3(), even if it's a deadlock, we
    could miss it because of the hit of chain cache. This could be confirmed
    by self testcase "chain cached mixed R-L/L-W ".

    To resolve this, we use concept "hlock_id" to generate the chainkey, the
    hlock_id is a tuple (hlock->class_idx, hlock->read), which fits in a u16
    type. With this, the chainkeys are different is the lock sequences have
    the same locks but different read/write status.

    Besides, since we use "hlock_id" to generate chainkeys, the chain_hlocks
    array now store the "hlock_id"s rather than lock_class indexes.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-15-boqun.feng@gmail.com

    Boqun Feng
     
  • Since we have all the fundamental to handle recursive read locks, we now
    add them into the dependency graph.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-13-boqun.feng@gmail.com

    Boqun Feng
     
  • Currently, in safe->unsafe detection, lockdep misses the fact that a
    LOCK_ENABLED_IRQ_*_READ usage and a LOCK_USED_IN_IRQ_*_READ usage may
    cause deadlock too, for example:

    P1 P2

    write_lock(l1);
    read_lock(l2);
    write_lock(l2);

    read_lock(l1);

    Actually, all of the following cases may cause deadlocks:

    LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*
    LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*
    LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*_READ
    LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*_READ

    To fix this, we need to 1) change the calculation of exclusive_mask() so
    that READ bits are not dropped and 2) always call usage() in
    mark_lock_irq() to check usage deadlocks, even when the new usage of the
    lock is READ.

    Besides, adjust usage_match() and usage_acculumate() to recursive read
    lock changes.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-12-boqun.feng@gmail.com

    Boqun Feng
     
  • check_redundant() will report redundancy if it finds a path could
    replace the about-to-add dependency in the BFS search. With recursive
    read lock changes, we certainly need to change the match function for
    the check_redundant(), because the path needs to match not only the lock
    class but also the dependency kinds. For example, if the about-to-add
    dependency @prev -> @next is A -(SN)-> B, and we find a path A -(S*)->
    .. -(*R)->B in the dependency graph with __bfs() (for simplicity, we can
    also say we find an -(SR)-> path from A to B), we can not replace the
    dependency with that path in the BFS search. Because the -(SN)->
    dependency can make a strong path with a following -(S*)-> dependency,
    however an -(SR)-> path cannot.

    Further, we can replace an -(SN)-> dependency with a -(EN)-> path, that
    means if we find a path which is stronger than or equal to the
    about-to-add dependency, we can report the redundancy. By "stronger", it
    means both the start and the end of the path are not weaker than the
    start and the end of the dependency (E is "stronger" than S and N is
    "stronger" than R), so that we can replace the dependency with that
    path.

    To make sure we find a path whose start point is not weaker than the
    about-to-add dependency, we use a trick: the ->only_xr of the root
    (start point) of __bfs() is initialized as @prev-> == 0, therefore if
    @prev is E, __bfs() will pick only -(E*)-> for the first dependency,
    otherwise, __bfs() can pick -(E*)-> or -(S*)-> for the first dependency.

    To make sure we find a path whose end point is not weaker than the
    about-to-add dependency, we replace the match function for __bfs()
    check_redundant(), we check for the case that either @next is R
    (anything is not weaker than it) or the end point of the path is N
    (which is not weaker than anything).

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-11-boqun.feng@gmail.com

    Boqun Feng
     
  • Currently, lockdep only has limit support for deadlock detection for
    recursive read locks.

    This patch support deadlock detection for recursive read locks. The
    basic idea is:

    We are about to add dependency B -> A in to the dependency graph, we use
    check_noncircular() to find whether we have a strong dependency path
    A -> .. -> B so that we have a strong dependency circle (a closed strong
    dependency path):

    A -> .. -> B -> A

    , which doesn't have two adjacent dependencies as -(*R)-> L -(S*)->.

    Since A -> .. -> B is already a strong dependency path, so if either
    B -> A is -(E*)-> or A -> .. -> B is -(*N)->, the circle A -> .. -> B ->
    A is strong, otherwise not. So we introduce a new match function
    hlock_conflict() to replace the class_equal() for the deadlock check in
    check_noncircular().

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-10-boqun.feng@gmail.com

    Boqun Feng
     
  • The "match" parameter of __bfs() is used for checking whether we hit a
    match in the search, therefore it should return a boolean value rather
    than an integer for better readability.

    This patch then changes the return type of the function parameter and the
    match functions to bool.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-9-boqun.feng@gmail.com

    Boqun Feng
     
  • Now we have four types of dependencies in the dependency graph, and not
    all the pathes carry real dependencies (the dependencies that may cause
    a deadlock), for example:

    Given lock A and B, if we have:

    CPU1 CPU2
    ============= ==============
    write_lock(A); read_lock(B);
    read_lock(B); write_lock(A);

    (assuming read_lock(B) is a recursive reader)

    then we have dependencies A -(ER)-> B, and B -(SN)-> A, and a
    dependency path A -(ER)-> B -(SN)-> A.

    In lockdep w/o recursive locks, a dependency path from A to A
    means a deadlock. However, the above case is obviously not a
    deadlock, because no one holds B exclusively, therefore no one
    waits for the other to release B, so who get A first in CPU1 and
    CPU2 will run non-blockingly.

    As a result, dependency path A -(ER)-> B -(SN)-> A is not a
    real/strong dependency that could cause a deadlock.

    From the observation above, we know that for a dependency path to be
    real/strong, no two adjacent dependencies can be as -(*R)-> -(S*)->.

    Now our mission is to make __bfs() traverse only the strong dependency
    paths, which is simple: we record whether we only have -(*R)-> for the
    previous lock_list of the path in lock_list::only_xr, and when we pick a
    dependency in the traverse, we 1) filter out -(S*)-> dependency if the
    previous lock_list only has -(*R)-> dependency (i.e. ->only_xr is true)
    and 2) set the next lock_list::only_xr to true if we only have -(*R)->
    left after we filter out dependencies based on 1), otherwise, set it to
    false.

    With this extension for __bfs(), we now need to initialize the root of
    __bfs() properly (with a correct ->only_xr), to do so, we introduce some
    helper functions, which also cleans up a little bit for the __bfs() root
    initialization code.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-8-boqun.feng@gmail.com

    Boqun Feng
     
  • To add recursive read locks into the dependency graph, we need to store
    the types of dependencies for the BFS later. There are four types of
    dependencies:

    * Exclusive -> Non-recursive dependencies: EN
    e.g. write_lock(prev) held and try to acquire write_lock(next)
    or non-recursive read_lock(next), which can be represented as
    "prev -(EN)-> next"

    * Shared -> Non-recursive dependencies: SN
    e.g. read_lock(prev) held and try to acquire write_lock(next) or
    non-recursive read_lock(next), which can be represented as
    "prev -(SN)-> next"

    * Exclusive -> Recursive dependencies: ER
    e.g. write_lock(prev) held and try to acquire recursive
    read_lock(next), which can be represented as "prev -(ER)-> next"

    * Shared -> Recursive dependencies: SR
    e.g. read_lock(prev) held and try to acquire recursive
    read_lock(next), which can be represented as "prev -(SR)-> next"

    So we use 4 bits for the presence of each type in lock_list::dep. Helper
    functions and macros are also introduced to convert a pair of locks into
    lock_list::dep bit and maintain the addition of different types of
    dependencies.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-7-boqun.feng@gmail.com

    Boqun Feng
     
  • lock_list::distance is always not greater than MAX_LOCK_DEPTH (which
    is 48 right now), so a u16 will fit. This patch reduces the size of
    lock_list::distance to save space, so that we can introduce other fields
    to help detect recursive read lock deadlocks without increasing the size
    of lock_list structure.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-6-boqun.feng@gmail.com

    Boqun Feng
     
  • Currently, __bfs() will do a breadth-first search in the dependency
    graph and visit each lock class in the graph exactly once, so for
    example, in the following graph:

    A ---------> B
    | ^
    | |
    +----------> C

    a __bfs() call starts at A, will visit B through dependency A -> B and
    visit C through dependency A -> C and that's it, IOW, __bfs() will not
    visit dependency C -> B.

    This is OK for now, as we only have strong dependencies in the
    dependency graph, so whenever there is a traverse path from A to B in
    __bfs(), it means A has strong dependencies to B (IOW, B depends on A
    strongly). So no need to visit all dependencies in the graph.

    However, as we are going to add recursive-read lock into the dependency
    graph, as a result, not all the paths mean strong dependencies, in the
    same example above, dependency A -> B may be a weak dependency and
    traverse A -> C -> B may be a strong dependency path. And with the old
    way of __bfs() (i.e. visiting every lock class exactly once), we will
    miss the strong dependency path, which will result into failing to find
    a deadlock. To cure this for the future, we need to find a way for
    __bfs() to visit each dependency, rather than each class, exactly once
    in the search until we find a match.

    The solution is simple:

    We used to mark lock_class::lockdep_dependency_gen_id to indicate a
    class has been visited in __bfs(), now we change the semantics a little
    bit: we now mark lock_class::lockdep_dependency_gen_id to indicate _all
    the dependencies_ in its lock_{after,before} have been visited in the
    __bfs() (note we only take one direction in a __bfs() search). In this
    way, every dependency is guaranteed to be visited until we find a match.

    Note: the checks in mark_lock_accessed() and lock_accessed() are
    removed, because after this modification, we may call these two
    functions on @source_entry of __bfs(), which may not be the entry in
    "list_entries"

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-5-boqun.feng@gmail.com

    Boqun Feng
     
  • __bfs() could return four magic numbers:

    1: search succeeds, but none match.
    0: search succeeds, find one match.
    -1: search fails because of the cq is full.
    -2: search fails because a invalid node is found.

    This patch cleans things up by using a enum type for the return value
    of __bfs() and its friends, this improves the code readability of the
    code, and further, could help if we want to extend the BFS.

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-4-boqun.feng@gmail.com

    Boqun Feng
     
  • On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
    read lock, actually it's only recursive if in_interrupt() is true. So
    change the annotation accordingly to catch more deadlocks.

    Note we used to treat read_lock() as pure recursive read locks in
    lib/locking-seftest.c, and this is useful, especially for the lockdep
    development selftest, so we keep this via a variable to force switching
    lock annotation for read_lock().

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200807074238.1632519-2-boqun.feng@gmail.com

    Boqun Feng
     
  • The lockdep tracepoints are under the lockdep recursion counter, this
    has a bunch of nasty side effects:

    - TRACE_IRQFLAGS doesn't work across the entire tracepoint

    - RCU-lockdep doesn't see the tracepoints either, hiding numerous
    "suspicious RCU usage" warnings.

    Pull the trace_lock_*() tracepoints completely out from under the
    lockdep recursion handling and completely rely on the trace level
    recusion handling -- also, tracing *SHOULD* not be taking locks in any
    case.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Steven Rostedt (VMware)
    Reviewed-by: Thomas Gleixner
    Acked-by: Rafael J. Wysocki
    Tested-by: Marco Elver
    Link: https://lkml.kernel.org/r/20200821085348.782688941@infradead.org

    Peter Zijlstra
     
  • Sven reported that commit a21ee6055c30 ("lockdep: Change
    hardirq{s_enabled,_context} to per-cpu variables") caused trouble on
    s390 because their this_cpu_*() primitives disable preemption which
    then lands back tracing.

    On the one hand, per-cpu ops should use preempt_*able_notrace() and
    raw_local_irq_*(), on the other hand, we can trivialy use raw_cpu_*()
    ops for this.

    Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
    Reported-by: Sven Schnelle
    Reviewed-by: Steven Rostedt (VMware)
    Reviewed-by: Thomas Gleixner
    Acked-by: Rafael J. Wysocki
    Tested-by: Marco Elver
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200821085348.192346882@infradead.org

    Peter Zijlstra
     

25 Aug, 2020

1 commit