04 May, 2010

3 commits


01 May, 2010

1 commit


06 Apr, 2010

1 commit

  • Locking statistics are implemented using global atomic
    variables. This is usually fine unless some path write them very
    often.

    This is the case for the function and function graph tracers
    that disable irqs for each entry saved (except if the function
    tracer is in preempt disabled only mode).
    And calls to local_irq_save/restore() increment
    hardirqs_on_events and hardirqs_off_events stats (or similar
    stats for redundant versions).

    Incrementing these global vars for each function ends up in too
    much cache bouncing if lockstats are enabled.

    To solve this, implement the debug_atomic_*() operations using
    per cpu vars.

    -v2: Use per_cpu() instead of get_cpu_var() to fetch the desired
    cpu vars on debug_atomic_read()

    -v3: Store the stats in a structure. No need for local_t as we
    are NMI/irq safe.

    -v4: Fix tons of build errors. I thought I had tested it but I
    probably forgot to select the relevant config.

    Suggested-by: Steven Rostedt
    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Steven Rostedt

    Frederic Weisbecker
     

24 Jul, 2009

5 commits

  • Some cleanups of the lockdep code after the BFS series:

    - Remove the last traces of the generation id
    - Fixup comment style
    - Move the bfs routines into lockdep.c
    - Cleanup the bfs routines

    [ tom.leiming@gmail.com: Fix crash ]
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Add BFS statistics to the existing lockdep stats.

    Signed-off-by: Ming Lei
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ming Lei
     
  • Since the shortest lock dependencies' path may be obtained by BFS,
    we print the shortest one by print_shortest_lock_dependencies().

    Signed-off-by: Ming Lei
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ming Lei
     
  • 1,replace %MAX_CIRCULAR_QUE_SIZE with &(MAX_CIRCULAR_QUE_SIZE-1)
    since we define MAX_CIRCULAR_QUE_SIZE as power of 2;

    2,use bitmap to mark if a lock is accessed in BFS in order to
    clear it quickly, because we may search a graph many times.

    Signed-off-by: Ming Lei
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ming Lei
     
  • Currently lockdep will print the 1st circle detected if it
    exists when acquiring a new (next) lock.

    This patch prints the shortest path from the next lock to be
    acquired to the previous held lock if a circle is found.

    The patch still uses the current method to check circle, and
    once the circle is found, breadth-first search algorithem is
    used to compute the shortest path from the next lock to the
    previous lock in the forward lock dependency graph.

    Printing the shortest path will shorten the dependency chain,
    and make troubleshooting for possible circular locking easier.

    Signed-off-by: Ming Lei
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ming Lei
     

13 May, 2009

1 commit

  • Now that lockdep coverage has increased it has become easier to
    run out of entries:

    [ 21.401387] BUG: MAX_LOCKDEP_ENTRIES too low!
    [ 21.402007] turning off the locking correctness validator.
    [ 21.402007] Pid: 1555, comm: S99local Not tainted 2.6.30-rc5-tip #2
    [ 21.402007] Call Trace:
    [ 21.402007] [] add_lock_to_list+0x53/0xba
    [ 21.402007] [] ? lookup_mnt+0x19/0x53
    [ 21.402007] [] check_prev_add+0x14b/0x1c7
    [ 21.402007] [] validate_chain+0x474/0x52a
    [ 21.402007] [] __lock_acquire+0x342/0x3c7
    [ 21.402007] [] lock_acquire+0xc1/0xe5
    [ 21.402007] [] ? lookup_mnt+0x19/0x53
    [ 21.402007] [] _spin_lock+0x31/0x66

    Double the size - as we've done in the past.

    [ Impact: allow lockdep to cover more locks ]

    Acked-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

15 Feb, 2009

4 commits

  • Generic, states independent, get_user_chars().

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Generate the state bit definitions from the lockdep_states.h file.

    Also, move LOCK_USED to last, so that the

    USED_IN
    USED_IN_READ
    ENABLED
    ENABLED_READ

    states are nicely bit aligned -- we're going to use that property

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • For convenience later.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Here is another version, with the incremental patch rolled up, and
    added reclaim context annotation to kswapd, and allocation tracing
    to slab allocators (which may only ever reach the page allocator
    in rare cases, so it is good to put annotations here too).

    Haven't tested this version as such, but it should be getting closer
    to merge worthy ;)

    --
    After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
    allocation when it should not have been, I thought it might be a good idea to
    try to catch this kind of thing with lockdep.

    I coded up a little idea that seems to work. Unfortunately the system has to
    actually be in __GFP_FS page reclaim, then take the lock, before it will mark
    it. But at least that might still be some orders of magnitude more common
    (and more debuggable) than an actual deadlock condition, so we have some
    improvement I hope (the concept is no less complete than discovery of a lock's
    interrupt contexts).

    I guess we could even do the same thing with __GFP_IO (normal reclaim), and
    even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
    code paths, so let's start there and see how it goes.

    It *seems* to work. I did a quick test.

    =================================
    [ INFO: inconsistent lock state ]
    2.6.28-rc6-00007-ged31348-dirty #26
    ---------------------------------
    inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
    modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
    (testlock){--..}, at: [] brd_init+0x55/0x216 [brd]
    {in-reclaim-W} state was registered at:
    [] __lock_acquire+0x75b/0x1a60
    [] lock_acquire+0x91/0xc0
    [] mutex_lock_nested+0xb1/0x310
    [] brd_init+0x2b/0x216 [brd]
    [] _stext+0x3b/0x170
    [] sys_init_module+0xaf/0x1e0
    [] system_call_fastpath+0x16/0x1b
    [] 0xffffffffffffffff
    irq event stamp: 3929
    hardirqs last enabled at (3929): [] mutex_lock_nested+0x285/0x310
    hardirqs last disabled at (3928): [] mutex_lock_nested+0x59/0x310
    softirqs last enabled at (3732): [] sk_filter+0x83/0xe0
    softirqs last disabled at (3730): [] sk_filter+0x16/0xe0

    other info that might help us debug this:
    1 lock held by modprobe/8526:
    #0: (testlock){--..}, at: [] brd_init+0x55/0x216 [brd]

    stack backtrace:
    Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged31348-dirty #26
    Call Trace:
    [] print_usage_bug+0x193/0x1d0
    [] mark_lock+0xaf0/0xca0
    [] mark_held_locks+0x55/0xc0
    [] ? brd_init+0x0/0x216 [brd]
    [] trace_reclaim_fs+0x2a/0x60
    [] __alloc_pages_internal+0x475/0x580
    [] ? mutex_lock_nested+0x26e/0x310
    [] ? brd_init+0x0/0x216 [brd]
    [] brd_init+0x6a/0x216 [brd]
    [] ? brd_init+0x0/0x216 [brd]
    [] _stext+0x3b/0x170
    [] ? mutex_unlock+0x9/0x10
    [] ? __mutex_unlock_slowpath+0x10d/0x180
    [] ? trace_hardirqs_on_caller+0x12c/0x190
    [] sys_init_module+0xaf/0x1e0
    [] system_call_fastpath+0x16/0x1b

    Signed-off-by: Nick Piggin
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Nick Piggin
     

13 Aug, 2008

1 commit

  • fix:

    kernel/built-in.o: In function `lockdep_stats_show':
    lockdep_proc.c:(.text+0x3cb2f): undefined reference to `lockdep_count_forward_deps'
    kernel/built-in.o: In function `l_show':
    lockdep_proc.c:(.text+0x3d02b): undefined reference to `lockdep_count_forward_deps'
    lockdep_proc.c:(.text+0x3d047): undefined reference to `lockdep_count_backward_deps'

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

11 Aug, 2008

1 commit

  • struct held_lock {
    u64 prev_chain_key; /* 0 8 */
    struct lock_class * class; /* 8 8 */
    long unsigned int acquire_ip; /* 16 8 */
    struct lockdep_map * instance; /* 24 8 */
    int irq_context; /* 32 4 */
    int trylock; /* 36 4 */
    int read; /* 40 4 */
    int check; /* 44 4 */
    int hardirqs_off; /* 48 4 */

    /* size: 56, cachelines: 1 */
    /* padding: 4 */
    /* last cacheline: 56 bytes */
    };

    struct held_lock {
    u64 prev_chain_key; /* 0 8 */
    long unsigned int acquire_ip; /* 8 8 */
    struct lockdep_map * instance; /* 16 8 */
    unsigned int class_idx:11; /* 24:21 4 */
    unsigned int irq_context:2; /* 24:19 4 */
    unsigned int trylock:1; /* 24:18 4 */
    unsigned int read:2; /* 24:16 4 */
    unsigned int check:2; /* 24:14 4 */
    unsigned int hardirqs_off:1; /* 24:13 4 */

    /* size: 32, cachelines: 1 */
    /* padding: 4 */
    /* bit_padding: 13 bits */
    /* last cacheline: 32 bytes */
    };

    [mingo@elte.hu: shrunk hlock->class too]
    [peterz@infradead.org: fixup bit sizes]
    Signed-off-by: Dave Jones
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Dave Jones
     

01 Aug, 2008

1 commit

  • When we traverse the graph, either forwards or backwards, we
    are interested in whether a certain property exists somewhere
    in a node reachable in the graph.

    Therefore it is never necessary to traverse through a node more
    than once to get a correct answer to the given query.

    Take advantage of this property using a global ID counter so that we
    need not clear all the markers in all the lock_class entries before
    doing a traversal. A new ID is choosen when we start to traverse, and
    we continue through a lock_class only if it's ID hasn't been marked
    with the new value yet.

    This short-circuiting is essential especially for high CPU count
    systems. The scheduler has a runqueue per cpu, and needs to take
    two runqueue locks at a time, which leads to long chains of
    backwards and forwards subgraphs from these runqueue lock nodes.
    Without the short-circuit implemented here, a graph traversal on
    a runqueue lock can take up to (1 << (N - 1)) checks on a system
    with N cpus.

    For anything more than 16 cpus or so, lockdep will eventually bring
    the machine to a complete standstill.

    Signed-off-by: David S. Miller
    Acked-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    David Miller
     

24 Jun, 2008

1 commit


20 Jun, 2008

1 commit