Eric Lee / smarc-fsl-linux-kernel

27 Aug, 2008

1 commit

748701728 lockdep: fix invalid list_del_rcu in zap_class ... Browse Code »

The problem is found during iwlagn driver testing on
v2.6.27-rc4-176-gb8e6c91 kernel, but it turns out to be a lockdep bug.
In our testing, we frequently load and unload the iwlagn driver
(>50 times). Then the MAX_STACK_TRACE_ENTRIES is reached (expected
behaviour?). The error message with the call trace is as below.

BUG: MAX_STACK_TRACE_ENTRIES too low!
turning off the locking correctness validator.
Pid: 4895, comm: iwlagn Not tainted 2.6.27-rc4 #13

Call Trace:
[] save_stack_trace+0x22/0x3e
[] save_trace+0x8b/0x91
[] mark_lock+0x1b0/0x8fa
[] __lock_acquire+0x5b9/0x716
[] ieee80211_sta_work+0x0/0x6ea [mac80211]
[] lock_acquire+0x52/0x6b
[] run_workqueue+0x97/0x1ed
[] run_workqueue+0xe7/0x1ed
[] run_workqueue+0x97/0x1ed
[] worker_thread+0xd8/0xe3
[] autoremove_wake_function+0x0/0x2e
[] worker_thread+0x0/0xe3
[] kthread+0x47/0x73
[] trace_hardirqs_on_thunk+0x3a/0x3f
[] child_rip+0xa/0x11
[] restore_args+0x0/0x30
[] finish_task_switch+0x0/0xcc
[] kthread+0x0/0x73
[] child_rip+0x0/0x11

Although the above is harmless, when the ilwagn module is removed
later lockdep will trigger a kernel oops as below.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
IP: [] zap_class+0x24/0x82
PGD 73128067 PUD 7448c067 PMD 0
Oops: 0002 [1] SMP
CPU 0
Modules linked in: rfcomm l2cap bluetooth autofs4 sunrpc
nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header
ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand
acpi_cpufreq dm_mirror dm_log dm_multipath dm_mod snd_hda_intel sr_mod
snd_seq_dummy snd_seq_oss snd_seq_midi_event battery snd_seq
snd_seq_device cdrom button snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer snd_page_alloc e1000e snd_hwdep sg iTCO_wdt
iTCO_vendor_support ac pcspkr i2c_i801 i2c_core snd soundcore video
output ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd mbcache
uhci_hcd ohci_hcd ehci_hcd [last unloaded: mac80211]
Pid: 4941, comm: modprobe Not tainted 2.6.27-rc4 #10
RIP: 0010:[] []
zap_class+0x24/0x82
RSP: 0000:ffff88007bcb3eb0 EFLAGS: 00010046
RAX: 0000000000068ee8 RBX: ffffffff8192a0a0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000001dfb RDI: ffffffff816e70b0
RBP: ffffffffa00cd000 R08: ffffffff816818f8 R09: ffff88007c923558
R10: ffffe20002ad2408 R11: ffffffff811028ec R12: ffffffff8192a0a0
R13: 000000000002bd90 R14: 0000000000000000 R15: 0000000000000296
FS: 00007f9d1cee56f0(0000) GS:ffffffff814a58c0(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000073047000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 4941, threadinfo ffff88007bcb2000, task
ffff8800758d1fc0)
Stack: ffffffff81057376 0000000000000000 ffffffffa00f7b00
0000000000000000
0000000000000080 0000000000618278 00007fff24f16720 0000000000000000
ffffffff8105d37a ffffffffa00f7b00 ffffffff8105d591 313132303863616d
Call Trace:
[] ? lockdep_free_key_range+0x61/0xf5
[] ? free_module+0xd4/0xe4
[] ? sys_delete_module+0x1de/0x1f9
[] ? audit_syscall_entry+0x12d/0x160
[] ? system_call_fastpath+0x16/0x1b

Code: b2 00 01 00 00 00 c3 31 f6 49 c7 c0 10 8a 61 81 eb 32 49 39 38
75 26 48 98 48 6b c0 38 48 8b 90 08 8a 61 81 48 8b 88 00 8a 61 81
89 51 08 48 89 0a 48 c7 80 08 8a 61 81 00 02 20 00 48 ff c6
RIP [] zap_class+0x24/0x82
RSP
CR2: 0000000000000008
---[ end trace a1297e0c4abb0f2e ]---

The root cause for this oops is in add_lock_to_list() when
save_trace() fails due to MAX_STACK_TRACE_ENTRIES is reached,
entry->class is assigned but entry is never added into any lock list.
This makes the list_del_rcu() in zap_class() oops later when the
module is unloaded. This patch fixes the problem by assigning
entry->class after save_trace() returns success.

Signed-off-by: Zhu Yi
Signed-off-by: Ingo Molnar

Zhu Yi
2008-08-27 14:40:36 +0800

26 Aug, 2008

1 commit

04148b73b lockstat: repair erronous contention statistics ... Browse Code »

Fix bad contention counting in /proc/lock_stat.

/proc/lockstat tries to gather per-ip contention
statistics per-lock. This was failing due to
a garbage per-ip index selector being used.

Signed-off-by: Ingo Molnar

Joe Korty
2008-08-26 16:37:47 +0800

18 Aug, 2008

1 commit

6951b12a0 lockdep: fix spurious 'inconsistent lock state' warning ... Browse Code »

Since f82b217e3513fe3af342c0f3ee1494e86250c21c lockdep can output spurious
warnings related to hwirqs due to hardirq_off shrinkage from int to bit-sized
flag. Guard it with double negation to fix the warning.

Signed-off-by: Dmitry Baryshkov
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Dmitry Baryshkov
2008-08-18 15:42:31 +0800

14 Aug, 2008

1 commit

2df8b1d65 lockdep: use WARN() in kernel/lockdep.c ... Browse Code »

Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton

Arjan van de Ven
2008-08-14 01:06:46 +0800

12 Aug, 2008

1 commit

0f2bc27be lockdep: fix debug_lock_alloc ... Browse Code »

When we enable DEBUG_LOCK_ALLOC but do not enable PROVE_LOCKING and or
LOCK_STAT, lock_alloc() and lock_release() turn into nops, even though
we should be doing hlock checking (check=1).

This causes a false warning and a lockdep self-disable.

Rectify this.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-12 04:45:51 +0800

11 Aug, 2008

4 commits

8bfe0298f lockdep: handle chains involving classes defined in modules ... Browse Code »

Solve this by marking the classes as unused and not printing information
about the unused classes.

Reported-by: Eric Sesterhenn
Signed-off-by: Rabin Vincent
Acked-by: Huang Ying
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Rabin Vincent
2008-08-11 15:30:26 +0800
7531e2f34 lockdep: lock protection locks ... Browse Code »

On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote:

> On Fri, 1 Aug 2008, David Miller wrote:
> >
> > Taking more than a few locks of the same class at once is bad
> > news and it's better to find an alternative method.
>
> It's not always wrong.
>
> If you can guarantee that anybody that takes more than one lock of a
> particular class will always take a single top-level lock _first_, then
> that's all good. You can obviously screw up and take the same lock _twice_
> (which will deadlock), but at least you cannot get into ABBA situations.
>
> So maybe the right thing to do is to just teach lockdep about "lock
> protection locks". That would have solved the multi-queue issues for
> networking too - all the actual network drivers would still have taken
> just their single queue lock, but the one case that needs to take all of
> them would have taken a separate top-level lock first.
>
> Never mind that the multi-queue locks were always taken in the same order:
> it's never wrong to just have some top-level serialization, and anybody
> who needs to take locks might as well do , because they sure as
> hell aren't going to be on _any_ fastpaths.
>
> So the simplest solution really sounds like just teaching lockdep about
> that one special case. It's not "nesting" exactly, although it's obviously
> related to it.

Do as Linus suggested. The lock protection lock is called nest_lock.

Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything
that spills that it still up shit creek.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-11 15:30:24 +0800
f82b217e3 lockdep: shrink held_lock structure ... Browse Code »

struct held_lock {
u64 prev_chain_key; /* 0 8 */
struct lock_class * class; /* 8 8 */
long unsigned int acquire_ip; /* 16 8 */
struct lockdep_map * instance; /* 24 8 */
int irq_context; /* 32 4 */
int trylock; /* 36 4 */
int read; /* 40 4 */
int check; /* 44 4 */
int hardirqs_off; /* 48 4 */

/* size: 56, cachelines: 1 */
/* padding: 4 */
/* last cacheline: 56 bytes */
};

struct held_lock {
u64 prev_chain_key; /* 0 8 */
long unsigned int acquire_ip; /* 8 8 */
struct lockdep_map * instance; /* 16 8 */
unsigned int class_idx:11; /* 24:21 4 */
unsigned int irq_context:2; /* 24:19 4 */
unsigned int trylock:1; /* 24:18 4 */
unsigned int read:2; /* 24:16 4 */
unsigned int check:2; /* 24:14 4 */
unsigned int hardirqs_off:1; /* 24:13 4 */

/* size: 32, cachelines: 1 */
/* padding: 4 */
/* bit_padding: 13 bits */
/* last cacheline: 32 bytes */
};

[mingo@elte.hu: shrunk hlock->class too]
[peterz@infradead.org: fixup bit sizes]
Signed-off-by: Dave Jones
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Dave Jones
2008-08-11 15:30:23 +0800
64aa348ed lockdep: lock_set_subclass - reset a held lock's subclass ... Browse Code »

this can be used to reset a held lock's subclass, for arbitrary-depth
iterated data structures such as trees or lists which have per-node
locks.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-11 15:30:21 +0800

01 Aug, 2008

1 commit

419ca3f13 lockdep: fix combinatorial explosion in lock subgraph traversal ... Browse Code »

When we traverse the graph, either forwards or backwards, we
are interested in whether a certain property exists somewhere
in a node reachable in the graph.

Therefore it is never necessary to traverse through a node more
than once to get a correct answer to the given query.

Take advantage of this property using a global ID counter so that we
need not clear all the markers in all the lock_class entries before
doing a traversal. A new ID is choosen when we start to traverse, and
we continue through a lock_class only if it's ID hasn't been marked
with the new value yet.

This short-circuiting is essential especially for high CPU count
systems. The scheduler has a runqueue per cpu, and needs to take
two runqueue locks at a time, which leads to long chains of
backwards and forwards subgraphs from these runqueue lock nodes.
Without the short-circuit implemented here, a graph traversal on
a runqueue lock can take up to (1 << (N - 1)) checks on a system
with N cpus.

For anything more than 16 cpus or so, lockdep will eventually bring
the machine to a complete standstill.

Signed-off-by: David S. Miller
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

David Miller
2008-08-01 00:38:28 +0800

15 Jul, 2008

1 commit

40e7babbb Merge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: fix kernel/fork.c warning
lockdep: fix ftrace irq tracing false positive
lockdep: remove duplicate definition of STATIC_LOCKDEP_MAP_INIT
lockdep: add lock_class information to lock_chain and output it
lockdep: add lock_class information to lock_chain and output it
lockdep: output lock_class key instead of address for forward dependency output
__mutex_lock_common: use signal_pending_state()
mutex-debug: check mutex magic before owner

Fixed up conflict in kernel/fork.c manually

Linus Torvalds
2008-07-15 05:55:13 +0800

14 Jul, 2008

1 commit

992860e99 lockdep: fix ftrace irq tracing false positive ... Browse Code »

fix this false positive:

[ 0.020000] ------------[ cut here ]------------
[ 0.020000] WARNING: at kernel/lockdep.c:2718 check_flags+0x14a/0x170()
[ 0.020000] Modules linked in:
[ 0.020000] Pid: 0, comm: swapper Not tainted 2.6.26-tip-00343-gd7e5521-dirty #14486
[ 0.020000] [] warn_on_slowpath+0x54/0x80
[ 0.020000] [] ? _spin_unlock_irqrestore+0x61/0x70
[ 0.020000] [] ? release_console_sem+0x201/0x210
[ 0.020000] [] ? __kernel_text_address+0x35/0x40
[ 0.020000] [] ? dump_trace+0x5e/0x140
[ 0.020000] [] ? __lock_acquire+0x245/0x820
[ 0.020000] [] check_flags+0x14a/0x170
[ 0.020000] [] ? lock_acquire+0x48/0xc0
[ 0.020000] [] lock_acquire+0x51/0xc0
[ 0.020000] [] ? down+0x2c/0x40
[ 0.020000] [] ? sched_clock+0x9/0x10
[ 0.020000] [] _write_lock+0x32/0x60
[ 0.020000] [] ? request_resource+0x1f/0xb0
[ 0.020000] [] request_resource+0x1f/0xb0
[ 0.020000] [] vgacon_startup+0x2bd/0x3e0
[ 0.020000] [] con_init+0x19/0x22f
[ 0.020000] [] ? tty_register_ldisc+0x5c/0x70
[ 0.020000] [] console_init+0x20/0x2e
[ 0.020000] [] start_kernel+0x20c/0x379
[ 0.020000] [] ? unknown_bootoption+0x0/0x1f6
[ 0.020000] [] __init_begin+0x99/0xa1
[ 0.020000] =======================
[ 0.020000] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 0.020000] possible reason: unannotated irqs-on.
[ 0.020000] irq event stamp: 0

which occurs if CONFIG_TRACE_IRQFLAGS=y, CONFIG_DEBUG_LOCKDEP=y,
but CONFIG_PROVE_LOCKING is disabled.

Signed-off-by: Ingo Molnar

Ingo Molnar
2008-07-14 16:32:14 +0800

24 Jun, 2008

1 commit

cd1a28e84 lockdep: add lock_class information to lock_chain and output it ... Browse Code »

It is based on x86/master branch of git-x86 tree, and has been tested
on x86_64 platform.

ChangeLog:

v2:

- Enclosing proc file system related code into CONFIG_PROVE_LOCKING.

- Fix nr_chain_hlocks update code.

Signed-off-by: Huang Ying
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar

Huang, Ying
2008-06-24 07:28:20 +0800

20 Jun, 2008

1 commit

443cd507c lockdep: add lock_class information to lock_chain and output it ... Browse Code »

This patch records array of lock_class into lock_chain, and export
lock_chain information via /proc/lockdep_chains.

It is based on x86/master branch of git-x86 tree, and has been tested
on x86_64 platform.

Signed-off-by: Huang Ying
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar

Huang, Ying
2008-06-20 18:21:33 +0800

24 May, 2008

4 commits

bb065afb8 lockdep: update lockdep_recursion on graph_lock ... Browse Code »

With the introduction of ftrace, it is possible to recurse into
the lockdep functions via the mcount call. To prevent possible
lockups, updating the lockdep_recursion counter on grabbing the internal
lockdep_lock should prevent deadlocks.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Steven Rostedt
2008-05-24 03:50:21 +0800
1d09daa55 ftrace: use Makefile to remove tracing from lockdep ... Browse Code »

This patch removes the "notrace" annotation from lockdep and adds the debugging
files in the kernel director to those that should not be compiled with
"-pg" mcount tracing.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Steven Rostedt
2008-05-24 03:15:14 +0800
0764d23cf ftrace: lockdep notrace annotations ... Browse Code »

Add notrace annotations to lockdep to keep ftrace from causing
recursive problems with lock tracing and debugging.

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Steven Rostedt
2008-05-24 02:39:40 +0800
81d68a96a ftrace: trace irq disabled critical timings ... Browse Code »

This patch adds latency tracing for critical timings
(how long interrupts are disabled for).

"irqsoff" is added to /debugfs/tracing/available_tracers

Note:
tracing_max_latency
also holds the max latency for irqsoff (in usecs).
(default to large number so one must start latency tracing)

tracing_thresh
threshold (in usecs) to always print out if irqs off
is detected to be longer than stated here.
If irq_thresh is non-zero, then max_irq_latency
is ignored.

Here's an example of a trace with ftrace_enabled = 0

=======
preemption latency trace v1.1.5 on 2.6.24-rc7
Signed-off-by: Ingo Molnar
--------------------------------------------------------------------
latency: 100 us, #3/3, CPU#1 | (M:rt VP:0, KP:0, SP:0 HP:0 #P:2)
-----------------
| task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
-----------------
=> started at: _spin_lock_irqsave+0x2a/0xb7
=> ended at: _spin_unlock_irqrestore+0x32/0x5f

_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
swapper-0 1d.s3 0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000])
swapper-0 1d.s3 100us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000])
swapper-0 1d.s3 100us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f)

vim:ft=help
=======

And this is a trace with ftrace_enabled == 1

=======
preemption latency trace v1.1.5 on 2.6.24-rc7
--------------------------------------------------------------------
latency: 102 us, #12/12, CPU#1 | (M:rt VP:0, KP:0, SP:0 HP:0 #P:2)
-----------------
| task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
-----------------
=> started at: _spin_lock_irqsave+0x2a/0xb7
=> ended at: _spin_unlock_irqrestore+0x32/0x5f

_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
swapper-0 1dNs3 0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000])
swapper-0 1dNs3 46us : e1000_read_phy_reg+0x16/0x225 [e1000] (e1000_update_stats+0x5e2/0x64c [e1000])
swapper-0 1dNs3 46us : e1000_swfw_sync_acquire+0x10/0x99 [e1000] (e1000_read_phy_reg+0x49/0x225 [e1000])
swapper-0 1dNs3 46us : e1000_get_hw_eeprom_semaphore+0x12/0xa6 [e1000] (e1000_swfw_sync_acquire+0x36/0x99 [e1000])
swapper-0 1dNs3 47us : __const_udelay+0x9/0x47 (e1000_read_phy_reg+0x116/0x225 [e1000])
swapper-0 1dNs3 47us+: __delay+0x9/0x50 (__const_udelay+0x45/0x47)
swapper-0 1dNs3 97us : preempt_schedule+0xc/0x84 (__delay+0x4e/0x50)
swapper-0 1dNs3 98us : e1000_swfw_sync_release+0xc/0x55 [e1000] (e1000_read_phy_reg+0x211/0x225 [e1000])
swapper-0 1dNs3 99us+: e1000_put_hw_eeprom_semaphore+0x9/0x35 [e1000] (e1000_swfw_sync_release+0x50/0x55 [e1000])
swapper-0 1dNs3 101us : _spin_unlock_irqrestore+0xe/0x5f (e1000_update_stats+0x641/0x64c [e1000])
swapper-0 1dNs3 102us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000])
swapper-0 1dNs3 102us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f)

vim:ft=help
=======

Signed-off-by: Steven Rostedt
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Steven Rostedt
2008-05-24 02:32:46 +0800

26 Feb, 2008

1 commit

1481197b5 Subject: lockdep: include all lock classes in all_lock_classes ... Browse Code »

Add each lock class to the all_lock_classes list when it is
first registered.

Previously, lock classes were added to all_lock_classes when
the lock class was first used. Since one of the uses of the
list is to find unused locks, this didn't work well.

Signed-off-by: Dale Farnsworth
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Dale Farnsworth
2008-02-26 06:03:02 +0800

26 Jan, 2008

1 commit

82a1fcb90 softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks ... Browse Code »

this patch extends the soft-lockup detector to automatically
detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
printed the following way:

------------------>
INFO: task prctl:3042 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
prctl D fd5e3793 0 3042 2997
f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
Call Trace:
[] schedule_timeout+0x6d/0x8b
[] schedule_timeout_uninterruptible+0x15/0x17
[] msleep+0x10/0x16
[] sys_prctl+0x30/0x1e2
[] sysenter_past_esp+0x5f/0xa5
=======================
2 locks held by prctl/3042:
#0: (&sb->s_type->i_mutex_key#5){--..}, at: [] do_fsync+0x38/0x7a
#1: (jbd_handle){--..}, at: [] journal_start+0xc7/0xe9
: CPU hotplug fixes. ]
[ Andrew Morton : build warning fix. ]

Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven

Ingo Molnar
2008-01-26 04:08:02 +0800

25 Jan, 2008

1 commit

fabe874a4 lockdep: fix kernel crash on module unload ... Browse Code »

Michael Wu noticed in his lkml post at

http://marc.info/?l=linux-kernel&m=119396182726091&w=2

that certain wireless drivers ended up having their name in module
memory, which would then crash the kernel on module unload.

The patch he proposed was a bit clumsy in that it increased the size of
a lockdep entry significantly; the patch below tries another approach,
it checks, on module teardown, if the name of a class is in module space
and then zaps the class. This is very similar to what we already do
with keys that are in module space.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds

Arjan van de Ven
2008-01-25 00:01:09 +0800

16 Jan, 2008

1 commit

5a26db5bd lockdep: fix internal double unlock during self-test ... Browse Code »

Lockdep, during self-test (when it was simulating double unlocks) was
sometimes unconditionally unlocking a spinlock when it had not been
locked. This won't work for ticket locks.

Signed-off-by: Nick Piggin
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Nick Piggin
2008-01-16 16:51:58 +0800

08 Dec, 2007

1 commit

5f9fa8a62 lockdep: make cli/sti annotation warnings clearer ... Browse Code »

make cli/sti annotation warnings easier to interpret.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-12-08 02:02:47 +0800

05 Dec, 2007

2 commits

54561783e lockdep: in_range() fix ... Browse Code »

Torsten Kaiser wrote:

| static inline int in_range(const void *start, const void *addr, const void *end)
| {
| return addr >= start && addr mem_to is the last byte of the freed range, that fits in_range
| lock_from = (void *)hlock->instance;
| -> first byte of the lock
| lock_to = (void *)(hlock->instance + 1);
| -> first byte of the next lock, not last byte of the lock that is being checked!
|
| The test is:
| if (!in_range(mem_from, lock_from, mem_to) &&
| !in_range(mem_from, lock_to, mem_to))
| continue;
| So it tests, if the first byte of the lock is in the range that is freed ->OK
| And if the first byte of the *next* lock is in the range that is freed
| -> Not OK.

We can also simplify in_range checks, we need only 2 comparisons, not 4.
If the lock is not in memory range, it should be either at the left of range
or at the right.

Signed-off-by: Oleg Nesterov
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Oleg Nesterov
2007-12-05 22:46:09 +0800
856848737 lockdep: fix debug_show_all_locks() ... Browse Code »

fix the oops that can be seen in:

http://bugzilla.kernel.org/attachment.cgi?id=13828&action=view

it is not safe to print the locks of running tasks.

(even with this fix we have a small race - but this is a debug
function after all.)

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-12-05 22:46:09 +0800

29 Oct, 2007

1 commit

17aacfb9c lockdep: fix a typo in the __lock_acquire comment ... Browse Code »

Fix a typo in the __lock_acquire comment.

Signed-off-by: Gautham R Shenoy
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Gautham R Shenoy
2007-10-29 03:47:01 +0800

20 Oct, 2007

2 commits

ba25f9dcc Use helpers to obtain task pid in printks ... Browse Code »

The task_struct->pid member is going to be deprecated, so start
using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
the kernel.

The first thing to start with is the pid, printed to dmesg - in
this case we may safely use task_pid_nr(). Besides, printks produce
more (much more) than a half of all the explicit pid usage.

[akpm@linux-foundation.org: git-drm went and changed lots of stuff]
Signed-off-by: Pavel Emelyanov
Cc: Dave Airlie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:43 +0800
4e6045f13 workqueue: debug flushing deadlocks with lockdep ... Browse Code »

In the following scenario:

code path 1:
my_function() -> lock(L1); ...; flush_workqueue(); ...

code path 2:
run_workqueue() -> my_work() -> ...; lock(L1); ...

you can get a deadlock when my_work() is queued or running
but my_function() has acquired L1 already.

This patch adds a pseudo-lock to each workqueue to make lockdep
warn about this scenario.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Johannes Berg
Acked-by: Oleg Nesterov
Acked-by: Ingo Molnar
Acked-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Berg
2007-10-20 02:53:38 +0800

12 Oct, 2007

2 commits

b351d164e lockdep: syscall exit check ... Browse Code »

Provide a check to validate that we do not hold any locks when switching
back to user-space.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-10-12 04:11:12 +0800
3aa416b07 lockdep: fix mismatched lockdep_depth/curr_chain_hash ... Browse Code »

It is possible for the current->curr_chain_key to become inconsistent with the
current index if the chain fails to validate. The end result is that future
lock_acquire() operations may inadvertently fail to find a hit in the cache
resulting in a new node being added to the graph for every acquire.

Signed-off-by: Gregory Haskins
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Gregory Haskins
2007-10-12 04:11:11 +0800

20 Jul, 2007

7 commits

c71063c9c lockdep debugging: give stacktrace for init_error ... Browse Code »

When I started adding support for lockdep to 64-bit powerpc, I got a
lockdep_init_error and with this patch was able to pinpoint why and where
to put lockdep_init(). Let's support this generally for others adding
lockdep support to their architecture.

Signed-off-by: Johannes Berg
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Berg
2007-07-20 01:04:49 +0800
96645678c lockstat: measure lock bouncing ... Browse Code »

__acquire
|
lock _____
| \
| __contended
| |
| wait
| _______/
|/
|
__acquired
|
__release
|
unlock

We measure acquisition and contention bouncing.

This is done by recording a cpu stamp in each lock instance.

Contention bouncing requires the cpu stamp to be set on acquisition. Hence we
move __acquired into the generic path.

__acquired is then used to measure acquisition bouncing by comparing the
current cpu with the old stamp before replacing it.

__contended is used to measure contention bouncing (only useful for preemptable
locks)

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800
4b32d0a4e lockdep: various fixes ... Browse Code »

- update the copyright notices
- use the default hash function
- fix a thinko in a BUILD_BUG_ON
- add a WARN_ON to spot inconsitent naming
- fix a termination issue in /proc/lock_stat

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800
c46261de0 lockstat: human readability tweaks ... Browse Code »

Present all this fancy new lock statistics information:

*warning, _wide_ output ahead*

(output edited for purpose of brevity)

# cat /proc/lock_stat
lock_stat version 0.1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
class name contentions waittime-min waittime-max waittime-total acquisitions holdtime-min holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

&inode->i_mutex: 14458 6.57 398832.75 2469412.23 6768876 0.34 11398383.65 339410830.89
---------------
&inode->i_mutex 4486 [] pipe_wait+0x86/0x8d
&inode->i_mutex 0 [] pipe_write_fasync+0x29/0x5d
&inode->i_mutex 0 [] pipe_read+0x74/0x3a5
&inode->i_mutex 0 [] do_lookup+0x81/0x1ae

.................................................................................................................................................................

&inode->i_data.tree_lock-W: 491 0.27 62.47 493.89 2477833 0.39 468.89 1146584.25
&inode->i_data.tree_lock-R: 65 0.44 4.27 48.78 26288792 0.36 184.62 10197458.24
--------------------------
&inode->i_data.tree_lock 46 [] __do_page_cache_readahead+0x69/0x24f
&inode->i_data.tree_lock 31 [] add_to_page_cache+0x31/0xba
&inode->i_data.tree_lock 0 [] __do_page_cache_readahead+0xc2/0x24f
&inode->i_data.tree_lock 0 [] find_get_page+0x1a/0x58

.................................................................................................................................................................

proc_inum_idr.lock: 0 0.00 0.00 0.00 36 0.00 65.60 148.26
proc_subdir_lock: 0 0.00 0.00 0.00 3049859 0.00 106.81 1563212.42
shrinker_rwsem-W: 0 0.00 0.00 0.00 5 0.00 1.73 3.68
shrinker_rwsem-R: 0 0.00 0.00 0.00 633 2.57 246.57 10909.76

'contentions' and 'acquisitions' are the number of such events measured (since
the last reset). The waittime- and holdtime- (min, max, total) numbers are
presented in microseconds.

If there are any contention points, the lock class is presented in the block
format (as i_mutex and tree_lock above), otherwise a single line of output is
presented.

The output is sorted on absolute number of contentions (read + write), this
should get the worst offenders presented first, so that:

# grep : /proc/lock_stat | head

will quickly show who's bad.

The stats can be reset using:

# echo 0 > /proc/lock_stat

[bunk@stusta.de: make 2 functions static]
[akpm@linux-foundation.org: fix printk warning]
Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Acked-by: Jason Baron
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800
f20786ff4 lockstat: core infrastructure ... Browse Code »

Introduce the core lock statistics code.

Lock statistics provides lock wait-time and hold-time (as well as the count
of corresponding contention and acquisitions events). Also, the first few
call-sites that encounter contention are tracked.

Lock wait-time is the time spent waiting on the lock. This provides insight
into the locking scheme, that is, a heavily contended lock is indicative of
a too coarse locking scheme.

Lock hold-time is the duration the lock was held, this provides a reference for
the wait-time numbers, so they can be put into perspective.

1)
lock
2)
... do stuff ..
unlock
3)

The time between 1 and 2 is the wait-time. The time between 2 and 3 is the
hold-time.

The lockdep held-lock tracking code is reused, because it already collects locks
into meaningful groups (classes), and because it is an existing infrastructure
for lock instrumentation.

Currently lockdep tracks lock acquisition with two hooks:

lock()
lock_acquire()
_lock()

... code protected by lock ...

unlock()
lock_release()
_unlock()

We need to extend this with two more hooks, in order to measure contention.

lock_contended() - used to measure contention events
lock_acquired() - completion of the contention

These are then placed the following way:

lock()
lock_acquire()
if (!_try_lock())
lock_contended()
_lock()
lock_acquired()

... do locked stuff ...

unlock()
lock_release()
_unlock()

(Note: the try_lock() 'trick' is used to avoid instrumenting all platform
dependent lock primitive implementations.)

It is also possible to toggle the two lockdep features at runtime using:

/proc/sys/kernel/prove_locking
/proc/sys/kernel/lock_stat

(esp. turning off the O(n^2) prove_locking functionaliy can help)

[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: nuke unneeded ifdefs]
Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Acked-by: Jason Baron
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800
8e18257d2 lockdep: reduce the ifdeffery ... Browse Code »

Move code around to get fewer but larger #ifdef sections. Break some
in-function #ifdefs out into their own functions.

Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800
ca58abcb4 lockdep: sanitise CONFIG_PROVE_LOCKING ... Browse Code »

Ensure that all of the lock dependency tracking code is under
CONFIG_PROVE_LOCKING. This allows us to use the held lock tracking code for
other purposes.

Signed-off-by: Peter Zijlstra
Acked-by: Ingo Molnar
Acked-by: Jason Baron
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:49 +0800

18 Jul, 2007

1 commit

9281acea6 kallsyms: make KSYM_NAME_LEN include space for trailing '\0' ... Browse Code »

KSYM_NAME_LEN is peculiar in that it does not include the space for the
trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
buffer. This is nonsense and error-prone. Moreover, when the caller
forgets that it's very likely to subtly bite back by corrupting the stack
because the last position of the buffer is always cleared to zero.

This patch increments KSYM_NAME_LEN by one and updates code accordingly.

* off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
is fixed.

* Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
MODULE_NAME_LEN was treated as if it didn't include space for the
trailing '\0'. Fix it.

Signed-off-by: Tejun Heo
Acked-by: Paulo Marques
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2007-07-18 01:23:03 +0800

09 May, 2007

2 commits

4ff773bbd lockdep: removed unused ip argument in mark_lock & mark_held_locks ... Browse Code »

It looks like a remainder from designing...

Signed-off-by: Jarek Poplawski
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jarek Poplawski
2007-05-09 02:15:13 +0800
9e860d000 lockdep: lookup_chain_cache comment errata ... Browse Code »

Signed-off-by: Jarek Poplawski
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jarek Poplawski
2007-05-09 02:15:11 +0800