Doug / smarc-fsl-linux-kernel | Embedian Git Server

03 Dec, 2009

2 commits

cf244dc01 rcu: Enable fourth level of TREE_RCU hierarchy ... Browse Code »

Enable a fourth level of rcu_node hierarchy for TREE_RCU and
TREE_PREEMPT_RCU. This is for stress-testing and experiemental
purposes only, although in theory this would enable 16,777,216
CPUs on 64-bit systems, though only 1,048,576 CPUs on 32-bit
systems. Normal experimental use of this fourth level will
normally set CONFIG_RCU_FANOUT=2, requiring a 16-CPU system,
though the more adventurous (and more fortunate) experimenters
may wish to chose CONFIG_RCU_FANOUT=3 for 81-CPU systems or even
CONFIG_RCU_FANOUT=4 for 256-CPU systems.

Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Acked-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-12-03 18:34:53 +0800
d3f6bad39 rcu: Rename "quiet" functions ... Browse Code »

The number of "quiet" functions has grown recently, and the
names are no longer very descriptive. The point of all of these
functions is to do some portion of the task of reporting a
quiescent state, so rename them accordingly:

o cpu_quiet() becomes rcu_report_qs_rdp(), which reports a
quiescent state to the per-CPU rcu_data structure. If this
turns out to be a new quiescent state for this grace period,
then rcu_report_qs_rnp() will be invoked to propagate the
quiescent state up the rcu_node hierarchy.

o cpu_quiet_msk() becomes rcu_report_qs_rnp(), which reports
a quiescent state for a given CPU (or possibly a set of CPUs)
up the rcu_node hierarchy.

o cpu_quiet_msk_finish() becomes rcu_report_qs_rsp(), which
reports a full set of quiescent states to the global rcu_state
structure.

o task_quiet() becomes rcu_report_unblock_qs_rnp(), which reports
a quiescent state due to a task exiting an RCU read-side critical
section that had previously blocked in that same critical section.
As indicated by the new name, this type of quiescent state is
reported up the rcu_node hierarchy (using rcu_report_qs_rnp()
to do so).

Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Acked-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-12-03 18:34:26 +0800

23 Nov, 2009

3 commits

6ebb237be rcu: Re-arrange code to reduce #ifdef pain ... Browse Code »

Remove #ifdefs from kernel/rcupdate.c and
include/linux/rcupdate.h by moving code to
include/linux/rcutiny.h, include/linux/rcutree.h, and
kernel/rcutree.c.

Also remove some definitions that are no longer used.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-23 01:58:16 +0800
9f680ab41 rcu: Eliminate unneeded function wrapping ... Browse Code »

The functions rcu_init() is a wrapper for __rcu_init(), and also
sets up the CPU-hotplug notifier for rcu_barrier_cpu_hotplug().
But TINY_RCU doesn't need CPU-hotplug notification, and the
rcu_barrier_cpu_hotplug() is a simple wrapper for
rcu_cpu_notify().

So push rcu_init() out to kernel/rcutree.c and kernel/rcutiny.c
and get rid of the wrapper function rcu_barrier_cpu_hotplug().

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-23 01:58:16 +0800
b668c9cf3 rcu: Fix grace-period-stall bug on large systems with CPU hotplug ... Browse Code »

When the last CPU of a given leaf rcu_node structure goes
offline, all of the tasks queued on that leaf rcu_node structure
(due to having blocked in their current RCU read-side critical
sections) are requeued onto the root rcu_node structure. This
requeuing is carried out by rcu_preempt_offline_tasks().
However, it is possible that these queued tasks are the only
thing preventing the leaf rcu_node structure from reporting a
quiescent state up the rcu_node hierarchy. Unfortunately, the
old code would fail to do this reporting, resulting in a
grace-period stall given the following sequence of events:

1. Kernel built for more than 32 CPUs on 32-bit systems or for more
than 64 CPUs on 64-bit systems, so that there is more than one
rcu_node structure. (Or CONFIG_RCU_FANOUT is artificially set
to a number smaller than CONFIG_NR_CPUS.)

2. The kernel is built with CONFIG_TREE_PREEMPT_RCU.

3. A task running on a CPU associated with a given leaf rcu_node
structure blocks while in an RCU read-side critical section
-and- that CPU has not yet passed through a quiescent state
for the current RCU grace period. This will cause the task
to be queued on the leaf rcu_node's blocked_tasks[] array, in
particular, on the element of this array corresponding to the
current grace period.

4. Each of the remaining CPUs corresponding to this same leaf rcu_node
structure pass through a quiescent state. However, the task is
still in its RCU read-side critical section, so these quiescent
states cannot be reported further up the rcu_node hierarchy.
Nevertheless, all bits in the leaf rcu_node structure's ->qsmask
field are now zero.

5. Each of the remaining CPUs go offline. (The events in step
#4 and #5 can happen in any order as long as each CPU passes
through a quiescent state before going offline.)

6. When the last CPU goes offline, __rcu_offline_cpu() will invoke
rcu_preempt_offline_tasks(), which will move the task to the
root rcu_node structure, but without reporting a quiescent state
up the rcu_node hierarchy (and this failure to report a quiescent
state is the bug).

But because this leaf rcu_node structure's ->qsmask field is
already zero and its ->block_tasks[] entries are all empty,
force_quiescent_state() will skip this rcu_node structure.

Therefore, grace periods are now hung.

This patch abstracts some code out of rcu_read_unlock_special(),
calling the result task_quiet() by analogy with cpu_quiet(), and
invokes task_quiet() from both rcu_read_lock_special() and
__rcu_offline_cpu(). Invoking task_quiet() from
__rcu_offline_cpu() reports the quiescent state up the rcu_node
hierarchy, fixing the bug. This ends up requiring a separate
lock_class_key per level of the rcu_node hierarchy, which this
patch also provides.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-23 01:58:15 +0800

14 Nov, 2009

2 commits

2f51f9884 rcu: Eliminate __rcu_pending() false positives ... Browse Code »

Now that there are both ->gpnum and ->completed fields in the
rcu_node structure, __rcu_pending() should check rdp->gpnum and
rdp->completed against rnp->gpnum and rdp->completed, respectively,
instead of the prior comparison against the rcu_state fields
rsp->gpnum and rsp->completed.

Given the old comparison, __rcu_pending() could return 1, resulting
in a needless raise_softirq(RCU_SOFTIRQ). This useless work would
happen if RCU responded to a scheduling-clock interrupt after the
rcu_state fields had been updated, but before the rcu_node fields
had been updated.

Changing the comparison from the rcu_state fields to the rcu_node
fields prevents this useless work from happening.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-14 17:31:42 +0800
560d4bc0d rcu: Further cleanups of use of lastcomp ... Browse Code »

Now that a copy of the rsp->completed flag is available in all
rcu_node structures, make full use of it. It is still
legitimate to access rsp->completed while holding the root
rcu_node structure's lock, however.

Also, tighten up force_quiescent_state()'s checks for end of
current grace period.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-14 17:31:42 +0800

13 Nov, 2009

2 commits

8e9aa8f06 rcu: Simplify association of forced quiescent states with grace periods ... Browse Code »

The force_quiescent_state() function also took a snapshot
of the ->completed field, which was as obnoxious as it was in
rcu_sched_qs() and friends. So snapshot ->gpnum-1.

Also, since the dyntick_record_completed() and
dyntick_recall_completed() functions are now simple assignments
that are independent of CONFIG_NO_HZ, and since their names are
now misleading, get rid of them.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-13 17:18:36 +0800
b32e9eb6a rcu: Accelerate callback processing on CPUs not detecting GP end ... Browse Code »

An earlier fix for a race resulted in a situation where the CPUs
other than the CPU that detected the end of the grace period would
not process their callbacks until the next grace period started.

This means that these other CPUs would unnecessarily demand that an
extra grace period be started.

This patch eliminates this extra grace period and speeds callback
processing by propagating rsp->completed to the rcu_node structures
in the case where the CPU detecting the end of the grace period
sees no reason to start a new grace period.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-13 17:18:36 +0800

12 Nov, 2009

1 commit

0e0fc1c23 rcu: Mark init-time-only rcu_bootup_announce() as __init ... Browse Code »

Because rcu_bootup_announce() is used only at boot time, mark it
as __init, presumably so that its memory can be reclaimed.

Suggested-by: Joe Perches
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-12 04:27:42 +0800

11 Nov, 2009

4 commits

c64ac3ce0 rcu: Simplify association of quiescent states with grace periods ... Browse Code »

The rdp->passed_quiesc_completed fields are used to properly
associate the recorded quiescent state with a grace period. It
is OK to wrongly associate a given quiescent state with a
preceding grace period, but it is fatal to associate a given
quiescent state with a grace period that begins after the
quiescent state occurred. Grace periods are numbered, and the
following fields track them:

o ->gpnum is the number of the grace period currently in
progress, or the number of the last grace period to
complete if no grace period is currently in progress.

o ->completed is the number of the last grace period to
have completed.

These two fields are equal if there is no grace period in
progress, otherwise ->gpnum is one greater than ->completed.
But the rdp->passed_quiesc_completed field compared against
->completed, and if equal, the quiescent state is presumed to
count against the current grace period.

The earlier code copied rdp->completed to
rdp->passed_quiesc_completed, which has been made to work, but
is error-prone. In contrast, copying one less than rdp->gpnum
is guaranteed safe, because rdp->gpnum is not incremented until
after the start of the corresponding grace period. At the end of
the grace period, when ->completed has incremented, then any
quiescent periods recorded previously will be discarded.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-11 05:48:50 +0800
4bcfe0550 rcu: Rename dynticks_completed to completed_fqs ... Browse Code »

This field is used whether or not CONFIG_NO_HZ is set, so the
old name of ->dynticks_completed is quite misleading.

Change to ->completed_fqs, given that it the value that
force_quiescent_state() is trying to drive the ->completed field
away from.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-11 05:48:50 +0800
956539b75 rcu: Enable synchronize_sched_expedited() fastpath ... Browse Code »

This patch adds a counter increment to enable tasks to actually
take the synchronize_sched_expedited() function's fastpath.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-11 05:48:49 +0800
dbe01350f rcu: Remove inline from forward-referenced functions ... Browse Code »

Some variants of gcc are reputed to dislike forward references
to functions declared "inline". Remove the "inline" keyword
from such functions.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-11 05:48:49 +0800

10 Nov, 2009

4 commits

9160306e6 rcu: Fix note_new_gpnum() uses of ->gpnum ... Browse Code »

Impose a clear locking design on the note_new_gpnum()
function's use of the ->gpnum counter. This is done by updating
rdp->gpnum only from the corresponding leaf rcu_node structure's
rnp->gpnum field, and even then only under the protection of
that same rcu_node structure's ->lock field. Performance and
scalability are maintained using a form of double-checked
locking, and excessive spinning is avoided by use of the
spin_trylock() function. The use of spin_trylock() is safe due
to the fact that CPUs who fail to acquire this lock will try
again later. The hierarchical nature of the rcu_node data
structure limits contention (which could be limited further if
need be using the RCU_FANOUT kernel parameter).

Without this patch, obscure but quite possible races could
result in a quiescent state that occurred during one grace
period to be accounted to the following grace period, causing
this following grace period to end prematurely. Not good!

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: # .32.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-10 11:12:11 +0800
d09b62dfa rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter ... Browse Code »

Impose a clear locking design on the rcu_process_gp_end()
function's use of the ->completed counter. This is done by
creating a ->completed field in the rcu_node structure, which
can safely be accessed under the protection of that structure's
lock. Performance and scalability are maintained by using a
form of double-checked locking, so that rcu_process_gp_end()
only acquires the leaf rcu_node structure's ->lock if a grace
period has recently ended.

This fix reduces rcutorture failure rate by at least two orders
of magnitude under heavy stress with force_quiescent_state()
being invoked artificially often. Without this fix,
unsynchronized access to the ->completed field can cause
rcu_process_gp_end() to advance callbacks whose grace period has
not yet expired. (Bad idea!)

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: # .32.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-10 11:11:54 +0800
281d150c5 rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter ... Browse Code »

Impose a clear locking design on non-NO_HZ handling of the
->completed counter. This increases the distance between the
RCU and the CPU-hotplug mechanisms.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: # .32.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-10 11:11:17 +0800
7e1a2766e Merge branch 'core/urgent' into core/rcu ... Browse Code »

Merge reason: Pick up RCU fixlet to base further commits on.

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-11-10 11:10:35 +0800

02 Nov, 2009

3 commits

c5e0cb3dd rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls ... Browse Code »

Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ,
while rcu_irq_enter() is invoked unconditionally. This patch
moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the
calls are balanced.

This patch has no effect on the behavior of the kernel because
both rcu_irq_enter() and rcu_irq_exit() are empty for
!CONFIG_NO_HZ, but the code is easier to understand if the calls
are obviously balanced in all cases.

Signed-off-by: Lai Jiangshan
Signed-off-by: Paul E. McKenney
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Lai Jiangshan
2009-11-02 23:06:37 +0800
83f5b01ff rcu: Fix long-grace-period race between forcing and initialization ... Browse Code »

Very long RCU read-side critical sections (50 milliseconds or
so) can cause a race between force_quiescent_state() and
rcu_start_gp() as follows on kernel builds with multi-level
rcu_node hierarchies:

1. CPU 0 calls force_quiescent_state(), sees that there is a
grace period in progress, and acquires ->fsqlock.

2. CPU 1 detects the end of the grace period, and so
cpu_quiet_msk_finish() sets rsp->completed to rsp->gpnum.
This operation is carried out under the root rnp->lock,
but CPU 0 has not yet acquired that lock. Note that
rsp->signaled is still RCU_SAVE_DYNTICK from the last
grace period.

3. CPU 1 calls rcu_start_gp(), but no one wants a new grace
period, so it drops the root rnp->lock and returns.

4. CPU 0 acquires the root rnp->lock and picks up rsp->completed
and rsp->signaled, then drops rnp->lock. It then enters the
RCU_SAVE_DYNTICK leg of the switch statement.

5. CPU 2 invokes call_rcu(), and now needs a new grace period.
It calls rcu_start_gp(), which acquires the root rnp->lock, sets
rsp->signaled to RCU_GP_INIT (too bad that CPU 0 is already in
the RCU_SAVE_DYNTICK leg of the switch statement!) and starts
initializing the rcu_node hierarchy. If there are multiple
levels to the hierarchy, it will drop the root rnp->lock and
initialize the lower levels of the hierarchy.

6. CPU 0 notes that rsp->completed has not changed, which permits
both CPU 2 and CPU 0 to try updating it concurrently. If CPU 0's
update prevails, later calls to force_quiescent_state() can
count old quiescent states against the new grace period, which
can in turn result in premature ending of grace periods.

Not good.

This patch adds an RCU_GP_IDLE state for rsp->signaled that is
set initially at boot time and any time a grace period ends.
This prevents CPU 0 from getting into the workings of
force_quiescent_state() in step 4. Additional locking and
checks prevent the concurrent update of rsp->signaled in step 6.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-11-02 23:06:21 +0800
b00bc0b23 uids: Prevent tear down race ... Browse Code »

Ingo triggered the following warning:

WARNING: at lib/debugobjects.c:255 debug_print_object+0x42/0x50()
Hardware name: System Product Name
ODEBUG: init active object type: timer_list
Modules linked in:
Pid: 2619, comm: dmesg Tainted: G W 2.6.32-rc5-tip+ #5298
Call Trace:
[] warn_slowpath_common+0x6a/0x81
[] ? debug_print_object+0x42/0x50
[] warn_slowpath_fmt+0x29/0x2c
[] debug_print_object+0x42/0x50
[] __debug_object_init+0x279/0x2d7
[] debug_object_init+0x13/0x18
[] init_timer_key+0x17/0x6f
[] free_uid+0x50/0x6c
[] put_cred_rcu+0x61/0x72
[] rcu_do_batch+0x70/0x121

debugobjects warns about an enqueued timer being initialized. If
CONFIG_USER_SCHED=y the user management code uses delayed work to
remove the user from the hash table and tear down the sysfs objects.

free_uid is called from RCU and initializes/schedules delayed work if
the usage count of the user_struct is 0. The init/schedule happens
outside of the uidhash_lock protected region which allows a concurrent
caller of find_user() to reference the about to be destroyed
user_struct w/o preventing the work from being scheduled. If the next
free_uid call happens before the work timer expired then the active
timer is initialized and the work scheduled again.

The race was introduced in commit 5cb350ba (sched: group scheduling,
sysfs tunables) and made more prominent by commit 3959214f (sched:
delayed cleanup of user_struct)

Move the init/schedule_delayed_work inside of the uidhash_lock
protected region to prevent the race.

Signed-off-by: Thomas Gleixner
Acked-by: Dhaval Giani
Cc: Paul E. McKenney
Cc: Kay Sievers
Cc: stable@kernel.org

Thomas Gleixner
2009-11-02 23:02:39 +0800

29 Oct, 2009

1 commit

11df6dddc futex: Fix spurious wakeup for requeue_pi really ... Browse Code »

The requeue_pi path doesn't use unqueue_me() (and the racy lock_ptr ==
NULL test) nor does it use the wake_list of futex_wake() which where
the reason for commit 41890f2 (futex: Handle spurious wake up)

See debugging discussing on LKML Message-ID:

The changes in this fix to the wait_requeue_pi path were considered to
be a likely unecessary, but harmless safety net. But it turns out that
due to the fact that for unknown $@#!*( reasons EWOULDBLOCK is defined
as EAGAIN we built an endless loop in the code path which returns
correctly EWOULDBLOCK.

Spurious wakeups in wait_requeue_pi code path are unlikely so we do
the easy solution and return EWOULDBLOCK^WEAGAIN to user space and let
it deal with the spurious wakeup.

Cc: Darren Hart
Cc: Peter Zijlstra
Cc: Eric Dumazet
Cc: John Stultz
Cc: Dinakar Guniguntala
LKML-Reference:
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2009-10-29 03:34:34 +0800

27 Oct, 2009

2 commits

2c28e2451 rcu: Fix TINY_RCU #elif condition ... Browse Code »

Some compilers are happy with "#elif CONFIG_RCU_TINY", while
others strongly prefer "#elif defined(CONFIG_RCU_TINY)". Change
to the latter to make more compilers happy.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-27 21:59:28 +0800
88b91c7ca rcu: Simplify creating of lockdep class for root rcu_node ... Browse Code »

Use lockdep_set_class() to simplify the code and to avoid any
additional overhead in the !LOCKDEP case. Also move the
definition of rcu_root_class into kernel/rcutree.c, as suggested
by Lai Jiangshan.

Signed-off-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-10-27 04:07:16 +0800

26 Oct, 2009

6 commits

4ce5b9034 rcu: Do tiny cleanups in rcutiny ... Browse Code »

No change in functionality - just straighten out a few small
stylistic details.

Cc: Paul E. McKenney
Cc: David Howells
Cc: Josh Triplett
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-10-26 16:40:40 +0800
cf886c44e rcu: Improve rcutorture diagnostics when bad torture_type specified ... Browse Code »

Make rcutorture list the available torture_type values when it
doesn't like the one specified.

Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Reviewed-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-26 16:40:31 +0800
64179861c rcu: Add synchronize_srcu_expedited() to the documentation ... Browse Code »

Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Reviewed-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-26 16:40:31 +0800
804bb8370 rcu: Add synchronize_srcu_expedited() to the rcutorture test suite ... Browse Code »

Adds the "srcu_expedited" torture type, and also renames
sched_ops_sync to sched_sync_ops for consistency while we are in
this file.

Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Reviewed-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-26 16:40:30 +0800
0cd397d33 rcu: Add synchronize_srcu_expedited() ... Browse Code »

This patch creates a synchronize_srcu_expedited() that uses
synchronize_sched_expedited() where synchronize_srcu()
uses synchronize_sched(). The synchronize_srcu() and
synchronize_srcu_expedited() functions become one-liners that
pass synchronize_sched() or synchronize_sched_expedited(),
repectively, to a new __synchronize_srcu() function.

While in the file, move the EXPORT_SYMBOL_GPL()s to immediately
follow the corresponding functions.

Requested-by: Avi Kivity
Tested-by: Marcelo Tosatti
Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Reviewed-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: avi@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-26 16:40:30 +0800
9b1d82fa1 rcu: "Tiny RCU", The Bloatwatch Edition ... Browse Code »

This patch is a version of RCU designed for !SMP provided for a
small-footprint RCU implementation. In particular, the
implementation of synchronize_rcu() is extremely lightweight and
high performance. It passes rcutorture testing in each of the
four relevant configurations (combinations of NO_HZ and PREEMPT)
on x86. This saves about 1K bytes compared to old Classic RCU
(which is no longer in mainline), and more than three kilobytes
compared to Hierarchical RCU (updated to 2.6.30):

CONFIG_TREE_RCU:

text data bss dec filename
183 4 0 187 kernel/rcupdate.o
2783 520 36 3339 kernel/rcutree.o
3526 Total (vs 4565 for v7)

CONFIG_TREE_PREEMPT_RCU:

text data bss dec filename
263 4 0 267 kernel/rcupdate.o
4594 776 52 5422 kernel/rcutree.o
5689 Total (6155 for v7)

CONFIG_TINY_RCU:

text data bss dec filename
96 4 0 100 kernel/rcupdate.o
734 24 0 758 kernel/rcutiny.o
858 Total (vs 848 for v7)

The above is for x86. Your mileage may vary on other platforms.
Further compression is possible, but is being procrastinated.

Changes from v7 (http://lkml.org/lkml/2009/10/9/388)

o Apply Lai Jiangshan's review comments (aside from
might_sleep() in synchronize_sched(), which is covered by SMP builds).

o Fix up expedited primitives.

Changes from v6 (http://lkml.org/lkml/2009/9/23/293).

o Forward ported to put it into the 2.6.33 stream.

o Added lockdep support.

o Make lightweight rcu_barrier.

Changes from v5 (http://lkml.org/lkml/2009/6/23/12).

o Ported to latest pre-2.6.32 merge window kernel.

- Renamed rcu_qsctr_inc() to rcu_sched_qs().
- Renamed rcu_bh_qsctr_inc() to rcu_bh_qs().
- Provided trivial rcu_cpu_notify().
- Provided trivial exit_rcu().
- Provided trivial rcu_needs_cpu().
- Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h.

o Removed the dependence on EMBEDDED, with a view to making
TINY_RCU default for !SMP at some time in the future.

o Added (trivial) support for expedited grace periods.

Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include:

o Squeeze the size down a bit further by removing the
->completed field from struct rcu_ctrlblk.

o This permits synchronize_rcu() to become the empty function.
Previous concerns about rcutorture were unfounded, as
rcutorture correctly handles a constant value from
rcu_batches_completed() and rcu_batches_completed_bh().

Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include:

o Changed rcu_batches_completed(), rcu_batches_completed_bh()
rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and
rcu_nmi_exit(), to be static inlines, as suggested by David
Howells. Doing this saves about 100 bytes from rcutiny.o.
(The numbers between v3 and this v4 of the patch are not directly
comparable, since they are against different versions of Linux.)

Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include:

o Fix whitespace issues.

o Change short-circuit "||" operator to instead be "+" in order
to fix performance bug noted by "kraai" on LWN.

(http://lwn.net/Articles/324348/)

Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include:

o This version depends on EMBEDDED as well as !SMP, as suggested
by Ingo.

o Updated rcu_needs_cpu() to unconditionally return zero,
permitting the CPU to enter dynticks-idle mode at any time.
This works because callbacks can be invoked upon entry to
dynticks-idle mode.

o Paul is now OK with this being included, based on a poll at
the Kernel Miniconf at linux.conf.au, where about ten people said
that they cared about saving 900 bytes on single-CPU systems.

o Applies to both mainline and tip/core/rcu.

Signed-off-by: Paul E. McKenney
Acked-by: David Howells
Acked-by: Josh Triplett
Reviewed-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-26 16:40:29 +0800

16 Oct, 2009

2 commits

89061d3d5 futex: Move drop_futex_key_refs out of spinlock'ed region ... Browse Code »

When requeuing tasks from one futex to another, the reference held
by the requeued task to the original futex location needs to be
dropped eventually.

Dropping the reference may ultimately lead to a call to
"iput_final" and subsequently call into filesystem- specific code -
which may be non-atomic.

It is therefore safer to defer this drop operation until after the
futex_hash_bucket spinlock has been dropped.

Originally-From: Helge Bahmann
Signed-off-by: Darren Hart
Cc:
Cc: Peter Zijlstra
Cc: Eric Dumazet
Cc: Dinakar Guniguntala
Cc: John Stultz
Cc: Sven-Thorsten Dietrich
Cc: John Kacur
LKML-Reference:
Signed-off-by: Ingo Molnar

Darren Hart
2009-10-16 16:19:18 +0800
237c80c5c rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang ... Browse Code »

If the following sequence of events occurs, then
TREE_PREEMPT_RCU will hang waiting for a grace period to
complete, eventually OOMing the system:

o A TREE_PREEMPT_RCU build of the kernel is booted on a system
with more than 64 physical CPUs present (32 on a 32-bit system).
Alternatively, a TREE_PREEMPT_RCU build of the kernel is booted
with RCU_FANOUT set to a sufficiently small value that the
physical CPUs populate two or more leaf rcu_node structures.

o A task is preempted in an RCU read-side critical section
while running on a CPU corresponding to a given leaf rcu_node
structure.

o All CPUs corresponding to this same leaf rcu_node structure
record quiescent states for the current grace period.

o All of these same CPUs go offline (hence the need for enough
physical CPUs to populate more than one leaf rcu_node structure).
This causes the preempted task to be moved to the root rcu_node
structure.

At this point, there is nothing left to cause the quiescent
state to be propagated up the rcu_node tree, so the current
grace period never completes.

The simplest fix, especially after considering the deadlock
possibilities, is to detect this situation when the last CPU is
offlined, and to set that CPU's ->qsmask bit in its leaf
rcu_node structure. This will cause the next invocation of
force_quiescent_state() to end the grace period.

Without this fix, this hang can be triggered in an hour or so on
some machines with rcutorture and random CPU onlining/offlining.
With this fix, these same machines pass a full 10 hours of this
sort of abuse.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-16 02:33:01 +0800

15 Oct, 2009

6 commits

0edf1a683 rcu: Update trace.txt documentation for blocked-tasks lists ... Browse Code »

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-15 17:20:23 +0800
bd58b4300 rcu: Update trace.txt documentation to reflect recent changes ... Browse Code »

o Remove the CONFIG_PREEMPT_RCU documentation since this
config option has now been removed.

o Change the now-incorrect references to "rcu" labels to
instead be "rcu_sched".

o Add notes stating that CONFIG_TREE_PREEMPT_RCU kernels will
have additional "rcu_preempt" output.

o Note the new "oqlen" field in the rcuhier output (for
RCU callbacks orphaned by an offlined CPU).

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-15 17:20:23 +0800
3397e040d rcu: Add rnp->blocked_tasks to tracing ... Browse Code »

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
Cc: Josh Triplett
LKML-Reference:
Signed-off-by: Ingo Molnar
kernel/rcutree_trace.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

Paul E. McKenney
2009-10-15 17:20:22 +0800
019129d59 rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU ... Browse Code »

For the short term, map synchronize_rcu_expedited() to
synchronize_rcu() for TREE_PREEMPT_RCU and to
synchronize_sched_expedited() for TREE_RCU.

Longer term, there needs to be a real expedited grace period for
TREE_PREEMPT_RCU, but candidate patches to date are considerably
more complex and intrusive.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-15 17:17:17 +0800
37c72e56f rcu: Prevent RCU IPI storms in presence of high call_rcu() load ... Browse Code »

As the number of callbacks on a given CPU rises, invoke
force_quiescent_state() only every blimit number of callbacks
(defaults to 10,000), and even then only if no other CPU has
invoked force_quiescent_state() in the meantime.

This should fix the performance regression reported by Nick.

Reported-by: Nick Piggin
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: jens.axboe@oracle.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2009-10-15 17:17:16 +0800
2bc872036 futex: Check for NULL keys in match_futex ... Browse Code »

If userspace tries to perform a requeue_pi on a non-requeue_pi waiter,
it will find the futex_q->requeue_pi_key to be NULL and OOPS.

Check for NULL in match_futex() instead of doing explicit NULL pointer
checks on all call sites. While match_futex(NULL, NULL) returning
false is a little odd, it's still correct as we expect valid key
references.

Signed-off-by: Darren Hart
Cc: Peter Zijlstra
Cc: Ingo Molnar
CC: Eric Dumazet
CC: Dinakar Guniguntala
CC: John Stultz
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Thomas Gleixner

Darren Hart
2009-10-15 04:00:14 +0800

14 Oct, 2009

1 commit

d58e6576b futex: Handle spurious wake up ... Browse Code »

The futex code does not handle spurious wake up in futex_wait and
futex_wait_requeue_pi.

The code assumes that any wake up which was not caused by futex_wake /
requeue or by a timeout was caused by a signal wake up and returns one
of the syscall restart error codes.

In case of a spurious wake up the signal delivery code which deals
with the restart error codes is not invoked and we return that error
code to user space. That causes applications which actually check the
return codes to fail. Blaise reported that on preempt-rt a python test
program run into a exception trap. -rt exposed that due to a built in
spurious wake up accelerator :)

Solve this by checking signal_pending(current) in the wake up path and
handle the spurious wake up case w/o returning to user space.

Reported-by: Blaise Gassend
Debugged-by: Darren Hart
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: stable@kernel.org
LKML-Reference:

Thomas Gleixner
2009-10-14 02:40:43 +0800

13 Oct, 2009

1 commit

c7cedb125 Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/opro… ... Browse Code »

…file into core/urgent

Ingo Molnar
2009-10-13 05:26:36 +0800