Eric Lee / smarc-fsl-linux-kernel

04 Oct, 2006

2 commits

595182bcd [PATCH] RCU: CREDITS and MAINTAINERS ... Browse Code »

Add MAINTAINERS entry for Read-Copy Update (RCU), listing Dipankar Sarma as
maintainer, and giving the URL for Paul McKenney's RCU site. Add
MAINTAINERS entry for rcutorture, listing myself as maintainer. Add
CREDITS entries for developers of RCU, RCU variants, and rcutorture. Use
Paul McKenney's preferred email address in include/linux/rcupdate.h .

Signed-off-by: Josh Triplett
Cc: Paul McKenney
Cc: Dipankar Sarma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2006-10-04 22:55:31 +0800
20e9751bd [PATCH] rcu: simplify/improve batch tuning ... Browse Code »

Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu
rcu_data.last_rs_qlen. Instead, it adds adds a flag rcu_ctrlblk.signaled,
which records the fact that one of CPUs has sent a resched IPI since the
last rcu_start_batch().

Roughly speaking, we need two rcu_start_batch()s in order to move callbacks
from ->nxtlist to ->donelist. This means that when ->qlen exceeds qhimark
and continues to grow, we should send a resched IPI, and then do it again
after we gone through a quiescent state.

On the other hand, if it was already sent, we don't need to do it again
when another CPU detects overflow of the queue.

Signed-off-by: Oleg Nesterov
Acked-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-10-04 22:55:31 +0800

01 Jul, 2006

1 commit

7f04ac062 [PATCH] rcu: Add lock annotations to RCU locking primitives ... Browse Code »

Add __acquire annotations to rcu_read_lock and rcu_read_lock_bh, and add
__release annotations to rcu_read_unlock and rcu_read_unlock_bh. This
allows sparse to detect improperly paired calls to these functions.

Signed-off-by: Josh Triplett
Acked-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2006-07-01 02:25:39 +0800

28 Jun, 2006

1 commit

c32e06605 [PATCH] rcutorture: add call_rcu_bh() operations ... Browse Code »

Add operations for the call_rcu_bh() variant of RCU. Also add an
rcu_batches_completed_bh() function, which is needed by rcutorture.

Signed-off-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2006-06-28 08:32:40 +0800

23 Jun, 2006

1 commit

d83015b8f [PATCH] Make RCU API inaccessible to non-GPL Linux kernel modules ... Browse Code »

Remove synchronize_kernel() (deprecated 2-APR-2005 in
http://lkml.org/lkml/2005/4/3/11) and makes the RCU API inaccessible to
non-GPL Linux kernel modules (as was announced more than one year ago in
http://lkml.org/lkml/2005/4/3/8). Tested on x86 and ppc64.

Signed-off-by: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2006-06-23 22:43:07 +0800

16 May, 2006

1 commit

986733e01 [PATCH] RCU: introduce rcu_needs_cpu() interface ... Browse Code »

With "Paul E. McKenney"

Introduce rcu_needs_cpu() interface. This can be used to tell if there
will be a new rcu batch on a cpu soon by looking at the curlist pointer.
This can be used to avoid to enter a tickless idle state where the cpu
would miss that a new batch is ready when rcu_start_batch would be called
on a different cpu.

Signed-off-by: Heiko Carstens
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Heiko Carstens
2006-05-16 02:20:55 +0800

23 Mar, 2006

1 commit

2178426d2 [PATCH] kernel/rcupdate.c: make two structs static ... Browse Code »

This patch makes two needlessly global structs static.

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-03-23 23:38:17 +0800

09 Mar, 2006

1 commit

21a1ea9eb [PATCH] rcu batch tuning ... Browse Code »

This patch adds new tunables for RCU queue and finished batches. There are
two types of controls - number of completed RCU updates invoked in a batch
(blimit) and monitoring for high rate of incoming RCUs on a cpu (qhimark,
qlowmark).

By default, the per-cpu batch limit is set to a small value. If the input
RCU rate exceeds the high watermark, we do two things - force quiescent
state on all cpus and set the batch limit of the CPU to INTMAX. Setting
batch limit to INTMAX forces all finished RCUs to be processed in one shot.
If we have more than INTMAX RCUs queued up, then we have bigger problems
anyway. Once the incoming queued RCUs fall below the low watermark, the
batch limit is set to the default.

Signed-off-by: Dipankar Sarma
Cc: "Paul E. McKenney"
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dipankar Sarma
2006-03-09 06:14:01 +0800

04 Feb, 2006

1 commit

bb3b9cf12 [PATCH] Fix comment to synchronize_sched() ... Browse Code »

Fix to broken comment to synchronize_rcu() noted by Keith Owens. Also add
sentence noting that synchronize_sched() and synchronize_rcu() are not
necessarily identical.

Signed-off-by: Paul E. McKenney
Cc: Keith Owens
Cc: Stephen Hemminger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2006-02-04 00:32:09 +0800

11 Jan, 2006

1 commit

69a0b3157 [PATCH] rcu: join rcu_ctrlblk and rcu_state ... Browse Code »

This patch moves rcu_state into the rcu_ctrlblk. I think there
are no reasons why we should have 2 different variables to control
rcu state. Every user of rcu_state has also "rcu_ctrlblk *rcp" in
the parameter list.

Signed-off-by: Oleg Nesterov
Acked-by: Paul E. McKenney
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-01-11 00:42:50 +0800

10 Jan, 2006

1 commit

677517771 [PATCH] rcu: uninline __rcu_pending() ... Browse Code »

__rcu_pending() is rather fat and called twice from rcu_pending().

rcu_pending() has multiple callers, and not that small too.

This patch uninlines both of them.

Signed-off-by: Oleg Nesterov
Acked-by: Paul E. McKenney
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-01-10 01:35:44 +0800

09 Jan, 2006

1 commit

22fc6eccb [PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros ... Browse Code »

____cacheline_maxaligned_in_smp is currently used to align critical structures
and avoid false sharing. It uses per-arch L1_CACHE_SHIFT_MAX and people find
L1_CACHE_SHIFT_MAX useless.

However, we have been using ____cacheline_maxaligned_in_smp to align
structures on the internode cacheline size. As per Andi's suggestion,
following patch kills ____cacheline_maxaligned_in_smp and introduces
INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches.
Arches needing L3/Internode cacheline alignment can define
INTERNODE_CACHE_SHIFT in the arch asm/cache.h. Patch replaces
____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp

With this patch, L1_CACHE_SHIFT_MAX can be killed

Signed-off-by: Ravikiran Thirumalai
Signed-off-by: Shai Fultheim
Signed-off-by: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ravikiran G Thirumalai
2006-01-09 12:13:38 +0800

13 Dec, 2005

1 commit

ab4720ec7 [PATCH] add rcu_barrier() synchronization point ... Browse Code »

This introduces a new interface - rcu_barrier() which waits until all
the RCUs queued until this call have been completed.

Reiser4 needs this, because we do more than just freeing memory object
in our RCU callback: we also remove it from the list hanging off
super-block. This means, that before freeing reiser4-specific portion
of super-block (during umount) we have to wait until all pending RCU
callbacks are executed.

The only change of reiser4 made to the original patch, is exporting of
rcu_barrier().

Cc: Hans Reiser
Cc: Vladimir V. Saveliev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dipankar Sarma
2005-12-13 00:57:42 +0800

31 Oct, 2005

1 commit

a241ec65a [PATCH] RCU torture-testing kernel module ... Browse Code »

This patch is a rewrite of the one submitted on October 1st, using modules
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2).

This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an
intense torture test of the RCU infratructure. This is needed due to the
continued changes to the RCU infrastructure to accommodate dynamic ticks,
CPU hotplug, realtime, and so on. Most of the code is in a separate file
that is compiled only if the CONFIG variable is set. Documentation on how
to run the test and interpret the output is also included.

This code has been tested on i386 and ppc64, and an earlier version of the
code has received extensive testing on a number of architectures as part of
the PREEMPT_RT patchset.

Signed-off-by: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2005-10-31 09:37:27 +0800

18 Oct, 2005

1 commit

5ee832dbc [PATCH] rcu: keep rcu callback event counter ... Browse Code »

This makes call_rcu() keep track of how many events there are on the RCU
list, and cause a reschedule event when the list gets too long.

This helps keep RCU event lists down.

Signed-off-by: Linus Torvalds

Eric Dumazet
2005-10-18 06:27:58 +0800

10 Sep, 2005

1 commit

8b6490e5f [PATCH] files: fix rcu initializers ... Browse Code »

First of a number of files_lock scaability patches.

Here are the x86 numbers -

tiobench on a 4(8)-way (HT) P4 system on ramdisk :

(lockfree)
Test 2.6.10-vanilla Stdev 2.6.10-fd Stdev
-------------------------------------------------------------
Seqread 1400.8 11.52 1465.4 34.27
Randread 1594 8.86 2397.2 29.21
Seqwrite 242.72 3.47 238.46 6.53
Randwrite 445.74 9.15 446.4 9.75

The performance improvement is very significant.
We are getting killed by the cacheline bouncing of the files_struct
lock here. Writes on ramdisk (ext2) seems to vary just too
much to get any meaningful number.

Also, With Tridge's thread_perf test on a 4(8)-way (HT) P4 xeon system :

2.6.12-rc5-vanilla :

Running test 'readwrite' with 8 tasks
Threads 0.34 +/- 0.01 seconds
Processes 0.16 +/- 0.00 seconds

2.6.12-rc5-fd :

Running test 'readwrite' with 8 tasks
Threads 0.17 +/- 0.02 seconds
Processes 0.17 +/- 0.02 seconds

I repeated the measurements on ramfs (as opposed to ext2 on ramdisk in
the earlier measurement) and I got more consistent results from tiobench :

4(8) way xeon P4
-----------------
(lock-free)
Test 2.6.12-rc5 Stdev 2.6.12-rc5-fd Stdev
-------------------------------------------------------------
Seqread 1282 18.59 1343.6 26.37
Randread 1517 7 2415 34.27
Seqwrite 702.2 5.27 709.46 5.9
Randwrite 846.86 15.15 919.68 21.4

4-way ppc64
------------
(lock-free)
Test 2.6.12-rc5 Stdev 2.6.12-rc5-fd Stdev
-------------------------------------------------------------
Seqread 1549 91.16 1569.6 47.2
Randread 1473.6 25.11 1585.4 69.99
Seqwrite 1096.8 20.03 1136 29.61
Randwrite 1189.6 4.04 1275.2 32.96

Also running Tridge's thread_perf test on ppc64 :

2.6.12-rc5-vanilla
--------------------
Running test 'readwrite' with 4 tasks
Threads 0.20 +/- 0.02 seconds
Processes 0.16 +/- 0.01 seconds

2.6.12-rc5-fd
--------------------
Running test 'readwrite' with 4 tasks
Threads 0.18 +/- 0.04 seconds
Processes 0.16 +/- 0.01 seconds

The benefits are huge (upto ~60%) in some cases on x86 primarily
due to the atomic operations during acquisition of ->file_lock
and cache line bouncing in fast path. ppc64 benefits are modest
due to LL/SC based locking, but still statistically significant.

This patch:

RCU head initilizer no longer needs the head varible name since we don't use
list.h lists anymore.

Signed-off-by: Dipankar Sarma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dipankar Sarma
2005-09-10 04:57:54 +0800

01 May, 2005

1 commit

9b06e8189 [PATCH] Deprecate synchronize_kernel, GPL replacement ... Browse Code »

The synchronize_kernel() primitive is used for quite a few different purposes:
waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on.
This makes RCU code harder to read, since synchronize_kernel() might or might
not have matching rcu_read_lock()s. This patch creates a new
synchronize_rcu() that is to be used for RCU readers and a new
synchronize_sched() that is used for the rest. These two new primitives
currently have the same implementation, but this is might well change with
additional real-time support. Both new primitives are GPL-only, the old
primitive is deprecated.

Signed-off-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul E. McKenney
2005-05-01 23:59:04 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800