Eric Lee / smarc-fsl-linux-kernel

31 Oct, 2011

1 commit

9984de1a5 kernel: Map most files to use export.h instead of module.h ... Browse Code »

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else. Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

-#include
+#include

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

17 Jun, 2011

1 commit

d8ad7d112 generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts ... Browse Code »

There is a problem that kdump(2nd kernel) sometimes hangs up due
to a pending IPI from 1st kernel. Kernel panic occurs because IPI
comes before call_single_queue is initialized.

To fix the crash, rename init_call_single_data() to call_function_init()
and call it in start_kernel() so that call_single_queue can be
initialized before enabling interrupts.

The details of the crash are:

(1) 2nd kernel boots up

(2) A pending IPI from 1st kernel comes when irqs are first enabled
in start_kernel().

(3) Kernel tries to handle the interrupt, but call_single_queue
is not initialized yet at this point. As a result, in the
generic_smp_call_function_single_interrupt(), NULL pointer
dereference occurs when list_replace_init() tries to access
&q->list.next.

Therefore this patch changes the name of init_call_single_data()
to call_function_init() and calls it before local_irq_enable()
in start_kernel().

Signed-off-by: Takao Indoh
Reviewed-by: WANG Cong
Acked-by: Neil Horman
Acked-by: Vivek Goyal
Acked-by: Peter Zijlstra
Cc: Milton Miller
Cc: Jens Axboe
Cc: Paul E. McKenney
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com
Signed-off-by: Ingo Molnar

Takao Indoh
2011-06-17 16:17:12 +0800

23 Mar, 2011

1 commit

34db18a05 smp: move smp setup functions to kernel/smp.c ... Browse Code »

Move setup_nr_cpu_ids(), smp_init() and some other SMP boot parameter
setup functions from init/main.c to kenrel/smp.c, saves some #ifdef
CONFIG_SMP.

Signed-off-by: WANG Cong
Cc: Rakib Mullick
Cc: David Howells
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Tejun Heo
Cc: Arnd Bergmann
Cc: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amerigo Wang
2011-03-23 08:44:11 +0800

18 Mar, 2011

4 commits

c8def554d smp_call_function_interrupt: use typedef and %pf ... Browse Code »

Use the newly added smp_call_func_t in smp_call_function_interrupt for
the func variable, and make the comment above the WARN more assertive
and explicit. Also, func is a function pointer and does not need an
offset, so use %pf not %pS.

Signed-off-by: Milton Miller
Signed-off-by: Linus Torvalds

Milton Miller
2011-03-18 07:58:11 +0800
723aae25d smp_call_function_many: handle concurrent clearing of mask ... Browse Code »

Mike Galbraith reported finding a lockup ("perma-spin bug") where the
cpumask passed to smp_call_function_many was cleared by other cpu(s)
while a cpu was preparing its call_data block, resulting in no cpu to
clear the last ref and unlock the block.

Having cpus clear their bit asynchronously could be useful on a mask of
cpus that might have a translation context, or cpus that need a push to
complete an rcu window.

Instead of adding a BUG_ON and requiring yet another cpumask copy, just
detect the race and handle it.

Note: arch_send_call_function_ipi_mask must still handle an empty
cpumask because the data block is globally visible before the that arch
callback is made. And (obviously) there are no guarantees to which cpus
are notified if the mask is changed during the call; only cpus that were
online and had their mask bit set during the whole call are guaranteed
to be called.

Reported-by: Mike Galbraith
Reported-by: Jan Beulich
Acked-by: Jan Beulich
Cc: stable@kernel.org
Signed-off-by: Milton Miller
Signed-off-by: Linus Torvalds

Milton Miller
2011-03-18 07:58:10 +0800
45a579192 call_function_many: add missing ordering ... Browse Code »

Paul McKenney's review pointed out two problems with the barriers in the
2.6.38 update to the smp call function many code.

First, a barrier that would force the func and info members of data to
be visible before their consumption in the interrupt handler was
missing. This can be solved by adding a smp_wmb between setting the
func and info members and setting setting the cpumask; this will pair
with the existing and required smp_rmb ordering the cpumask read before
the read of refs. This placement avoids the need a second smp_rmb in
the interrupt handler which would be executed on each of the N cpus
executing the call request. (I was thinking this barrier was present
but was not).

Second, the previous write to refs (establishing the zero that we the
interrupt handler was testing from all cpus) was performed by a third
party cpu. This would invoke transitivity which, as a recient or
concurrent addition to memory-barriers.txt now explicitly states, would
require a full smp_mb().

However, we know the cpumask will only be set by one cpu (the data
owner) and any preivous iteration of the mask would have cleared by the
reading cpu. By redundantly writing refs to 0 on the owning cpu before
the smp_wmb, the write to refs will follow the same path as the writes
that set the cpumask, which in turn allows us to keep the barrier in the
interrupt handler a smp_rmb instead of promoting it to a smp_mb (which
will be be executed by N cpus for each of the possible M elements on the
list).

I moved and expanded the comment about our (ab)use of the rcu list
primitives for the concurrent walk earlier into this function. I
considered moving the first two paragraphs to the queue list head and
lock, but felt it would have been too disconected from the code.

Cc: Paul McKinney
Cc: stable@kernel.org (2.6.32 and later)
Signed-off-by: Milton Miller
Signed-off-by: Linus Torvalds

Milton Miller
2011-03-18 07:58:10 +0800
e6cd1e07a call_function_many: fix list delete vs add race ... Browse Code »

Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.

Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.

I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful. Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.

Cc: stable@kernel.org (v2.6.32 and later)
Reported-by: Peter Zijlstra
Signed-off-by: Milton Miller
Signed-off-by: Linus Torvalds

Milton Miller
2011-03-18 07:58:10 +0800

21 Jan, 2011

3 commits

2b1caf6ed Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c
lockdep: Move early boot local IRQ enable/disable status to init/main.c

Linus Torvalds
2011-01-21 10:30:37 +0800
225c8e010 kernel/smp.c: consolidate writes in smp_call_function_interrupt() ... Browse Code »

We have to test the cpu mask in the interrupt handler before checking the
refs, otherwise we can start to follow an entry before its deleted and
find it partially initailzed for the next trip. Presently we also clear
the cpumask bit before executing the called function, which implies
getting write access to the line. After the function is called we then
decrement refs, and if they go to zero we then unlock the structure.

However, this implies getting write access to the call function data
before and after another the function is called. If we can assert that no
smp_call_function execution function is allowed to enable interrupts, then
we can move both writes to after the function is called, hopfully allowing
both writes with one cache line bounce.

On a 256 thread system with a kernel compiled for 1024 threads, the time
to execute testcase in the "smp_call_function_many race" changelog was
reduced by about 30-40ms out of about 545 ms.

I decided to keep this as WARN because its now a buggy function, even
though the stack trace is of no value -- a simple printk would give us the
information needed.

Raw data:

Without patch:
ipi_test startup took 1219366ns complete 539819014ns total 541038380ns
ipi_test startup took 1695754ns complete 543439872ns total 545135626ns
ipi_test startup took 7513568ns complete 539606362ns total 547119930ns
ipi_test startup took 13304064ns complete 533898562ns total 547202626ns
ipi_test startup took 8668192ns complete 544264074ns total 552932266ns
ipi_test startup took 4977626ns complete 548862684ns total 553840310ns
ipi_test startup took 2144486ns complete 541292318ns total 543436804ns
ipi_test startup took 21245824ns complete 530280180ns total 551526004ns

With patch:
ipi_test startup took 5961748ns complete 500859628ns total 506821376ns
ipi_test startup took 8975996ns complete 495098924ns total 504074920ns
ipi_test startup took 19797750ns complete 492204740ns total 512002490ns
ipi_test startup took 14824796ns complete 487495878ns total 502320674ns
ipi_test startup took 11514882ns complete 494439372ns total 505954254ns
ipi_test startup took 8288084ns complete 502570774ns total 510858858ns
ipi_test startup took 6789954ns complete 493388112ns total 500178066ns

#include
#include
#include /* sched clock */

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;
u64 start, started, done;

start = local_clock();
for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}
started = local_clock();
for_each_online_cpu(cpu)
flush_work(&work[cpu]);
done = local_clock();
pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n",
started-start, done-started, done-start);

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

Signed-off-by: Milton Miller
Cc: Anton Blanchard
Cc: Ingo Molnar
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Milton Miller
2011-01-21 09:02:06 +0800
6dc198999 kernel/smp.c: fix smp_call_function_many() SMP race ... Browse Code »

I noticed a failure where we hit the following WARN_ON in
generic_smp_call_function_interrupt:

if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
continue;

data->csd.func(data->csd.info);

refs = atomic_dec_return(&data->refs);
WARN_ON(refs < 0); cpumask sees and
clears bit in cpumask
might be using old or new fn!
decrements refs below 0

set data->refs (too late!)

The important thing to note is since the interrupt handler walks a
potentially stale call_function.queue without any locking, then another
cpu can view the percpu *data structure at any time, even when the owner
is in the process of initialising it.

The following test case hits the WARN_ON 100% of the time on my PowerPC
box (having 128 threads does help :)

#include
#include

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;

for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

I tried to fix it by ordering the read and the write of ->cpumask and
->refs. In doing so I missed a critical case but Paul McKenney was able
to spot my bug thankfully :) To ensure we arent viewing previous
iterations the interrupt handler needs to read ->refs then ->cpumask then
->refs _again_.

Thanks to Milton Miller and Paul McKenney for helping to debug this issue.

[miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ]
[miltonm@bga.com: remove excess tests]
Signed-off-by: Anton Blanchard
Signed-off-by: Milton Miller
Cc: Ingo Molnar
Cc: "Paul E. McKenney"
Cc: [2.6.32+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Anton Blanchard
2011-01-21 09:02:06 +0800

20 Jan, 2011

1 commit

bd924e8cb smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c ... Browse Code »

percpu may end up calling vfree() during early boot which in
turn may call on_each_cpu() for TLB flushes. The function of
on_each_cpu() can be done safely while IRQ is disabled during
early boot but it assumed that the function is always called
with local IRQ enabled which ended up enabling local IRQ
prematurely during boot and triggering a couple of warnings.

This patch updates on_each_cpu() and smp_call_function_many()
such on_each_cpu() can be used safely while
early_boot_irqs_disabled is set.

Signed-off-by: Tejun Heo
Acked-by: Peter Zijlstra
Acked-by: Pekka Enberg
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar
Reported-by: Ingo Molnar

Tejun Heo
2011-01-20 20:32:34 +0800

14 Jan, 2011

1 commit

351f8f8e6 kernel: clean up USE_GENERIC_SMP_HELPERS ... Browse Code »

For arch which needs USE_GENERIC_SMP_HELPERS, it has to select
USE_GENERIC_SMP_HELPERS, rather than leaving a choice to user, since they
don't provide their own implementions.

Also, move on_each_cpu() to kernel/smp.c, it is strange to put it in
kernel/softirq.c.

For arch which doesn't use USE_GENERIC_SMP_HELPERS, e.g. blackfin, only
on_each_cpu() is compiled.

Signed-off-by: Amerigo Wang
Cc: David Howells
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Peter Zijlstra
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amerigo Wang
2011-01-14 00:03:08 +0800

28 Oct, 2010

1 commit

3a5f65df5 Typedef SMP call function pointer ... Browse Code »

Typedef the pointer to the function to be called by smp_call_function() and
friends:

typedef void (*smp_call_func_t)(void *info);

as it is used in a fair number of places.

Signed-off-by: David Howells
cc: linux-arch@vger.kernel.org

David Howells
2010-10-28 00:28:36 +0800

10 Sep, 2010

1 commit

27c379f7f generic-ipi: Fix deadlock in __smp_call_function_single ... Browse Code »

Just got my 6 way machine to a state where cpu 0 is in an
endless loop within __smp_call_function_single.
All other cpus are idle.

The call trace on cpu 0 looks like this:

__smp_call_function_single
scheduler_tick
update_process_times
tick_sched_timer
__run_hrtimer
hrtimer_interrupt
clock_comparator_work
do_extint
ext_int_handler
----> timer irq
cpu_idle

__smp_call_function_single() got called from nohz_balancer_kick()
(inlined) with the remote cpu being 1, wait being 0 and the per
cpu variable remote_sched_softirq_cb (call_single_data) of the
current cpu (0).

Then it loops forever when it tries to grab the lock of the
call_single_data, since it is already locked and enqueued on cpu 0.

My theory how this could have happened: for some reason the
scheduler decided to call __smp_call_function_single() on it's own
cpu, and sends an IPI to itself. The interrupt stays pending
since IRQs are disabled. If then the hypervisor schedules the
cpu away it might happen that upon rescheduling both the IPI and
the timer IRQ are pending. If then interrupts are enabled again
it depends which one gets scheduled first.
If the timer interrupt gets delivered first we end up with the
local deadlock as seen in the calltrace above.

Let's make __smp_call_function_single() check if the target cpu is
the current cpu and execute the function immediately just like
smp_call_function_single does. That should prevent at least the
scenario described here.

It might also be that the scheduler is not supposed to call
__smp_call_function_single with the remote cpu being the current
cpu, but that is a different issue.

Signed-off-by: Heiko Carstens
Acked-by: Peter Zijlstra
Acked-by: Jens Axboe
Cc: Venkatesh Pallipadi
Cc: Suresh Siddha
LKML-Reference:
Signed-off-by: Ingo Molnar

Heiko Carstens
2010-09-10 22:48:40 +0800

28 May, 2010

1 commit

80b5184cc kernel/: convert cpu notifier to return encapsulate errno value ... Browse Code »

By the previous modification, the cpu notifier can return encapsulate
errno value. This converts the cpu notifiers for kernel/*.c

Signed-off-by: Akinobu Mita
Cc: Ingo Molnar
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2010-05-28 00:12:48 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

18 Jan, 2010

1 commit

e03bcb686 generic-ipi: Optimize accesses by using DEFINE_PER_CPU_SHARED_ALIGNED for IPI data ... Browse Code »

The smp ipi data is passed around and given write access by
other cpus and should be separated from per-cpu data consumed by
this cpu.

Looking for hot lines, I saw call_function_data shared with
tick_cpu_sched.

Signed-off-by: Milton Miller
Acked-by: Anton Blanchard
Acked-by: Jens Axboe
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: : Nick Piggin
LKML-Reference:
Signed-off-by: Ingo Molnar

Milton Miller
2010-01-18 16:02:59 +0800

17 Jan, 2010

1 commit

af2422c42 smp_call_function_any(): pass the node value to cpumask_of_node() ... Browse Code »

The change in acpi_cpufreq to use smp_call_function_any causes a warning
when it is called since the function erroneously passes the cpu id to
cpumask_of_node rather than the node that the cpu is on. Fix this.

cpumask_of_node(3): node > nr_node_ids(1)
Pid: 1, comm: swapper Not tainted 2.6.33-rc3-00097-g2c1f189 #223
Call Trace:
[] cpumask_of_node+0x23/0x58
[] smp_call_function_any+0x65/0xfa
[] ? do_drv_read+0x0/0x2f
[] get_cur_val+0xb0/0x102
[] get_cur_freq_on_cpu+0x74/0xc5
[] acpi_cpufreq_cpu_init+0x417/0x515
[] ? __down_write+0xb/0xd
[] cpufreq_add_dev+0x278/0x922

Signed-off-by: David John
Cc: Suresh Siddha
Cc: Rusty Russell
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David John
2010-01-17 04:15:39 +0800

16 Dec, 2009

2 commits

8f0ddf91f Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
clockevents: Convert to raw_spinlock
clockevents: Make tick_device_lock static
debugobjects: Convert to raw_spinlocks
perf_event: Convert to raw_spinlock
hrtimers: Convert to raw_spinlocks
genirq: Convert irq_desc.lock to raw_spinlock
smp: Convert smplocks to raw_spinlocks
rtmutes: Convert rtmutex.lock to raw_spinlock
sched: Convert pi_lock to raw_spinlock
sched: Convert cpupri lock to raw_spinlock
sched: Convert rt_runtime_lock to raw_spinlock
sched: Convert rq->lock to raw_spinlock
plist: Make plist debugging raw_spinlock aware
bkl: Fixup core_lock fallout
locking: Cleanup the name space completely
locking: Further name space cleanups
alpha: Fix fallout from locking changes
locking: Implement new raw_spinlock
locking: Convert raw_rwlock functions to arch_rwlock
locking: Convert raw_rwlock to arch_rwlock
...

Linus Torvalds
2009-12-16 01:02:01 +0800
c0f68c2fa generic-ipi: cleanup for generic_smp_call_function_interrupt() ... Browse Code »

Use smp_processor_id() instead of get_cpu() and put_cpu() in
generic_smp_call_function_interrupt(), It's no need to disable preempt,
because we must call generic_smp_call_function_interrupt() with interrupts
disabled.

Signed-off-by: Xiao Guangrong
Acked-by: Ingo Molnar
Cc: Jens Axboe
Cc: Nick Piggin
Cc: Peter Zijlstra
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xiao Guangrong
2009-12-16 00:53:25 +0800

15 Dec, 2009

1 commit

9f5a5621e smp: Convert smplocks to raw_spinlocks ... Browse Code »

Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar

Thomas Gleixner
2009-12-15 06:55:33 +0800

18 Nov, 2009

1 commit

2ea6dec4a generic-ipi: Add smp_call_function_any() ... Browse Code »

Andrew points out that acpi-cpufreq uses cpumask_any, when it really
would prefer to use the same CPU if possible (to avoid an IPI). In
general, this seems a good idea to offer.

[ tglx: Documented selection preference and Inlined the UP case to
avoid the copy of smp_call_function_single() and the extra
EXPORT ]

Signed-off-by: Rusty Russell
Cc: Ingo Molnar
Cc: Venkatesh Pallipadi
Cc: Len Brown
Cc: Zhao Yakui
Cc: Dave Jones
Cc: Thomas Gleixner
Cc: Mike Galbraith
Cc: "Zhang, Yanmin"
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner

Rusty Russell
2009-11-18 21:52:25 +0800

23 Oct, 2009

1 commit

72f279b25 generic-ipi: Fix misleading smp_call_function*() description ... Browse Code »

After commit:8969a5ede0f9e17da4b943712429aef2c9bcd82b
"generic-ipi: remove kmalloc()", wait = 0 can be guaranteed.

Signed-off-by: Sheng Yang
Cc: Peter Zijlstra
Cc: Jens Axboe
Cc: Nick Piggin
LKML-Reference:
Signed-off-by: Ingo Molnar

Sheng Yang
2009-10-23 19:51:45 +0800

24 Sep, 2009

1 commit

0748bd017 cpumask: remove arch_send_call_function_ipi ... Browse Code »

Now everyone is converted to arch_send_call_function_ipi_mask, remove
the shim and the #defines.

Signed-off-by: Rusty Russell

Rusty Russell
2009-09-24 08:04:47 +0800

23 Sep, 2009

1 commit

54fdade1c generic-ipi: make struct call_function_data lockless ... Browse Code »

This patch can remove spinlock from struct call_function_data, the
reasons are below:

1: add a new interface for cpumask named cpumask_test_and_clear_cpu(),
it can atomically test and clear specific cpu, we can use it instead
of cpumask_test_cpu() and cpumask_clear_cpu() and no need data->lock
to protect those in generic_smp_call_function_interrupt().

2: in smp_call_function_many(), after csd_lock() return, the current's
cfd_data is deleted from call_function list, so it not have race
between other cpus, then cfs_data is only used in
smp_call_function_many() that must disable preemption and not from
a hardware interrupthandler or from a bottom half handler to call,
only the correspond cpu can use it, so it not have race in current
cpu, no need cfs_data->lock to protect it.

3: after 1 and 2, cfs_data->lock is only use to protect cfs_data->refs in
generic_smp_call_function_interrupt(), so we can define cfs_data->refs
to atomic_t, and no need cfs_data->lock any more.

Signed-off-by: Xiao Guangrong
Cc: Ingo Molnar
Cc: Jens Axboe
Cc: Nick Piggin
Cc: Peter Zijlstra
Acked-by: Rusty Russell
[akpm@linux-foundation.org: use atomic_dec_return()]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xiao Guangrong
2009-09-23 22:39:28 +0800

27 Aug, 2009

1 commit

b855192c0 Merge branch 'x86/urgent' into x86/pat ... Browse Code »

Reason: Change to is_new_memtype_allowed() in x86/urgent

Resolved semantic conflicts in:

arch/x86/mm/pat.c
arch/x86/mm/ioremap.c

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2009-08-27 08:24:28 +0800

22 Aug, 2009

1 commit

269c861ba generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled ... Browse Code »

Because of deadlock possiblities smp_call_function() is not allowed to
be called with interrupts disabled. Add an exception for the cpu not
yet online, as no one else can send smp call function interrupt to this
cpu that is not yet online and as such deadlock condition is not possible.

Signed-off-by: Suresh Siddha
Acked-by: Nick Piggin
Signed-off-by: H. Peter Anvin

Suresh Siddha
2009-08-22 07:25:43 +0800

08 Aug, 2009

1 commit

69dd647f9 generic-ipi: fix hotplug_cfd() ... Browse Code »

Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG

When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug,
but only if CPUMASK_OFFSTACK=y, which is default to n.

The bug was introduced by 8969a5ede0f9e17da4b943712429aef2c9bcd82b
("generic-ipi: remove kmalloc()").

Signed-off-by: Xiao Guangrong
Cc: Ingo Molnar
Cc: Jens Axboe
Cc: Nick Piggin
Cc: Peter Zijlstra
Cc: Rusty Russell
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xiao Guangrong
2009-08-08 01:39:55 +0800

09 Jun, 2009

1 commit

eaa958402 cpumask: alloc zeroed cpumask for static cpumask_var_ts ... Browse Code »

These are defined as static cpumask_var_t so if MAXSMP is not used,
they are cleared already. Avoid surprises when MAXSMP is enabled.

Signed-off-by: Yinghai Lu
Signed-off-by: Rusty Russell

Yinghai Lu
2009-06-09 21:00:27 +0800

13 Mar, 2009

1 commit

641cd4cfc generic-ipi: eliminate WARN_ON()s during oops/panic ... Browse Code »

Do not output smp-call related warnings in the oops/panic codepath.

Reported-by: Jan Beulich
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-13 17:47:34 +0800

25 Feb, 2009

4 commits

0b13fda1e generic-ipi: cleanups ... Browse Code »

Andrew pointed out that there's some small amount of
style rot in kernel/smp.c.

Clean it up.

Reported-by: Andrew Morton
Cc: Nick Piggin
Cc: Jens Axboe
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-02-25 23:52:50 +0800
6e2756376 generic-ipi: remove CSD_FLAG_WAIT ... Browse Code »

Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
the code so that we can use CSD_FLAG_LOCK for both purposes.

Signed-off-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Linus Torvalds
Cc: Nick Piggin
Cc: Jens Axboe
Cc: "Paul E. McKenney"
Cc: Rusty Russell
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-02-25 21:13:44 +0800
8969a5ede generic-ipi: remove kmalloc() ... Browse Code »

Remove the use of kmalloc() from the smp_call_function_*()
calls.

Steven's generic-ipi patch (d7240b98: generic-ipi: use per cpu
data for single cpu ipi calls) started the discussion on the use
of kmalloc() in this code and fixed the
smp_call_function_single(.wait=0) fallback case.

In this patch we complete this by also providing means for the
_many() call, which fully removes the need for kmalloc() in this
code.

The problem with the _many() call is that other cpus might still
be observing our entry when we're done with it. It solved this
by dynamically allocating data elements and RCU-freeing it.

We solve it by using a single per-cpu entry which provides
static storage and solves one half of the problem (avoiding
referencing freed data).

The other half, ensuring the queue iteration it still possible,
is done by placing re-used entries at the head of the list. This
means that if someone was still iterating that entry when it got
moved, he will now re-visit the entries on the list he had
already seen, but avoids skipping over entries like would have
happened had we placed the new entry at the end.

Furthermore, visiting entries twice is not a problem, since we
remove our cpu from the entry's cpumask once its called.

Many thanks to Oleg for his suggestions and him poking holes in
my earlier attempts.

Signed-off-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Linus Torvalds
Cc: Nick Piggin
Cc: Jens Axboe
Cc: "Paul E. McKenney"
Cc: Rusty Russell
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-02-25 21:13:43 +0800
15d0d3b33 generic IPI: simplify barriers and locking ... Browse Code »

Simplify the barriers in generic remote function call interrupt
code.

Firstly, just unconditionally take the lock and check the list
in the generic_call_function_single_interrupt IPI handler. As
we've just taken an IPI here, the chances are fairly high that
there will be work on the list for us, so do the locking
unconditionally. This removes the tricky lockless list_empty
check and dubious barriers. The change looks bigger than it is
because it is just removing an outer loop.

Secondly, clarify architecture specific IPI locking rules.
Generic code has no tools to impose any sane ordering on IPIs if
they go outside normal cache coherency, ergo the arch code must
make them appear to obey cache coherency as a "memory operation"
to initiate an IPI, and a "memory operation" to receive one.
This way at least they can be reasoned about in generic code,
and smp_mb used to provide ordering.

The combination of these two changes means that explict barriers
can be taken out of queue handling for the single case -- shared
data is explicitly locked, and ipi ordering must conform to
that, so no barriers needed. An extra barrier is needed in the
many handler, so as to ensure we load the list element after the
IPI is received.

Does any architecture actually *need* these barriers? For the
initiator I could see it, but for the handler I would be
surprised. So the other thing we could do for simplicity is just
to require that, rather than just matching with cache coherency,
we just require a full barrier before generating an IPI, and
after receiving an IPI. In which case, the smp_mb()s can go
away. But just for now, we'll be on the safe side and use the
barriers (they're in the slow case anyway).

Signed-off-by: Nick Piggin
Acked-by: Peter Zijlstra
Cc: linux-arch@vger.kernel.org
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Jens Axboe
Cc: Oleg Nesterov
Cc: Suresh Siddha
Signed-off-by: Ingo Molnar

Nick Piggin
2009-02-25 19:27:08 +0800

31 Jan, 2009

1 commit

d7240b988 generic-ipi: use per cpu data for single cpu ipi calls ... Browse Code »

The smp_call_function can be passed a wait parameter telling it to
wait for all the functions running on other CPUs to complete before
returning, or to return without waiting. Unfortunately, this is
currently just a suggestion and not manditory. That is, the
smp_call_function can decide not to return and wait instead.

The reason for this is because it uses kmalloc to allocate storage
to send to the called CPU and that CPU will free it when it is done.
But if we fail to allocate the storage, the stack is used instead.
This means we must wait for the called CPU to finish before
continuing.

Unfortunatly, some callers do no abide by this hint and act as if
the non-wait option is mandatory. The MTRR code for instance will
deadlock if the smp_call_function is set to wait. This is because
the smp_call_function will wait for the other CPUs to finish their
called functions, but those functions are waiting on the caller to
continue.

This patch changes the generic smp_call_function code to use per cpu
variables if the allocation of the data fails for a single CPU call. The
smp_call_function_many will fall back to the smp_call_function_single
if it fails its alloc. The smp_call_function_single is modified
to not force the wait state.

Since we now are using a single data per cpu we must synchronize the
callers to prevent a second caller modifying the data before the
first called IPI functions complete. To do so, I added a flag to
the call_single_data called CSD_FLAG_LOCK. When the single CPU is
called (which can be called when a many call fails an alloc), we
set the LOCK bit on this per cpu data. When the caller finishes
it clears the LOCK bit.

The caller must wait till the LOCK bit is cleared before setting
it. When it is cleared, there is no IPI function using it.

Signed-off-by: Steven Rostedt
Signed-off-by: Peter Zijlstra
Acked-by: Jens Axboe
Acked-by: Linus Torvalds
Signed-off-by: Ingo Molnar

Steven Rostedt
2009-01-31 01:31:08 +0800

01 Jan, 2009

1 commit

4f4b6c1a9 cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: core ... Browse Code »

Impact: cleanup

In future, all cpumask ops will only be valid (in general) for bit
numbers < nr_cpu_ids. So use that instead of NR_CPUS in iterators
and other comparisons.

This is always safe: no cpu number can be >= nr_cpu_ids, and
nr_cpu_ids is initialized to NR_CPUS at boot.

Signed-off-by: Rusty Russell
Signed-off-by: Mike Travis
Acked-by: Ingo Molnar
Acked-by: James Morris
Cc: Eric Biederman

Rusty Russell
2009-01-01 07:42:15 +0800

30 Dec, 2008

2 commits

ce47d974f cpumask: arch_send_call_function_ipi_mask: core ... Browse Code »

Impact: new API to reduce stack usage

We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask().

Signed-off-by: Rusty Russell

Rusty Russell
2008-12-30 06:35:17 +0800
54b11e6d5 cpumask: smp_call_function_many() ... Browse Code »

Impact: Implementation change to remove cpumask_t from stack.

Actually change smp_call_function_mask() to smp_call_function_many().
We avoid cpumasks on the stack in this version.

(S390 has its own version, but that's going away apparently).

We have to do some dancing to figure out if 0 or 1 other cpus are in
the mask supplied and the online mask without allocating a tmp
cpumask. It's still fairly cheap.

We allocate the cpumask at the end of the call_function_data
structure: if allocation fails we fallback to smp_call_function_single
rather than using the baroque quiescing code (which needs a cpumask on
stack).

(Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!)

Signed-off-by: Rusty Russell
Signed-off-by: Mike Travis
Cc: Hiroshi Shimamoto
Cc: npiggin@suse.de
Cc: axboe@kernel.dk

Rusty Russell
2008-12-30 06:35:16 +0800

06 Nov, 2008

1 commit

561920a0d generic-ipi: fix the smp_mb() placement ... Browse Code »

smp_mb() is needed (to make the memory operations visible globally) before
sending the ipi on the sender and the receiver (on Alpha atleast) needs
smp_read_barrier_depends() in the handler before reading the call_single_queue
list in a lock-free fashion.

On x86, x2apic mode register accesses for sending IPI's don't have serializing
semantics. So the need for smp_mb() before sending the IPI becomes more
critical in x2apic mode.

Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.

Signed-off-by: Suresh Siddha
Signed-off-by: Jens Axboe

Suresh Siddha
2008-11-06 15:41:56 +0800

26 Aug, 2008

1 commit

f73be6ded smp: have smp_call_function_single() detect invalid CPUs ... Browse Code »

Have smp_call_function_single() return invalid CPU indicies and return
-ENXIO. This function is already executed inside a
get_cpu()..put_cpu() which locks out CPU removal, so rather than
having the higher layers doing another layer of locking to guard
against unplugged CPUs do the test here.

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2008-08-26 08:45:48 +0800