17 Jan, 2010

1 commit

  • The change in acpi_cpufreq to use smp_call_function_any causes a warning
    when it is called since the function erroneously passes the cpu id to
    cpumask_of_node rather than the node that the cpu is on. Fix this.

    cpumask_of_node(3): node > nr_node_ids(1)
    Pid: 1, comm: swapper Not tainted 2.6.33-rc3-00097-g2c1f189 #223
    Call Trace:
    [] cpumask_of_node+0x23/0x58
    [] smp_call_function_any+0x65/0xfa
    [] ? do_drv_read+0x0/0x2f
    [] get_cur_val+0xb0/0x102
    [] get_cur_freq_on_cpu+0x74/0xc5
    [] acpi_cpufreq_cpu_init+0x417/0x515
    [] ? __down_write+0xb/0xd
    [] cpufreq_add_dev+0x278/0x922

    Signed-off-by: David John
    Cc: Suresh Siddha
    Cc: Rusty Russell
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David John
     

16 Dec, 2009

2 commits

  • …el/git/tip/linux-2.6-tip

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
    clockevents: Convert to raw_spinlock
    clockevents: Make tick_device_lock static
    debugobjects: Convert to raw_spinlocks
    perf_event: Convert to raw_spinlock
    hrtimers: Convert to raw_spinlocks
    genirq: Convert irq_desc.lock to raw_spinlock
    smp: Convert smplocks to raw_spinlocks
    rtmutes: Convert rtmutex.lock to raw_spinlock
    sched: Convert pi_lock to raw_spinlock
    sched: Convert cpupri lock to raw_spinlock
    sched: Convert rt_runtime_lock to raw_spinlock
    sched: Convert rq->lock to raw_spinlock
    plist: Make plist debugging raw_spinlock aware
    bkl: Fixup core_lock fallout
    locking: Cleanup the name space completely
    locking: Further name space cleanups
    alpha: Fix fallout from locking changes
    locking: Implement new raw_spinlock
    locking: Convert raw_rwlock functions to arch_rwlock
    locking: Convert raw_rwlock to arch_rwlock
    ...

    Linus Torvalds
     
  • Use smp_processor_id() instead of get_cpu() and put_cpu() in
    generic_smp_call_function_interrupt(), It's no need to disable preempt,
    because we must call generic_smp_call_function_interrupt() with interrupts
    disabled.

    Signed-off-by: Xiao Guangrong
    Acked-by: Ingo Molnar
    Cc: Jens Axboe
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     

15 Dec, 2009

1 commit


18 Nov, 2009

1 commit

  • Andrew points out that acpi-cpufreq uses cpumask_any, when it really
    would prefer to use the same CPU if possible (to avoid an IPI). In
    general, this seems a good idea to offer.

    [ tglx: Documented selection preference and Inlined the UP case to
    avoid the copy of smp_call_function_single() and the extra
    EXPORT ]

    Signed-off-by: Rusty Russell
    Cc: Ingo Molnar
    Cc: Venkatesh Pallipadi
    Cc: Len Brown
    Cc: Zhao Yakui
    Cc: Dave Jones
    Cc: Thomas Gleixner
    Cc: Mike Galbraith
    Cc: "Zhang, Yanmin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Rusty Russell
     

23 Oct, 2009

1 commit


24 Sep, 2009

1 commit


23 Sep, 2009

1 commit

  • This patch can remove spinlock from struct call_function_data, the
    reasons are below:

    1: add a new interface for cpumask named cpumask_test_and_clear_cpu(),
    it can atomically test and clear specific cpu, we can use it instead
    of cpumask_test_cpu() and cpumask_clear_cpu() and no need data->lock
    to protect those in generic_smp_call_function_interrupt().

    2: in smp_call_function_many(), after csd_lock() return, the current's
    cfd_data is deleted from call_function list, so it not have race
    between other cpus, then cfs_data is only used in
    smp_call_function_many() that must disable preemption and not from
    a hardware interrupthandler or from a bottom half handler to call,
    only the correspond cpu can use it, so it not have race in current
    cpu, no need cfs_data->lock to protect it.

    3: after 1 and 2, cfs_data->lock is only use to protect cfs_data->refs in
    generic_smp_call_function_interrupt(), so we can define cfs_data->refs
    to atomic_t, and no need cfs_data->lock any more.

    Signed-off-by: Xiao Guangrong
    Cc: Ingo Molnar
    Cc: Jens Axboe
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Acked-by: Rusty Russell
    [akpm@linux-foundation.org: use atomic_dec_return()]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     

27 Aug, 2009

1 commit


22 Aug, 2009

1 commit


08 Aug, 2009

1 commit

  • Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG

    When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug,
    but only if CPUMASK_OFFSTACK=y, which is default to n.

    The bug was introduced by 8969a5ede0f9e17da4b943712429aef2c9bcd82b
    ("generic-ipi: remove kmalloc()").

    Signed-off-by: Xiao Guangrong
    Cc: Ingo Molnar
    Cc: Jens Axboe
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     

09 Jun, 2009

1 commit


13 Mar, 2009

1 commit


25 Feb, 2009

4 commits

  • Andrew pointed out that there's some small amount of
    style rot in kernel/smp.c.

    Clean it up.

    Reported-by: Andrew Morton
    Cc: Nick Piggin
    Cc: Jens Axboe
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
    the code so that we can use CSD_FLAG_LOCK for both purposes.

    Signed-off-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Jens Axboe
    Cc: "Paul E. McKenney"
    Cc: Rusty Russell
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Remove the use of kmalloc() from the smp_call_function_*()
    calls.

    Steven's generic-ipi patch (d7240b98: generic-ipi: use per cpu
    data for single cpu ipi calls) started the discussion on the use
    of kmalloc() in this code and fixed the
    smp_call_function_single(.wait=0) fallback case.

    In this patch we complete this by also providing means for the
    _many() call, which fully removes the need for kmalloc() in this
    code.

    The problem with the _many() call is that other cpus might still
    be observing our entry when we're done with it. It solved this
    by dynamically allocating data elements and RCU-freeing it.

    We solve it by using a single per-cpu entry which provides
    static storage and solves one half of the problem (avoiding
    referencing freed data).

    The other half, ensuring the queue iteration it still possible,
    is done by placing re-used entries at the head of the list. This
    means that if someone was still iterating that entry when it got
    moved, he will now re-visit the entries on the list he had
    already seen, but avoids skipping over entries like would have
    happened had we placed the new entry at the end.

    Furthermore, visiting entries twice is not a problem, since we
    remove our cpu from the entry's cpumask once its called.

    Many thanks to Oleg for his suggestions and him poking holes in
    my earlier attempts.

    Signed-off-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Jens Axboe
    Cc: "Paul E. McKenney"
    Cc: Rusty Russell
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Simplify the barriers in generic remote function call interrupt
    code.

    Firstly, just unconditionally take the lock and check the list
    in the generic_call_function_single_interrupt IPI handler. As
    we've just taken an IPI here, the chances are fairly high that
    there will be work on the list for us, so do the locking
    unconditionally. This removes the tricky lockless list_empty
    check and dubious barriers. The change looks bigger than it is
    because it is just removing an outer loop.

    Secondly, clarify architecture specific IPI locking rules.
    Generic code has no tools to impose any sane ordering on IPIs if
    they go outside normal cache coherency, ergo the arch code must
    make them appear to obey cache coherency as a "memory operation"
    to initiate an IPI, and a "memory operation" to receive one.
    This way at least they can be reasoned about in generic code,
    and smp_mb used to provide ordering.

    The combination of these two changes means that explict barriers
    can be taken out of queue handling for the single case -- shared
    data is explicitly locked, and ipi ordering must conform to
    that, so no barriers needed. An extra barrier is needed in the
    many handler, so as to ensure we load the list element after the
    IPI is received.

    Does any architecture actually *need* these barriers? For the
    initiator I could see it, but for the handler I would be
    surprised. So the other thing we could do for simplicity is just
    to require that, rather than just matching with cache coherency,
    we just require a full barrier before generating an IPI, and
    after receiving an IPI. In which case, the smp_mb()s can go
    away. But just for now, we'll be on the safe side and use the
    barriers (they're in the slow case anyway).

    Signed-off-by: Nick Piggin
    Acked-by: Peter Zijlstra
    Cc: linux-arch@vger.kernel.org
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Jens Axboe
    Cc: Oleg Nesterov
    Cc: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Nick Piggin
     

31 Jan, 2009

1 commit

  • The smp_call_function can be passed a wait parameter telling it to
    wait for all the functions running on other CPUs to complete before
    returning, or to return without waiting. Unfortunately, this is
    currently just a suggestion and not manditory. That is, the
    smp_call_function can decide not to return and wait instead.

    The reason for this is because it uses kmalloc to allocate storage
    to send to the called CPU and that CPU will free it when it is done.
    But if we fail to allocate the storage, the stack is used instead.
    This means we must wait for the called CPU to finish before
    continuing.

    Unfortunatly, some callers do no abide by this hint and act as if
    the non-wait option is mandatory. The MTRR code for instance will
    deadlock if the smp_call_function is set to wait. This is because
    the smp_call_function will wait for the other CPUs to finish their
    called functions, but those functions are waiting on the caller to
    continue.

    This patch changes the generic smp_call_function code to use per cpu
    variables if the allocation of the data fails for a single CPU call. The
    smp_call_function_many will fall back to the smp_call_function_single
    if it fails its alloc. The smp_call_function_single is modified
    to not force the wait state.

    Since we now are using a single data per cpu we must synchronize the
    callers to prevent a second caller modifying the data before the
    first called IPI functions complete. To do so, I added a flag to
    the call_single_data called CSD_FLAG_LOCK. When the single CPU is
    called (which can be called when a many call fails an alloc), we
    set the LOCK bit on this per cpu data. When the caller finishes
    it clears the LOCK bit.

    The caller must wait till the LOCK bit is cleared before setting
    it. When it is cleared, there is no IPI function using it.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Peter Zijlstra
    Acked-by: Jens Axboe
    Acked-by: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

01 Jan, 2009

1 commit


30 Dec, 2008

2 commits

  • Impact: new API to reduce stack usage

    We're weaning the core code off handing cpumask's around on-stack.
    This introduces arch_send_call_function_ipi_mask().

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Implementation change to remove cpumask_t from stack.

    Actually change smp_call_function_mask() to smp_call_function_many().
    We avoid cpumasks on the stack in this version.

    (S390 has its own version, but that's going away apparently).

    We have to do some dancing to figure out if 0 or 1 other cpus are in
    the mask supplied and the online mask without allocating a tmp
    cpumask. It's still fairly cheap.

    We allocate the cpumask at the end of the call_function_data
    structure: if allocation fails we fallback to smp_call_function_single
    rather than using the baroque quiescing code (which needs a cpumask on
    stack).

    (Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!)

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Cc: Hiroshi Shimamoto
    Cc: npiggin@suse.de
    Cc: axboe@kernel.dk

    Rusty Russell
     

06 Nov, 2008

1 commit

  • smp_mb() is needed (to make the memory operations visible globally) before
    sending the ipi on the sender and the receiver (on Alpha atleast) needs
    smp_read_barrier_depends() in the handler before reading the call_single_queue
    list in a lock-free fashion.

    On x86, x2apic mode register accesses for sending IPI's don't have serializing
    semantics. So the need for smp_mb() before sending the IPI becomes more
    critical in x2apic mode.

    Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
    smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
    doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Jens Axboe

    Suresh Siddha
     

26 Aug, 2008

1 commit


12 Aug, 2008

1 commit

  • > > Nick Piggin (1):
    > > generic-ipi: fix stack and rcu interaction bug in
    > > smp_call_function_mask()
    >
    > I'm still not 100% sure that I have this patch right... I might have seen
    > a lockup trace implicating the smp call function path... which may have
    > been due to some other problem or a different bug in the new call function
    > code, but if some more people can take a look at it before merging?

    OK indeed it did have a couple of bugs. Firstly, I wasn't freeing the
    data properly in the alloc && wait case. Secondly, I wasn't resetting
    CSD_FLAG_WAIT in the for each cpu loop (so only the first CPU would
    wait).

    After those fixes, the patch boots and runs with the kmalloc commented
    out (so it always executes the slowpath).

    Signed-off-by: Ingo Molnar

    Nick Piggin
     

11 Aug, 2008

1 commit

  • * Venki Pallipadi wrote:

    > Found a OOPS on a big SMP box during an overnight reboot test with
    > upstream git.
    >
    > Suresh and I looked at the oops and looks like the root cause is in
    > generic_smp_call_function_interrupt() and smp_call_function_mask() with
    > wait parameter.
    >
    > The actual oops looked like
    >
    > [ 11.277260] BUG: unable to handle kernel paging request at ffff8802ffffffff
    > [ 11.277815] IP: [] 0xffff8802ffffffff
    > [ 11.278155] PGD 202063 PUD 0
    > [ 11.278576] Oops: 0010 [1] SMP
    > [ 11.279006] CPU 5
    > [ 11.279336] Modules linked in:
    > [ 11.279752] Pid: 0, comm: swapper Not tainted 2.6.27-rc2-00020-g685d87f #290
    > [ 11.280039] RIP: 0010:[] [] 0xffff8802ffffffff
    > [ 11.280692] RSP: 0018:ffff88027f1f7f70 EFLAGS: 00010086
    > [ 11.280976] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
    > [ 11.281264] RDX: 0000000000004f4e RSI: 0000000000000001 RDI: 0000000000000000
    > [ 11.281624] RBP: ffff88027f1f7f98 R08: 0000000000000001 R09: ffffffff802509af
    > [ 11.281925] R10: ffff8800280c2780 R11: 0000000000000000 R12: ffff88027f097d48
    > [ 11.282214] R13: ffff88027f097d70 R14: 0000000000000005 R15: ffff88027e571000
    > [ 11.282502] FS: 0000000000000000(0000) GS:ffff88027f1c3340(0000) knlGS:0000000000000000
    > [ 11.283096] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    > [ 11.283382] CR2: ffff8802ffffffff CR3: 0000000000201000 CR4: 00000000000006e0
    > [ 11.283760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > [ 11.284048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    > [ 11.284337] Process swapper (pid: 0, threadinfo ffff88027f1f2000, task ffff88027f1f0640)
    > [ 11.284936] Stack: ffffffff80250963 0000000000000212 0000000000ee8c78 0000000000ee8a66
    > [ 11.285802] ffff88027e571550 ffff88027f1f7fa8 ffffffff8021adb5 ffff88027f1f3e40
    > [ 11.286599] ffffffff8020bdd6 ffff88027f1f3e40 ffff88027f1f3ef8 0000000000000000
    > [ 11.287120] Call Trace:
    > [ 11.287768] [] ? generic_smp_call_function_interrupt+0x61/0x12c
    > [ 11.288354] [] smp_call_function_interrupt+0x17/0x27
    > [ 11.288744] [] call_function_interrupt+0x66/0x70
    > [ 11.289030] [] ? clockevents_notify+0x19/0x73
    > [ 11.289380] [] ? acpi_idle_enter_simple+0x18b/0x1fa
    > [ 11.289760] [] ? acpi_idle_enter_simple+0x181/0x1fa
    > [ 11.290051] [] ? cpuidle_idle_call+0x70/0xa2
    > [ 11.290338] [] ? cpu_idle+0x5f/0x7d
    > [ 11.290723] [] ? start_secondary+0x14d/0x152
    > [ 11.291010]
    > [ 11.291287]
    > [ 11.291654] Code: Bad RIP value.
    > [ 11.292041] RIP [] 0xffff8802ffffffff
    > [ 11.292380] RSP
    > [ 11.292741] CR2: ffff8802ffffffff
    > [ 11.310951] ---[ end trace 137c54d525305f1c ]---
    >
    > The problem is with the following sequence of events:
    >
    > - CPU A calls smp_call_function_mask() for CPU B with wait parameter
    > - CPU A sets up the call_function_data on the stack and does an rcu add to
    > call_function_queue
    > - CPU A waits until the WAIT flag is cleared
    > - CPU B gets the call function interrupt and starts going through the
    > call_function_queue
    > - CPU C also gets some other call function interrupt and starts going through
    > the call_function_queue
    > - CPU C, which is also going through the call_function_queue, starts referencing
    > CPU A's stack, as that element is still in call_function_queue
    > - CPU B finishes the function call that CPU A set up and as there are no other
    > references to it, rcu deletes the call_function_data (which was from CPU A
    > stack)
    > - CPU B sees the wait flag and just clears the flag (no call_rcu to free)
    > - CPU A which was waiting on the flag continues executing and the stack
    > contents change
    >
    > - CPU C is still in rcu_read section accessing the CPU A's stack sees
    > inconsistent call_funation_data and can try to execute
    > function with some random pointer, causing stack corruption for A
    > (by clearing the bits in mask field) and oops.

    Nice debugging work.

    I'd suggest something like the attached (boot tested) patch as the simple
    fix for now.

    I expect the benefits from the less synchronized, multiple-in-flight-data
    global queue will still outweigh the costs of dynamic allocations. But
    if worst comes to worst then we just go back to a globally synchronous
    one-at-a-time implementation, but that would be pretty sad!

    Signed-off-by: Ingo Molnar

    Nick Piggin
     

27 Jul, 2008

1 commit

  • A previous patch added the early_initcall(), to allow a cleaner hooking of
    pre-SMP initcalls. Now we remove the older interface, converting all
    existing users to the new one.

    [akpm@linux-foundation.org: cleanups]
    [akpm@linux-foundation.org: build fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    Signed-off-by: Eduard - Gabriel Munteanu
    Cc: Tom Zanussi
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eduard - Gabriel Munteanu
     

16 Jul, 2008

2 commits

  • When a GFP_ATOMIC allocation fails, it falls back to allocating the
    data on the stack and converting it to a waiting call.

    Make sure we actually wait in this case.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge
     
  • …l/git/tip/linux-2.6-tip

    * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    generic-ipi: more merge fallout
    generic-ipi: merge fix
    x86, visws: use mach-default/entry_arch.h
    x86, visws: fix generic-ipi build
    generic-ipi: fixlet
    generic-ipi: fix s390 build bug
    generic-ipi: fix linux-next tree build failure
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix "smp_call_function: get rid of the unused nonatomic/retry argument"
    on_each_cpu(): kill unused 'retry' parameter
    smp_call_function: get rid of the unused nonatomic/retry argument
    sh: convert to generic helpers for IPI function calls
    parisc: convert to generic helpers for IPI function calls
    mips: convert to generic helpers for IPI function calls
    m32r: convert to generic helpers for IPI function calls
    arm: convert to generic helpers for IPI function calls
    alpha: convert to generic helpers for IPI function calls
    ia64: convert to generic helpers for IPI function calls
    powerpc: convert to generic helpers for IPI function calls
    ...

    Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually

    Linus Torvalds
     

27 Jun, 2008

1 commit


26 Jun, 2008

2 commits

  • It's never used and the comments refer to nonatomic and retry
    interchangably. So get rid of it.

    Acked-by: Jeremy Fitzhardinge
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This adds kernel/smp.c which contains helpers for IPI function calls. In
    addition to supporting the existing smp_call_function() in a more efficient
    manner, it also adds a more scalable variant called smp_call_function_single()
    for calling a given function on a single CPU only.

    The core of this is based on the x86-64 patch from Nick Piggin, lots of
    changes since then. "Alan D. Brunelle" has
    contributed lots of fixes and suggestions as well. Also thanks to
    Paul E. McKenney for reviewing RCU usage
    and getting rid of the data allocation fallback deadlock.

    Acked-by: Ingo Molnar
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Jens Axboe

    Jens Axboe