01 Jan, 2009

1 commit


30 Dec, 2008

2 commits

  • Impact: new API to reduce stack usage

    We're weaning the core code off handing cpumask's around on-stack.
    This introduces arch_send_call_function_ipi_mask().

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Implementation change to remove cpumask_t from stack.

    Actually change smp_call_function_mask() to smp_call_function_many().
    We avoid cpumasks on the stack in this version.

    (S390 has its own version, but that's going away apparently).

    We have to do some dancing to figure out if 0 or 1 other cpus are in
    the mask supplied and the online mask without allocating a tmp
    cpumask. It's still fairly cheap.

    We allocate the cpumask at the end of the call_function_data
    structure: if allocation fails we fallback to smp_call_function_single
    rather than using the baroque quiescing code (which needs a cpumask on
    stack).

    (Thanks to Hiroshi Shimamoto for spotting several bugs in previous versions!)

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Cc: Hiroshi Shimamoto
    Cc: npiggin@suse.de
    Cc: axboe@kernel.dk

    Rusty Russell
     

06 Nov, 2008

1 commit

  • smp_mb() is needed (to make the memory operations visible globally) before
    sending the ipi on the sender and the receiver (on Alpha atleast) needs
    smp_read_barrier_depends() in the handler before reading the call_single_queue
    list in a lock-free fashion.

    On x86, x2apic mode register accesses for sending IPI's don't have serializing
    semantics. So the need for smp_mb() before sending the IPI becomes more
    critical in x2apic mode.

    Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
    smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
    doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Jens Axboe

    Suresh Siddha
     

26 Aug, 2008

1 commit

  • Have smp_call_function_single() return invalid CPU indicies and return
    -ENXIO. This function is already executed inside a
    get_cpu()..put_cpu() which locks out CPU removal, so rather than
    having the higher layers doing another layer of locking to guard
    against unplugged CPUs do the test here.

    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

12 Aug, 2008

1 commit

  • > > Nick Piggin (1):
    > > generic-ipi: fix stack and rcu interaction bug in
    > > smp_call_function_mask()
    >
    > I'm still not 100% sure that I have this patch right... I might have seen
    > a lockup trace implicating the smp call function path... which may have
    > been due to some other problem or a different bug in the new call function
    > code, but if some more people can take a look at it before merging?

    OK indeed it did have a couple of bugs. Firstly, I wasn't freeing the
    data properly in the alloc && wait case. Secondly, I wasn't resetting
    CSD_FLAG_WAIT in the for each cpu loop (so only the first CPU would
    wait).

    After those fixes, the patch boots and runs with the kmalloc commented
    out (so it always executes the slowpath).

    Signed-off-by: Ingo Molnar

    Nick Piggin
     

11 Aug, 2008

1 commit

  • * Venki Pallipadi wrote:

    > Found a OOPS on a big SMP box during an overnight reboot test with
    > upstream git.
    >
    > Suresh and I looked at the oops and looks like the root cause is in
    > generic_smp_call_function_interrupt() and smp_call_function_mask() with
    > wait parameter.
    >
    > The actual oops looked like
    >
    > [ 11.277260] BUG: unable to handle kernel paging request at ffff8802ffffffff
    > [ 11.277815] IP: [] 0xffff8802ffffffff
    > [ 11.278155] PGD 202063 PUD 0
    > [ 11.278576] Oops: 0010 [1] SMP
    > [ 11.279006] CPU 5
    > [ 11.279336] Modules linked in:
    > [ 11.279752] Pid: 0, comm: swapper Not tainted 2.6.27-rc2-00020-g685d87f #290
    > [ 11.280039] RIP: 0010:[] [] 0xffff8802ffffffff
    > [ 11.280692] RSP: 0018:ffff88027f1f7f70 EFLAGS: 00010086
    > [ 11.280976] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
    > [ 11.281264] RDX: 0000000000004f4e RSI: 0000000000000001 RDI: 0000000000000000
    > [ 11.281624] RBP: ffff88027f1f7f98 R08: 0000000000000001 R09: ffffffff802509af
    > [ 11.281925] R10: ffff8800280c2780 R11: 0000000000000000 R12: ffff88027f097d48
    > [ 11.282214] R13: ffff88027f097d70 R14: 0000000000000005 R15: ffff88027e571000
    > [ 11.282502] FS: 0000000000000000(0000) GS:ffff88027f1c3340(0000) knlGS:0000000000000000
    > [ 11.283096] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    > [ 11.283382] CR2: ffff8802ffffffff CR3: 0000000000201000 CR4: 00000000000006e0
    > [ 11.283760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > [ 11.284048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    > [ 11.284337] Process swapper (pid: 0, threadinfo ffff88027f1f2000, task ffff88027f1f0640)
    > [ 11.284936] Stack: ffffffff80250963 0000000000000212 0000000000ee8c78 0000000000ee8a66
    > [ 11.285802] ffff88027e571550 ffff88027f1f7fa8 ffffffff8021adb5 ffff88027f1f3e40
    > [ 11.286599] ffffffff8020bdd6 ffff88027f1f3e40 ffff88027f1f3ef8 0000000000000000
    > [ 11.287120] Call Trace:
    > [ 11.287768] [] ? generic_smp_call_function_interrupt+0x61/0x12c
    > [ 11.288354] [] smp_call_function_interrupt+0x17/0x27
    > [ 11.288744] [] call_function_interrupt+0x66/0x70
    > [ 11.289030] [] ? clockevents_notify+0x19/0x73
    > [ 11.289380] [] ? acpi_idle_enter_simple+0x18b/0x1fa
    > [ 11.289760] [] ? acpi_idle_enter_simple+0x181/0x1fa
    > [ 11.290051] [] ? cpuidle_idle_call+0x70/0xa2
    > [ 11.290338] [] ? cpu_idle+0x5f/0x7d
    > [ 11.290723] [] ? start_secondary+0x14d/0x152
    > [ 11.291010]
    > [ 11.291287]
    > [ 11.291654] Code: Bad RIP value.
    > [ 11.292041] RIP [] 0xffff8802ffffffff
    > [ 11.292380] RSP
    > [ 11.292741] CR2: ffff8802ffffffff
    > [ 11.310951] ---[ end trace 137c54d525305f1c ]---
    >
    > The problem is with the following sequence of events:
    >
    > - CPU A calls smp_call_function_mask() for CPU B with wait parameter
    > - CPU A sets up the call_function_data on the stack and does an rcu add to
    > call_function_queue
    > - CPU A waits until the WAIT flag is cleared
    > - CPU B gets the call function interrupt and starts going through the
    > call_function_queue
    > - CPU C also gets some other call function interrupt and starts going through
    > the call_function_queue
    > - CPU C, which is also going through the call_function_queue, starts referencing
    > CPU A's stack, as that element is still in call_function_queue
    > - CPU B finishes the function call that CPU A set up and as there are no other
    > references to it, rcu deletes the call_function_data (which was from CPU A
    > stack)
    > - CPU B sees the wait flag and just clears the flag (no call_rcu to free)
    > - CPU A which was waiting on the flag continues executing and the stack
    > contents change
    >
    > - CPU C is still in rcu_read section accessing the CPU A's stack sees
    > inconsistent call_funation_data and can try to execute
    > function with some random pointer, causing stack corruption for A
    > (by clearing the bits in mask field) and oops.

    Nice debugging work.

    I'd suggest something like the attached (boot tested) patch as the simple
    fix for now.

    I expect the benefits from the less synchronized, multiple-in-flight-data
    global queue will still outweigh the costs of dynamic allocations. But
    if worst comes to worst then we just go back to a globally synchronous
    one-at-a-time implementation, but that would be pretty sad!

    Signed-off-by: Ingo Molnar

    Nick Piggin
     

27 Jul, 2008

1 commit

  • A previous patch added the early_initcall(), to allow a cleaner hooking of
    pre-SMP initcalls. Now we remove the older interface, converting all
    existing users to the new one.

    [akpm@linux-foundation.org: cleanups]
    [akpm@linux-foundation.org: build fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    Signed-off-by: Eduard - Gabriel Munteanu
    Cc: Tom Zanussi
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eduard - Gabriel Munteanu
     

16 Jul, 2008

2 commits

  • When a GFP_ATOMIC allocation fails, it falls back to allocating the
    data on the stack and converting it to a waiting call.

    Make sure we actually wait in this case.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge
     
  • …l/git/tip/linux-2.6-tip

    * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    generic-ipi: more merge fallout
    generic-ipi: merge fix
    x86, visws: use mach-default/entry_arch.h
    x86, visws: fix generic-ipi build
    generic-ipi: fixlet
    generic-ipi: fix s390 build bug
    generic-ipi: fix linux-next tree build failure
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix "smp_call_function: get rid of the unused nonatomic/retry argument"
    on_each_cpu(): kill unused 'retry' parameter
    smp_call_function: get rid of the unused nonatomic/retry argument
    sh: convert to generic helpers for IPI function calls
    parisc: convert to generic helpers for IPI function calls
    mips: convert to generic helpers for IPI function calls
    m32r: convert to generic helpers for IPI function calls
    arm: convert to generic helpers for IPI function calls
    alpha: convert to generic helpers for IPI function calls
    ia64: convert to generic helpers for IPI function calls
    powerpc: convert to generic helpers for IPI function calls
    ...

    Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually

    Linus Torvalds
     

27 Jun, 2008

1 commit


26 Jun, 2008

2 commits

  • It's never used and the comments refer to nonatomic and retry
    interchangably. So get rid of it.

    Acked-by: Jeremy Fitzhardinge
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This adds kernel/smp.c which contains helpers for IPI function calls. In
    addition to supporting the existing smp_call_function() in a more efficient
    manner, it also adds a more scalable variant called smp_call_function_single()
    for calling a given function on a single CPU only.

    The core of this is based on the x86-64 patch from Nick Piggin, lots of
    changes since then. "Alan D. Brunelle" has
    contributed lots of fixes and suggestions as well. Also thanks to
    Paul E. McKenney for reviewing RCU usage
    and getting rid of the data allocation fallback deadlock.

    Acked-by: Ingo Molnar
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Jens Axboe

    Jens Axboe