05 Jun, 2014

1 commit


11 Mar, 2014

1 commit

  • We must use smp_call_function_single(.wait=1) for the
    irq_cpu_stop_queue_work() to ensure the queueing is actually done under
    stop_cpus_lock. Without this we could have dropped the lock by the time
    we do the queueing and get the race we tried to fix.

    Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")

    Signed-off-by: Peter Zijlstra
    Cc: Prarit Bhargava
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Christoph Hellwig
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/20140228123905.GK3104@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 Nov, 2013

1 commit

  • There is a race between stop_two_cpus, and the global stop_cpus.

    It is possible for two CPUs to get their stopper functions queued
    "backwards" from one another, resulting in the stopper threads
    getting stuck, and the system hanging. This can happen because
    queuing up stoppers is not synchronized.

    This patch adds synchronization between stop_cpus (a rare operation),
    and stop_two_cpus.

    Reported-and-Tested-by: Prarit Bhargava
    Signed-off-by: Rik van Riel
    Signed-off-by: Peter Zijlstra
    Acked-by: Mel Gorman
    Link: http://lkml.kernel.org/r/20131101104146.03d1e043@annuminas.surriel.com
    Signed-off-by: Ingo Molnar

    Rik van Riel
     

16 Oct, 2013

1 commit

  • Remove get_online_cpus() usage from the scheduler; there's 4 sites that
    use it:

    - sched_init_smp(); where its completely superfluous since we're in
    'early' boot and there simply cannot be any hotplugging.

    - sched_getaffinity(); we already take a raw spinlock to protect the
    task cpus_allowed mask, this disables preemption and therefore
    also stabilizes cpu_online_mask as that's modified using
    stop_machine. However switch to active mask for symmetry with
    sched_setaffinity()/set_cpus_allowed_ptr(). We guarantee active
    mask stability by inserting sync_rcu/sched() into _cpu_down.

    - sched_setaffinity(); we don't appear to need get_online_cpus()
    either, there's two sites where hotplug appears relevant:
    * cpuset_cpus_allowed(); for the !cpuset case we use possible_mask,
    for the cpuset case we hold task_lock, which is a spinlock and
    thus for mainline disables preemption (might cause pain on RT).
    * set_cpus_allowed_ptr(); Holds all scheduler locks and thus has
    preemption properly disabled; also it already deals with hotplug
    races explicitly where it releases them.

    - migrate_swap(); we can make stop_two_cpus() do the heavy lifting for
    us with a little trickery. By adding a sync_sched/rcu() after the
    CPU_DOWN_PREPARE notifier we can provide preempt/rcu guarantees for
    cpu_active_mask. Use these to validate that both our cpus are active
    when queueing the stop work before we queue the stop_machine works
    for take_cpu_down().

    Signed-off-by: Peter Zijlstra
    Cc: "Srivatsa S. Bhat"
    Cc: Paul McKenney
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Srikar Dronamraju
    Cc: Andrea Arcangeli
    Cc: Johannes Weiner
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Steven Rostedt
    Cc: Oleg Nesterov
    Link: http://lkml.kernel.org/r/20131011123820.GV3081@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

09 Oct, 2013

1 commit

  • Introduce stop_two_cpus() in order to allow controlled swapping of two
    tasks. It repurposes the stop_machine() state machine but only stops
    the two cpus which we can do with on-stack structures and avoid
    machine wide synchronization issues.

    The ordering of CPUs is important to avoid deadlocks. If unordered then
    two cpus calling stop_two_cpus on each other simultaneously would attempt
    to queue in the opposite order on each CPU causing an AB-BA style deadlock.
    By always having the lowest number CPU doing the queueing of works, we can
    guarantee that works are always queued in the same order, and deadlocks
    are avoided.

    Signed-off-by: Peter Zijlstra
    [ Implemented deadlock avoidance. ]
    Signed-off-by: Rik van Riel
    Cc: Andrea Arcangeli
    Cc: Johannes Weiner
    Cc: Srikar Dronamraju
    Signed-off-by: Mel Gorman
    Link: http://lkml.kernel.org/r/1381141781-10992-38-git-send-email-mgorman@suse.de
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

27 Feb, 2013

1 commit

  • commit 14e568e78 (stop_machine: Use smpboot threads) introduced the
    following regression:

    Before this commit the stopper enabled bit was set in the online
    notifier.

    CPU0 CPU1
    cpu_up
    cpu online
    hotplug_notifier(ONLINE)
    stopper(CPU1)->enabled = true;
    ...
    stop_machine()

    The conversion to smpboot threads moved the enablement to the wakeup
    path of the parked thread. The majority of users seem to have the
    following working order:

    CPU0 CPU1
    cpu_up
    cpu online
    unpark_threads()
    wakeup(stopper[CPU1])
    ....
    stopper thread runs
    stopper(CPU1)->enabled = true;
    stop_machine()

    But Konrad and Sander have observed:

    CPU0 CPU1
    cpu_up
    cpu online
    unpark_threads()
    wakeup(stopper[CPU1])
    ....
    stop_machine()
    stopper thread runs
    stopper(CPU1)->enabled = true;

    Now the stop machinery kicks CPU0 into the stop loop, where it gets
    stuck forever because the queue code saw stopper(CPU1)->enabled ==
    false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1
    stopper work got discarded due to enabled == false.

    Add a pre_unpark function to the smpboot thread descriptor and call it
    before waking the thread.

    This fixes the problem at hand, but the stop_machine code should be
    more robust. The stopper->enabled flag smells fishy at best.

    Thanks to Konrad for going through a loop of debug patches and
    providing the information to decode this issue.

    Reported-and-tested-by: Konrad Rzeszutek Wilk
    Reported-and-tested-by: Sander Eikelenboom
    Cc: Srivatsa S. Bhat
    Cc: Rusty Russell
    Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

14 Feb, 2013

2 commits

  • Use the smpboot thread infrastructure. Mark the stopper thread
    selfparking and park it after it has finished the take_cpu_down()
    work.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Cc: Paul McKenney
    Cc: Srivatsa S. Bhat
    Cc: Arjan van de Veen
    Cc: Paul Turner
    Cc: Richard Weinberger
    Cc: Magnus Damm
    Link: http://lkml.kernel.org/r/20130131120741.686315164@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • To allow the stopper thread being managed by the smpboot thread
    infrastructure separate out the task storage from the stopper data
    structure.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Cc: Paul McKenney
    Cc: Srivatsa S. Bhat
    Cc: Arjan van de Veen
    Cc: Paul Turner
    Cc: Richard Weinberger
    Cc: Magnus Damm
    Link: http://lkml.kernel.org/r/20130131120741.626690384@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

01 Nov, 2011

1 commit

  • Make stop_machine() safe to call early in boot, before SMP has been set
    up, by simply calling the callback function directly if there's only one
    CPU online.

    [ Fixes from AKPM:
    - add comment
    - local_irq_flags, not save_flags
    - also call hard_irq_disable() for systems which need it

    Tejun suggested using an explicit flag rather than just looking at
    the online cpu count. ]

    Cc: Tejun Heo
    Acked-by: Rusty Russell
    Cc: Peter Zijlstra
    Cc: H. Peter Anvin
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Acked-by: Tejun Heo
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge
     

31 Oct, 2011

1 commit

  • The changed files were only including linux/module.h for the
    EXPORT_SYMBOL infrastructure, and nothing else. Revector them
    onto the isolated export header for faster compile times.

    Nothing to see here but a whole lot of instances of:

    -#include
    +#include

    This commit is only changing the kernel dir; next targets
    will probably be mm, fs, the arch dirs, etc.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

28 Jun, 2011

4 commits

  • MTRR rendezvous sequence is not implemened using stop_machine() before, as this
    gets called both from the process context aswell as the cpu online paths
    (where the cpu has not come online and the interrupts are disabled etc).

    Now that we have a new stop_machine_from_inactive_cpu() API, use it for
    rendezvous during mtrr init of a logical processor that is coming online.

    For the rest (runtime MTRR modification, system boot, resume paths), use
    stop_machine() to implement the rendezvous sequence. This will consolidate and
    cleanup the code.

    Signed-off-by: Suresh Siddha
    Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     
  • Currently, mtrr wants stop_machine functionality while a CPU is being
    brought up. As stop_machine() requires the calling CPU to be active,
    mtrr implements its own stop_machine using stop_one_cpu() on each
    online CPU. This doesn't only unnecessarily duplicate complex logic
    but also introduces a possibility of deadlock when it races against
    the generic stop_machine().

    This patch implements stop_machine_from_inactive_cpu() to serve such
    use cases. Its functionality is basically the same as stop_machine();
    however, it should be called from a CPU which isn't active and doesn't
    depend on working scheduling on the calling CPU.

    This is achieved by using busy loops for synchronization and
    open-coding stop_cpus queuing and waiting with direct invocation of
    fn() for local CPU inbetween.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/20110623182056.982526827@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Suresh Siddha
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     
  • Refactor the queuing part of the stop cpus work from __stop_cpus() into
    queue_stop_cpus_work().

    The reorganization is to help future improvements to stop_machine()
    and doesn't introduce any behavior difference.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/20110623182056.897818337@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Suresh Siddha
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     
  • MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
    happen in parallel with another system wide rendezvous using
    stop_machine(). This can lead to deadlock (The order in which
    works are queued can be different on different cpu's. Some cpu's
    will be running the first rendezvous handler and others will be running
    the second rendezvous handler. Each set waiting for the other set to join
    for the system wide rendezvous, leading to a deadlock).

    MTRR rendezvous sequence is not implemented using stop_machine() as this
    gets called both from the process context aswell as the cpu online paths
    (where the cpu has not come online and the interrupts are disabled etc).
    stop_machine() works with only online cpus.

    For now, take the stop_machine mutex in the MTRR rendezvous sequence that
    gets called from an online cpu (here we are in the process context
    and can potentially sleep while taking the mutex). And the MTRR rendezvous
    that gets triggered during cpu online doesn't need to take this stop_machine
    lock (as the stop_machine() already ensures that there is no cpu hotplug
    going on in parallel by doing get_online_cpus())

    TBD: Pursue a cleaner solution of extending the stop_machine()
    infrastructure to handle the case where the calling cpu is
    still not online and use this for MTRR rendezvous sequence.

    fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008

    Reported-by: Vadim Kotelnikov
    Signed-off-by: Suresh Siddha
    Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
    Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

23 Mar, 2011

1 commit

  • ksoftirqd, kworker, migration, and pktgend kthreads can be created with
    kthread_create_on_node(), to get proper NUMA affinities for their stack and
    task_struct.

    Signed-off-by: Eric Dumazet
    Acked-by: David S. Miller
    Reviewed-by: Andi Kleen
    Acked-by: Rusty Russell
    Acked-by: Tejun Heo
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: David Howells
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

27 Oct, 2010

2 commits

  • In commit e6bde73b07edeb703d4c89c1daabc09c303de11f ("cpu-hotplug: return
    better errno on cpu hotplug failure"), the cpu notifier can return an
    encapsulated errno value.

    This converts the cpu notifier to return an encapsulated errno value for
    stop_machine().

    Signed-off-by: Akinobu Mita
    Cc: Rusty Russell
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • kernel/stop_machine.c: In function `cpu_stopper_thread':
    kernel/stop_machine.c:265: warning: unused variable `ksym_buf'

    ksym_buf[] is unused if WARN_ON() is a no-op.

    Signed-off-by: Rakib Mullick
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rakib Mullick
     

19 Oct, 2010

1 commit

  • In order to separate the stop/migrate work thread from the SCHED_FIFO
    implementation, create a special class for it that is of higher priority than
    SCHED_FIFO itself.

    This currently solves a problem where cpu-hotplug consumes so much cpu-time
    that the SCHED_FIFO class gets throttled, but has the bandwidth replenishment
    timer pending on the now dead cpu.

    It is also required for when we add the planned deadline scheduling class above
    SCHED_FIFO, as the stop/migrate thread still needs to transcent those tasks.

    Tested-by: Heiko Carstens
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

10 Aug, 2010

1 commit


31 May, 2010

1 commit

  • Problem: In a stress test where some heavy tests were running along with
    regular CPU offlining and onlining, a hang was observed. The system seems
    to be hung at a point where migration_call() tries to kill the
    migration_thread of the dying CPU, which just got moved to the current
    CPU. This migration thread does not get a chance to run (and die) since
    rt_throttled is set to 1 on current, and it doesn't get cleared as the
    hrtimer which is supposed to reset the rt bandwidth
    (sched_rt_period_timer) is tied to the CPU which we just marked dead!

    Solution: This patch pushes the killing of migration thread to
    "CPU_POST_DEAD" event. By then all the timers (including
    sched_rt_period_timer) should have got migrated (along with other
    callbacks).

    Signed-off-by: Amit Arora
    Signed-off-by: Gautham R Shenoy
    Acked-by: Tejun Heo
    Signed-off-by: Peter Zijlstra
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Amit K. Arora
     

18 May, 2010

1 commit


08 May, 2010

1 commit

  • When !CONFIG_SMP, cpu_stop functions weren't defined at all which
    could lead to build failures if UP code uses cpu_stop facility. Add
    dummy cpu_stop implementation for UP. The waiting variants execute
    the work function directly with preempt disabled and
    stop_one_cpu_nowait() schedules a workqueue work.

    Makefile and ifdefs around stop_machine implementation are updated to
    accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case.

    Signed-off-by: Tejun Heo
    Reported-by: Ingo Molnar

    Tejun Heo
     

07 May, 2010

3 commits

  • Currently migration_thread is serving three purposes - migration
    pusher, context to execute active_load_balance() and forced context
    switcher for expedited RCU synchronize_sched. All three roles are
    hardcoded into migration_thread() and determining which job is
    scheduled is slightly messy.

    This patch kills migration_thread and replaces all three uses with
    cpu_stop. The three different roles of migration_thread() are
    splitted into three separate cpu_stop callbacks -
    migration_cpu_stop(), active_load_balance_cpu_stop() and
    synchronize_sched_expedited_cpu_stop() - and each use case now simply
    asks cpu_stop to execute the callback as necessary.

    synchronize_sched_expedited() was implemented with private
    preallocated resources and custom multi-cpu queueing and waiting
    logic, both of which are provided by cpu_stop.
    synchronize_sched_expedited_count is made atomic and all other shared
    resources along with the mutex are dropped.

    synchronize_sched_expedited() also implemented a check to detect cases
    where not all the callback got executed on their assigned cpus and
    fall back to synchronize_sched(). If called with cpu hotplug blocked,
    cpu_stop already guarantees that and the condition cannot happen;
    otherwise, stop_machine() would break. However, this patch preserves
    the paranoid check using a cpumask to record on which cpus the stopper
    ran so that it can serve as a bisection point if something actually
    goes wrong theree.

    Because the internal execution state is no longer visible,
    rcu_expedited_torture_stats() is removed.

    This patch also renames cpu_stop threads to from "stopper/%d" to
    "migration/%d". The names of these threads ultimately don't matter
    and there's no reason to make unnecessary userland visible changes.

    With this patch applied, stop_machine() and sched now share the same
    resources. stop_machine() is faster without wasting any resources and
    sched migration users are much cleaner.

    Signed-off-by: Tejun Heo
    Acked-by: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Dipankar Sarma
    Cc: Josh Triplett
    Cc: Paul E. McKenney
    Cc: Oleg Nesterov
    Cc: Dimitri Sivanich

    Tejun Heo
     
  • Reimplement stop_machine using cpu_stop. As cpu stoppers are
    guaranteed to be available for all online cpus,
    stop_machine_create/destroy() are no longer necessary and removed.

    With resource management and synchronization handled by cpu_stop, the
    new implementation is much simpler. Asking the cpu_stop to execute
    the stop_cpu() state machine on all online cpus with cpu hotplug
    disabled is enough.

    stop_machine itself doesn't need to manage any global resources
    anymore, so all per-instance information is rolled into struct
    stop_machine_data and the mutex and all static data variables are
    removed.

    The previous implementation created and destroyed RT workqueues as
    necessary which made stop_machine() calls highly expensive on very
    large machines. According to Dimitri Sivanich, preventing the dynamic
    creation/destruction makes booting faster more than twice on very
    large machines. cpu_stop resources are preallocated for all online
    cpus and should have the same effect.

    Signed-off-by: Tejun Heo
    Acked-by: Rusty Russell
    Acked-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Dimitri Sivanich

    Tejun Heo
     
  • Implement a simplistic per-cpu maximum priority cpu monopolization
    mechanism. A non-sleeping callback can be scheduled to run on one or
    multiple cpus with maximum priority monopolozing those cpus. This is
    primarily to replace and unify RT workqueue usage in stop_machine and
    scheduler migration_thread which currently is serving multiple
    purposes.

    Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(),
    stop_cpus() and try_stop_cpus().

    This is to allow clean sharing of resources among stop_cpu and all the
    migration thread users. One stopper thread per cpu is created which
    is currently named "stopper/CPU". This will eventually replace the
    migration thread and take on its name.

    * This facility was originally named cpuhog and lived in separate
    files but Peter Zijlstra nacked the name and thus got renamed to
    cpu_stop and moved into stop_machine.c.

    * Better reporting of preemption leak as per Peter's suggestion.

    Signed-off-by: Tejun Heo
    Acked-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Dimitri Sivanich

    Tejun Heo
     

17 Feb, 2010

1 commit

  • Add __percpu sparse annotations to core subsystems.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Acked-by: Paul E. McKenney
    Cc: Jens Axboe
    Cc: linux-mm@kvack.org
    Cc: Rusty Russell
    Cc: Dipankar Sarma
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Eric Biederman

    Tejun Heo
     

30 Mar, 2009

1 commit


20 Feb, 2009

1 commit

  • Impact: cleanup

    There are two allocated per-cpu accessor macros with almost identical
    spelling. The original and far more popular is per_cpu_ptr (44
    files), so change over the other 4 files.

    tj: kill percpu_ptr() and update UP too

    Signed-off-by: Rusty Russell
    Cc: mingo@redhat.com
    Cc: lenb@kernel.org
    Cc: cpufreq@vger.kernel.org
    Signed-off-by: Tejun Heo

    Rusty Russell
     

05 Jan, 2009

1 commit

  • Introduce stop_machine_create/destroy. With this interface subsystems
    that need a non-failing stop_machine environment can create the
    stop_machine machine threads before actually calling stop_machine.
    When the threads aren't needed anymore they can be killed with
    stop_machine_destroy again.

    When stop_machine gets called and the threads aren't present they
    will be created and destroyed automatically. This restores the old
    behaviour of stop_machine.

    This patch also converts cpu hotplug to the new interface since it
    is special: cpu_down calls __stop_machine instead of stop_machine.
    However the kstop threads will only be created when stop_machine
    gets called.

    Changing the code so that the threads would be created automatically
    on __stop_machine is currently not possible: when __stop_machine gets
    called we hold cpu_add_remove_lock, which is the same lock that
    create_rt_workqueue would take. So the workqueue needs to be created
    before the cpu hotplug code locks cpu_add_remove_lock.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Rusty Russell

    Heiko Carstens
     

01 Jan, 2009

1 commit

  • Impact: Reduce stack usage, use new cpumask API.

    Mainly changing cpumask_t to 'struct cpumask' and similar simple API
    conversion. Two conversions worth mentioning:

    1) we use cpumask_any_but to avoid a temporary in kernel/softlockup.c,
    2) Use cpumask_var_t in taskstats_user_cmd().

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Cc: Balbir Singh
    Cc: Ingo Molnar

    Rusty Russell
     

17 Nov, 2008

1 commit


26 Oct, 2008

1 commit

  • This reverts commit a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635 by moving
    the call to init_workqueues() back where it belongs - after SMP has been
    initialized.

    It also moves stop_machine_init() - which needs workqueues - to a later
    phase using a core_initcall() instead of early_initcall(). That should
    satisfy all ordering requirements, and was apparently the reason why
    init_workqueues() was moved to be too early.

    Cc: Heiko Carstens
    Cc: Rusty Russell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

22 Oct, 2008

2 commits

  • Using |= for updating a value which might be updated on several cpus
    concurrently will not always work since we need to make sure that the
    update happens atomically.
    To fix this just use a write if the called function returns an error
    code on a cpu. We end up writing the error code of an arbitrary cpu
    if multiple ones fail but that should be sufficient.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Rusty Russell

    Heiko Carstens
     
  • Convert stop_machine to a workqueue based approach. Instead of using kernel
    threads for stop_machine we now use a an rt workqueue to synchronize all
    cpus.
    This has the advantage that all needed per cpu threads are already created
    when stop_machine gets called. And therefore a call to stop_machine won't
    fail anymore. This is needed for s390 which needs a mechanism to synchronize
    all cpus without allocating any memory.
    As Rusty pointed out free_module() needs a non-failing stop_machine interface
    as well.

    As a side effect the stop_machine code gets simplified.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Rusty Russell

    Heiko Carstens
     

12 Aug, 2008

1 commit


28 Jul, 2008

3 commits

  • Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all
    cpus), pass a cpumask_t. Allow NULL for the common case (where we
    don't care which CPU the function is run on): temporary cpumask_t's
    are usually considered bad for stack space.

    This deprecates stop_machine_run, to be removed soon when all the
    callers are dead.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • stop_machine creates a kthread which creates kernel threads. We can
    create those threads directly and simplify things a little. Some care
    must be taken with CPU hotunplug, which has special needs, but that code
    seems more robust than it was in the past.

    Signed-off-by: Rusty Russell
    Acked-by: Christian Borntraeger

    Rusty Russell
     
  • -allow stop_mahcine_run() to call a function on all cpus. Calling
    stop_machine_run() with a 'ALL_CPUS' invokes this new behavior.
    stop_machine_run() proceeds as normal until the calling cpu has
    invoked 'fn'. Then, we tell all the other cpus to call 'fn'.

    Signed-off-by: Jason Baron
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Rusty Russell
    CC: Adrian Bunk
    CC: Andi Kleen
    CC: Alexey Dobriyan
    CC: Christoph Hellwig
    CC: mingo@elte.hu
    CC: akpm@osdl.org

    Jason Baron