09 Jan, 2009

8 commits

  • Impact: cleanups, use new cpumask API

    Final trivial cleanups: mainly s/cpumask_t/struct cpumask

    Note there is a FIXME in generate_sched_domains(). A future patch will
    change struct cpumask *doms to struct cpumask *doms[].
    (I suppose Rusty will do this.)

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: use new cpumask API

    This patch mainly does the following things:
    - change cs->cpus_allowed from cpumask_t to cpumask_var_t
    - call alloc_bootmem_cpumask_var() for top_cpuset in cpuset_init_early()
    - call alloc_cpumask_var() for other cpusets
    - replace cpus_xxx() to cpumask_xxx()

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: cleanups, reduce stack usage

    This patch prepares for the next patch. When we convert
    cpuset.cpus_allowed to cpumask_var_t, (trialcs = *cs) no longer works.

    Another result of this patch is reducing stack usage of trialcs.
    sizeof(*cs) can be as large as 148 bytes on x86_64, so it's really not
    good to have it on stack.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Allocate a global cpumask_var_t at boot, and use it in cpuset_attach(), so
    we won't fail cpuset_attach().

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Just use cs->cpus_allowed, and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • This patchset converts cpuset to use new cpumask API, and thus
    remove on stack cpumask_t to reduce stack usage.

    Before:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    21
    After:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    0

    This patch:

    Impact: reduce stack usage

    It's safe to call cpulist_scnprintf inside callback_mutex, and thus we can
    just remove the cpumask_t and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • I found a bug on my dual-cpu box. I created a sub cpuset in top cpuset
    and assign 1 to its cpus. And then we attach some tasks into this sub
    cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset
    are moved into top cpuset automatically because there is no cpu in sub
    cpuset. Then we online CPU1, we find all the tasks which doesn't belong
    to top cpuset originally just run on CPU0.

    We fix this bug by setting task's cpu_allowed to cpu_possible_map when
    attaching it into top cpuset. This method needn't modify the current
    behavior of cpusets on CPU hotplug, and all of tasks in top cpuset use
    cpu_possible_map to initialize their cpu_allowed.

    Signed-off-by: Miao Xie
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • task_cs() calls task_subsys_state().

    We must use rcu_read_lock() to protect cgroup_subsys_state().

    It's correct that top_cpuset is never freed, but cgroup_subsys_state()
    accesses css_set, this css_set maybe freed when task_cs() called.

    We use use rcu_read_lock() to protect it.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: KAMEZAWA Hiroyuki
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     

07 Jan, 2009

1 commit

  • When cpusets are enabled, it's necessary to print the triggering task's
    set of allowable nodes so the subsequently printed meminfo can be
    interpreted correctly.

    We also print the task's cpuset name for informational purposes.

    [rientjes@google.com: task lock current before dereferencing cpuset]
    Cc: Paul Menage
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

13 Dec, 2008

1 commit

  • …t_scnprintf to take pointers.

    Impact: change calling convention of existing cpumask APIs

    Most cpumask functions started with cpus_: these have been replaced by
    cpumask_ ones which take struct cpumask pointers as expected.

    These four functions don't have good replacement names; fortunately
    they're rarely used, so we just change them over.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Mike Travis <travis@sgi.com>
    Acked-by: Ingo Molnar <mingo@elte.hu>
    Cc: paulus@samba.org
    Cc: mingo@redhat.com
    Cc: tony.luck@intel.com
    Cc: ralf@linux-mips.org
    Cc: Greg Kroah-Hartman <gregkh@suse.de>
    Cc: cl@linux-foundation.org
    Cc: srostedt@redhat.com

    Rusty Russell
     

30 Nov, 2008

1 commit

  • this warning:

    kernel/cpuset.c: In function ‘generate_sched_domains’:
    kernel/cpuset.c:588: warning: ‘ndoms’ may be used uninitialized in this function

    triggers because GCC does not recognize that ndoms stays uninitialized
    only if doms is NULL - but that flow is covered at the end of
    generate_sched_domains().

    Help out GCC by initializing this variable to 0. (that's prudent anyway)

    Also, this function needs a splitup and code flow simplification:
    with 160 lines length it's clearly too long.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

20 Nov, 2008

1 commit

  • After adding a node into the machine, top cpuset's mems isn't updated.

    By reviewing the code, we found that the update function

    cpuset_track_online_nodes()

    was invoked after node_states[N_ONLINE] changes. It is wrong because
    N_ONLINE just means node has pgdat, and if node has/added memory, we use
    N_HIGH_MEMORY. So, We should invoke the update function after
    node_states[N_HIGH_MEMORY] changes, just like its commit says.

    This patch fixes it. And we use notifier of memory hotplug instead of
    direct calling of cpuset_track_online_nodes().

    Signed-off-by: Miao Xie
    Acked-by: Yasunori Goto
    Cc: David Rientjes
    Cc: Paul Menage
    Signed-off-by: Linus Torvalds

    Miao Xie
     

18 Nov, 2008

1 commit

  • Impact: properly rebuild sched-domains on kmalloc() failure

    When cpuset failed to generate sched domains due to kmalloc()
    failure, the scheduler should fallback to the single partition
    'fallback_doms' and rebuild sched domains, but now it only
    destroys but not rebuilds sched domains.

    The regression was introduced by:

    | commit dfb512ec4834116124da61d6c1ee10fd0aa32bd6
    | Author: Max Krasnyansky
    | Date: Fri Aug 29 13:11:41 2008 -0700
    |
    | sched: arch_reinit_sched_domains() must destroy domains to force rebuild

    After the above commit, partition_sched_domains(0, NULL, NULL) will
    only destroy sched domains and partition_sched_domains(1, NULL, NULL)
    will create the default sched domain.

    Signed-off-by: Li Zefan
    Cc: Max Krasnyansky
    Cc:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

20 Oct, 2008

2 commits

  • 1) seq_file excepts that m->count == m->size when it's buf is full,
    so current code will causes bugs when buf is overflow.

    2) There is not too good that cpuset accesses struct seq_file's
    fields directly.

    Signed-off-by: Lai Jiangshan
    Cc: Alexey Dobriyan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • Remove the use of int cpus_nonempty variable from 'update_flag' function.

    Signed-off-by: Md.Rakib H. Mullick
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rakib Mullick
     

03 Oct, 2008

1 commit

  • This fixes a warning on latest -tip:

    kernel/cpuset.c: Dans la fonction «scan_for_empty_cpusets» :
    kernel/cpuset.c:1932: attention : passing argument 1 of «list_add_tail» discards qualifiers from pointer target type

    Actually the struct cpuset *root passed in parameter to scan_for_empty_cpusets
    is not supposed to be const since an entry is added on the tail of its list.
    Just correct the qualifier.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

14 Sep, 2008

1 commit

  • After the patch:

    commit 0b2f630a28d53b5a2082a5275bc3334b10373508
    Author: Miao Xie
    Date: Fri Jul 25 01:47:21 2008 -0700

    cpusets: restructure the function update_cpumask() and update_nodemask()

    It might happen that 'echo 0 > /cpuset/sub/cpus' returned failure but 'cpus'
    has been changed, because cpus was changed before calling heap_init() which
    may return -ENOMEM.

    This patch restores the orginal behavior.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

14 Aug, 2008

1 commit

  • This is an updated version of my previous cpuset patch on top of
    the latest mainline git.
    The patch fixes CPU hotplug handling issues in the current cpusets code.
    Namely circular locking in rebuild_sched_domains() and unsafe access to
    the cpu_online_map in the cpuset cpu hotplug handler.

    This version includes changes suggested by Paul Jackson (naming, comments,
    style, etc). I also got rid of the separate workqueue thread because it is
    now safe to call get_online_cpus() from workqueue callbacks.

    Here are some more details:

    rebuild_sched_domains() is the only way to rebuild sched domains
    correctly based on the current cpuset settings. What this means
    is that we need to be able to call it from different contexts,
    like cpu hotplug for example.
    Also latest scheduler code in -tip now calls rebuild_sched_domains()
    directly from functions like arch_reinit_sched_domains().

    In order to support that properly we need to rework cpuset locking
    rules to avoid circular dependencies, which is what this patch does.
    New lock nesting rules are explained in the comments.
    We can now safely call rebuild_sched_domains() from virtually any
    context. The only requirement is that it needs to be called under
    get_online_cpus(). This allows cpu hotplug handlers and the scheduler
    to call rebuild_sched_domains() directly.
    The rest of the cpuset code now offloads sched domains rebuilds to
    a workqueue (async_rebuild_sched_domains()).

    This version of the patch addresses comments from the previous review.
    I fixed all miss-formated comments and trailing spaces.

    I also factored out the code that builds domain masks and split up CPU and
    memory hotplug handling. This was needed to simplify locking, to avoid unsafe
    access to the cpu_online_map from mem hotplug handler, and in general to make
    things cleaner.

    The patch passes moderate testing (building kernel with -j 16, creating &
    removing domains and bringing cpus off/online at the same time) on the
    quad-core2 based machine.

    It passes lockdep checks, even with preemptable RCU enabled.
    This time I also tested in with suspend/resume path and everything is working
    as expected.

    Signed-off-by: Max Krasnyansky
    Acked-by: Paul Jackson
    Cc: menage@google.com
    Cc: a.p.zijlstra@chello.nl
    Cc: vegard.nossum@gmail.com
    Signed-off-by: Ingo Molnar

    Max Krasnyansky
     

31 Jul, 2008

4 commits

  • Use cpuset.stack_list rather than kfifo, so we avoid memory allocation
    for kfifo.

    Signed-off-by: Li Zefan
    Signed-off-by: Lai Jiangshan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • When multiple cpusets are overlapping in their 'cpus' and hence they
    form a single sched domain, the largest sched_relax_domain_level among
    those should be used. But when top_cpuset's sched_load_balance is
    set, its sched_relax_domain_level is used regardless other sub-cpusets'.

    This patch fixes it by walking the cpuset hierarchy to find the largest
    sched_relax_domain_level.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Reviewed-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • All child cpusets contain a subset of the parent's cpus, so we can skip
    them when partitioning sched domains. This decreases 'csa' greately for
    cpusets with multi-level hierarchy.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Reviewed-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • clean up hierarchy traversal code

    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Jackson
    Cc: Cliff Wickman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

26 Jul, 2008

7 commits

  • In cpuset_update_task_memory_state() local variable struct task_struct
    *tsk = current;

    And local variable tsk is used 14 times and statement task_cs(tsk) is used
    twice in this function. So using task_cs(tsk) instead of task_cs(current)
    is better for readability.

    And "(struct cgroup_scanner *)&scan" is not good for readability also.
    (and "container_of" is used in cpuset_do_move_task(), not
    "(cpuset_hotplug_scanner *)scan")

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • cgroup(cgroup_scan_tasks) will initialize heap->gt for us. This patch
    removes started_after() and its helper-function.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • I create lots of empty cpusets(empty cpumasks) and turn off the
    "sched_load_balance" in top cpuset.

    I found that all these empty cpumasks are passed to
    partition_sched_domains() in rebuild_sched_domains(), it's very
    time-consuming for partition_sched_domains() and it's not need.

    It also reduce memory consumed and some works in rebuild_sched_domains()
    too.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • When changing 'sched_relax_domain_level', don't rebuild sched domains if
    'cpus' is empty or 'sched_load_balance' is not set.

    Also make the comments of rebuild_sched_domains() more readable.

    Signed-off-by: Li Zefan
    Cc: Hidetoshi Seto
    Cc: Paul Jackson
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • The bug is that a task may run on the cpu/node which is not in its
    cpuset.cpus/ cpuset.mems.

    It can be reproduced by the following commands:
    -----------------------------------
    # mkdir /dev/cpuset
    # mount -t cpuset xxx /dev/cpuset
    # mkdir /dev/cpuset/0
    # echo 0-1 > /dev/cpuset/0/cpus
    # echo 0 > /dev/cpuset/0/mems
    # echo $$ > /dev/cpuset/0/tasks
    # echo 0 > /sys/devices/system/cpu/cpu1/online
    # echo 1 > /sys/devices/system/cpu/cpu1/online
    -----------------------------------

    There is only CPU0 in cpuset.cpus, but the task in this cpuset runs on
    both CPU0 and CPU1.

    It is because the task's cpu_allowed didn't get updated after we did CPU
    offline/online manipulation. Similar for mem_allowed.

    This patch fixes this bug expect for root cpuset. Because there is a
    problem about root cpuset, in that whether it is necessary to update all
    the tasks in root cpuset or not after cpu/node offline/online.

    If updating, some kernel threads which is bound into a specified cpu will
    be unbound.

    If not updating, there is a bug in root cpuset. This bug is also caused
    by offline/online manipulation. For example, there is a dual-cpu machine.
    we create a sub cpuset in root cpuset and assign 1 to its cpus. And then
    we attach some tasks into this sub cpuset. After this, we offline CPU1.
    Now, the tasks in this new cpuset are moved into root cpuset automatically
    because there is no cpu in sub cpuset. Then we online CPU1, we find all
    the tasks which doesn't belong to root cpuset originally just run on CPU0.

    Maybe we need to add a flag in the task_struct to mark which task can't be
    unbound?

    Signed-off-by: Miao Xie
    Acked-by: Paul Jackson
    Cc: Li Zefan
    Cc: Paul Jackson
    Cc: Paul Menage
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • Extract two functions from update_cpumask() and update_nodemask().They
    will be used later for updating tasks' cpus_allowed and mems_allowed after
    CPU/NODE offline/online.

    [lizf@cn.fujitsu.com: build fix]
    Signed-off-by: Miao Xie
    Acked-by: Paul Jackson
    Cc: David Rientjes
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • This patch tweaks the signatures of the update_cpumask() and
    update_nodemask() functions so that they can be called directly as
    handlers for the new cgroups write_string() method.

    This allows cpuset_common_file_write() to be removed.

    Signed-off-by: Paul Menage
    Cc: Paul Jackson
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: Serge Hallyn
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     

24 Jul, 2008

1 commit

  • * 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: hrtick_enabled() should use cpu_active()
    sched, x86: clean up hrtick implementation
    sched: fix build error, provide partition_sched_domains() unconditionally
    sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
    cpu hotplug: Make cpu_active_map synchronization dependency clear
    cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
    sched: rework of "prioritize non-migratable tasks over migratable ones"
    sched: reduce stack size in isolated_cpu_setup()
    Revert parts of "ftrace: do not trace scheduler functions"

    Fixed up conflicts in include/asm-x86/thread_info.h (due to the
    TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
    kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
    introduction).

    Linus Torvalds
     

23 Jul, 2008

1 commit

  • Fix wrong domain attr updates, or we will always update the first sched
    domain attr.

    Signed-off-by: Miao Xie
    Cc: Hidetoshi Seto
    Cc: Paul Jackson
    Cc: Nick Piggin
    Cc: Ingo Molnar
    Cc: [2.6.26.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     

18 Jul, 2008

1 commit

  • This is based on Linus' idea of creating cpu_active_map that prevents
    scheduler load balancer from migrating tasks to the cpu that is going
    down.

    It allows us to simplify domain management code and avoid unecessary
    domain rebuilds during cpu hotplug event handling.

    Please ignore the cpusets part for now. It needs some more work in order
    to avoid crazy lock nesting. Although I did simplfy and unify domain
    reinitialization logic. We now simply call partition_sched_domains() in
    all the cases. This means that we're using exact same code paths as in
    cpusets case and hence the test below cover cpusets too.
    Cpuset changes to make rebuild_sched_domains() callable from various
    contexts are in the separate patch (right next after this one).

    This not only boots but also easily handles
    while true; do make clean; make -j 8; done
    and
    while true; do on-off-cpu 1; done
    at the same time.
    (on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).

    Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
    this on right now in gnome-terminal and things are moving just fine.

    Also this is running with most of the debug features enabled (lockdep,
    mutex, etc) no BUG_ONs or lockdep complaints so far.

    I believe I addressed all of the Dmitry's comments for original Linus'
    version. I changed both fair and rt balancer to mask out non-active cpus.
    And replaced cpu_is_offline() with !cpu_active() in the main scheduler
    code where it made sense (to me).

    Signed-off-by: Max Krasnyanskiy
    Acked-by: Linus Torvalds
    Acked-by: Peter Zijlstra
    Acked-by: Gregory Haskins
    Cc: dmitry.adamushko@gmail.com
    Cc: pj@sgi.com
    Signed-off-by: Ingo Molnar

    Max Krasnyansky
     

14 Jul, 2008

1 commit


13 Jul, 2008

1 commit

  • Commit f18f982ab ("sched: CPU hotplug events must not destroy scheduler
    domains created by the cpusets") introduced a hotplug-related problem as
    described below:

    Upon CPU_DOWN_PREPARE,

    update_sched_domains() -> detach_destroy_domains(&cpu_online_map)

    does the following:

    /*
    * Force a reinitialization of the sched domains hierarchy. The domains
    * and groups cannot be updated in place without racing with the balancing
    * code, so we temporarily attach all running cpus to the NULL domain
    * which will prevent rebalancing while the sched domains are recalculated.
    */

    The sched-domains should be rebuilt when a CPU_DOWN ops. has been
    completed, effectively either upon CPU_DEAD{_FROZEN} (upon success) or
    CPU_DOWN_FAILED{_FROZEN} (upon failure -- restore the things to their
    initial state). That's what update_sched_domains() also does but only
    for !CPUSETS case.

    With f18f982ab, sched-domains' reinitialization is delegated to
    CPUSETS code:

    cpuset_handle_cpuhp() -> common_cpu_mem_hotplug_unplug() ->
    rebuild_sched_domains()

    Being called for CPU_UP_PREPARE and if its callback is called after
    update_sched_domains()), it just negates all the work done by
    update_sched_domains() -- i.e. a soon-to-be-offline cpu is included in
    the sched-domains and that makes it visible for the load-balancer
    while the CPU_DOWN ops. is in progress.

    __migrate_live_tasks() moves the tasks off a 'dead' cpu (it's already
    "offline" when this function is called).

    try_to_wake_up() is called for one of these tasks from another CPU ->
    the load-balancer (wake_idle()) picks up a "dead" CPU and places the
    task on it. Then e.g. BUG_ON(rq->nr_running) detects this a bit later
    -> oops.

    Signed-off-by: Dmitry Adamushko
    Tested-by: Vegard Nossum
    Cc: Paul Menage
    Cc: Max Krasnyansky
    Cc: Paul Jackson
    Cc: Peter Zijlstra
    Cc: miaox@cn.fujitsu.com
    Cc: rostedt@goodmis.org
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Dmitry Adamushko
     

23 Jun, 2008

1 commit


21 Jun, 2008

1 commit


19 Jun, 2008

2 commits

  • We allow the inputs to be [-1 ... SD_LV_MAX), and return -EINVAL
    for inputs outside this range.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Acked-by: Paul Jackson
    Acked-by: Hidetoshi Seto
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Li Zefan
     
  • First issue is not related to the cpusets. We're simply leaking doms_cur.
    It's allocated in arch_init_sched_domains() which is called for every
    hotplug event. So we just keep reallocation doms_cur without freeing it.
    I introduced free_sched_domains() function that cleans things up.

    Second issue is that sched domains created by the cpusets are
    completely destroyed by the CPU hotplug events. For all CPU hotplug
    events scheduler attaches all CPUs to the NULL domain and then puts
    them all into the single domain thereby destroying domains created
    by the cpusets (partition_sched_domains).
    The solution is simple, when cpusets are enabled scheduler should not
    create default domain and instead let cpusets do that. Which is
    exactly what the patch does.

    Signed-off-by: Max Krasnyansky
    Cc: pj@sgi.com
    Cc: menage@google.com
    Cc: rostedt@goodmis.org
    Acked-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner

    Max Krasnyansky
     

16 Jun, 2008

1 commit


10 Jun, 2008

1 commit

  • Kthreads that have called kthread_bind() are bound to specific cpus, so
    other tasks should not be able to change their cpus_allowed from under
    them. Otherwise, it is possible to move kthreads, such as the migration
    or software watchdog threads, so they are not allowed access to the cpu
    they work on.

    Cc: Peter Zijlstra
    Cc: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: David Rientjes
    Signed-off-by: Ingo Molnar

    David Rientjes