17 Jun, 2009

3 commits

  • Fix allocating page cache/slab object on the unallowed node when memory
    spread is set by updating tasks' mems_allowed after its cpuset's mems is
    changed.

    In order to update tasks' mems_allowed in time, we must modify the code of
    memory policy. Because the memory policy is applied in the process's
    context originally. After applying this patch, one task directly
    manipulates anothers mems_allowed, and we use alloc_lock in the
    task_struct to protect mems_allowed and memory policy of the task.

    But in the fast path, we didn't use lock to protect them, because adding a
    lock may lead to performance regression. But if we don't add a lock,the
    task might see no nodes when changing cpuset's mems_allowed to some
    non-overlapping set. In order to avoid it, we set all new allowed nodes,
    then clear newly disallowed ones.

    [lee.schermerhorn@hp.com:
    The rework of mpol_new() to extract the adjusting of the node mask to
    apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
    with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
    allocation. Fix this by adding the check for MPOL_PREFERRED and empty
    node mask to mpol_new_mpolicy().

    Remove the now unneeded 'nodes = NULL' from mpol_new().

    Note that mpol_new_mempolicy() is always called with a non-NULL
    'nodes' parameter now that it has been removed from mpol_new().
    Therefore, we don't need to test nodes for NULL before testing it for
    'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to
    verify this assumption.]
    [lee.schermerhorn@hp.com:

    I don't think the function name 'mpol_new_mempolicy' is descriptive
    enough to differentiate it from mpol_new().

    This function applies cpuset set context, usually constraining nodes
    to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag
    is set, it also translates the nodes. So I settled on
    'mpol_set_nodemask()', because the comment block for mpol_new() mentions
    that we need to call this function to "set nodes".

    Some additional minor line length, whitespace and typo cleanup.]
    Signed-off-by: Miao Xie
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Christoph Lameter
    Cc: Paul Menage
    Cc: Nick Piggin
    Cc: Yasunori Goto
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • Fix the bug that the kernel didn't spread page cache/slab object evenly
    over all the allowed nodes when spread flags were set by updating tasks'
    page/slab spread flags in time.

    Signed-off-by: Miao Xie
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Christoph Lameter
    Cc: Paul Menage
    Cc: Nick Piggin
    Cc: Yasunori Goto
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • The kernel still allocates the page caches on old node after modifying its
    cpuset's mems when 'memory_spread_page' was set, or it didn't spread the
    page cache evenly over all the nodes that faulting task is allowed to usr
    after memory_spread_page was set. it is caused by the old mem_allowed and
    flags of the task, the current kernel doesn't updates them unless some
    function invokes cpuset_update_task_memory_state(), it is too late
    sometimes.We must update the mem_allowed and the flags of the tasks in
    time.

    Slab has the same problem.

    The following patches fix this bug by updating tasks' mem_allowed and
    spread flag after its cpuset's mems or spread flag is changed.

    This patch:

    Extract a function from cpuset_update_task_memory_state(). It will be
    used later for update tasks' page/slab spread flags after its cpuset's
    flag is set

    Signed-off-by: Miao Xie
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Christoph Lameter
    Cc: Paul Menage
    Cc: Nick Piggin
    Cc: Yasunori Goto
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     

12 Jun, 2009

1 commit


03 Apr, 2009

8 commits

  • Kthreads that have the PF_THREAD_BOUND bit set in their flags are bound to a
    specific cpu. Thus, their set of allowed cpus shall not change.

    This patch prevents such threads from attaching to non-root cpusets. They do
    not have mempolicies that restrict them to a subset of system nodes and, since
    their cpumask may never change, they cannot use any of the features of
    cpusets.

    The tasks will forever be a member of the root cpuset and will be returned
    when listing the tasks attached to that cpuset.

    Cc: Paul Menage
    Cc: Peter Zijlstra
    Cc: Dhaval Giani
    Signed-off-by: David Rientjes
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Allow cpusets to be configured/built on non-SMP systems

    Currently it's impossible to build cpusets under UML on x86-64, since
    cpusets depends on SMP and x86-64 UML doesn't support SMP.

    There's code in cpusets that doesn't depend on SMP. This patch surrounds
    the minimum amount of cpusets code with #ifdef CONFIG_SMP in order to
    allow cpusets to build/run on UP systems (for testing purposes under UML).

    Reviewed-by: Li Zefan
    Signed-off-by: Paul Menage
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • The cpuset_zone_allowed() variants are actually only a function of the
    zone's node.

    Cc: Paul Menage
    Acked-by: Christoph Lameter
    Cc: Randy Dunlap
    Signed-off-by: David Rientjes
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Use cgroup_scanner.data, instead of introducing cpuset_hotplug_scanner.

    Signed-off-by: Li Zefan
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Menage
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • When writing to cpuset.mems, cpuset has to update its mems_allowed before
    calling update_tasks_nodemask(), but this function might return -ENOMEM.

    To avoid this rare case, we allocate the memory before changing
    mems_allowed, and then pass to update_tasks_nodemask(). Similar to what
    update_cpumask() does.

    Signed-off-by: Li Zefan
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Menage
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • This patch uses cgroup_scan_tasks() to rebind tasks' vmas to new cpuset's
    mems_allowed.

    Not only simplify the code largely, but also avoid allocating an array to
    hold mm pointers of all the tasks in the cpuset. This array can be big
    (size > PAGESIZE) if we have lots of tasks in that cpuset, thus has a
    chance to fail the allocation when under memory stress.

    Signed-off-by: Li Zefan
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Menage
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Change to cpuset->cpus_allowed and cpuset->mems_allowed should be protected
    by callback_mutex, otherwise the reader may read wrong cpus/mems. This is
    cpuset's lock rule.

    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • We have some read-only files and write-only files, but currently they are
    all set to 0644, which is counter-intuitive and cause trouble for some
    cgroup tools like libcgroup.

    This patch adds 'mode' to struct cftype to allow cgroup subsys to set it's
    own files' file mode, and for the most cases cft->mode can be default to 0
    and cgroup will figure out proper mode.

    Acked-by: Paul Menage
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

19 Jan, 2009

1 commit

  • Lockdep reported some possible circular locking info when we tested cpuset on
    NUMA/fake NUMA box.

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.29-rc1-00224-ga652504 #111
    -------------------------------------------------------
    bash/2968 is trying to acquire lock:
    (events){--..}, at: [] flush_work+0x24/0xd8

    but task is already holding lock:
    (cgroup_mutex){--..}, at: [] cgroup_lock_live_group+0x12/0x29

    which lock already depends on the new lock.
    ......
    -------------------------------------------------------

    Steps to reproduce:
    # mkdir /dev/cpuset
    # mount -t cpuset xxx /dev/cpuset
    # mkdir /dev/cpuset/0
    # echo 0 > /dev/cpuset/0/cpus
    # echo 0 > /dev/cpuset/0/mems
    # echo 1 > /dev/cpuset/0/memory_migrate
    # cat /dev/zero > /dev/null &
    # echo $! > /dev/cpuset/0/tasks

    This is because async_rebuild_sched_domains has the following lock sequence:
    run_workqueue(async_rebuild_sched_domains)
    -> do_rebuild_sched_domains -> cgroup_lock

    But, attaching tasks when memory_migrate is set has following:
    cgroup_lock_live_group(cgroup_tasks_write)
    -> do_migrate_pages -> flush_work

    This patch fixes it by using a separate workqueue thread.

    Signed-off-by: Miao Xie
    Signed-off-by: Lai Jiangshan
    Signed-off-by: Ingo Molnar

    Miao Xie
     

16 Jan, 2009

1 commit

  • Move Documentation/cpusets.txt and Documentation/controllers/* to
    Documentation/cgroups/

    Signed-off-by: Li Zefan
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Balbir Singh
    Acked-by: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

09 Jan, 2009

8 commits

  • Impact: cleanups, use new cpumask API

    Final trivial cleanups: mainly s/cpumask_t/struct cpumask

    Note there is a FIXME in generate_sched_domains(). A future patch will
    change struct cpumask *doms to struct cpumask *doms[].
    (I suppose Rusty will do this.)

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: use new cpumask API

    This patch mainly does the following things:
    - change cs->cpus_allowed from cpumask_t to cpumask_var_t
    - call alloc_bootmem_cpumask_var() for top_cpuset in cpuset_init_early()
    - call alloc_cpumask_var() for other cpusets
    - replace cpus_xxx() to cpumask_xxx()

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: cleanups, reduce stack usage

    This patch prepares for the next patch. When we convert
    cpuset.cpus_allowed to cpumask_var_t, (trialcs = *cs) no longer works.

    Another result of this patch is reducing stack usage of trialcs.
    sizeof(*cs) can be as large as 148 bytes on x86_64, so it's really not
    good to have it on stack.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Allocate a global cpumask_var_t at boot, and use it in cpuset_attach(), so
    we won't fail cpuset_attach().

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Just use cs->cpus_allowed, and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • This patchset converts cpuset to use new cpumask API, and thus
    remove on stack cpumask_t to reduce stack usage.

    Before:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    21
    After:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    0

    This patch:

    Impact: reduce stack usage

    It's safe to call cpulist_scnprintf inside callback_mutex, and thus we can
    just remove the cpumask_t and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • I found a bug on my dual-cpu box. I created a sub cpuset in top cpuset
    and assign 1 to its cpus. And then we attach some tasks into this sub
    cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset
    are moved into top cpuset automatically because there is no cpu in sub
    cpuset. Then we online CPU1, we find all the tasks which doesn't belong
    to top cpuset originally just run on CPU0.

    We fix this bug by setting task's cpu_allowed to cpu_possible_map when
    attaching it into top cpuset. This method needn't modify the current
    behavior of cpusets on CPU hotplug, and all of tasks in top cpuset use
    cpu_possible_map to initialize their cpu_allowed.

    Signed-off-by: Miao Xie
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • task_cs() calls task_subsys_state().

    We must use rcu_read_lock() to protect cgroup_subsys_state().

    It's correct that top_cpuset is never freed, but cgroup_subsys_state()
    accesses css_set, this css_set maybe freed when task_cs() called.

    We use use rcu_read_lock() to protect it.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: KAMEZAWA Hiroyuki
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     

07 Jan, 2009

1 commit

  • When cpusets are enabled, it's necessary to print the triggering task's
    set of allowable nodes so the subsequently printed meminfo can be
    interpreted correctly.

    We also print the task's cpuset name for informational purposes.

    [rientjes@google.com: task lock current before dereferencing cpuset]
    Cc: Paul Menage
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

13 Dec, 2008

1 commit

  • …t_scnprintf to take pointers.

    Impact: change calling convention of existing cpumask APIs

    Most cpumask functions started with cpus_: these have been replaced by
    cpumask_ ones which take struct cpumask pointers as expected.

    These four functions don't have good replacement names; fortunately
    they're rarely used, so we just change them over.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Mike Travis <travis@sgi.com>
    Acked-by: Ingo Molnar <mingo@elte.hu>
    Cc: paulus@samba.org
    Cc: mingo@redhat.com
    Cc: tony.luck@intel.com
    Cc: ralf@linux-mips.org
    Cc: Greg Kroah-Hartman <gregkh@suse.de>
    Cc: cl@linux-foundation.org
    Cc: srostedt@redhat.com

    Rusty Russell
     

30 Nov, 2008

1 commit

  • this warning:

    kernel/cpuset.c: In function ‘generate_sched_domains’:
    kernel/cpuset.c:588: warning: ‘ndoms’ may be used uninitialized in this function

    triggers because GCC does not recognize that ndoms stays uninitialized
    only if doms is NULL - but that flow is covered at the end of
    generate_sched_domains().

    Help out GCC by initializing this variable to 0. (that's prudent anyway)

    Also, this function needs a splitup and code flow simplification:
    with 160 lines length it's clearly too long.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

20 Nov, 2008

1 commit

  • After adding a node into the machine, top cpuset's mems isn't updated.

    By reviewing the code, we found that the update function

    cpuset_track_online_nodes()

    was invoked after node_states[N_ONLINE] changes. It is wrong because
    N_ONLINE just means node has pgdat, and if node has/added memory, we use
    N_HIGH_MEMORY. So, We should invoke the update function after
    node_states[N_HIGH_MEMORY] changes, just like its commit says.

    This patch fixes it. And we use notifier of memory hotplug instead of
    direct calling of cpuset_track_online_nodes().

    Signed-off-by: Miao Xie
    Acked-by: Yasunori Goto
    Cc: David Rientjes
    Cc: Paul Menage
    Signed-off-by: Linus Torvalds

    Miao Xie
     

18 Nov, 2008

1 commit

  • Impact: properly rebuild sched-domains on kmalloc() failure

    When cpuset failed to generate sched domains due to kmalloc()
    failure, the scheduler should fallback to the single partition
    'fallback_doms' and rebuild sched domains, but now it only
    destroys but not rebuilds sched domains.

    The regression was introduced by:

    | commit dfb512ec4834116124da61d6c1ee10fd0aa32bd6
    | Author: Max Krasnyansky
    | Date: Fri Aug 29 13:11:41 2008 -0700
    |
    | sched: arch_reinit_sched_domains() must destroy domains to force rebuild

    After the above commit, partition_sched_domains(0, NULL, NULL) will
    only destroy sched domains and partition_sched_domains(1, NULL, NULL)
    will create the default sched domain.

    Signed-off-by: Li Zefan
    Cc: Max Krasnyansky
    Cc:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

20 Oct, 2008

2 commits

  • 1) seq_file excepts that m->count == m->size when it's buf is full,
    so current code will causes bugs when buf is overflow.

    2) There is not too good that cpuset accesses struct seq_file's
    fields directly.

    Signed-off-by: Lai Jiangshan
    Cc: Alexey Dobriyan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • Remove the use of int cpus_nonempty variable from 'update_flag' function.

    Signed-off-by: Md.Rakib H. Mullick
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rakib Mullick
     

03 Oct, 2008

1 commit

  • This fixes a warning on latest -tip:

    kernel/cpuset.c: Dans la fonction «scan_for_empty_cpusets» :
    kernel/cpuset.c:1932: attention : passing argument 1 of «list_add_tail» discards qualifiers from pointer target type

    Actually the struct cpuset *root passed in parameter to scan_for_empty_cpusets
    is not supposed to be const since an entry is added on the tail of its list.
    Just correct the qualifier.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

14 Sep, 2008

1 commit

  • After the patch:

    commit 0b2f630a28d53b5a2082a5275bc3334b10373508
    Author: Miao Xie
    Date: Fri Jul 25 01:47:21 2008 -0700

    cpusets: restructure the function update_cpumask() and update_nodemask()

    It might happen that 'echo 0 > /cpuset/sub/cpus' returned failure but 'cpus'
    has been changed, because cpus was changed before calling heap_init() which
    may return -ENOMEM.

    This patch restores the orginal behavior.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

14 Aug, 2008

1 commit

  • This is an updated version of my previous cpuset patch on top of
    the latest mainline git.
    The patch fixes CPU hotplug handling issues in the current cpusets code.
    Namely circular locking in rebuild_sched_domains() and unsafe access to
    the cpu_online_map in the cpuset cpu hotplug handler.

    This version includes changes suggested by Paul Jackson (naming, comments,
    style, etc). I also got rid of the separate workqueue thread because it is
    now safe to call get_online_cpus() from workqueue callbacks.

    Here are some more details:

    rebuild_sched_domains() is the only way to rebuild sched domains
    correctly based on the current cpuset settings. What this means
    is that we need to be able to call it from different contexts,
    like cpu hotplug for example.
    Also latest scheduler code in -tip now calls rebuild_sched_domains()
    directly from functions like arch_reinit_sched_domains().

    In order to support that properly we need to rework cpuset locking
    rules to avoid circular dependencies, which is what this patch does.
    New lock nesting rules are explained in the comments.
    We can now safely call rebuild_sched_domains() from virtually any
    context. The only requirement is that it needs to be called under
    get_online_cpus(). This allows cpu hotplug handlers and the scheduler
    to call rebuild_sched_domains() directly.
    The rest of the cpuset code now offloads sched domains rebuilds to
    a workqueue (async_rebuild_sched_domains()).

    This version of the patch addresses comments from the previous review.
    I fixed all miss-formated comments and trailing spaces.

    I also factored out the code that builds domain masks and split up CPU and
    memory hotplug handling. This was needed to simplify locking, to avoid unsafe
    access to the cpu_online_map from mem hotplug handler, and in general to make
    things cleaner.

    The patch passes moderate testing (building kernel with -j 16, creating &
    removing domains and bringing cpus off/online at the same time) on the
    quad-core2 based machine.

    It passes lockdep checks, even with preemptable RCU enabled.
    This time I also tested in with suspend/resume path and everything is working
    as expected.

    Signed-off-by: Max Krasnyansky
    Acked-by: Paul Jackson
    Cc: menage@google.com
    Cc: a.p.zijlstra@chello.nl
    Cc: vegard.nossum@gmail.com
    Signed-off-by: Ingo Molnar

    Max Krasnyansky
     

31 Jul, 2008

4 commits

  • Use cpuset.stack_list rather than kfifo, so we avoid memory allocation
    for kfifo.

    Signed-off-by: Li Zefan
    Signed-off-by: Lai Jiangshan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • When multiple cpusets are overlapping in their 'cpus' and hence they
    form a single sched domain, the largest sched_relax_domain_level among
    those should be used. But when top_cpuset's sched_load_balance is
    set, its sched_relax_domain_level is used regardless other sub-cpusets'.

    This patch fixes it by walking the cpuset hierarchy to find the largest
    sched_relax_domain_level.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Reviewed-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • All child cpusets contain a subset of the parent's cpus, so we can skip
    them when partitioning sched domains. This decreases 'csa' greately for
    cpusets with multi-level hierarchy.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Reviewed-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • clean up hierarchy traversal code

    Signed-off-by: Li Zefan
    Cc: Paul Menage
    Cc: Cedric Le Goater
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Jackson
    Cc: Cliff Wickman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

26 Jul, 2008

4 commits

  • In cpuset_update_task_memory_state() local variable struct task_struct
    *tsk = current;

    And local variable tsk is used 14 times and statement task_cs(tsk) is used
    twice in this function. So using task_cs(tsk) instead of task_cs(current)
    is better for readability.

    And "(struct cgroup_scanner *)&scan" is not good for readability also.
    (and "container_of" is used in cpuset_do_move_task(), not
    "(cpuset_hotplug_scanner *)scan")

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • cgroup(cgroup_scan_tasks) will initialize heap->gt for us. This patch
    removes started_after() and its helper-function.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • I create lots of empty cpusets(empty cpumasks) and turn off the
    "sched_load_balance" in top cpuset.

    I found that all these empty cpumasks are passed to
    partition_sched_domains() in rebuild_sched_domains(), it's very
    time-consuming for partition_sched_domains() and it's not need.

    It also reduce memory consumed and some works in rebuild_sched_domains()
    too.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     
  • When changing 'sched_relax_domain_level', don't rebuild sched domains if
    'cpus' is empty or 'sched_load_balance' is not set.

    Also make the comments of rebuild_sched_domains() more readable.

    Signed-off-by: Li Zefan
    Cc: Hidetoshi Seto
    Cc: Paul Jackson
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan