27 May, 2011

1 commit

  • Add cgroup subsystem callbacks for per-thread attachment in atomic contexts

    Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
    for cgroups's subsystem interface. Unlike can_attach and attach, these
    are for per-thread operations, to be called potentially many times when
    attaching an entire threadgroup.

    Also, the old "bool threadgroup" interface is removed, as replaced by
    this. All subsystems are modified for the new interface - of note is
    cpuset, which requires from/to nodemasks for attach to be globally scoped
    (though per-cpuset would work too) to persist from its pre_attach to
    attach_task and attach.

    This is a pre-patch for cgroup-procs-writable.patch.

    Signed-off-by: Ben Blum
    Cc: "Eric W. Biederman"
    Cc: Li Zefan
    Cc: Matt Helsley
    Reviewed-by: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Cc: Miao Xie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     

28 Oct, 2010

3 commits

  • There are 4 state transitions possible for a freezer. Only FREEZING ->
    FROZEN transaction is done lazily. This patch allows update_freezer_state
    only to perform this transaction and renames the function to
    update_if_frozen.

    Moreover is_task_frozen_enough function is removed and its every occurence
    is replaced with frozen(). Therefore for a group to become FROZEN every
    task must be frozen.

    The previous version could trigger a following bug: When cgroup is in the
    process of freezing (but none of its tasks are frozen yet),
    update_freezer_state() (called from freezer_read or freezer_write) would
    incorrectly report that a group is 'THAWED' (because nfrozen = 0),
    allowing the transaction FREEZING -> THAWED without writing anything to
    'freezer.state'. This is incorrect according to the documentation. This
    could result in a 'THAWED' cgroup with frozen tasks inside.

    A code to reproduce this bug is available here:
    http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug2.c

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Tomasz Buchert
    Cc: Matt Helsley
    Cc: Paul Menage
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomasz Buchert
     
  • It is possible to move a task from its cgroup even if this group is
    'FREEZING'. This results in a nasty bug - the moved task will become
    frozen OUTSIDE its original cgroup and will remain in a permanent 'D'
    state.

    This patch allows to migrate the task only between THAWED cgroups.

    This behavior was observed and easily reproduced on a single core laptop.
    Notice that reproducibility depends highly on the machine used. Program
    and instructions how to reproduce the bug can be fetched from:
    http://pentium.hopto.org/~thinred/repos/linux-misc/freezer_bug.c

    Signed-off-by: Tomasz Buchert
    Cc: Matt Helsley
    Cc: Paul Menage
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomasz Buchert
     
  • The root freezer_state is always CGROUP_THAWED so we can remove the
    special case from the code. The test itself can be handy and is extracted
    to static function.

    Signed-off-by: Tomasz Buchert
    Cc: Matt Helsley
    Cc: Paul Menage
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomasz Buchert
     

11 May, 2010

1 commit


30 Apr, 2010

1 commit

  • Add an RCU read-side critical section to suppress this false
    positive.

    Located-by: Eric Paris
    Signed-off-by: Paul E. McKenney
    Acked-by: Li Zefan
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: eric.dumazet@gmail.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

05 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

27 Mar, 2010

1 commit

  • When the cgroup freezer is used to freeze tasks we do not want to thaw
    those tasks during resume. Currently we test the cgroup freezer
    state of the resuming tasks to see if the cgroup is FROZEN. If so
    then we don't thaw the task. However, the FREEZING state also indicates
    that the task should remain frozen.

    This also avoids a problem pointed out by Oren Ladaan: the freezer state
    transition from FREEZING to FROZEN is updated lazily when userspace reads
    or writes the freezer.state file in the cgroup filesystem. This means that
    resume will thaw tasks in cgroups which should be in the FROZEN state if
    there is no read/write of the freezer.state file to trigger this
    transition before suspend.

    NOTE: Another "simple" solution would be to always update the cgroup
    freezer state during resume. However it's a bad choice for several reasons:
    Updating the cgroup freezer state is somewhat expensive because it requires
    walking all the tasks in the cgroup and checking if they are each frozen.
    Worse, this could easily make resume run in N^2 time where N is the number
    of tasks in the cgroup. Finally, updating the freezer state from this code
    path requires trickier locking because of the way locks must be ordered.

    Instead of updating the freezer state we rely on the fact that lazy
    updates only manage the transition from FREEZING to FROZEN. We know that
    a cgroup with the FREEZING state may actually be FROZEN so test for that
    state too. This makes sense in the resume path even for partially-frozen
    cgroups -- those that really are FREEZING but not FROZEN.

    Reported-by: Oren Ladaan
    Signed-off-by: Matt Helsley
    Cc: stable@kernel.org
    Signed-off-by: Rafael J. Wysocki

    Matt Helsley
     

24 Sep, 2009

1 commit

  • Alter the ss->can_attach and ss->attach functions to be able to deal with
    a whole threadgroup at a time, for use in cgroup_attach_proc. (This is a
    pre-patch to cgroup-procs-writable.patch.)

    Currently, new mode of the attach function can only tell the subsystem
    about the old cgroup of the threadgroup leader. No subsystem currently
    needs that information for each thread that's being moved, but if one were
    to be added (for example, one that counts tasks within a group) this bit
    would need to be reworked a bit to tell the subsystem the right
    information.

    [hidave.darkstar@gmail.com: fix build]
    Signed-off-by: Ben Blum
    Signed-off-by: Paul Menage
    Acked-by: Li Zefan
    Reviewed-by: Matt Helsley
    Cc: "Eric W. Biederman"
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Dave Young
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Blum
     

13 Nov, 2008

2 commits

  • With this change, control file 'freezer.state' doesn't exist in root
    cgroup, making root cgroup unfreezable.

    I think it's reasonable to disallow freeze tasks in the root cgroup. And
    then we can avoid fork overhead when freezer subsystem is compiled but not
    used.

    Also make writing invalid value to freezer.state returns EINVAL rather
    than EIO. This is more consistent with other cgroup subsystem.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Cc: Cedric Le Goater
    Cc: Paul Menage
    Cc: Matt Helsley
    Cc: "Serge E. Hallyn"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • In theory the task can be moved to another cgroup and the freezer will be
    freed right after task_lock is dropped, so the lock results in zero
    protection.

    But in the case of freezer_fork() no lock is needed, since the task is not
    in tasklist yet so it won't be moved to another cgroup, so task->cgroups
    won't be changed or invalidated.

    Signed-off-by: Li Zefan
    Cc: Matt Helsley
    Cc: Cedric Le Goater
    Cc: "Serge E. Hallyn"
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

31 Oct, 2008

4 commits


20 Oct, 2008

4 commits

  • check_if_frozen() sounds like it should return something when in fact it's
    just updating the freezer state.

    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley
     
  • Rename cgroup freezer states to be less generic to avoid any name
    collisions while also better describing what each state is.

    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley
     
  • Don't let frozen tasks or cgroups change. This means frozen tasks can't
    leave their current cgroup for another cgroup. It also means that tasks
    cannot be added to or removed from a cgroup in the FROZEN state. We
    enforce these rules by checking for frozen tasks and cgroups in the
    can_attach() function.

    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley
     
  • This patch implements a new freezer subsystem in the control groups
    framework. It provides a way to stop and resume execution of all tasks in
    a cgroup by writing in the cgroup filesystem.

    The freezer subsystem in the container filesystem defines a file named
    freezer.state. Writing "FROZEN" to the state file will freeze all tasks
    in the cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in
    the cgroup. Reading will return the current state.

    * Examples of usage :

    # mkdir /containers/freezer
    # mount -t cgroup -ofreezer freezer /containers
    # mkdir /containers/0
    # echo $some_pid > /containers/0/tasks

    to get status of the freezer subsystem :

    # cat /containers/0/freezer.state
    RUNNING

    to freeze all tasks in the container :

    # echo FROZEN > /containers/0/freezer.state
    # cat /containers/0/freezer.state
    FREEZING
    # cat /containers/0/freezer.state
    FROZEN

    to unfreeze all tasks in the container :

    # echo RUNNING > /containers/0/freezer.state
    # cat /containers/0/freezer.state
    RUNNING

    This is the basic mechanism which should do the right thing for user space
    task in a simple scenario.

    It's important to note that freezing can be incomplete. In that case we
    return EBUSY. This means that some tasks in the cgroup are busy doing
    something that prevents us from completely freezing the cgroup at this
    time. After EBUSY, the cgroup will remain partially frozen -- reflected
    by freezer.state reporting "FREEZING" when read. The state will remain
    "FREEZING" until one of these things happens:

    1) Userspace cancels the freezing operation by writing "RUNNING" to
    the freezer.state file
    2) Userspace retries the freezing operation by writing "FROZEN" to
    the freezer.state file (writing "FREEZING" is not legal
    and returns EIO)
    3) The tasks that blocked the cgroup from entering the "FROZEN"
    state disappear from the cgroup's set of tasks.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: export thaw_process]
    Signed-off-by: Cedric Le Goater
    Signed-off-by: Matt Helsley
    Acked-by: Serge E. Hallyn
    Tested-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley