08 Sep, 2013

1 commit


04 Sep, 2013

1 commit

  • Pull cgroup updates from Tejun Heo:
    "A lot of activities on the cgroup front. Most changes aren't visible
    to userland at all at this point and are laying foundation for the
    planned unified hierarchy.

    - The biggest change is decoupling the lifetime management of css
    (cgroup_subsys_state) from that of cgroup's. Because controllers
    (cpu, memory, block and so on) will need to be dynamically enabled
    and disabled, css which is the association point between a cgroup
    and a controller may come and go dynamically across the lifetime of
    a cgroup. Till now, css's were created when the associated cgroup
    was created and stayed till the cgroup got destroyed.

    Assumptions around this tight coupling permeated through cgroup
    core and controllers. These assumptions are gradually removed,
    which consists bulk of patches, and css destruction path is
    completely decoupled from cgroup destruction path. Note that
    decoupling of creation path is relatively easy on top of these
    changes and the patchset is pending for the next window.

    - cgroup has its own event mechanism cgroup.event_control, which is
    only used by memcg. It is overly complex trying to achieve high
    flexibility whose benefits seem dubious at best. Going forward,
    new events will simply generate file modified event and the
    existing mechanism is being made specific to memcg. This pull
    request contains prepatory patches for such change.

    - Various fixes and cleanups"

    Fixed up conflict in kernel/cgroup.c as per Tejun.

    * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
    cgroup: fix cgroup_css() invocation in css_from_id()
    cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
    cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
    cgroup: implement CFTYPE_NO_PREFIX
    cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
    cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
    cgroup: fix cgroup_write_event_control()
    cgroup: fix subsystem file accesses on the root cgroup
    cgroup: change cgroup_from_id() to css_from_id()
    cgroup: use css_get() in cgroup_create() to check CSS_ROOT
    cpuset: remove an unncessary forward declaration
    cgroup: RCU protect each cgroup_subsys_state release
    cgroup: move subsys file removal to kill_css()
    cgroup: factor out kill_css()
    cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
    cgroup: replace cgroup->css_kill_cnt with ->nr_css
    cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
    cgroup: move cgroup->subsys[] assignment to online_css()
    cgroup: reorganize css init / exit paths
    cgroup: add __rcu modifier to cgroup->subsys[]
    ...

    Linus Torvalds
     

29 Aug, 2013

1 commit

  • On 3.11-rc we are seeing cgroup directories left behind when they should
    have been removed. Here's a trivial reproducer:

    cd /sys/fs/cgroup/memory
    mkdir parent parent/child; rmdir parent/child parent
    rmdir: failed to remove `parent': Device or resource busy

    It's because cgroup_destroy_locked() (step 1 of destruction) leaves
    cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
    destruction) remove it; but step 2 is run by work queue, which may not
    yet have removed the children when parent destruction checks the list.

    Fix that by checking through a non-empty list of children: if every one
    of them has already been marked CGRP_DEAD, then it's safe to proceed:
    those children are invisible to userspace, and should not obstruct rmdir.

    (I didn't see any reason to keep the cgrp->children checks under the
    unrelated css_set_lock, so moved them out.)

    tj: Flattened nested ifs a bit and updated comment so that it's
    correct on both for-3.11-fixes and for-3.12.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Tejun Heo

    Hugh Dickins
     

28 Aug, 2013

1 commit

  • ca8bdcaff0 ("cgroup: make cgroup_css() take cgroup_subsys * instead
    and allow NULL subsys") missed one conversion in css_from_id(), which
    was newly added. As css_from_id() doesn't have any user yet, this
    doesn't break anything other than generating a build warning.

    Convert it.

    Signed-off-by: Tejun Heo
    Reported-by: Stephen Rothwell
    Reported-by: kbuild test robot

    Tejun Heo
     

27 Aug, 2013

5 commits

  • cgroup_event will be moved to its only user - memcg. Replace
    __d_cgrp() usage with css_from_dir(), which is already exported. This
    also simplifies the code a bit.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov

    Tejun Heo
     
  • Currently, each registered cgroup_event holds an extra reference to
    the cgroup. This is a bit weird as events are subsystem specific and
    will also be incorrect in the planned unified hierarchy as css
    (cgroup_subsys_state) may come and go dynamically across the lifetime
    of a cgroup. Holding onto cgroup won't prevent the target css from
    going away.

    Update cgroup_event to hold onto the css the traget file belongs to
    instead of cgroup.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov

    Tejun Heo
     
  • When cgroup files are created, cgroup core automatically prepends the
    name of the subsystem as prefix. This patch adds CFTYPE_NO_ which
    disables the automatic prefix. This is to work around historical
    baggages and shouldn't be used for new files.

    This will be used to move "cgroup.event_control" from cgroup core to
    memcg.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov
    Cc: Glauber Costa

    Tejun Heo
     
  • cgroup_css() is no longer used in hot paths. Make it take struct
    cgroup_subsys * and allow the users to specify NULL subsys to obtain
    the dummy_css. This removes open-coded NULL subsystem testing in a
    couple users and generally simplifies the code.

    After this patch, css_from_dir() also allows NULL @ss and returns the
    matching dummy_css. This behavior change doesn't affect its only user
    - perf.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov

    Tejun Heo
     
  • cgroup_css_from_dir() will grow another user. In preparation, make
    the following changes.

    * All css functions are prefixed with just "css_", rename it to
    css_from_dir().

    * Take dentry * instead of file * as dentry is what ultimately
    identifies a cgroup and file may not always be available. Note that
    the function now checkes whether @dentry->d_inode is NULL as the
    caller now may specify a negative dentry.

    * Make it take cgroup_subsys * instead of integer subsys_id. This
    simplifies the function and allows specifying no subsystem for
    cgroup->dummy_css.

    * Make return section a bit less verbose.

    This patch doesn't introduce any behavior changes.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar

    Tejun Heo
     

19 Aug, 2013

3 commits

  • 81eeaf0411 ("cgroup: make cftype->[un]register_event() deal with
    cgroup_subsys_state inst ead of cgroup") updated the cftype event
    methods to take @css (cgroup_subsys_state) instead of @cgroup;
    however, it incorrectly used @css passed to
    cgroup_write_event_control(), which the dummy_css for the cgroup as
    the file is a cgroup core file. This leads to oops on event
    registration.

    Fix it by using the css matching the event target file. Note that
    cgroup_write_event_control() now disallows cgroup core files from
    being event sources. This is for simplicity and doesn't matter as
    cgroup_event will be moved and made specific to memcg.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • 105347ba5 ("cgroup: make cgroup_file_open() rcu_read_lock() around
    cgroup_css() and add cfent->css") added cfent->css to cache the
    associted cgroup_subsys_state across file operations.

    A cfent is associated with single css throughout its lifetime and the
    origimal commit initialized the cache pointer during cgroup_add_file()
    and verified that it matches the actual one in cgroup_file_open().
    While this works fine for !root cgroups, it's broken for root cgroups
    as files in a root cgroup are created before the css's are associated
    with the cgroup and thus cgroup_css() call in cgroup_add_file()
    returns NULL associating all cfents in the root cgroup with NULL css.
    This makes cgroup_file_open() trigger WARN and fail with -ENODEV for
    all !core subsystem files in the root cgroups.

    There's no reason to initialize cfent->css separately from
    cgroup_add_file(). As the association never changes,
    cgroup_file_open() can set it unconditionally every time and
    containing the logic in cgroup_file_open() makes more sense anyway as
    the only reason it's necessary is file->private_data being already
    occupied.

    Fix it by setting cfent->css unconditionally from cgroup_file_open().

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Now we want cgroup core to always provide the css to use to the
    subsystems, so change this API to css_from_id().

    Uninline css_from_id(), because it's getting bigger and cgroup_css()
    has been unexported.

    While at it, remove the #ifdef, and shuffle the order of the args.

    Signed-off-by: Li Zefan
    Signed-off-by: Tejun Heo

    Li Zefan
     

16 Aug, 2013

1 commit

  • It seems that the root css doesn't have refcnt allocated(not needed?),
    and would cause the booting error attached.

    This patch tries to use css_get() to not increase the refcnt if parent
    is root.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] cgroup_mkdir+0x37c/0x740
    PGD 0
    Oops: 0002 [#1]
    Modules linked in:
    CPU: 0 PID: 1 Comm: systemd Not tainted 3.11.0-rc5-next-20130815+ #1
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    task: ffff88007f868000 ti: ffff88007f864000 task.ti: ffff88007f864000
    RIP: 0010:[] [] cgroup_mkdir+0x37c/0x740
    RSP: 0018:ffff88007f865df8 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffffffff81a46ee0 RCX: 0000000000000001
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81a415c0
    RBP: ffff88007f865ec8 R08: 0000000000000001 R09: 0000000000000000
    R10: ffff88007ce6d060 R11: 0000000000000000 R12: ffff88007ce6d000
    R13: ffff88007ce6d060 R14: ffffffff81a46d80 R15: ffff88007c6e8018
    FS: 00007f13dbf6f840(0000) GS:ffffffff81a23000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007b7e5000 CR4: 00000000000006b0
    Stack:
    ffffffff810b380d 0000000000000002 ffff88007f865e18 ffffffff81167069
    ffff88007f865ed8 ffffffff8116a3f5 ffff880037454400 ffff88007c6e8018
    ffff88007c6e8028 ffff88007c6e8328 ffff88007c6e8000 ffff88007ce6d000
    Call Trace:
    [] ? cgroup_mkdir+0x3bd/0x740
    [] ? lookup_hash+0x19/0x20
    [] ? kern_path_create+0x95/0x170
    [] vfs_mkdir+0x9e/0xf0
    [] SyS_mkdirat+0x60/0xe0
    [] SyS_mkdir+0x19/0x20
    [] tracesys+0xcf/0xd4
    Code: ad 70 ff ff ff 48 89 9d 60 ff ff ff 4d 89 d5 4c 8b bd 68 ff ff ff 4c 8b 65 88 eb 50 0f 1f 00 48 8b 43 18 a8 03 0f 85 6c 03 00 00 00 e8 1d 0a fb ff 85 c0 74 0d 80 3d f0 45 a1 00 00 0f 84 4c
    RIP [] cgroup_mkdir+0x37c/0x740
    RSP
    CR2: 0000000000000000
    ---[ end trace a4b14b49bc46fd60 ]---

    Signed-off-by: Li Zhong
    Acked-by: Li Zefan
    Signed-off-by: Tejun Heo

    Li Zhong
     

14 Aug, 2013

7 commits

  • With the planned unified hierarchy, individual css's will be created
    and destroyed dynamically across the lifetime of a cgroup. To enable
    such usages, css destruction is being decoupled from cgroup
    destruction. Most of the destruction path has been decoupled but the
    actual free of css still depends on cgroup free path.

    When all css refs are drained, css_release() kicks off
    css_free_work_fn() which puts the cgroup. When the cgroup refcnt
    reaches zero, cgroup_diput() is invoked which in turn schedules RCU
    free of the cgroup. After a grace period, all css's are freed along
    with the cgroup itself.

    This patch moves the RCU grace period and css freeing from cgroup
    release path to css release path. css_release(), instead of kicking
    off css_free_work_fn() directly, schedules RCU callback
    css_free_rcu_fn() which in turn kicks off css_free_work_fn() after a
    RCU grace period. css_free_work_fn() is updated to free the css
    directly.

    The five-way punting - percpu ref kill confirmation, a work item,
    percpu ref release, RCU grace period, and again a work item - is quite
    hairy but the work items are there only to provide process context and
    the actual sequence is kill confirm -> release -> RCU free, which
    isn't simple but not too crazy.

    This removes cgroup_css() usage after offline_css() allowing clearing
    cgroup->subsys[] from offline_css(), which makes it consistent with
    online_css() and brings it closer to proper lifetime management for
    individual css's.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • With the planned unified hierarchy, individual css's will be created
    and destroyed dynamically across the lifetime of a cgroup. To enable
    such usages, css destruction is being decoupled from cgroup
    destruction. This patch moves subsys file removal from
    cgroup_destroy_locked() to kill_css().

    While this changes the order of destruction operations, the changes
    shouldn't be noticeable to cgroup subsystems or userland.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Factor out css ref killing from cgroup_destroy_locked() into
    kill_css(). We're gonna add more to the path and the factored out
    function will eventually be called from other places too.

    While at it, replace open coded percpu_ref_get() with css_get() for
    consistency. This shouldn't cause any functional difference as the
    function is not used for root cgroups.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Currently, css (cgroup_subsys_state) lifetime is tied to that of the
    associated cgroup. css's are created when the associated cgroup is
    created and destroyed when it gets destroyed. Also, individual css's
    aren't RCU protected but the whole cgroup is. With the planned
    unified hierarchy, css's will need to be dynamically created and
    destroyed within the lifetime of a cgroup.

    To enable such usages, this patch decouples css destruction from
    cgroup destruction - offline_css() invocation and the final css_put()
    are moved from cgroup_destroy_css_killed() to css_killed_work_fn().
    Now each css is individually offlined and put as its reference count
    is killed instead of waiting for all css's attached to the cgroup to
    finish refcnt killing and then proceeding to offlining and putting
    them together.

    While this changes the order of destruction operations, the changes
    shouldn't be noticeable to cgroup subsystems or userland.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Currently, css (cgroup_subsys_state) lifetime is tied to that of the
    associated cgroup. With the planned unified hierarchy, css's will be
    dynamically created and destroyed within the lifetime of a cgroup. To
    enable such usages, css's will be individually RCU protected instead
    of being tied to the cgroup.

    cgroup->css_kill_cnt is used during cgroup destruction to wait for css
    reference count disable; however, this model doesn't work once css's
    lifetimes are managed separately from cgroup's. This patch replaces
    it with cgroup->nr_css which is an cgroup_mutex protected integer
    counting the number of attached css's. The count is incremented from
    online_css() and decremented after refcnt kill is confirmed. If the
    count reaches zero and the cgroup is marked dead, the second stage of
    cgroup destruction is kicked off. If a cgroup doesn't have any css
    attached at the time of rmdir, cgroup_destroy_locked() now invokes the
    second stage directly as no css kill confirmation would happen.

    cgroup_offline_fn() - the second step of cgroup destruction - is
    renamed to cgroup_destroy_css_killed() and now expects to be called
    with cgroup_mutex held.

    While this patch changes how css destruction is punted to work items,
    it shouldn't change any visible behavior.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • css (cgroup_subsys_state) offlining, which requires process context,
    will be moved to ref kill confirmation. In preparation, bounce
    css_killed handling through css->destroy_work.

    css_ref_killed_fn() is renamed to css_killed_ref_fn() so that it's
    consistent with the new css_killed_work_fn().

    This patch adds an additional work item bouncing but doesn't change
    the actual logic.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Currently, css (cgroup_subsys_state) lifetime is tied to that of the
    associated cgroup. With the planned unified hierarchy, css's will be
    dynamically created and destroyed within the lifetime of a cgroup. To
    enable such usages, css's will be individually RCU protected instead
    of being tied to the cgroup.

    In preparation, this patch moves cgroup->subsys[] assignment from
    init_css() to online_css(). As this means that a newly initialized
    css should be remembered separately and that cgroup_css() returns NULL
    between init and online, cgroup_create() is updated so that it stores
    newly created css's in a local array css_ar[] and
    cgroup_init/load_subsys() are updated to use local variable @css
    instead of using cgroup_css(). This change also slightly simplifies
    error path of cgroup_create().

    While this patch changes when cgroup->subsys[] is initialized, this
    change isn't visible to subsystems or userland.

    v2: This patch wasn't updated accordingly after the previous "cgroup:
    reorganize css init / exit paths" was updated leading to missing a
    css_ar[] conversion in cgroup_create() and thus boot failure. Fix
    it.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     

13 Aug, 2013

7 commits

  • css (cgroup_subsys_state) lifetime management is about to be
    restructured. In prepartion, make the following mostly trivial
    changes.

    * init_cgroup_css() is renamed to init_css() so that it's consistent
    with other css handling functions.

    * alloc_css_id(), online_css() and offline_css() updated to take @css
    instead of cgroups and subsys IDs.

    This patch doesn't make any functional changes.

    v2: v1 merged two for_each_root_subsys() loops in cgroup_create() but
    Li Zefan pointed out that it breaks error path. Dropped.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • For the planned unified hierarchy, each css (cgroup_subsys_state) will
    be RCU protected so that it can be created and destroyed individually
    while allowing RCU accesses. Previous changes ensured that all
    cgroup->subsys[] accesses use the cgroup_css() accessor. This patch
    adds __rcu modifier to cgroup->subsys[], add matching RCU dereference
    in cgroup_css() and convert all assignments to either
    rcu_assign_pointer() or RCU_INIT_POINTER().

    This change prepares for the actual RCUfication of css's and doesn't
    introduce any visible behavior change. The conversion is verified
    with sparse and all accesses are properly RCU annotated.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • For the planned unified hierarchy, each css (cgroup_subsys_state) will
    be RCU protected so that it can be created and destroyed individually
    while allowing RCU accesses, and cgroup_css() will soon require either
    holding cgroup_mutex or RCU read lock.

    This patch updates cgroup_file_open() such that it acquires the
    associated css under rcu_read_lock(). While cgroup_file_css() usages
    in other file operations are safe due to the reference from open,
    cgroup_css() wouldn't know that and will still trigger warnings. It'd
    be cleanest to store the acquired css in file->prvidate_data for
    further file operations but that's already used by seqfile. This
    patch instead adds cfent->css to cache the associated css. Note that
    while this field is initialized during cfe init, it should only be
    considered valid while the file is open.

    This patch doesn't change visible behavior.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup->subsys[] will become RCU protected and thus all cgroup_css()
    usages should either be under RCU read lock or cgroup_mutex. This
    patch updates cgroup_css_from_dir() which returns the matching
    cgroup_subsys_state given a directory file and subsys_id so that it
    requires RCU read lock and updates its sole user
    perf_cgroup_connect().

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar

    Tejun Heo
     
  • With the planned unified hierarchy, css's (cgroup_subsys_state) will
    be RCU protected and allowed to be attached and detached dynamically
    over the course of a cgroup's lifetime. This means that css's will
    stay accessible after being detached from its cgroup - the matching
    pointer in cgroup->subsys[] cleared - for ref draining and RCU grace
    period.

    cgroup core still wants to guarantee that the parent css is never
    destroyed before its children and css_parent() always returns the
    parent regardless of the state of the child css as long as it's
    accessible.

    This patch makes css's hold onto their parents and adds css->parent so
    that the parent css is never detroyed before its children and can be
    determined without consulting the cgroups.

    cgroup->dummy_css is also updated to point to the parent dummy_css;
    however, it doesn't need to worry about object lifetime as the parent
    cgroup is already pinned by the child.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • css (cgroup_subsys_state) will become RCU protected and there will be
    two stages which require punting to work item during release. To
    prepare for using the work item for multiple times, rename
    css->dput_work to css->destroy_work and css_dput_fn() to
    css_free_work_fn() and move work item initialization from css init to
    right before the actual usage.

    This reorganization doesn't introduce any behavior change.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup_css() is the accessor for cgroup->subsys[] but is not used
    consistently. cgroup->subsys[] will become RCU protected and
    cgroup_css() will grow synchronization sanity checks. In preparation,
    make all cgroup->subsys[] dereferences use cgroup_css() consistently.

    This patch doesn't introduce any functional difference.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     

09 Aug, 2013

13 commits

  • Previously, all css descendant iterators didn't include the origin
    (root of subtree) css in the iteration. The reasons were maintaining
    consistency with css_for_each_child() and that at the time of
    introduction more use cases needed skipping the origin anyway;
    however, given that css_is_descendant() considers self to be a
    descendant, omitting the origin css has become more confusing and
    looking at the accumulated use cases rather clearly indicates that
    including origin would result in simpler code overall.

    While this is a change which can easily lead to subtle bugs, cgroup
    API including the iterators has recently gone through major
    restructuring and no out-of-tree changes will be applicable without
    adjustments making this a relatively acceptable opportunity for this
    type of change.

    The conversions are mostly straight-forward. If the iteration block
    had explicit origin handling before or after, it's moved inside the
    iteration. If not, if (pos == origin) continue; is added. Some
    conversions add extra reference get/put around origin handling by
    consolidating origin handling and the rest. While the extra ref
    operations aren't strictly necessary, this shouldn't cause any
    noticeable difference.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Vivek Goyal
    Acked-by: Aristeu Rozanski
    Acked-by: Michal Hocko
    Cc: Jens Axboe
    Cc: Matt Helsley
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     
  • cgroup_css() no longer has any user left outside cgroup.c proper and
    we don't want subsystems to grow new usages of the function. cgroup
    core should always provide the css to use to the subsystems, which
    will make dynamic creation and destruction of css's across the
    lifetime of a cgroup much more manageable than exposing the cgroup
    directly to subsystems and let them dereference css's from it.

    Make cgroup_css() a static function in cgroup.c.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup is in the process of converting to css (cgroup_subsys_state)
    from cgroup as the principal subsystem interface handle. This is
    mostly to prepare for the unified hierarchy support where css's will
    be created and destroyed dynamically but also helps cleaning up
    subsystem implementations as css is usually what they are interested
    in anyway.

    cgroup_taskset which is used by the subsystem attach methods is the
    last cgroup subsystem API which isn't using css as the handle. Update
    cgroup_taskset_cur_cgroup() to cgroup_taskset_cur_css() and
    cgroup_taskset_for_each() to take @skip_css instead of @skip_cgrp.

    The conversions are pretty mechanical. One exception is
    cpuset::cgroup_cs(), which lost its last user and got removed.

    This patch shouldn't introduce any functional changes.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Daniel Wagner
    Cc: Ingo Molnar
    Cc: Matt Helsley
    Cc: Steven Rostedt

    Tejun Heo
     
  • cgroup is in the process of converting to css (cgroup_subsys_state)
    from cgroup as the principal subsystem interface handle. This is
    mostly to prepare for the unified hierarchy support where css's will
    be created and destroyed dynamically but also helps cleaning up
    subsystem implementations as css is usually what they are interested
    in anyway.

    cftype->[un]register_event() is among the remaining couple interfaces
    which still use struct cgroup. Convert it to cgroup_subsys_state.
    The conversion is mostly mechanical and removes the last users of
    mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.

    v2: indentation update as suggested by Li Zefan.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     
  • cgroup is in the process of converting to css (cgroup_subsys_state)
    from cgroup as the principal subsystem interface handle. This is
    mostly to prepare for the unified hierarchy support where css's will
    be created and destroyed dynamically but also helps cleaning up
    subsystem implementations as css is usually what they are interested
    in anyway.

    This patch converts task iterators to deal with css instead of cgroup.
    Note that under unified hierarchy, different sets of tasks will be
    considered belonging to a given cgroup depending on the subsystem in
    question and making the iterators deal with css instead cgroup
    provides them with enough information about the iteration.

    While at it, fix several function comment formats in cpuset.c.

    This patch doesn't introduce any behavior differences.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: Balbir Singh
    Cc: Matt Helsley

    Tejun Heo
     
  • cgroup_scan_tasks() takes a pointer to struct cgroup_scanner as its
    sole argument and the only function of that struct is packing the
    arguments of the function call which are consisted of five fields.
    It's not too unusual to pack parameters into a struct when the number
    of arguments gets excessive or the whole set needs to be passed around
    a lot, but neither holds here making it just weird.

    Drop struct cgroup_scanner and pass the params directly to
    cgroup_scan_tasks(). Note that struct cpuset_change_nodemask_arg was
    added to cpuset.c to pass both ->cs and ->newmems pointer to
    cpuset_change_nodemask() using single data pointer.

    This doesn't make any functional differences.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • Currently all cgroup_task_iter functions require @cgrp to be passed
    in, which is superflous and increases chance of usage error. Make
    cgroup_task_iter remember the cgroup being iterated and drop @cgrp
    argument from next and end functions.

    This patch doesn't introduce any behavior differences.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Cc: Matt Helsley
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     
  • cgroup now has multiple iterators and it's quite confusing to have
    something which walks over tasks of a single cgroup named cgroup_iter.
    Let's rename it to cgroup_task_iter.

    While at it, reformat / update comments and replace the overview
    comment above the interface function decls with proper function
    comments. Such overview can be useful but function comments should be
    more than enough here.

    This is pure rename and doesn't introduce any functional changes.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Cc: Matt Helsley
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     
  • For some reason, cgroup_advance_iter() is standing lonely all away
    from its iter comrades. Relocate it.

    This is cosmetic.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup is currently in the process of transitioning to using css
    (cgroup_subsys_state) as the primary handle instead of cgroup in
    subsystem API. For hierarchy iterators, this is beneficial because

    * In most cases, css is the only thing subsystems care about anyway.

    * On the planned unified hierarchy, iterations for different
    subsystems will need to skip over different subtrees of the
    hierarchy depending on which subsystems are enabled on each cgroup.
    Passing around css makes it unnecessary to explicitly specify the
    subsystem in question as css is intersection between cgroup and
    subsystem

    * For the planned unified hierarchy, css's would need to be created
    and destroyed dynamically independent from cgroup hierarchy. Having
    cgroup core manage css iteration makes enforcing deref rules a lot
    easier.

    Most subsystem conversions are straight-forward. Noteworthy changes
    are

    * blkio: cgroup_to_blkcg() is no longer used. Removed.

    * freezer: cgroup_freezer() is no longer used. Removed.

    * devices: cgroup_to_devcgroup() is no longer used. Removed.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Acked-by: Vivek Goyal
    Acked-by: Aristeu Rozanski
    Cc: Johannes Weiner
    Cc: Balbir Singh
    Cc: Matt Helsley
    Cc: Jens Axboe

    Tejun Heo
     
  • There are several places where the children list is accessed directly.
    This patch converts those places to use cgroup_next_child(). This
    will help updating the hierarchy iterators to use @css instead of
    @cgrp.

    While cgroup_next_child() can be heavy in pathological cases - e.g. a
    lot of dead children, this shouldn't cause any noticeable behavior
    differences.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup is transitioning to using css (cgroup_subsys_state) as the main
    subsys interface handle instead of cgroup and the iterators will be
    updated to use css too. The iterators need to walk the cgroup
    hierarchy and return the css's matching the origin css, which is a bit
    cumbersome to open code.

    This patch converts cgroup_next_sibling() to cgroup_next_child() so
    that it can handle all steps of direct child iteration. This will be
    used to update iterators to take @css instead of @cgrp. In addition
    to the new iteration init handling, cgroup_next_child() is
    restructured so that the different branches share the end of iteration
    condition check.

    This patch doesn't change any behavior.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan

    Tejun Heo
     
  • cgroup is currently in the process of transitioning to using struct
    cgroup_subsys_state * as the primary handle instead of struct cgroup.
    Please see the previous commit which converts the subsystem methods
    for rationale.

    This patch converts all cftype file operations to take @css instead of
    @cgroup. cftypes for the cgroup core files don't have their subsytem
    pointer set. These will automatically use the dummy_css added by the
    previous patch and can be converted the same way.

    Most subsystem conversions are straight forwards but there are some
    interesting ones.

    * freezer: update_if_frozen() is also converted to take @css instead
    of @cgroup for consistency. This will make the code look simpler
    too once iterators are converted to use css.

    * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
    vmpressure while mem_cgroup_from_cont() can be made static.
    Updated accordingly.

    * cpu: cgroup_tg() doesn't have any user left. Removed.

    * cpuacct: cgroup_ca() doesn't have any user left. Removed.

    * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
    Removed.

    * net_cls: cgrp_cls_state() doesn't have any user left. Removed.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Acked-by: Vivek Goyal
    Acked-by: Aristeu Rozanski
    Acked-by: Daniel Wagner
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Johannes Weiner
    Cc: Balbir Singh
    Cc: Matt Helsley
    Cc: Jens Axboe
    Cc: Steven Rostedt

    Tejun Heo