06 Apr, 2019

1 commit

  • [ Upstream commit b4ff1b44bcd384d22fcbac6ebaf9cc0d33debe50 ]

    cgroup_rstat_cpu_pop_updated() is used to traverse the updated cgroups
    on flush. While it was only visiting updated ones in the subtree, it
    was visiting @root unconditionally. We can easily check whether @root
    is updated or not by looking at its ->updated_next just as with the
    cgroups in the subtree.

    * Remove the unnecessary cgroup_parent() test. The system root cgroup
    is never updated and thus its ->updated_next is always NULL. No
    need to test whether cgroup_parent() exists in addition to
    ->updated_next.

    * Terminate traverse if ->updated_next is NULL. This can only happen
    for subtree @root and there's no reason to visit it if it's not
    marked updated.

    This reduces cpu consumption when reading a lot of rstat backed files.
    In a micro benchmark reading stat from ~1600 cgroups, the sys time was
    lowered by >40%.

    Signed-off-by: Tejun Heo
    Signed-off-by: Sasha Levin

    Tejun Heo
     

27 Apr, 2018

9 commits

  • cgroup_rstat_updated() ensures that the cgroup's rstat is linked to
    the parent. If there's no parent, it never gets linked and the
    function ends up grabbing and releasing the cgroup_rstat_lock each
    time for no reason which can be expensive.

    This hasn't been a problem till now because nobody was calling the
    function for the root cgroup but rstat is gonna be exposed to
    controllers and use cases, so let's get ready. Make
    cgroup_rstat_updated() an no-op for the root cgroup.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • cgroup_rstat_updated() has a small race window where an updated
    signaling can race with flush and could be lost till the next update.
    This wasn't a problem for the existing usages, but we plan to use
    rstat to track counters which need to be accurate.

    This patch plugs the race window by synchronizing
    cgroup_rstat_updated() and flush path with memory barriers around
    cgroup_rstat_cpu->updated_next pointer.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • This patch adds cgroup_subsys->css_rstat_flush(). If a subsystem has
    this callback, its csses are linked on cgrp->css_rstat_list and rstat
    will call the function whenever the associated cgroup is flushed.
    Flush is also performed when such csses are released so that residual
    counts aren't lost.

    Combined with the rstat API previous patches factored out, this allows
    controllers to plug into rstat to manage their statistics in a
    scalable way.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • Currently, rstat flush path is protected with a mutex which is fine as
    all the existing users are from interface file show path. However,
    rstat is being generalized for use by controllers and flushing from
    atomic contexts will be necessary.

    This patch replaces cgroup_rstat_mutex with a spinlock and adds a
    irq-safe flush function - cgroup_rstat_flush_irqsafe(). Explicit
    yield handling is added to the flush path so that other flush
    functions can yield to other threads and flushers.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • cgroup_rstat is being generalized so that controllers can use it too.
    This patch factors out and exposes the following interface functions.

    * cgroup_rstat_updated(): Renamed from cgroup_rstat_cpu_updated() for
    consistency.

    * cgroup_rstat_flush_hold/release(): Factored out from base stat
    implementation.

    * cgroup_rstat_flush(): Verbatim expose.

    While at it, drop assert on cgroup_rstat_mutex in
    cgroup_base_stat_flush() as it crosses layers and make a minor comment
    update.

    v2: Added EXPORT_SYMBOL_GPL(cgroup_rstat_updated) to fix a build bug.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • Currently, rstat.c has rstat and base stat implementations intermixed.
    Collect base stat implementation at the end of the file. Also,
    reorder the prototypes.

    This patch doesn't make any functional changes.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • Base resource stat accounts universial (not specific to any
    controller) resource consumptions on top of rstat. Currently, its
    implementation is intermixed with rstat implementation making the code
    confusing to follow.

    This patch clarifies the distintion by doing the followings.

    * Encapsulate base resource stat counters, currently only cputime, in
    struct cgroup_base_stat.

    * Move prev_cputime into struct cgroup and initialize it with cgroup.

    * Rename the related functions so that they start with cgroup_base_stat.

    * Prefix the related variables and field names with b.

    This patch doesn't make any functional changes.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • stat is too generic a name and ends up causing subtle confusions.
    It'll be made generic so that controllers can plug into it, which will
    make the problem worse. Let's rename it to something more specific -
    cgroup_rstat for cgroup recursive stat.

    This patch does the following renames. No other changes.

    * cpu_stat -> rstat_cpu
    * stat -> rstat
    * ?cstat -> ?rstatc

    Note that the renames are selective. The unrenamed are the ones which
    implement basic resource statistics on top of rstat. This will be
    further cleaned up in the following patches.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • stat is too generic a name and ends up causing subtle confusions.
    It'll be made generic so that controllers can plug into it, which will
    make the problem worse. Let's rename it to something more specific -
    cgroup_rstat for cgroup recursive stat.

    First, rename kernel/cgroup/stat.c to kernel/cgroup/rstat.c. No
    content changes.

    Signed-off-by: Tejun Heo

    Tejun Heo