02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

15 Jan, 2016

1 commit

  • Let the networking stack know when a memcg is under reclaim pressure so
    that it can clamp its transmit windows accordingly.

    Whenever the reclaim efficiency of a cgroup's LRU lists drops low enough
    for a MEDIUM or HIGH vmpressure event to occur, assert a pressure state
    in the socket and tcp memory code that tells it to curb consumption
    growth from sockets associated with said control group.

    Traditionally, vmpressure reports for the entire subtree of a memcg
    under pressure, which drops useful information on the individual groups
    reclaimed. However, it's too late to change the userinterface, so add a
    second reporting mode that reports on the level of reclaim instead of at
    the level of pressure, and use that report for sockets.

    vmpressure events are naturally edge triggered, so for hysteresis assert
    socket pressure for a second to allow for subsequent vmpressure events
    to occur before letting the socket code return to normal.

    This will likely need finetuning for a wider variety of workloads, but
    for now stick to the vmpressure presets and keep hysteresis simple.

    Signed-off-by: Johannes Weiner
    Acked-by: David S. Miller
    Reviewed-by: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

23 Nov, 2013

3 commits

  • cgroup_event is now memcg specific. Replace cgroup_event->css with
    ->memcg and convert [un]register_event() callbacks to take mem_cgroup
    pointer instead of cgroup_subsys_state one. This simplifies the code
    slightly and makes css_to_vmpressure() unnecessary which is removed.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov
    Acked-by: Michal Hocko

    Tejun Heo
     
  • The only use of cgroup_event->cft is distinguishing "usage_in_bytes"
    and "memsw.usgae_in_bytes" for mem_cgroup_usage_[un]register_event(),
    which can be done by adding an explicit argument to the function and
    implementing two wrappers so that the two cases can be distinguished
    from the function alone.

    Remove cgroup_event->cft and the related code including
    [un]register_events() methods.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov
    Acked-by: Michal Hocko

    Tejun Heo
     
  • cgroup_event is being moved from cgroup core to memcg and the
    implementation is already moved by the previous patch. This patch
    moves the data fields and callbacks.

    * cgroup->event_list[_lock] are moved to mem_cgroup.

    * cftype->[un]register_event() are moved to cgroup_event. This makes
    it impossible for individual cftype definitions to specify their
    event callbacks. This is worked around by simply hard-coding
    filename to event callback mapping in cgroup_write_event_control().
    This is awkward and inflexible, which is actually desirable given
    that we don't want to grow more usages of this feature.

    * eventfd_ctx declaration is removed from cgroup.h, which makes
    vmpressure.h miss eventfd_ctx declaration. Include eventfd.h from
    vmpressure.h.

    v2: Use file name from dentry instead of cftype. This will allow
    removing all cftype handling in the function.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     

04 Sep, 2013

1 commit

  • Pull cgroup updates from Tejun Heo:
    "A lot of activities on the cgroup front. Most changes aren't visible
    to userland at all at this point and are laying foundation for the
    planned unified hierarchy.

    - The biggest change is decoupling the lifetime management of css
    (cgroup_subsys_state) from that of cgroup's. Because controllers
    (cpu, memory, block and so on) will need to be dynamically enabled
    and disabled, css which is the association point between a cgroup
    and a controller may come and go dynamically across the lifetime of
    a cgroup. Till now, css's were created when the associated cgroup
    was created and stayed till the cgroup got destroyed.

    Assumptions around this tight coupling permeated through cgroup
    core and controllers. These assumptions are gradually removed,
    which consists bulk of patches, and css destruction path is
    completely decoupled from cgroup destruction path. Note that
    decoupling of creation path is relatively easy on top of these
    changes and the patchset is pending for the next window.

    - cgroup has its own event mechanism cgroup.event_control, which is
    only used by memcg. It is overly complex trying to achieve high
    flexibility whose benefits seem dubious at best. Going forward,
    new events will simply generate file modified event and the
    existing mechanism is being made specific to memcg. This pull
    request contains prepatory patches for such change.

    - Various fixes and cleanups"

    Fixed up conflict in kernel/cgroup.c as per Tejun.

    * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
    cgroup: fix cgroup_css() invocation in css_from_id()
    cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
    cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
    cgroup: implement CFTYPE_NO_PREFIX
    cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
    cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
    cgroup: fix cgroup_write_event_control()
    cgroup: fix subsystem file accesses on the root cgroup
    cgroup: change cgroup_from_id() to css_from_id()
    cgroup: use css_get() in cgroup_create() to check CSS_ROOT
    cpuset: remove an unncessary forward declaration
    cgroup: RCU protect each cgroup_subsys_state release
    cgroup: move subsys file removal to kill_css()
    cgroup: factor out kill_css()
    cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
    cgroup: replace cgroup->css_kill_cnt with ->nr_css
    cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
    cgroup: move cgroup->subsys[] assignment to online_css()
    cgroup: reorganize css init / exit paths
    cgroup: add __rcu modifier to cgroup->subsys[]
    ...

    Linus Torvalds
     

09 Aug, 2013

1 commit

  • cgroup is in the process of converting to css (cgroup_subsys_state)
    from cgroup as the principal subsystem interface handle. This is
    mostly to prepare for the unified hierarchy support where css's will
    be created and destroyed dynamically but also helps cleaning up
    subsystem implementations as css is usually what they are interested
    in anyway.

    cftype->[un]register_event() is among the remaining couple interfaces
    which still use struct cgroup. Convert it to cgroup_subsys_state.
    The conversion is mostly mechanical and removes the last users of
    mem_cgroup_from_cont() and cg_to_vmpressure(), which are removed.

    v2: indentation update as suggested by Li Zefan.

    Signed-off-by: Tejun Heo
    Acked-by: Li Zefan
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: Balbir Singh

    Tejun Heo
     

01 Aug, 2013

2 commits

  • vmpressure is called synchronously from reclaim where the target_memcg
    is guaranteed to be alive but the eventfd is signaled from the work
    queue context. This means that memcg (along with vmpressure structure
    which is embedded into it) might go away while the work item is pending
    which would result in use-after-release bug.

    We have two possible ways how to fix this. Either vmpressure pins memcg
    before it schedules vmpr->work and unpin it in vmpressure_work_fn or
    explicitely flush the work item from the css_offline context (as
    suggested by Tejun).

    This patch implements the later one and it introduces vmpressure_cleanup
    which flushes the vmpressure work queue item item. It hooks into
    mem_cgroup_css_offline after the memcg itself is cleaned up.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Michal Hocko
    Reported-by: Tejun Heo
    Cc: Anton Vorontsov
    Cc: Johannes Weiner
    Cc: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Cc: Li Zefan
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • There is nothing that can sleep inside critical sections protected by
    this lock and those sections are really small so there doesn't make much
    sense to use mutex for them. Change the log to a spinlock

    Signed-off-by: Michal Hocko
    Reported-by: Tejun Heo
    Cc: Anton Vorontsov
    Cc: Johannes Weiner
    Cc: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Cc: Li Zefan
    Reviewed-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

30 Apr, 2013

1 commit

  • With this patch userland applications that want to maintain the
    interactivity/memory allocation cost can use the pressure level
    notifications. The levels are defined like this:

    The "low" level means that the system is reclaiming memory for new
    allocations. Monitoring this reclaiming activity might be useful for
    maintaining cache level. Upon notification, the program (typically
    "Activity Manager") might analyze vmstat and act in advance (i.e.
    prematurely shutdown unimportant services).

    The "medium" level means that the system is experiencing medium memory
    pressure, the system might be making swap, paging out active file
    caches, etc. Upon this event applications may decide to further analyze
    vmstat/zoneinfo/memcg or internal memory usage statistics and free any
    resources that can be easily reconstructed or re-read from a disk.

    The "critical" level means that the system is actively thrashing, it is
    about to out of memory (OOM) or even the in-kernel OOM killer is on its
    way to trigger. Applications should do whatever they can to help the
    system. It might be too late to consult with vmstat or any other
    statistics, so it's advisable to take an immediate action.

    The events are propagated upward until the event is handled, i.e. the
    events are not pass-through. Here is what this means: for example you
    have three cgroups: A->B->C. Now you set up an event listener on
    cgroups A, B and C, and suppose group C experiences some pressure. In
    this situation, only group C will receive the notification, i.e. groups
    A and B will not receive it. This is done to avoid excessive
    "broadcasting" of messages, which disturbs the system and which is
    especially bad if we are low on memory or thrashing. So, organize the
    cgroups wisely, or propagate the events manually (or, ask us to
    implement the pass-through events, explaining why would you need them.)

    Performance wise, the memory pressure notifications feature itself is
    lightweight and does not require much of bookkeeping, in contrast to the
    rest of memcg features. Unfortunately, as of current memcg
    implementation, pages accounting is an inseparable part and cannot be
    turned off. The good news is that there are some efforts[1] to improve
    the situation; plus, implementing the same, fully API-compatible[2]
    interface for CONFIG_MEMCG=n case (e.g. embedded) is also a viable
    option, so it will not require any changes on the userland side.

    [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291
    [2] http://lkml.org/lkml/2013/2/21/454

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings]
    Signed-off-by: Anton Vorontsov
    Acked-by: Kirill A. Shutemov
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Tejun Heo
    Cc: David Rientjes
    Cc: Pekka Enberg
    Cc: Mel Gorman
    Cc: Glauber Costa
    Cc: Michal Hocko
    Cc: Luiz Capitulino
    Cc: Greg Thelen
    Cc: Leonid Moiseichuk
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Cc: Bartlomiej Zolnierkiewicz
    Cc: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Vorontsov