Eric Lee / smarc-fsl-linux-kernel

21 May, 2016

1 commit

297880f4a mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size ... Browse Code »

The page_counter rounds limits down to page size values. This makes
sense, except in the case of hugetlb_cgroup where it's not possible to
charge partial hugepages. If the hugetlb_cgroup margin is less than the
hugepage size being charged, it will fail as expected.

Round the hugetlb_cgroup limit down to hugepage size, since it is the
effective limit of the cgroup.

For consistency, round down PAGE_COUNTER_MAX as well when a
hugetlb_cgroup is created: this prevents error reports when a user
cannot restore the value to the kernel default.

Signed-off-by: David Rientjes
Cc: Michal Hocko
Cc: Nikolay Borisov
Cc: Johannes Weiner
Cc: "Kirill A. Shutemov"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2016-05-21 08:58:30 +0800

07 Nov, 2015

1 commit

1d798ca3f mm: make compound_head() robust ... Browse Code »

Hugh has pointed that compound_head() call can be unsafe in some
context. There's one example:

CPU0 CPU1

isolate_migratepages_block()
page_count()
compound_head()
!!PageTail() == true
put_page()
tail->first_page = NULL
head = tail->first_page
alloc_pages(__GFP_COMP)
prep_compound_page()
tail->first_page = head
__SetPageTail(p);
!!PageTail() == true

The race is pure theoretical. I don't it's possible to trigger it in
practice. But who knows.

We can fix the race by changing how encode PageTail() and compound_head()
within struct page to be able to update them in one shot.

The patch introduces page->compound_head into third double word block in
front of compound_dtor and compound_order. Bit 0 encodes PageTail() and
the rest bits are pointer to head page if bit zero is set.

The patch moves page->pmd_huge_pte out of word, just in case if an
architecture defines pgtable_t into something what can have the bit 0
set.

hugetlb_cgroup uses page->lru.next in the second tail page to store
pointer struct hugetlb_cgroup. The patch switch it to use page->private
in the second tail page instead. The space is free since ->first_page is
removed from the union.

The patch also opens possibility to remove HUGETLB_CGROUP_MIN_ORDER
limitation, since there's now space in first tail page to store struct
hugetlb_cgroup pointer. But that's out of scope of the patch.

That means page->compound_head shares storage space with:

- page->lru.next;
- page->next;
- page->rcu_head.next;

That's too long list to be absolutely sure, but looks like nobody uses
bit 0 of the word.

page->rcu_head.next guaranteed[1] to have bit 0 clean as long as we use
call_rcu(), call_rcu_bh(), call_rcu_sched(), or call_srcu(). But future
call_rcu_lazy() is not allowed as it makes use of the bit and we can
get false positive PageTail().

[1] http://lkml.kernel.org/g/20150827163634.GD4029@linux.vnet.ibm.com

Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Reviewed-by: Andrea Arcangeli
Cc: Hugh Dickins
Cc: David Rientjes
Cc: Vlastimil Babka
Acked-by: Paul E. McKenney
Cc: Aneesh Kumar K.V
Cc: Andi Kleen
Cc: Christoph Lameter
Cc: Joonsoo Kim
Cc: Sergey Senozhatsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2015-11-07 09:50:42 +0800

06 Nov, 2015

1 commit

6071ca520 mm: page_counter: let page_counter_try_charge() return bool ... Browse Code »

page_counter_try_charge() currently returns 0 on success and -ENOMEM on
failure, which is surprising behavior given the function name.

Make it follow the expected pattern of try_stuff() functions that return a
boolean true to indicate success, or false for failure.

Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Vladimir Davydov
Signed-off-by: Linus Torvalds

Johannes Weiner
2015-11-06 11:34:48 +0800

12 Feb, 2015

1 commit

650c5e565 mm: page_counter: pull "-1" handling out of page_counter_memparse() ... Browse Code »

The unified hierarchy interface for memory cgroups will no longer use "-1"
to mean maximum possible resource value. In preparation for this, make
the string an argument and let the caller supply it.

Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Vladimir Davydov
Cc: Greg Thelen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2015-02-12 09:06:02 +0800

11 Dec, 2014

1 commit

71f87bee3 mm: hugetlb_cgroup: convert to lockless page counters ... Browse Code »

Abandon the spinlock-protected byte counters in favor of the unlocked
page counters in the hugetlb controller as well.

Signed-off-by: Johannes Weiner
Reviewed-by: Vladimir Davydov
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:04 +0800

30 Aug, 2014

1 commit

7ea8574e5 hugetlb_cgroup: use lockdep_assert_held rather than spin_is_locked ... Browse Code »

spin_lock may be an empty struct for !SMP configurations and so
arch_spin_is_locked may return unconditional 0 and trigger the VM_BUG_ON
even when the lock is held.

Replace spin_is_locked by lockdep_assert_held. We will not BUG anymore
but it is questionable whether crashing makes a lot of sense in the
uncharge path. Uncharge happens after the last page reference was
released so nobody should touch the page and the function doesn't update
any shared state except for res counter which uses synchronization of
its own.

Signed-off-by: Michal Hocko
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2014-08-30 07:28:16 +0800

15 Aug, 2014

1 commit

24d7cd207 mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size ... Browse Code »

Memcg aligns memory.limit_in_bytes to PAGE_SIZE as part of the resource
counter since it makes no sense to allow a partial page to be charged.

As a result of the hugetlb cgroup using the resource counter, it is also
aligned to PAGE_SIZE but makes no sense unless aligned to the size of
the hugepage being limited.

Align hugetlb cgroup limit to hugepage size.

Signed-off-by: David Rientjes
Acked-by: Michal Hocko
Cc: "Aneesh Kumar K.V"
Cc: Tejun Heo
Cc: Li Zefan
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2014-08-15 00:56:15 +0800

15 Jul, 2014

1 commit

2cf669a58 cgroup: replace cgroup_add_cftypes() with cgroup_add_legacy_cftypes() ... Browse Code »

Currently, cftypes added by cgroup_add_cftypes() are used for both the
unified default hierarchy and legacy ones and subsystems can mark each
file with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to
appear only on one of them. This is quite hairy and error-prone.
Also, we may end up exposing interface files to the default hierarchy
without thinking it through.

cgroup_subsys will grow two separate cftype addition functions and
apply each only on the hierarchies of the matching type. This will
allow organizing cftypes in a lot clearer way and encourage subsystems
to scrutinize the interface which is being exposed in the new default
hierarchy.

In preparation, this patch adds cgroup_add_legacy_cftypes() which
currently is a simple wrapper around cgroup_add_cftypes() and replaces
all cgroup_add_cftypes() usages with it.

While at it, this patch drops a completely spurious return from
__hugetlb_cgroup_file_init().

This patch doesn't introduce any functional differences.

Signed-off-by: Tejun Heo
Acked-by: Neil Horman
Acked-by: Li Zefan
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Aneesh Kumar K.V

Tejun Heo
2014-07-15 23:05:09 +0800

17 May, 2014

1 commit

5c9d535b8 cgroup: remove css_parent() ... Browse Code »

cgroup in general is moving towards using cgroup_subsys_state as the
fundamental structural component and css_parent() was introduced to
convert from using cgroup->parent to css->parent. It was quite some
time ago and we're moving forward with making css more prominent.

This patch drops the trivial wrapper css_parent() and let the users
dereference css->parent. While at it, explicitly mark fields of css
which are public and immutable.

v2: New usage from device_cgroup.c converted.

Signed-off-by: Tejun Heo
Acked-by: Michal Hocko
Acked-by: Neil Horman
Acked-by: "David S. Miller"
Acked-by: Li Zefan
Cc: Vivek Goyal
Cc: Jens Axboe
Cc: Peter Zijlstra
Cc: Johannes Weiner

Tejun Heo
2014-05-17 01:22:48 +0800

14 May, 2014

3 commits

6770c64e5 cgroup: replace cftype->trigger() with cftype->write() ... Browse Code »

cftype->trigger() is pointless. It's trivial to ignore the input
buffer from a regular ->write() operation. Convert all ->trigger()
users to ->write() and remove ->trigger().

This patch doesn't introduce any visible behavior changes.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Cc: Johannes Weiner
Cc: Michal Hocko

Tejun Heo
2014-05-14 00:16:21 +0800
451af504d cgroup: replace cftype->write_string() with cftype->write() ... Browse Code »

Convert all cftype->write_string() users to the new cftype->write()
which maps directly to kernfs write operation and has full access to
kernfs and cgroup contexts. The conversions are mostly mechanical.

* @css and @cft are accessed using of_css() and of_cft() accessors
respectively instead of being specified as arguments.

* Should return @nbytes on success instead of 0.

* @buf is not trimmed automatically. Trim if necessary. Note that
blkcg and netprio don't need this as the parsers already handle
whitespaces.

cftype->write_string() has no user left after the conversions and
removed.

While at it, remove unnecessary local variable @p in
cgroup_subtree_control_write() and stale comment about
CGROUP_LOCAL_BUFFER_SIZE in cgroup_freezer.c.

This patch doesn't introduce any visible behavior changes.

v2: netprio was missing from conversion. Converted.

Signed-off-by: Tejun Heo
Acked-by: Aristeu Rozanski
Acked-by: Vivek Goyal
Acked-by: Li Zefan
Cc: Jens Axboe
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Neil Horman
Cc: "David S. Miller"

Tejun Heo
2014-05-14 00:16:21 +0800
ec903c0c8 cgroup: rename css_tryget*() to css_tryget_online*() ... Browse Code »

Unlike the more usual refcnting, what css_tryget() provides is the
distinction between online and offline csses instead of protection
against upping a refcnt which already reached zero. cgroup is
planning to provide actual tryget which fails if the refcnt already
reached zero. Let's rename the existing trygets so that they clearly
indicate that they're onliness.

I thought about keeping the existing names as-are and introducing new
names for the planned actual tryget; however, given that each
controller participates in the synchronization of the online state, it
seems worthwhile to make it explicit that these functions are about
on/offline state.

Rename css_tryget() to css_tryget_online() and css_tryget_from_dir()
to css_tryget_online_from_dir(). This is pure rename.

v2: cgroup_freezer grew new usages of css_tryget(). Update
accordingly.

Signed-off-by: Tejun Heo
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Li Zefan
Cc: Vivek Goyal
Cc: Jens Axboe
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo

Tejun Heo
2014-05-14 00:11:01 +0800

19 Mar, 2014

1 commit

4d3bb511b cgroup: drop const from @buffer of cftype->write_string() ... Browse Code »

cftype->write_string() just passes on the writeable buffer from kernfs
and there's no reason to add const restriction on the buffer. The
only thing const achieves is unnecessarily complicating parsing of the
buffer. Drop const from @buffer.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki

Tejun Heo
2014-03-19 22:23:54 +0800

08 Feb, 2014

1 commit

073219e99 cgroup: clean up cgroup_subsys names and initialization ... Browse Code »

cgroup_subsys is a bit messier than it needs to be.

* The name of a subsys can be different from its internal identifier
defined in cgroup_subsys.h. Most subsystems use the matching name
but three - cpu, memory and perf_event - use different ones.

* cgroup_subsys_id enums are postfixed with _subsys_id and each
cgroup_subsys is postfixed with _subsys. cgroup.h is widely
included throughout various subsystems, it doesn't and shouldn't
have claim on such generic names which don't have any qualifier
indicating that they belong to cgroup.

* cgroup_subsys->subsys_id should always equal the matching
cgroup_subsys_id enum; however, we require each controller to
initialize it and then BUG if they don't match, which is a bit
silly.

This patch cleans up cgroup_subsys names and initialization by doing
the followings.

* cgroup_subsys_id enums are now postfixed with _cgrp_id, and each
cgroup_subsys with _cgrp_subsys.

* With the above, renaming subsys identifiers to match the userland
visible names doesn't cause any naming conflicts. All non-matching
identifiers are renamed to match the official names.

cpu_cgroup -> cpu
mem_cgroup -> memory
perf -> perf_event

* controllers no longer need to initialize ->subsys_id and ->name.
They're generated in cgroup core and set automatically during boot.

* Redundant cgroup_subsys declarations removed.

* While updating BUG_ON()s in cgroup_init_early(), convert them to
WARN()s. BUGging that early during boot is stupid - the kernel
can't print anything, even through serial console and the trap
handler doesn't even link stack frame properly for back-tracing.

This patch doesn't introduce any behavior changes.

v2: Rebased on top of fe1217c4f3f7 ("net: net_cls: move cgroupfs
classid handling into core").

Signed-off-by: Tejun Heo
Acked-by: Neil Horman
Acked-by: "David S. Miller"
Acked-by: "Rafael J. Wysocki"
Acked-by: Michal Hocko
Acked-by: Peter Zijlstra
Acked-by: Aristeu Rozanski
Acked-by: Ingo Molnar
Acked-by: Li Zefan
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: Serge E. Hallyn
Cc: Vivek Goyal
Cc: Thomas Graf

Tejun Heo
2014-02-08 23:36:58 +0800

24 Jan, 2014

1 commit

309381fea mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE ... Browse Code »

Most of the VM_BUG_ON assertions are performed on a page. Usually, when
one of these assertions fails we'll get a BUG_ON with a call stack and
the registers.

I've recently noticed based on the requests to add a small piece of code
that dumps the page to various VM_BUG_ON sites that the page dump is
quite useful to people debugging issues in mm.

This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what
VM_BUG_ON() does, also dumps the page before executing the actual
BUG_ON.

[akpm@linux-foundation.org: fix up includes]
Signed-off-by: Sasha Levin
Cc: "Kirill A. Shutemov"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2014-01-24 08:36:50 +0800

06 Dec, 2013

1 commit

716f479d2 hugetlb_cgroup: convert away from cftype->read() ... Browse Code »

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

All users of cftype->read() can be easily served, usually better, by
seq_file and other methods. Update hugetlb_cgroup_read() to return
u64 instead of printing itself and rename it to
hugetlb_cgroup_read_u64().

This patch doesn't make any visible behavior changes.

Signed-off-by: Tejun Heo
Reviewed-by: Michal Hocko
Acked-by: Li Zefan
Cc: Aneesh Kumar K.V
Cc: Johannes Weiner

Tejun Heo
2013-12-06 01:28:03 +0800

09 Aug, 2013

6 commits

182446d08 cgroup: pass around cgroup_subsys_state instead of cgroup in file methods ... Browse Code »

cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup.
Please see the previous commit which converts the subsystem methods
for rationale.

This patch converts all cftype file operations to take @css instead of
@cgroup. cftypes for the cgroup core files don't have their subsytem
pointer set. These will automatically use the dummy_css added by the
previous patch and can be converted the same way.

Most subsystem conversions are straight forwards but there are some
interesting ones.

* freezer: update_if_frozen() is also converted to take @css instead
of @cgroup for consistency. This will make the code look simpler
too once iterators are converted to use css.

* memory/vmpressure: mem_cgroup_from_css() needs to be exported to
vmpressure while mem_cgroup_from_cont() can be made static.
Updated accordingly.

* cpu: cgroup_tg() doesn't have any user left. Removed.

* cpuacct: cgroup_ca() doesn't have any user left. Removed.

* hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
Removed.

* net_cls: cgrp_cls_state() doesn't have any user left. Removed.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Acked-by: Michal Hocko
Acked-by: Vivek Goyal
Acked-by: Aristeu Rozanski
Acked-by: Daniel Wagner
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: Matt Helsley
Cc: Jens Axboe
Cc: Steven Rostedt

Tejun Heo
2013-08-09 08:11:24 +0800
eb95419b0 cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods ... Browse Code »

cgroup is currently in the process of transitioning to using struct
cgroup_subsys_state * as the primary handle instead of struct cgroup *
in subsystem implementations for the following reasons.

* With unified hierarchy, subsystems will be dynamically bound and
unbound from cgroups and thus css's (cgroup_subsys_state) may be
created and destroyed dynamically over the lifetime of a cgroup,
which is different from the current state where all css's are
allocated and destroyed together with the associated cgroup. This
in turn means that cgroup_css() should be synchronized and may
return NULL, making it more cumbersome to use.

* Differing levels of per-subsystem granularity in the unified
hierarchy means that the task and descendant iterators should behave
differently depending on the specific subsystem the iteration is
being performed for.

* In majority of the cases, subsystems only care about its part in the
cgroup hierarchy - ie. the hierarchy of css's. Subsystem methods
often obtain the matching css pointer from the cgroup and don't
bother with the cgroup pointer itself. Passing around css fits
much better.

This patch converts all cgroup_subsys methods to take @css instead of
@cgroup. The conversions are mostly straight-forward. A few
noteworthy changes are

* ->css_alloc() now takes css of the parent cgroup rather than the
pointer to the new cgroup as the css for the new cgroup doesn't
exist yet. Knowing the parent css is enough for all the existing
subsystems.

* In kernel/cgroup.c::offline_css(), unnecessary open coded css
dereference is replaced with local variable access.

This patch shouldn't cause any behavior differences.

v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
with local variable @css as suggested by Li Zefan.

Rebased on top of new for-3.12 which includes for-3.11-fixes so
that ->css_free() invocation added by da0a12caff ("cgroup: fix a
leak when percpu_ref_init() fails") is converted too. Suggested
by Li Zefan.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Acked-by: Michal Hocko
Acked-by: Vivek Goyal
Acked-by: Aristeu Rozanski
Acked-by: Daniel Wagner
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: Matt Helsley
Cc: Jens Axboe
Cc: Steven Rostedt

Tejun Heo
2013-08-09 08:11:23 +0800
638769869 cgroup: add css_parent() ... Browse Code »

Currently, controllers have to explicitly follow the cgroup hierarchy
to find the parent of a given css. cgroup is moving towards using
cgroup_subsys_state as the main controller interface construct, so
let's provide a way to climb the hierarchy using just csses.

This patch implements css_parent() which, given a css, returns its
parent. The function is guarnateed to valid non-NULL parent css as
long as the target css is not at the top of the hierarchy.

freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices
are converted to use css_parent() instead of accessing cgroup->parent
directly.

* __parent_ca() is dropped from cpuacct and its usage is replaced with
parent_ca(). The only difference between the two was NULL test on
cgroup->parent which is now embedded in css_parent() making the
distinction moot. Note that eventually a css->parent field will be
added to css and the NULL check in css_parent() will go away.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan

Tejun Heo
2013-08-09 08:11:23 +0800
a7c6d554a cgroup: add/update accessors which obtain subsys specific data from css ... Browse Code »

css (cgroup_subsys_state) is usually embedded in a subsys specific
data structure. Subsystems either use container_of() directly to cast
from css to such data structure or has an accessor function wrapping
such cast. As cgroup as whole is moving towards using css as the main
interface handle, add and update such accessors to ease dealing with
css's.

All accessors explicitly handle NULL input and return NULL in those
cases. While this looks like an extra branch in the code, as all
controllers specific data structures have css as the first field, the
casting doesn't involve any offsetting and the compiler can trivially
optimize out the branch.

* blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
accessor. Added.

* memory, hugetlb and devices already had one but didn't explicitly
handle NULL input. Updated.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan

Tejun Heo
2013-08-09 08:11:23 +0800
3f7985183 hugetlb_cgroup: pass around @hugetlb_cgroup instead of @cgroup ... Browse Code »

cgroup controller API will be converted to primarily use struct
cgroup_subsys_state instead of struct cgroup. In preparation, make
hugetlb_cgroup functions pass around struct hugetlb_cgroup instead of
struct cgroup.

This patch shouldn't cause any behavior differences.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Reviewed-by: Aneesh Kumar K.V
Reviewed-by: Michal Hocko
Cc: KAMEZAWA Hiroyuki
Cc: Johannes Weiner

Tejun Heo
2013-08-09 08:11:22 +0800
8af01f56a cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ ... Browse Code »

The names of the two struct cgroup_subsys_state accessors -
cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
The former clashes with the type name and the latter doesn't even
indicate it's somehow related to cgroup.

We're about to revamp large portion of cgroup API, so, let's rename
them so that they're less awkward. Most per-controller usages of the
accessors are localized in accessor wrappers and given the amount of
scheduled changes, this isn't gonna add any noticeable headache.

Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
to task_css(). This patch is pure rename.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan

Tejun Heo
2013-08-09 08:11:22 +0800

19 Dec, 2012

1 commit

7179e7bf4 mm/hugetlb: create hugetlb cgroup file in hugetlb_init ... Browse Code »

Build kernel with CONFIG_HUGETLBFS=y,CONFIG_HUGETLB_PAGE=y and
CONFIG_CGROUP_HUGETLB=y, then specify hugepagesz=xx boot option, system
will fail to boot.

This failure is caused by following code path:

setup_hugepagesz
hugetlb_add_hstate
hugetlb_cgroup_file_init
cgroup_add_cftypes
kzalloc
Signed-off-by: Jiang Liu
Reviewed-by: Aneesh Kumar K.V
Acked-by: Michal Hocko
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jianguo Wu
2012-12-19 07:02:15 +0800

20 Nov, 2012

1 commit

92fb97487 cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() ... Browse Code »

Rename cgroup_subsys css lifetime related callbacks to better describe
what their roles are. Also, update documentation.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan

Tejun Heo
2012-11-20 00:13:38 +0800

06 Nov, 2012

2 commits

bcf6de1b9 cgroup: make ->pre_destroy() return void ... Browse Code »

All ->pre_destory() implementations return 0 now, which is the only
allowed return value. Make it return void.

Signed-off-by: Tejun Heo
Reviewed-by: Michal Hocko
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Li Zefan
Cc: Balbir Singh
Cc: Vivek Goyal

Tejun Heo
2012-11-06 01:16:59 +0800
9d093cb10 hugetlb: do not fail in hugetlb_cgroup_pre_destroy ... Browse Code »

Now that pre_destroy callbacks are called from the context where neither
any task can attach the group nor any children group can be added there
is no other way to fail from hugetlb_pre_destroy.

Signed-off-by: Michal Hocko
Reviewed-by: Tejun Heo
Reviewed-by: Glauber Costa
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Tejun Heo

Michal Hocko
2012-11-06 01:16:59 +0800

01 Aug, 2012

7 commits

75754681f hugetlb/cgroup: remove exclude and wakeup rmdir calls from migrate ... Browse Code »

We already hold the hugetlb_lock. That should prevent a parallel cgroup
rmdir from touching page's hugetlb cgroup. So remove the exclude and
wakeup calls.

Signed-off-by: Aneesh Kumar K.V
Reviewed-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
94ae8ba71 hugetlb/cgroup: assign the page hugetlb cgroup when we move the page to active list. ... Browse Code »

A page's hugetlb cgroup assignment and movement to the active list should
occur with hugetlb_lock held. Otherwise when we remove the hugetlb cgroup
we will iterate the active list and find pages with NULL hugetlb cgroup
values.

Signed-off-by: Aneesh Kumar K.V
Reviewed-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
8e6ac7fab hugetlb/cgroup: migrate hugetlb cgroup info from oldpage to new page during migration ... Browse Code »

With HugeTLB pages, hugetlb cgroup is uncharged in compound page
destructor. Since we are holding a hugepage reference, we can be sure
that old page won't get uncharged till the last put_page().

Signed-off-by: Aneesh Kumar K.V
Cc: David Rientjes
Acked-by: KAMEZAWA Hiroyuki
Cc: Hillf Danton
Cc: Michal Hocko
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
abb8206cb hugetlb/cgroup: add hugetlb cgroup control files ... Browse Code »

Add the control files for hugetlb controller

[akpm@linux-foundation.org: s/CONFIG_CGROUP_HUGETLB_RES_CTLR/CONFIG_MEMCG_HUGETLB/g]
[akpm@linux-foundation.org: s/CONFIG_MEMCG_HUGETLB/CONFIG_CGROUP_HUGETLB/]
Signed-off-by: Aneesh Kumar K.V
Cc: David Rientjes
Acked-by: KAMEZAWA Hiroyuki
Cc: Hillf Danton
Reviewed-by: Michal Hocko
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
da1def559 hugetlb/cgroup: add support for cgroup removal ... Browse Code »

Add support for cgroup removal. If we don't have parent cgroup, the
charges are moved to root cgroup.

Signed-off-by: Aneesh Kumar K.V
Cc: David Rientjes
Acked-by: KAMEZAWA Hiroyuki
Cc: Hillf Danton
Reviewed-by: Michal Hocko
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
6d76dcf40 hugetlb/cgroup: add charge/uncharge routines for hugetlb cgroup ... Browse Code »

Add the charge and uncharge routines for hugetlb cgroup. We do cgroup
charging in page alloc and uncharge in compound page destructor.
Assigning page's hugetlb cgroup is protected by hugetlb_lock.

[liwp@linux.vnet.ibm.com: add huge_page_order check to avoid incorrect uncharge]
Signed-off-by: Aneesh Kumar K.V
Cc: David Rientjes
Acked-by: KAMEZAWA Hiroyuki
Cc: Hillf Danton
Cc: Michal Hocko
Cc: KOSAKI Motohiro
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Wanpeng Li
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:41 +0800
2bc64a204 mm/hugetlb: add new HugeTLB cgroup ... Browse Code »

Implement a new controller that allows us to control HugeTLB allocations.
The extension allows to limit the HugeTLB usage per control group and
enforces the controller limit during page fault. Since HugeTLB doesn't
support page reclaim, enforcing the limit at page fault time implies that,
the application will get SIGBUS signal if it tries to access HugeTLB pages
beyond its limit. This requires the application to know beforehand how
much HugeTLB pages it would require for its use.

The charge/uncharge calls will be added to HugeTLB code in later patch.
Support for cgroup removal will be added in later patches.

[akpm@linux-foundation.org: s/CONFIG_CGROUP_HUGETLB_RES_CTLR/CONFIG_MEMCG_HUGETLB/g]
[akpm@linux-foundation.org: s/CONFIG_MEMCG_HUGETLB/CONFIG_CGROUP_HUGETLB/g]
Reviewed-by: KAMEZAWA Hiroyuki
Signed-off-by: Aneesh Kumar K.V
Cc: David Rientjes
Cc: Hillf Danton
Reviewed-by: Michal Hocko
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2012-08-01 09:42:40 +0800