Doug / smarc-fsl-linux-kernel | Embedian Git Server

30 May, 2012

1 commit

3f1346193 memcg: decrement static keys at real destroy time ... Browse Code »

We call the destroy function when a cgroup starts to be removed, such as
by a rmdir event.

However, because of our reference counters, some objects are still
inflight. Right now, we are decrementing the static_keys at destroy()
time, meaning that if we get rid of the last static_key reference, some
objects will still have charges, but the code to properly uncharge them
won't be run.

This becomes a problem specially if it is ever enabled again, because now
new charges will be added to the staled charges making keeping it pretty
much impossible.

We just need to be careful with the static branch activation: since there
is no particular preferred order of their activation, we need to make sure
that we only start using it after all call sites are active. This is
achieved by having a per-memcg flag that is only updated after
static_key_slow_inc() returns. At this time, we are sure all sites are
active.

This is made per-memcg, not global, for a reason: it also has the effect
of making socket accounting more consistent. The first memcg to be
limited will trigger static_key() activation, therefore, accounting. But
all the others will then be accounted no matter what. After this patch,
only limited memcgs will have its sockets accounted.

[akpm@linux-foundation.org: move enum sock_flag_bits into sock.h,
document enum sock_flag_bits,
convert memcg_proto_active() and memcg_proto_activated() to test_bit(),
redo tcp_update_limit() comment to 80 cols]
Signed-off-by: Glauber Costa
Cc: Tejun Heo
Cc: Li Zefan
Acked-by: KAMEZAWA Hiroyuki
Cc: Johannes Weiner
Cc: Michal Hocko
Acked-by: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-05-30 07:22:28 +0800

11 Apr, 2012

1 commit

1d62e4365 cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg ... Browse Code »

The only reason cgroup was used, was to be consistent with the populate()
interface. Now that we're getting rid of it, not only we no longer need
it, but we also *can't* call it this way.

Since we will no longer rely on populate(), this will be called from
create(). During create, the association between struct mem_cgroup
and struct cgroup does not yet exist, since cgroup internals hasn't
yet initialized its bookkeeping. This means we would not be able
to draw the memcg pointer from the cgroup pointer in these
functions, which is highly undesirable.

Signed-off-by: Glauber Costa
Acked-by: Kamezawa Hiroyuki
Signed-off-by: Tejun Heo
CC: Li Zefan
CC: Johannes Weiner
CC: Michal Hocko

Glauber Costa
2012-04-11 01:04:07 +0800

02 Apr, 2012

2 commits

6bc103498 cgroup: convert memcg controller to the new cftype interface ... Browse Code »

Convert memcg to use the new cftype based interface. kmem support
abuses ->populate() for mem_cgroup_sockets_init() so it can't be
removed at the moment.

tcp_memcontrol is updated so that tcp_files[] is registered via a
__initcall. This change also allows removing the forward declaration
of tcp_files[]. Removed.

Signed-off-by: Tejun Heo
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Li Zefan
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Balbir Singh
Cc: Glauber Costa
Cc: Hugh Dickins
Cc: Greg Thelen

Tejun Heo
2012-04-02 03:09:55 +0800
676f7c8f8 cgroup: relocate cftype and cgroup_subsys definitions in controllers ... Browse Code »

blk-cgroup, netprio_cgroup, cls_cgroup and tcp_memcontrol
unnecessarily define cftype array and cgroup_subsys structures at the
top of the file, which is unconventional and necessiates forward
declaration of methods.

This patch relocates those below the definitions of the methods and
removes the forward declarations. Note that forward declaration of
tcp_files[] is added in tcp_memcontrol.c for tcp_init_cgroup(). This
will be removed soon by another patch.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo
Acked-by: Li Zefan

Tejun Heo
2012-04-02 03:09:55 +0800

21 Mar, 2012

1 commit

0d9cabdcc Merge branch 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup changes from Tejun Heo:
"Out of the 8 commits, one fixes a long-standing locking issue around
tasklist walking and others are cleanups."

* 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list
cgroup: Remove wrong comment on cgroup_enable_task_cg_list()
cgroup: remove cgroup_subsys argument from callbacks
cgroup: remove extra calls to find_existing_css_set
cgroup: replace tasklist_lock with rcu_read_lock
cgroup: simplify double-check locking in cgroup_attach_proc
cgroup: move struct cgroup_pidlist out from the header file
cgroup: remove cgroup_attach_task_current_cg()

Linus Torvalds
2012-03-21 09:11:21 +0800

24 Feb, 2012

1 commit

c5905afb0 static keys: Introduce 'struct static_key', static_key_true()/false() and static… ... Browse Code »

…_key_slow_[inc|dec]()

So here's a boot tested patch on top of Jason's series that does
all the cleanups I talked about and turns jump labels into a
more intuitive to use facility. It should also address the
various misconceptions and confusions that surround jump labels.

Typical usage scenarios:

#include <linux/static_key.h>

struct static_key key = STATIC_KEY_INIT_TRUE;

if (static_key_false(&key))
do unlikely code
else
do likely code

Or:

if (static_key_true(&key))
do likely code
else
do unlikely code

The static key is modified via:

static_key_slow_inc(&key);
...
static_key_slow_dec(&key);

The 'slow' prefix makes it abundantly clear that this is an
expensive operation.

I've updated all in-kernel code to use this everywhere. Note
that I (intentionally) have not pushed through the rename
blindly through to the lowest levels: the actual jump-label
patching arch facility should be named like that, so we want to
decouple jump labels from the static-key facility a bit.

On non-jump-label enabled architectures static keys default to
likely()/unlikely() branches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: a.p.zijlstra@chello.nl
Cc: mathieu.desnoyers@efficios.com
Cc: davem@davemloft.net
Cc: ddaney.cavm@gmail.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Ingo Molnar
2012-02-24 17:05:59 +0800

03 Feb, 2012

1 commit

761b3ef50 cgroup: remove cgroup_subsys argument from callbacks ... Browse Code »

The argument is not used at all, and it's not necessary, because
a specific callback handler of course knows which subsys it
belongs to.

Now only ->pupulate() takes this argument, because the handlers of
this callback always call cgroup_add_file()/cgroup_add_files().

So we reduce a few lines of code, though the shrinking of object size
is minimal.

16 files changed, 113 insertions(+), 162 deletions(-)

text data bss dec hex filename
5486240 656987 7039960 13183187 c928d3 vmlinux.o.orig
5486170 656987 7039960 13183117 c9288d vmlinux.o

Signed-off-by: Li Zefan
Signed-off-by: Tejun Heo

Li Zefan
2012-02-03 01:20:22 +0800

13 Jan, 2012

1 commit

1398eee08 net: decrement memcg jump label when limit, not usage, is changed ... Browse Code »

The logic of the current code is that whenever we destroy
a cgroup that had its limit set (set meaning different than
maximum), we should decrement the jump_label counter.
Otherwise we assume it was never incremented.

But what the code actually does is test for RES_USAGE
instead of RES_LIMIT. Usage being different than maximum
is likely to be true most of the time.

The effect of this is that the key must become negative,
and since the jump_label test says:

!!atomic_read(&key->enabled);

we'll have jump_labels still on when no one else is
using this functionality.

Signed-off-by: Glauber Costa
CC: David S. Miller
Signed-off-by: David S. Miller

Glauber Costa
2012-01-13 04:27:59 +0800

16 Dec, 2011

1 commit

c48e074c7 tcp_memcontrol: fix reversed if condition ... Browse Code »

We should only dereference the pointer if it's valid, not the other way
round.

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2011-12-16 00:59:44 +0800

13 Dec, 2011

6 commits

0850f0f5c Display maximum tcp memory allocation in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
kmem_cgroup filesystem. The root cgroup will display a value equal
to RESOURCE_MAX. This is to avoid introducing any locking schemes in
the network paths when cgroups are not being actively used.

All others, will see the maximum memory ever used by this cgroup.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
ffea59e50 Display current tcp failcnt in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.failcnt file, living in the
kmem_cgroup filesystem. Following the pattern in the other
memcg resources, this files keeps a counter of how many times
allocation failed due to limits being hit in this cgroup.
The root cgroup will always show a failcnt of 0.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
5a6dd3437 Display current tcp memory allocation in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.usage_in_bytes file, living in the
kmem_cgroup filesystem. It is a simple read-only file that displays the
amount of kernel memory currently consumed by the cgroup.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
3aaabe234 tcp buffer limitation: per-cgroup limit ... Browse Code »

This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
effectively control the amount of kernel memory pinned by a cgroup.

This value is ignored in the root cgroup, and in all others,
caps the value specified by the admin in the net namespaces'
view of tcp_sysctl_mem.

If namespaces are being used, the admin is allowed to set a
value bigger than cgroup's maximum, the same way it is allowed
to set pretty much unlimited values in a real box.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
3dc43e3e4 per-netns ipv4 sysctl_tcp_mem ... Browse Code »

This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa
Reviewed-by: KAMEZAWA Hiroyuki
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
d1a4c0b37 tcp memory pressure controls ... Browse Code »

This patch introduces memory pressure controls for the tcp
protocol. It uses the generic socket memory pressure code
introduced in earlier patches, and fills in the
necessary data in cg_proto struct.

Signed-off-by: Glauber Costa
Reviewed-by: KAMEZAWA Hiroyuki
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:10 +0800