Eric Lee / smarc-fsl-linux-kernel

22 Sep, 2009

1 commit

2c85f51d2 mm: also use alloc_large_system_hash() for the PID hash table ... Browse Code »

This is being done by allowing boot time allocations to specify that they
may want a sub-page sized amount of memory.

Overall this seems more consistent with the other hash table allocations,
and allows making two supposedly mm-only variables really mm-only
(nr_{kernel,all}_pages).

Signed-off-by: Jan Beulich
Cc: Ingo Molnar
Cc: "Eric W. Biederman"
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2009-09-22 22:17:38 +0800

10 Jul, 2009

1 commit

264ef8a90 kmemleak: Remove alloc_bootmem annotations introduced in the past ... Browse Code »

kmemleak_alloc() calls were added in some places where alloc_bootmem was
called. Since now kmemleak tracks bootmem allocations, these explicit
calls should be run.

Signed-off-by: Catalin Marinas
Cc: Ingo Molnar
Acked-by: Pekka Enberg

Catalin Marinas
2009-07-10 00:07:02 +0800

30 Jun, 2009

1 commit

12de38b18 kmemleak: Inform kmemleak about pid_hash ... Browse Code »

Kmemleak does not track alloc_bootmem calls but the pid_hash allocated
in pidhash_init() would need to be scanned as it contains pointers to
struct pid objects.

Signed-off-by: Catalin Marinas

Catalin Marinas
2009-06-30 00:14:14 +0800

19 Jun, 2009

1 commit

17f98dcf6 pids: clean up find_task_by_pid variants ... Browse Code »

find_task_by_pid_type_ns is only used to implement find_task_by_vpid and
find_task_by_pid_ns, but both of them pass PIDTYPE_PID as first argument.
So just fold find_task_by_pid_type_ns into find_task_by_pid_ns and use
find_task_by_pid_ns to implement find_task_by_vpid.

While we're at it also remove the exports for find_task_by_pid_ns and
find_task_by_vpid - we don't have any modular callers left as the only
modular caller of he old pre pid namespace find_task_by_pid (gfs2) was
switched to pid_task which operates on a struct pid pointer instead of a
pid_t. Given the confusion about pid_t values vs namespace that's
generally the better option anyway and I think we're better of restricting
modules to do it that way.

Signed-off-by: Christoph Hellwig
Cc: Pavel Emelyanov
Cc: "Eric W. Biederman"
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2009-06-19 04:03:55 +0800

03 Apr, 2009

2 commits

52ee2dfdd pids: refactor vnr/nr_ns helpers to make them safe ... Browse Code »

Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.

task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.

As for "special" pids, vnr/nr_ns helpers always need rcu. However, if
task != current, they are unsafe even under rcu lock, we can't trust
task->group_leader without the special checks.

And almost every helper has a callsite which needs a fix.

Also, it is a bit annoying that the implementations of, say,
task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".

This patch introduces the new helper, __task_pid_nr_ns(), which is always
safe to use, and turns all other helpers into the trivial wrappers.

After this I'll send another patch which converts task_tgid_xxx() as well,
they're are a bit special.

Signed-off-by: Oleg Nesterov
Cc: Louis Rilling
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-04-03 10:05:02 +0800
2ae448efc pids: improve get_task_pid() to fix the unsafe sys_wait4()->task_pgrp() ... Browse Code »

sys_wait4() does get_pid(task_pgrp(current)), this is not safe. We can
add rcu lock/unlock around, but we already have get_task_pid() which can
be improved to handle the special pids in more reliable manner.

Signed-off-by: Oleg Nesterov
Cc: Louis Rilling
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-04-03 10:05:02 +0800

09 Jan, 2009

1 commit

61bce0f13 pid: generalize task_active_pid_ns ... Browse Code »

Currently task_active_pid_ns is not safe to call after a task becomes a
zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By
reading the pid namespace from the pid of the task we can trivially solve
this problem at the cost of one extra memory read in what should be the
same cacheline as we read the namespace from.

When moving things around I have made task_active_pid_ns out of line
because keeping it in pid_namespace.h would require adding includes of
pid.h and sched.h that I don't think we want.

This change does make task_active_pid_ns unsafe to call during
copy_process until we attach a pid on the task_struct which seems to be a
reasonable trade off.

Signed-off-by: Eric W. Biederman
Signed-off-by: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Roland McGrath
Cc: Bastian Blank
Cc: Pavel Emelyanov
Cc: Nadia Derbey
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2009-01-09 00:31:12 +0800

06 Jan, 2009

1 commit

025dfdafe trivial: fix then -> than typos in comments and documentation ... Browse Code »

- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer
Signed-off-by: Jiri Kosina

Frederik Schwarzer
2009-01-06 18:28:06 +0800

26 Jul, 2008

2 commits

e49859e71 pidns: remove now unused find_pid function. ... Browse Code »

This one had the only users so far - the kill_proc, which is removed, so
drop this (invalid in namespaced world) call too.

And of course - erase all references on it from comments.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-07-26 01:53:45 +0800
339caf2a2 proc: misplaced export of find_get_pid ... Browse Code »

Move EXPORT_SYMBOL right after the func

Signed-off-by: David Sterba
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Sterba
2008-07-26 01:53:45 +0800

19 May, 2008

1 commit

82524746c rcu: split list.h and move rcu-protected lists into rculist.h ... Browse Code »

Move rcu-protected lists from list.h into a new header file rculist.h.

This is done because list are a very used primitive structure all over the
kernel and it's currently impossible to include other header files in this
list.h without creating some circular dependencies.

For example, list.h implements rcu-protected list and uses rcu_dereference()
without including rcupdate.h. It actually compiles because users of
rcu_dereference() are macros. Others RCU functions could be used too but
aren't probably because of this.

Therefore this patch creates rculist.h which includes rcupdates without to
many changes/troubles.

Signed-off-by: Franck Bui-Huu
Acked-by: Paul E. McKenney
Acked-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar

Franck Bui-Huu
2008-05-19 16:01:37 +0800

30 Apr, 2008

4 commits

24336eaee pids: introduce change_pid() helper ... Browse Code »

Based on Eric W. Biederman's idea.

Without tasklist_lock held task_session()/task_pgrp() can return NULL if the
caller races with setprgp()/setsid() which does detach_pid() + attach_pid().
This can happen even if task == current.

Intoduce the new helper, change_pid(), which should be used instead. This way
the caller always sees the special pid != NULL, either old or new.

Also change the prototype of attach_pid(), it always returns 0 and nobody
check the returned value.

Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800
65450cebc pids: de_thread: don't clear session/pgrp pids for the old leader ... Browse Code »

Based on Eric W. Biederman's idea.

Unless task == current, without tasklist_lock held task_session()/task_pgrp()
can return NULL if the caller races with de_thread() which switches the group
leader.

Change transfer_pid() to not clear old->pids[type].pid for the old leader.
This means that its .pid can point to "nowhere", but this is already true for
sub-threads, and the old leader is not group_leader() any longer. IOW, with
or without this change we can't trust task's special pids unless it is the
group leader.

With this change the following code

rcu_read_lock();
task = find_task_by_xxx();
do_something(task_pgrp(task), task_session(task));
rcu_read_unlock();

can't race with exec and hit the NULL pid.

Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800
5cd204550 Deprecate find_task_by_pid() ... Browse Code »

There are some places that are known to operate on tasks'
global pids only:

* the rest_init() call (called on boot)
* the kgdb's getthread
* the create_kthread() (since the kthread is run in init ns)

So use the find_task_by_pid_ns(..., &init_pid_ns) there
and schedule the find_task_by_pid for removal.

[sukadev@us.ibm.com: Fix warning in kernel/pid.c]
Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Signed-off-by: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-04-30 23:29:48 +0800
b7127aa45 free_pidmap: turn it into free_pidmap(struct upid *) ... Browse Code »

The callers of free_pidmap() pass 2 members of "struct upid", we can just
pass "struct upid *" instead. Shaves off 10 bytes from pid.o.

Also, simplify the alloc_pid's "out_free:" error path a little bit. This
way it looks more clear which subset of pid->numbers[] we are freeing.

Signed-off-by: Oleg Nesterov
Cc: Pavel Emelyanov
Cc: "Eric W. Biederman"
Cc :Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800

09 Feb, 2008

3 commits

7ad5b3a50 kernel: remove fastcall in kernel/* ... Browse Code »

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-02-09 01:22:31 +0800
44c4e1b25 pid: Extend/Fix pid_vnr ... Browse Code »

pid_vnr returns the user space pid with respect to the pid namespace the
struct pid was allocated in. What we want before we return a pid to user
space is the user space pid with respect to the pid namespace of current.

pid_vnr is a very nice optimization but because it isn't quite what we want
it is easy to use pid_vnr at times when we aren't certain the struct pid
was allocated in our pid namespace.

Currently this describes at least tiocgpgrp and tiocgsid in ttyio.c the
parent process reported in the core dumps and the parent process in
get_signal_to_deliver.

So unless the performance impact is huge having an interface that does what
we want instead of always what we want should be much more reliable and
much less error prone.

Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Acked-by: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2008-02-09 01:22:27 +0800
74bd59bb3 namespaces: cleanup the code managed with PID_NS option ... Browse Code »

Just like with the user namespaces, move the namespace management code into
the separate .c file and mark the (already existing) PID_NS option as "depend
on NAMESPACES"

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-09 01:22:23 +0800

08 Feb, 2008

1 commit

eccba0689 gfs2: make gfs2_glock.gl_owner_pid be a struct pid * ... Browse Code »

The gl_owner_pid field is used to get the lock owning task by its pid, so make
it in a proper manner, i.e. by using the struct pid pointer and pid_task()
function.

The pid_task() becomes exported for the gfs2 module.

Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Acked-by: Steven Whitehouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-08 00:42:06 +0800

15 Nov, 2007

1 commit

57d5f66b8 pidns: Place under CONFIG_EXPERIMENTAL ... Browse Code »

This is my trivial patch to swat innumerable little bugs with a single
blow.

After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.

Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.

Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.

Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Adrian Bunk
Cc: Jeremy Fitzhardinge
Cc: Kir Kolyshkin
Cc: Kirill Korotaev
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-11-15 10:45:43 +0800

20 Oct, 2007

15 commits

2f2a3a46f Uninline the task_xid_nr_ns() calls ... Browse Code »

Since these are expanded into call to pid_nr_ns() anyway, it's OK to move
the whole routine out-of-line. This is a cheap way to save ~100 bytes from
vmlinux. Together with the previous two patches, it saves half-a-kilo from
the vmlinux.

Un-inline other (currently inlined) functions must be done with additional
performance testing.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:41 +0800
8990571eb Uninline find_pid etc set of functions ... Browse Code »

The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its
id, depending on whic id - global or virtual - is used.

The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the
stack to call another function - find_pid_ns(). It turned out, that this
dereference together with the push itself cause the kernel text size to
grow too much.

Move all these out-of-line. Together with the previous patch this saves a
bit less that 400 bytes from .text section.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:41 +0800
19b9b9b54 pid namespaces: remove the struct pid unneeded fields ... Browse Code »

Since we've switched from using pid->nr to pid->upids->nr some
fields on struct pid are no longer needed

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:40 +0800
228ebcbe6 Uninline find_task_by_xxx set of functions ... Browse Code »

The find_task_by_something is a set of macros are used to find task by pid
depending on what kind of pid is proposed - global or virtual one. All of
them are wrappers above the most generic one - find_task_by_pid_type_ns() -
and just substitute some args for it.

It turned out, that dereferencing the current->nsproxy->pid_ns construction
and pushing one more argument on the stack inline cause kernel text size to
grow.

This patch moves all this stuff out-of-line into kernel/pid.c. Together
with the next patch it saves a bit less than 400 bytes from the .text
section.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:40 +0800
3eb07c8c8 pid namespaces: destroy pid namespace on init's death ... Browse Code »

Terminate all processes in a namespace when the reaper of the namespace is
exiting. We do this by walking the pidmap of the namespace and sending
SIGKILL to all processes.

Signed-off-by: Sukadev Bhattiprolu
Acked-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-10-20 02:53:40 +0800
c9c5d9221 pid namespaces: create a slab-cache for 'struct pid_namespace' ... Browse Code »

This will help fixing memory leaks due to bad reference counting.

Signed-off-by: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-10-20 02:53:40 +0800
30e49c263 pid namespaces: allow cloning of new namespace ... Browse Code »

When clone() is invoked with CLONE_NEWPID, create a new pid namespace and then
create a new struct pid for the new process. Allocate pid_t's for the new
process in the new pid namespace and all ancestor pid namespaces. Make the
newly cloned process the session and process group leader.

Since the active pid namespace is special and expected to be the first entry
in pid->upid_list, preserve the order of pid namespaces.

The size of 'struct pid' is dependent on the the number of pid namespaces the
process exists in, so we use multiple pid-caches'. Only one pid cache is
created during system startup and this used by processes that exist only in
init_pid_ns.

When a process clones its pid namespace, we create additional pid caches as
necessary and use the pid cache to allocate 'struct pids' for that depth.

Note, that with this patch the newly created namespace won't work, since the
rest of the kernel still uses global pids, but this is to be fixed soon. Init
pid namespace still works.

[oleg@tv-sign.ru: merge fix]
Signed-off-by: Pavel Emelyanov
Signed-off-by: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
b461cc038 pid namespaces: miscellaneous preparations for pid namespaces ... Browse Code »

* remove pid.h from pid_namespaces.h;
* rework is_(cgroup|global)_init;
* optimize (get|put)_pid_ns for init_pid_ns;
* declare task_child_reaper to return actual reaper.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
198fe21b0 pid namespaces: helpers to find the task by its numerical ids ... Browse Code »

When searching the task by numerical id on may need to find it using global
pid (as it is done now in kernel) or by its virtual id, e.g. when sending a
signal to a task from one namespace the sender will specify the task's virtual
id and we should find the task by this value.

[akpm@linux-foundation.org: fix gfs2 linkage]
Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
7af572947 pid namespaces: helpers to obtain pid numbers ... Browse Code »

When showing pid to user or getting the pid numerical id for in-kernel use the
value of this id may differ depending on the namespace.

This set of helpers is used to get the global pid nr, the virtual (i.e. seen
by task in its namespace) nr and the nr as it is seen from the specified
namespace.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
8ef047aaa pid namespaces: make alloc_pid(), free_pid() and put_pid() work with struct upid ... Browse Code »

Each struct upid element of struct pid has to be initialized properly, i.e.
its nr mst be allocated from appropriate pidmap and ns set to appropriate
namespace.

When allocating a new pid, we need to know the namespace this pid will live
in, so the additional argument is added to alloc_pid().

On the other hand, the rest of the kernel still uses the pid->nr and
pid->pid_chain fields, so these ones are still initialized, but this will be
removed soon.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
faacbfd3a pid namespaces: add support for pid namespaces hierarchy ... Browse Code »

Each namespace has a parent and is characterized by its "level". Level is the
number of the namespace generation. E.g. init namespace has level 0, after
cloning new one it will have level 1, the next one - 2 and so on and so forth.
This level is not explicitly limited.

True hierarchy must have some way to find each namespace's children, but it is
not used in the patches, so this ability is not added (yet).

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:38 +0800
b460cbc58 pid namespaces: define is_global_init() and is_container_init() ... Browse Code »

is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk->pid == 1.

A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().

2.6.21-mm2-pidns2:

- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn
Signed-off-by: Sukadev Bhattiprolu
Acked-by: Pavel Emelianov
Cc: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Herbert Poetzel
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2007-10-20 02:53:37 +0800
2894d650c pid namespaces: define and use task_active_pid_ns() wrapper ... Browse Code »

With multiple pid namespaces, a process is known by some pid_t in every
ancestor pid namespace. Every time the process forks, the child process also
gets a pid_t in every ancestor pid namespace.

While a process is visible in >=1 pid namespaces, it can see pid_t's in only
one pid namespace. We call this pid namespace it's "active pid namespace",
and it is always the youngest pid namespace in which the process is known.

This patch defines and uses a wrapper to find the active pid namespace of a
process. The implementation of the wrapper will be changed in when support
for multiple pid namespaces are added.

Changelog:
2.6.22-rc4-mm2-pidns1:
- [Pavel Emelianov, Alexey Dobriyan] Back out the change to use
task_active_pid_ns() in child_reaper() since task->nsproxy
can be NULL during task exit (so child_reaper() continues to
use init_pid_ns).

to implement child_reaper() since init_pid_ns.child_reaper to
implement child_reaper() since tsk->nsproxy can be NULL during exit.

2.6.21-rc6-mm1:
- Rename task_pid_ns() to task_active_pid_ns() to reflect that a
process can have multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu
Acked-by: Pavel Emelianov
Cc: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Herbert Poetzel
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-10-20 02:53:37 +0800
baf8f0f82 pid namespaces: dynamic kmem cache allocator for pid namespaces ... Browse Code »

Add kmem_cache to pid_namespace to allocate pids from.

Since both implementations expand the struct pid to carry more numerical
values each namespace should have separate cache to store pids of different
sizes.

Each kmem cache is name "pid_", where is the number of numerical ids
on the pid. Different namespaces with same level of nesting will have same
caches.

This patch has two FIXMEs that are to be fixed after we reach the consensus
about the struct pid itself.

The first one is that the namespace to free the pid from in free_pid() must be
taken from pid. Now the init_pid_ns is used.

The second FIXME is about the cache allocation. When we do know how long the
object will be then we'll have to calculate this size in create_pid_cachep.
Right now the sizeof(struct pid) value is used.

[akpm@linux-foundation.org: coding-style repair]
Signed-off-by: Pavel Emelianov
Acked-by: Cedric Le Goater
Acked-by: Sukadev Bhattiprolu
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2007-10-20 02:53:37 +0800

17 Jul, 2007

1 commit

213dd266d namespace: ensure clone_flags are always stored in an unsigned long ... Browse Code »

While working on unshare support for the network namespace I noticed we
were putting clone flags in an int. Which is weird because the syscall
uses unsigned long and we at least need an unsigned to properly hold all of
the unshare flags.

So to make the code consistent, this patch updates the code to use
unsigned long instead of int for the clone flags in those places
where we get it wrong today.

Signed-off-by: Eric W. Biederman
Acked-by: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-07-17 00:05:48 +0800

11 May, 2007

2 commits

820e45db2 statically initialize struct pid for swapper ... Browse Code »

Statically initialize a struct pid for the swapper process (pid_t == 0) and
attach it to init_task. This is needed so task_pid(), task_pgrp() and
task_session() interfaces work on the swapper process also.

Signed-off-by: Sukadev Bhattiprolu
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Eric Biederman
Cc: Herbert Poetzl
Cc:
Acked-by: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-05-11 23:29:35 +0800
e713d0dab attach_pid() with struct pid parameter ... Browse Code »

attach_pid() currently takes a pid_t and then uses find_pid() to find the
corresponding struct pid. Sometimes we already have the struct pid. We can
then skip find_pid() if attach_pid() were to take a struct pid parameter.

Signed-off-by: Sukadev Bhattiprolu
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-05-11 23:29:35 +0800

09 May, 2007

1 commit

e3222c4ec Merge sys_clone()/sys_unshare() nsproxy and namespace handling ... Browse Code »

sys_clone() and sys_unshare() both makes copies of nsproxy and its associated
namespaces. But they have different code paths.

This patch merges all the nsproxy and its associated namespace copy/clone
handling (as much as possible). Posted on container list earlier for
feedback.

- Create a new nsproxy and its associated namespaces and pass it back to
caller to attach it to right process.

- Changed all copy_*_ns() routines to return a new copy of namespace
instead of attaching it to task->nsproxy.

- Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines.

- Removed unnessary !ns checks from copy_*_ns() and added BUG_ON()
just incase.

- Get rid of all individual unshare_*_ns() routines and make use of
copy_*_ns() instead.

[akpm@osdl.org: cleanups, warning fix]
[clg@fr.ibm.com: remove dup_namespaces() declaration]
[serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval]
[akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n]
Signed-off-by: Badari Pulavarty
Signed-off-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc:
Signed-off-by: Cedric Le Goater
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2007-05-09 02:15:00 +0800

08 May, 2007

1 commit

0a31bd5f2 KMEM_CACHE(): simplify slab cache creation ... Browse Code »

This patch provides a new macro

KMEM_CACHE(, )

to simplify slab creation. KMEM_CACHE creates a slab with the name of the
struct, with the size of the struct and with the alignment of the struct.
Additional slab flags may be specified if necessary.

Example

struct test_slab {
int a,b,c;
struct list_head;
} __cacheline_aligned_in_smp;

test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC)

will create a new slab named "test_slab" of the size sizeof(struct
test_slab) and aligned to the alignment of test slab. If it fails then we
panic.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:55 +0800