Eric Lee / smarc-fsl-linux-kernel

13 Jan, 2012

1 commit

b8f566b04 sysctl: add the kernel.ns_last_pid control ... Browse Code »

The sysctl works on the current task's pid namespace, getting and setting
its last_pid field.

Writing is allowed for CAP_SYS_ADMIN-capable tasks thus making it possible
to create a task with desired pid value. This ability is required badly
for the checkpoint/restore in userspace.

This approach suits all the parties for now.

Signed-off-by: Pavel Emelyanov
Acked-by: Tejun Heo
Cc: Oleg Nesterov
Cc: Cyrill Gorcunov
Cc: "Eric W. Biederman"
Cc: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2012-01-13 12:13:11 +0800

31 Oct, 2011

1 commit

9984de1a5 kernel: Map most files to use export.h instead of module.h ... Browse Code »

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else. Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

-#include
+#include

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

29 Sep, 2011

1 commit

b3fbab057 rcu: Restore checks for blocking in RCU read-side critical sections ... Browse Code »

Long ago, using TREE_RCU with PREEMPT would result in "scheduling
while atomic" diagnostics if you blocked in an RCU read-side critical
section. However, PREEMPT now implies TREE_PREEMPT_RCU, which defeats
this diagnostic. This commit therefore adds a replacement diagnostic
based on PROVE_RCU.

Because rcu_lockdep_assert() and lockdep_rcu_dereference() are now being
used for things that have nothing to do with rcu_dereference(), rename
lockdep_rcu_dereference() to lockdep_rcu_suspicious() and add a third
argument that is a string indicating what is suspicious. This third
argument is passed in from a new third argument to rcu_lockdep_assert().
Update all calls to rcu_lockdep_assert() to add an informative third
argument.

Also, add a pair of rcu_lockdep_assert() calls from within
rcu_note_context_switch(), one complaining if a context switch occurs
in an RCU-bh read-side critical section and another complaining if a
context switch occurs in an RCU-sched read-side critical section.
These are present only if the PROVE_RCU kernel parameter is enabled.

Finally, fix some checkpatch whitespace complaints in lockdep.c.

Again, you must enable PROVE_RCU to see these new diagnostics. But you
are enabling PROVE_RCU to check out new RCU uses in any case, aren't you?

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2011-09-29 12:36:37 +0800

09 Jul, 2011

1 commit

d8bf4ca9c rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check ... Browse Code »

Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.

Signed-off-by: Michal Hocko
Acked-by: Paul E. McKenney
Signed-off-by: Jiri Kosina

Michal Hocko
2011-07-09 04:21:58 +0800

19 Apr, 2011

1 commit

c78193e9c next_pidmap: fix overflow condition ... Browse Code »

next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.

Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.

So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).

[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]

Reported-by: Tavis Ormandy
Analyzed-by: Robert Święcki
Cc: Eric W. Biederman
Cc: Pavel Emelyanov
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-04-19 01:35:30 +0800

18 Mar, 2011

1 commit

77c100c83 export pid symbols needed for kvm_vcpu_on_spin ... Browse Code »

Export the symbols required for a race-free kvm_vcpu_on_spin.

Signed-off-by: Rik van Riel
Signed-off-by: Avi Kivity

Rik van Riel
2011-03-18 00:08:28 +0800

20 Aug, 2010

2 commits

4221a9918 Add RCU check for find_task_by_vpid(). ... Browse Code »

find_task_by_vpid() says "Must be called under rcu_read_lock().". But due to
commit 3120438 "rcu: Disable lockdep checking in RCU list-traversal primitives",
we are currently unable to catch "find_task_by_vpid() with tasklist_lock held
but RCU lock not held" errors due to the RCU-lockdep checks being
suppressed in the RCU variants of the struct list_head traversals.
This commit therefore places an explicit check for being in an RCU
read-side critical section in find_task_by_pid_ns().

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/pid.c:386 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by rc.sysinit/1102:
#0: (tasklist_lock){.+.+..}, at: [] sys_setpgid+0x40/0x160

stack backtrace:
Pid: 1102, comm: rc.sysinit Not tainted 2.6.35-rc3-dirty #1
Call Trace:
[] lockdep_rcu_dereference+0x94/0xb0
[] find_task_by_pid_ns+0x6d/0x70
[] find_task_by_vpid+0x18/0x20
[] sys_setpgid+0x47/0x160
[] sysenter_do_call+0x12/0x36

Commit updated to use a new rcu_lockdep_assert() exported API rather than
the old internal __do_rcu_dereference().

Signed-off-by: Tetsuo Handa
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Tetsuo Handa
2010-08-20 08:18:02 +0800
67bdbffd6 rculist: avoid __rcu annotations ... Browse Code »

This avoids warnings from missing __rcu annotations
in the rculist implementation, making it possible to
use the same lists in both RCU and non-RCU cases.

We can add rculist annotations later, together with
lockdep support for rculist, which is missing as well,
but that may involve changing all the users.

Signed-off-by: Arnd Bergmann
Signed-off-by: Paul E. McKenney
Cc: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Reviewed-by: Josh Triplett

Arnd Bergmann
2010-08-20 08:18:00 +0800

11 Aug, 2010

2 commits

c52b0b91b pids: alloc_pidmap: remove the unnecessary boundary checks ... Browse Code »

alloc_pidmap() calculates max_scan so that if the initial offset != 0 we
inspect the first map->page twice. This is correct, we want to find the
unused bits < offset in this bitmap block. Add the comment.

But it doesn't make any sense to stop the find_next_offset() loop when we
are looking into this map->page for the second time. We have already
already checked the bits >= offset during the first attempt, it is fine to
do this again, no matter if we succeed this time or not.

Remove this hard-to-understand code. It optimizes the very unlikely case
when we are going to fail, but slows down the more likely case.

Signed-off-by: Oleg Nesterov
Cc: Salman Qazi
Cc: Ingo Molnar
Cc: Sukadev Bhattiprolu
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-08-11 23:59:20 +0800
5fdee8c4a pids: fix a race in pid generation that causes pids to be reused immediately ... Browse Code »

A program that repeatedly forks and waits is susceptible to having the
same pid repeated, especially when it competes with another instance of
the same program. This is really bad for bash implementation.
Furthermore, many shell scripts assume that pid numbers will not be used
for some length of time.

Race Description:

A B

// pid == offset == n // pid == offset == n + 1
test_and_set_bit(offset, map->page)
test_and_set_bit(offset, map->page);
pid_ns->last_pid = pid;
pid_ns->last_pid = pid;
// pid == n + 1 is freed (wait())

// Next fork()...
last = pid_ns->last_pid; // == n
pid = last + 1;

Code to reproduce it (Running multiple instances is more effective):

#include
#include
#include
#include
#include
#include

// The distance mod 32768 between two pids, where the first pid is expected
// to be smaller than the second.
int PidDistance(pid_t first, pid_t second) {
return (second + 32768 - first) % 32768;
}

int main(int argc, char* argv[]) {
int failed = 0;
pid_t last_pid = 0;
int i;
printf("%d\n", sizeof(pid_t));
for (i = 0; i < 10000000; ++i) {
if (i % 32786 == 0)
printf("Iter: %d\n", i/32768);
int child_exit_code = i % 256;
pid_t pid = fork();
if (pid == -1) {
fprintf(stderr, "fork failed, iteration %d, errno=%d", i, errno);
exit(1);
}
if (pid == 0) {
// Child
exit(child_exit_code);
} else {
// Parent
if (i > 0) {
int distance = PidDistance(last_pid, pid);
if (distance == 0 || distance > 30000) {
fprintf(stderr,
"Unexpected pid sequence: previous fork: pid=%d, "
"current fork: pid=%d for iteration=%d.\n",
last_pid, pid, i);
failed = 1;
}
}
last_pid = pid;
int status;
int reaped = wait(&status);
if (reaped != pid) {
fprintf(stderr,
"Wait return value: expected pid=%d, "
"got %d, iteration %d\n",
pid, reaped, i);
failed = 1;
} else if (WEXITSTATUS(status) != child_exit_code) {
fprintf(stderr,
"Unexpected exit status %x, iteration %d\n",
WEXITSTATUS(status), i);
failed = 1;
}
}
}
exit(failed);
}

Thanks to Ted Tso for the key ideas of this implementation.

Signed-off-by: Salman Qazi
Cc: Ingo Molnar
Cc: Theodore Ts'o
Cc: Peter Zijlstra
Cc: Sukadev Bhattiprolu
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Salman
2010-08-11 23:59:20 +0800

28 May, 2010

1 commit

72680a191 pids: increase pid_max based on num_possible_cpus ... Browse Code »

On a system with a substantial number of processors, the early default
pid_max of 32k will not be enough. A system with 1664 CPU's, there are
25163 processes started before the login prompt. It's estimated that with
2048 CPU's we will pass the 32k limit. With 4096, we'll reach that limit
very early during the boot cycle, and processes would stall waiting for an
available pid.

This patch increases the early maximum number of pids available, and
increases the minimum number of pids that can be set during runtime.

[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Hedi Berriche
Signed-off-by: Mike Travis
Signed-off-by: Robin Holt
Acked-by: Linus Torvalds
Cc: Ingo Molnar
Cc: Pavel Machek
Cc: Alan Cox
Cc: Greg KH
Cc: Rik van Riel
Cc: John Stoffel
Cc: Jack Steiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hedi Berriche
2010-05-28 00:12:51 +0800

14 Mar, 2010

1 commit

4e3eaddd1 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
locking: Make sparse work with inline spinlocks and rwlocks
x86/mce: Fix RCU lockdep splats
rcu: Increase RCU CPU stall timeouts if PROVE_RCU
ftrace: Replace read_barrier_depends() with rcu_dereference_raw()
rcu: Suppress RCU lockdep warnings during early boot
rcu, ftrace: Fix RCU lockdep splat in ftrace_perf_buf_prepare()
rcu: Suppress __mpol_dup() false positive from RCU lockdep
rcu: Make rcu_read_lock_sched_held() handle !PREEMPT
rcu: Add control variables to lockdep_rcu_dereference() diagnostics
rcu, cgroup: Relax the check in task_subsys_state() as early boot is now handled by lockdep-RCU
rcu: Use wrapper function instead of exporting tasklist_lock
sched, rcu: Fix rcu_dereference() for RCU-lockdep
rcu: Make task_subsys_state() RCU-lockdep checks handle boot-time use
rcu: Fix holdoff for accelerated GPs for last non-dynticked CPU
x86/gart: Unexport gart_iommu_aperture

Fix trivial conflicts in kernel/trace/ftrace.c

Linus Torvalds
2010-03-14 06:43:01 +0800

07 Mar, 2010

1 commit

9728e5d6e kernel/pid.c: update comment on find_task_by_pid_ns ... Browse Code »

tasklist_lock does protect the task and its pid, it can't go away. The
problem is that find_pid_ns() itself is unsafe without rcu lock, it can
race with copy_process()->free_pid(any_pid).

Protecting copy_process()->free_pid(any_pid) with tasklist_lock would make
it possible to call find_task_by_pid_ns() under tasklist safely, but we
don't do so because we are trying to get rid of the read_lock sites of
tasklist_lock.

Signed-off-by: Tetsuo Handa
Cc: Oleg Nesterov
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2010-03-07 03:26:33 +0800

04 Mar, 2010

1 commit

db1466b3e rcu: Use wrapper function instead of exporting tasklist_lock ... Browse Code »

Lockdep-RCU commit d11c563d exported tasklist_lock, which is not
a good thing. This patch instead exports a function that uses
lockdep to check whether tasklist_lock is held.

Suggested-by: Christoph Hellwig
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: Christoph Hellwig
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-03-04 18:46:14 +0800

25 Feb, 2010

1 commit

d11c563dd sched: Use lockdep-based checking on rcu_dereference() ... Browse Code »

Update the rcu_dereference() usages to take advantage of the new
lockdep-based checking.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
[ -v2: fix allmodconfig missing symbol export build failure on x86 ]
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:26 +0800

16 Dec, 2009

2 commits

417e31524 pid: reduce code size by using a pointer to iterate over array ... Browse Code »

It decreases code size by 16 bytes on my gcc 4.4.1 on Core 2:
text data bss dec hex filename
4314 2216 8 6538 198a kernel/pid.o-BEFORE
4298 2216 8 6522 197a kernel/pid.o-AFTER

Signed-off-by: André Goddard Rosa
Cc: Pekka Enberg
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

André Goddard Rosa
2009-12-16 23:20:12 +0800
7be6d991b pid: tighten pidmap spinlock critical section by removing kfree() ... Browse Code »

Avoid calling kfree() under pidmap spinlock, calling it afterwards.

Normally kfree() is fast, but sometimes it can be slow, so avoid
calling it under the spinlock if we can do it.

Signed-off-by: André Goddard Rosa
Cc: Pekka Enberg
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

André Goddard Rosa
2009-12-16 23:20:12 +0800

22 Sep, 2009

1 commit

2c85f51d2 mm: also use alloc_large_system_hash() for the PID hash table ... Browse Code »

This is being done by allowing boot time allocations to specify that they
may want a sub-page sized amount of memory.

Overall this seems more consistent with the other hash table allocations,
and allows making two supposedly mm-only variables really mm-only
(nr_{kernel,all}_pages).

Signed-off-by: Jan Beulich
Cc: Ingo Molnar
Cc: "Eric W. Biederman"
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2009-09-22 22:17:38 +0800

10 Jul, 2009

1 commit

264ef8a90 kmemleak: Remove alloc_bootmem annotations introduced in the past ... Browse Code »

kmemleak_alloc() calls were added in some places where alloc_bootmem was
called. Since now kmemleak tracks bootmem allocations, these explicit
calls should be run.

Signed-off-by: Catalin Marinas
Cc: Ingo Molnar
Acked-by: Pekka Enberg

Catalin Marinas
2009-07-10 00:07:02 +0800

30 Jun, 2009

1 commit

12de38b18 kmemleak: Inform kmemleak about pid_hash ... Browse Code »

Kmemleak does not track alloc_bootmem calls but the pid_hash allocated
in pidhash_init() would need to be scanned as it contains pointers to
struct pid objects.

Signed-off-by: Catalin Marinas

Catalin Marinas
2009-06-30 00:14:14 +0800

19 Jun, 2009

1 commit

17f98dcf6 pids: clean up find_task_by_pid variants ... Browse Code »

find_task_by_pid_type_ns is only used to implement find_task_by_vpid and
find_task_by_pid_ns, but both of them pass PIDTYPE_PID as first argument.
So just fold find_task_by_pid_type_ns into find_task_by_pid_ns and use
find_task_by_pid_ns to implement find_task_by_vpid.

While we're at it also remove the exports for find_task_by_pid_ns and
find_task_by_vpid - we don't have any modular callers left as the only
modular caller of he old pre pid namespace find_task_by_pid (gfs2) was
switched to pid_task which operates on a struct pid pointer instead of a
pid_t. Given the confusion about pid_t values vs namespace that's
generally the better option anyway and I think we're better of restricting
modules to do it that way.

Signed-off-by: Christoph Hellwig
Cc: Pavel Emelyanov
Cc: "Eric W. Biederman"
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2009-06-19 04:03:55 +0800

03 Apr, 2009

2 commits

52ee2dfdd pids: refactor vnr/nr_ns helpers to make them safe ... Browse Code »

Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.

task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.

As for "special" pids, vnr/nr_ns helpers always need rcu. However, if
task != current, they are unsafe even under rcu lock, we can't trust
task->group_leader without the special checks.

And almost every helper has a callsite which needs a fix.

Also, it is a bit annoying that the implementations of, say,
task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".

This patch introduces the new helper, __task_pid_nr_ns(), which is always
safe to use, and turns all other helpers into the trivial wrappers.

After this I'll send another patch which converts task_tgid_xxx() as well,
they're are a bit special.

Signed-off-by: Oleg Nesterov
Cc: Louis Rilling
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-04-03 10:05:02 +0800
2ae448efc pids: improve get_task_pid() to fix the unsafe sys_wait4()->task_pgrp() ... Browse Code »

sys_wait4() does get_pid(task_pgrp(current)), this is not safe. We can
add rcu lock/unlock around, but we already have get_task_pid() which can
be improved to handle the special pids in more reliable manner.

Signed-off-by: Oleg Nesterov
Cc: Louis Rilling
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-04-03 10:05:02 +0800

09 Jan, 2009

1 commit

61bce0f13 pid: generalize task_active_pid_ns ... Browse Code »

Currently task_active_pid_ns is not safe to call after a task becomes a
zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By
reading the pid namespace from the pid of the task we can trivially solve
this problem at the cost of one extra memory read in what should be the
same cacheline as we read the namespace from.

When moving things around I have made task_active_pid_ns out of line
because keeping it in pid_namespace.h would require adding includes of
pid.h and sched.h that I don't think we want.

This change does make task_active_pid_ns unsafe to call during
copy_process until we attach a pid on the task_struct which seems to be a
reasonable trade off.

Signed-off-by: Eric W. Biederman
Signed-off-by: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Roland McGrath
Cc: Bastian Blank
Cc: Pavel Emelyanov
Cc: Nadia Derbey
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2009-01-09 00:31:12 +0800

06 Jan, 2009

1 commit

025dfdafe trivial: fix then -> than typos in comments and documentation ... Browse Code »

- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer
Signed-off-by: Jiri Kosina

Frederik Schwarzer
2009-01-06 18:28:06 +0800

26 Jul, 2008

2 commits

e49859e71 pidns: remove now unused find_pid function. ... Browse Code »

This one had the only users so far - the kill_proc, which is removed, so
drop this (invalid in namespaced world) call too.

And of course - erase all references on it from comments.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-07-26 01:53:45 +0800
339caf2a2 proc: misplaced export of find_get_pid ... Browse Code »

Move EXPORT_SYMBOL right after the func

Signed-off-by: David Sterba
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Sterba
2008-07-26 01:53:45 +0800

19 May, 2008

1 commit

82524746c rcu: split list.h and move rcu-protected lists into rculist.h ... Browse Code »

Move rcu-protected lists from list.h into a new header file rculist.h.

This is done because list are a very used primitive structure all over the
kernel and it's currently impossible to include other header files in this
list.h without creating some circular dependencies.

For example, list.h implements rcu-protected list and uses rcu_dereference()
without including rcupdate.h. It actually compiles because users of
rcu_dereference() are macros. Others RCU functions could be used too but
aren't probably because of this.

Therefore this patch creates rculist.h which includes rcupdates without to
many changes/troubles.

Signed-off-by: Franck Bui-Huu
Acked-by: Paul E. McKenney
Acked-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar

Franck Bui-Huu
2008-05-19 16:01:37 +0800

30 Apr, 2008

4 commits

24336eaee pids: introduce change_pid() helper ... Browse Code »

Based on Eric W. Biederman's idea.

Without tasklist_lock held task_session()/task_pgrp() can return NULL if the
caller races with setprgp()/setsid() which does detach_pid() + attach_pid().
This can happen even if task == current.

Intoduce the new helper, change_pid(), which should be used instead. This way
the caller always sees the special pid != NULL, either old or new.

Also change the prototype of attach_pid(), it always returns 0 and nobody
check the returned value.

Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800
65450cebc pids: de_thread: don't clear session/pgrp pids for the old leader ... Browse Code »

Based on Eric W. Biederman's idea.

Unless task == current, without tasklist_lock held task_session()/task_pgrp()
can return NULL if the caller races with de_thread() which switches the group
leader.

Change transfer_pid() to not clear old->pids[type].pid for the old leader.
This means that its .pid can point to "nowhere", but this is already true for
sub-threads, and the old leader is not group_leader() any longer. IOW, with
or without this change we can't trust task's special pids unless it is the
group leader.

With this change the following code

rcu_read_lock();
task = find_task_by_xxx();
do_something(task_pgrp(task), task_session(task));
rcu_read_unlock();

can't race with exec and hit the NULL pid.

Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800
5cd204550 Deprecate find_task_by_pid() ... Browse Code »

There are some places that are known to operate on tasks'
global pids only:

* the rest_init() call (called on boot)
* the kgdb's getthread
* the create_kthread() (since the kthread is run in init ns)

So use the find_task_by_pid_ns(..., &init_pid_ns) there
and schedule the find_task_by_pid for removal.

[sukadev@us.ibm.com: Fix warning in kernel/pid.c]
Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Signed-off-by: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-04-30 23:29:48 +0800
b7127aa45 free_pidmap: turn it into free_pidmap(struct upid *) ... Browse Code »

The callers of free_pidmap() pass 2 members of "struct upid", we can just
pass "struct upid *" instead. Shaves off 10 bytes from pid.o.

Also, simplify the alloc_pid's "out_free:" error path a little bit. This
way it looks more clear which subset of pid->numbers[] we are freeing.

Signed-off-by: Oleg Nesterov
Cc: Pavel Emelyanov
Cc: "Eric W. Biederman"
Cc :Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-04-30 23:29:48 +0800

09 Feb, 2008

3 commits

7ad5b3a50 kernel: remove fastcall in kernel/* ... Browse Code »

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-02-09 01:22:31 +0800
44c4e1b25 pid: Extend/Fix pid_vnr ... Browse Code »

pid_vnr returns the user space pid with respect to the pid namespace the
struct pid was allocated in. What we want before we return a pid to user
space is the user space pid with respect to the pid namespace of current.

pid_vnr is a very nice optimization but because it isn't quite what we want
it is easy to use pid_vnr at times when we aren't certain the struct pid
was allocated in our pid namespace.

Currently this describes at least tiocgpgrp and tiocgsid in ttyio.c the
parent process reported in the core dumps and the parent process in
get_signal_to_deliver.

So unless the performance impact is huge having an interface that does what
we want instead of always what we want should be much more reliable and
much less error prone.

Signed-off-by: Eric W. Biederman
Cc: Oleg Nesterov
Acked-by: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2008-02-09 01:22:27 +0800
74bd59bb3 namespaces: cleanup the code managed with PID_NS option ... Browse Code »

Just like with the user namespaces, move the namespace management code into
the separate .c file and mark the (already existing) PID_NS option as "depend
on NAMESPACES"

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-09 01:22:23 +0800

08 Feb, 2008

1 commit

eccba0689 gfs2: make gfs2_glock.gl_owner_pid be a struct pid * ... Browse Code »

The gl_owner_pid field is used to get the lock owning task by its pid, so make
it in a proper manner, i.e. by using the struct pid pointer and pid_task()
function.

The pid_task() becomes exported for the gfs2 module.

Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Acked-by: Steven Whitehouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-08 00:42:06 +0800

15 Nov, 2007

1 commit

57d5f66b8 pidns: Place under CONFIG_EXPERIMENTAL ... Browse Code »

This is my trivial patch to swat innumerable little bugs with a single
blow.

After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.

Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.

Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.

Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Adrian Bunk
Cc: Jeremy Fitzhardinge
Cc: Kir Kolyshkin
Cc: Kirill Korotaev
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-11-15 10:45:43 +0800

20 Oct, 2007

3 commits

2f2a3a46f Uninline the task_xid_nr_ns() calls ... Browse Code »

Since these are expanded into call to pid_nr_ns() anyway, it's OK to move
the whole routine out-of-line. This is a cheap way to save ~100 bytes from
vmlinux. Together with the previous two patches, it saves half-a-kilo from
the vmlinux.

Un-inline other (currently inlined) functions must be done with additional
performance testing.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:41 +0800
8990571eb Uninline find_pid etc set of functions ... Browse Code »

The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its
id, depending on whic id - global or virtual - is used.

The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the
stack to call another function - find_pid_ns(). It turned out, that this
dereference together with the push itself cause the kernel text size to
grow too much.

Move all these out-of-line. Together with the previous patch this saves a
bit less that 400 bytes from .text section.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:41 +0800
19b9b9b54 pid namespaces: remove the struct pid unneeded fields ... Browse Code »

Since we've switched from using pid->nr to pid->upids->nr some
fields on struct pid are no longer needed

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:40 +0800