Eric Lee / smarc-fsl-linux-kernel

11 Jan, 2012

1 commit

0499680a4 procfs: add hidepid= and gid= mount options ... Browse Code »

Add support for mount options to restrict access to /proc/PID/
directories. The default backward-compatible "relaxed" behaviour is left
untouched.

The first mount option is called "hidepid" and its value defines how much
info about processes we want to be available for non-owners:

hidepid=0 (default) means the old behavior - anybody may read all
world-readable /proc/PID/* files.

hidepid=1 means users may not access any /proc// directories, but
their own. Sensitive files like cmdline, sched*, status are now protected
against other users. As permission checking done in proc_pid_permission()
and files' permissions are left untouched, programs expecting specific
files' modes are not confused.

hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other
users. It doesn't mean that it hides whether a process exists (it can be
learned by other means, e.g. by kill -0 $PID), but it hides process' euid
and egid. It compicates intruder's task of gathering info about running
processes, whether some daemon runs with elevated privileges, whether
another user runs some sensitive program, whether other users run any
program at all, etc.

gid=XXX defines a group that will be able to gather all processes' info
(as in hidepid=0 mode). This group should be used instead of putting
nonroot user in sudoers file or something. However, untrusted users (like
daemons, etc.) which are not supposed to monitor the tasks in the whole
system should not be added to the group.

hidepid=1 or higher is designed to restrict access to procfs files, which
might reveal some sensitive private information like precise keystrokes
timings:

http://www.openwall.com/lists/oss-security/2011/11/05/3

hidepid=1/2 doesn't break monitoring userspace tools. ps, top, pgrep, and
conky gracefully handle EPERM/ENOENT and behave as if the current user is
the only user running processes. pstree shows the process subtree which
contains "pstree" process.

Note: the patch doesn't deal with setuid/setgid issues of keeping
preopened descriptors of procfs files (like
https://lkml.org/lkml/2011/2/7/368). We rely on that the leaked
information like the scheduling counters of setuid apps doesn't threaten
anybody's privacy - only the user started the setuid program may read the
counters.

Signed-off-by: Vasiliy Kulikov
Cc: Alexey Dobriyan
Cc: Al Viro
Cc: Randy Dunlap
Cc: "H. Peter Anvin"
Cc: Greg KH
Cc: Theodore Tso
Cc: Alan Cox
Cc: James Morris
Cc: Oleg Nesterov
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasiliy Kulikov
2012-01-11 08:30:54 +0800

09 Jan, 2009

1 commit

61bce0f13 pid: generalize task_active_pid_ns ... Browse Code »

Currently task_active_pid_ns is not safe to call after a task becomes a
zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By
reading the pid namespace from the pid of the task we can trivially solve
this problem at the cost of one extra memory read in what should be the
same cacheline as we read the namespace from.

When moving things around I have made task_active_pid_ns out of line
because keeping it in pid_namespace.h would require adding includes of
pid.h and sched.h that I don't think we want.

This change does make task_active_pid_ns unsafe to call during
copy_process until we attach a pid on the task_struct which seems to be a
reasonable trade off.

Signed-off-by: Eric W. Biederman
Signed-off-by: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Roland McGrath
Cc: Bastian Blank
Cc: Pavel Emelyanov
Cc: Nadia Derbey
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2009-01-09 00:31:12 +0800

17 Oct, 2008

1 commit

25cbe53ef pid_ns: kill the now unused task_child_reaper() ... Browse Code »

task_child_reaper() has no callers anymore, kill it.

Signed-off-by: Oleg Nesterov
Acked-by: Serge Hallyn
Acked-by: Pavel Emelyanov
Acked-by: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-10-17 02:21:48 +0800

26 Jul, 2008

2 commits

20fad13ac pidns: add the struct bsd_acct_struct pointer on pid_namespace struct ... Browse Code »

All the bsdacct-related info will be stored in the area, pointer by this
one.

It will be NULL automatically for all new namespaces.

Signed-off-by: Pavel Emelyanov
Cc: Balbir Singh
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-07-26 01:53:46 +0800
3ae4eed34 proper pid{hash,map}_init() prototypes ... Browse Code »

This patch adds proper prototypes for pid{hash,map}_init() in
include/linux/pid_namespace.h

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-07-26 01:53:45 +0800

30 Apr, 2008

1 commit

caafa4324 pidns: make pid->level and pid_ns->level unsigned ... Browse Code »

These values represent the nesting level of a namespace and pids living in it,
and it's always non-negative.

Turning this from int to unsigned int saves some space in pid.c (11 bytes on
x86 and 64 on ia64) by letting the compiler optimize the pid_nr_ns a bit.
E.g. on ia64 this removes the sign extension calls, which compiler adds to
optimize access to pid->nubers[ns->level].

Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-04-30 23:29:49 +0800

09 Feb, 2008

1 commit

74bd59bb3 namespaces: cleanup the code managed with PID_NS option ... Browse Code »

Just like with the user namespaces, move the namespace management code into
the separate .c file and mark the (already existing) PID_NS option as "depend
on NAMESPACES"

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-09 01:22:23 +0800

15 Nov, 2007

1 commit

57d5f66b8 pidns: Place under CONFIG_EXPERIMENTAL ... Browse Code »

This is my trivial patch to swat innumerable little bugs with a single
blow.

After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.

Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.

Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.

Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Adrian Bunk
Cc: Jeremy Fitzhardinge
Cc: Kir Kolyshkin
Cc: Kirill Korotaev
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-11-15 10:45:43 +0800

20 Oct, 2007

7 commits

b461cc038 pid namespaces: miscellaneous preparations for pid namespaces ... Browse Code »

* remove pid.h from pid_namespaces.h;
* rework is_(cgroup|global)_init;
* optimize (get|put)_pid_ns for init_pid_ns;
* declare task_child_reaper to return actual reaper.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
07543f5c7 pid namespaces: make proc have multiple superblocks - one for each namespace ... Browse Code »

Each pid namespace have to be visible through its own proc mount. Thus we
need to have per-namespace proc trees with their own superblocks.

We cannot easily show different pid namespace via one global proc tree, since
each pid refers to different tasks in different namespaces. E.g. pid 1
refers to the init task in the initial namespace and to some other task when
seeing from another namespace. Moreover - pid, exisintg in one namespace may
not exist in the other.

This approach has one move advantage is that the tasks from the init namespace
can see what tasks live in another namespace by reading entries from another
proc tree.

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:39 +0800
faacbfd3a pid namespaces: add support for pid namespaces hierarchy ... Browse Code »

Each namespace has a parent and is characterized by its "level". Level is the
number of the namespace generation. E.g. init namespace has level 0, after
cloning new one it will have level 1, the next one - 2 and so on and so forth.
This level is not explicitly limited.

True hierarchy must have some way to find each namespace's children, but it is
not used in the patches, so this ability is not added (yet).

Signed-off-by: Pavel Emelyanov
Cc: Oleg Nesterov
Cc: Sukadev Bhattiprolu
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:38 +0800
88f21d818 pid namespaces: rename child_reaper() function ... Browse Code »

Rename the child_reaper() function to task_child_reaper() to be similar to
other task_* functions and to distinguish the function from 'struct
pid_namspace.child_reaper'.

Signed-off-by: Sukadev Bhattiprolu
Cc: Pavel Emelianov
Cc: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Herbert Poetzel
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-10-20 02:53:37 +0800
2894d650c pid namespaces: define and use task_active_pid_ns() wrapper ... Browse Code »

With multiple pid namespaces, a process is known by some pid_t in every
ancestor pid namespace. Every time the process forks, the child process also
gets a pid_t in every ancestor pid namespace.

While a process is visible in >=1 pid namespaces, it can see pid_t's in only
one pid namespace. We call this pid namespace it's "active pid namespace",
and it is always the youngest pid namespace in which the process is known.

This patch defines and uses a wrapper to find the active pid namespace of a
process. The implementation of the wrapper will be changed in when support
for multiple pid namespaces are added.

Changelog:
2.6.22-rc4-mm2-pidns1:
- [Pavel Emelianov, Alexey Dobriyan] Back out the change to use
task_active_pid_ns() in child_reaper() since task->nsproxy
can be NULL during task exit (so child_reaper() continues to
use init_pid_ns).

to implement child_reaper() since init_pid_ns.child_reaper to
implement child_reaper() since tsk->nsproxy can be NULL during exit.

2.6.21-rc6-mm1:
- Rename task_pid_ns() to task_active_pid_ns() to reflect that a
process can have multiple pid namespaces.

Signed-off-by: Sukadev Bhattiprolu
Acked-by: Pavel Emelianov
Cc: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Herbert Poetzel
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2007-10-20 02:53:37 +0800
baf8f0f82 pid namespaces: dynamic kmem cache allocator for pid namespaces ... Browse Code »

Add kmem_cache to pid_namespace to allocate pids from.

Since both implementations expand the struct pid to carry more numerical
values each namespace should have separate cache to store pids of different
sizes.

Each kmem cache is name "pid_", where is the number of numerical ids
on the pid. Different namespaces with same level of nesting will have same
caches.

This patch has two FIXMEs that are to be fixed after we reach the consensus
about the struct pid itself.

The first one is that the namespace to free the pid from in free_pid() must be
taken from pid. Now the init_pid_ns is used.

The second FIXME is about the cache allocation. When we do know how long the
object will be then we'll have to calculate this size in create_pid_cachep.
Right now the sizeof(struct pid) value is used.

[akpm@linux-foundation.org: coding-style repair]
Signed-off-by: Pavel Emelianov
Acked-by: Cedric Le Goater
Acked-by: Sukadev Bhattiprolu
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2007-10-20 02:53:37 +0800
a05f7b15d pid namespaces: make get_pid_ns() return the namespace itself ... Browse Code »

Make get_pid_ns() return the namespace itself to look like the other getters
and make the code using it look nicer.

Signed-off-by: Pavel Emelianov
Acked-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2007-10-20 02:53:37 +0800

17 Jul, 2007

1 commit

213dd266d namespace: ensure clone_flags are always stored in an unsigned long ... Browse Code »

While working on unshare support for the network namespace I noticed we
were putting clone flags in an int. Which is weird because the syscall
uses unsigned long and we at least need an unsigned to properly hold all of
the unshare flags.

So to make the code consistent, this patch updates the code to use
unsigned long instead of int for the clone flags in those places
where we get it wrong today.

Signed-off-by: Eric W. Biederman
Acked-by: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-07-17 00:05:48 +0800

09 May, 2007

1 commit

e3222c4ec Merge sys_clone()/sys_unshare() nsproxy and namespace handling ... Browse Code »

sys_clone() and sys_unshare() both makes copies of nsproxy and its associated
namespaces. But they have different code paths.

This patch merges all the nsproxy and its associated namespace copy/clone
handling (as much as possible). Posted on container list earlier for
feedback.

- Create a new nsproxy and its associated namespaces and pass it back to
caller to attach it to right process.

- Changed all copy_*_ns() routines to return a new copy of namespace
instead of attaching it to task->nsproxy.

- Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines.

- Removed unnessary !ns checks from copy_*_ns() and added BUG_ON()
just incase.

- Get rid of all individual unshare_*_ns() routines and make use of
copy_*_ns() instead.

[akpm@osdl.org: cleanups, warning fix]
[clg@fr.ibm.com: remove dup_namespaces() declaration]
[serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval]
[akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n]
Signed-off-by: Badari Pulavarty
Signed-off-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc:
Signed-off-by: Cedric Le Goater
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2007-05-09 02:15:00 +0800

31 Jan, 2007

1 commit

0f2452855 [PATCH] namespaces: fix task exit disaster ... Browse Code »

This is based on a patch by Eric W. Biederman, who pointed out that pid
namespaces are still fake, and we only have one ever active.

So for the time being, we can modify any code which could access
tsk->nsproxy->pid_ns during task exit to just use &init_pid_ns instead,
and move the exit_task_namespaces call in do_exit() back above
exit_notify(), so that an exiting nfs server has a valid tsk->sighand to
work with.

Long term, pulling pid_ns out of nsproxy might be the cleanest solution.

Signed-off-by: Eric W. Biederman

[ Eric's patch fixed to take care of free_pid() too ]

Signed-off-by: Serge E. Hallyn
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2007-01-31 05:40:36 +0800

09 Dec, 2006

3 commits

84d737866 [PATCH] add child reaper to pid_namespace ... Browse Code »

Add a per pid_namespace child-reaper. This is needed so processes are reaped
within the same pid space and do not spill over to the parent pid space. Its
also needed so containers preserve existing semantic that pid == 1 would reap
orphaned children.

This is based on Eric Biederman's patch: http://lkml.org/lkml/2006/2/6/285

Signed-off-by: Sukadev Bhattiprolu
Signed-off-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2006-12-09 00:28:52 +0800
9a575a92d [PATCH] to nsproxy ... Browse Code »

Add the pid namespace framework to the nsproxy object. The copy of the pid
namespace only increases the refcount on the global pid namespace,
init_pid_ns, and unshare is not implemented.

There is no configuration option to activate or deactivate this feature
because this not relevant for the moment.

Signed-off-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2006-12-09 00:28:52 +0800
61a58c6c2 [PATCH] rename struct pspace to struct pid_namespace ... Browse Code »

Rename struct pspace to struct pid_namespace for consistency with other
namespaces (uts_namespace and ipc_namespace). Also rename
include/linux/pspace.h to include/linux/pid_namespace.h and variables from
pspace to pid_ns.

Signed-off-by: Sukadev Bhattiprolu
Signed-off-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2006-12-09 00:28:52 +0800