Eric Lee / smarc-fsl-linux-kernel

27 May, 2018

1 commit

0168b9e38 procfs: switch instantiate_t to d_splice_alias() ... Browse Code »

... and get rid of pointless struct inode *dir argument of those,
while we are at it.

Signed-off-by: Al Viro

Al Viro
2018-05-27 02:20:50 +0800

23 May, 2018

1 commit

1bbc55131 procfs: get rid of ancient BS in pid_revalidate() uses ... Browse Code »

First of all, calling pid_revalidate() in the end of /* lookups
is *not* about closing any kind of races; that used to be true once
upon a time, but these days those comments are actively misleading.
Especially since pid_revalidate() doesn't even do d_drop() on
failure anymore. It doesn't matter, anyway, since once
pid_revalidate() starts returning false, ->d_delete() of those
dentries starts saying "don't keep"; they won't get stuck in
dcache any longer than they are pinned.

These calls cannot be just removed, though - the side effect of
pid_revalidate() (updating i_uid/i_gid/etc.) is what we are calling
it for here.

Let's separate the "update ownership" into a new helper (pid_update_inode())
and use it, both in lookups and in pid_revalidate() itself.

The comments in pid_revalidate() are also out of date - they refer to
the time when pid_revalidate() used to call d_drop() directly...

Signed-off-by: Al Viro

Al Viro
2018-05-23 02:28:03 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

09 May, 2017

1 commit

eaa0d190b pidns: expose task pid_ns_for_children to userspace ... Browse Code »

pid_ns_for_children set by a task is known only to the task itself, and
it's impossible to identify it from outside.

It's a big problem for checkpoint/restore software like CRIU, because it
can't correctly handle tasks, that do setns(CLONE_NEWPID) in proccess of
their work.

This patch solves the problem, and it exposes pid_ns_for_children to ns
directory in standard way with the name "pid_for_children":

~# ls /proc/5531/ns -l | grep pid
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 Jan 14 16:38 pid_for_children -> pid:[4026532286]

Link: http://lkml.kernel.org/r/149201123914.6007.2187327078064239572.stgit@localhost.localdomain
Signed-off-by: Kirill Tkhai
Cc: Andrei Vagin
Cc: Andreas Gruenbacher
Cc: Kees Cook
Cc: Michael Kerrisk
Cc: Al Viro
Cc: Oleg Nesterov
Cc: Paul Moore
Cc: Eric Biederman
Cc: Andy Lutomirski
Cc: Ingo Molnar
Cc: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill Tkhai
2017-05-09 08:15:12 +0800

15 Nov, 2016

1 commit

db978da8f proc: Pass file mode to proc_pid_make_inode ... Browse Code »

Pass the file mode of the proc inode to be created to
proc_pid_make_inode. In proc_pid_make_inode, initialize inode->i_mode
before calling security_task_to_inode. This allows selinux to set
isec->sclass right away without introducing "half-initialized" inode
security structs.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Paul Moore

Andreas Gruenbacher
2016-11-15 04:39:48 +0800

03 May, 2016

1 commit

f50752eaa switch all procfs directories ->iterate_shared() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2016-05-03 07:49:30 +0800

17 Feb, 2016

1 commit

a79a908fd cgroup: introduce cgroup namespaces ... Browse Code »

Introduce the ability to create new cgroup namespace. The newly created
cgroup namespace remembers the cgroup of the process at the point
of creation of the cgroup namespace (referred as cgroupns-root).
The main purpose of cgroup namespace is to virtualize the contents
of /proc/self/cgroup file. Processes inside a cgroup namespace
are only able to see paths relative to their namespace root
(unless they are moved outside of their cgroupns-root, at which point
they will see a relative path from their cgroupns-root).
For a correctly setup container this enables container-tools
(like libcontainer, lxc, lmctfy, etc.) to create completely virtualized
containers without leaking system level cgroup hierarchy to the task.
This patch only implements the 'unshare' part of the cgroupns.

Signed-off-by: Aditya Kali
Signed-off-by: Serge Hallyn
Signed-off-by: Tejun Heo

Aditya Kali
2016-02-17 02:04:58 +0800

21 Jan, 2016

1 commit

caaee6234 ptrace: use fsuid, fsgid, effective creds for fs access checks ... Browse Code »

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g. by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

/proc/$pid/stat - uses the check for determining whether pointers
should be visible, useful for bypassing ASLR
/proc/$pid/maps - also useful for bypassing ASLR
/proc/$pid/cwd - useful for gaining access to restricted
directories that contain files with lax permissions, e.g. in
this scenario:
lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
drwx------ root root /root
drwxr-xr-x root root /root/foobar
-rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn
Acked-by: Kees Cook
Cc: Casey Schaufler
Cc: Oleg Nesterov
Cc: Ingo Molnar
Cc: James Morris
Cc: "Serge E. Hallyn"
Cc: Andy Shevchenko
Cc: Andy Lutomirski
Cc: Al Viro
Cc: "Eric W. Biederman"
Cc: Willy Tarreau
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jann Horn
2016-01-21 09:09:18 +0800

31 Dec, 2015

1 commit

fceef393a switch ->get_link() to delayed_call, kill ->put_link() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2015-12-31 02:01:03 +0800

09 Dec, 2015

1 commit

6b2553918 replace ->follow_link() with new method that could stay in RCU mode ... Browse Code »

new method: ->get_link(); replacement of ->follow_link(). The differences
are:
* inode and dentry are passed separately
* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.

It's a flagday change - the old method is gone, all in-tree instances
converted. Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode. That'll change
in the next commits.

Signed-off-by: Al Viro

Al Viro
2015-12-09 11:41:54 +0800

11 May, 2015

2 commits

6e77137b3 don't pass nameidata to ->follow_link() ... Browse Code »

its only use is getting passed to nd_jump_link(), which can obtain
it from current->nameidata

Signed-off-by: Al Viro

Al Viro
2015-05-11 10:20:15 +0800
680baacbc new ->follow_link() and ->put_link() calling conventions ... Browse Code »

a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks. Stored pointer
is ignored in all cases except the last one.

Storing NULL for opaque pointer (or not storing it at all) means no call
of ->put_link().

b) the body used to be passed to ->put_link() implicitly (via nameidata).
Now only the opaque pointer is. In the cases when we used the symlink body
to free stuff, ->follow_link() now should store it as opaque pointer in addition
to returning it.

Signed-off-by: Al Viro

Al Viro
2015-05-11 10:19:45 +0800

16 Apr, 2015

1 commit

2b0143b5c VFS: normal filesystems (and lustre): d_inode() annotations ... Browse Code »

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:57 +0800

11 Dec, 2014

2 commits

3d3d35b1e kill proc_ns completely ... Browse Code »

procfs inodes need only the ns_ops part; nsfs inodes don't need it at all

Signed-off-by: Al Viro

Al Viro
2014-12-11 10:30:57 +0800
e149ed2b8 take the targets of /proc/*/ns/* symlinks to separate fs ... Browse Code »

New pseudo-filesystem: nsfs. Targets of /proc/*/ns/* live there now.
It's not mountable (not even registered, so it's not in /proc/filesystems,
etc.). Files on it *are* bindable - we explicitly permit that in do_loopback().

This stuff lives in fs/nsfs.c now; proc_ns_fget() moved there as well.
get_proc_ns() is a macro now (it's simply returning ->i_private; would
have been an inline, if not for header ordering headache).
proc_ns_inode() is an ex-parrot. The interface used in procfs is
ns_get_path(path, task, ops) and ns_get_name(buf, size, task, ops).

Dentries and inodes are never hashed; a non-counting reference to dentry
is stashed in ns_common (removed by ->d_prune()) and reused by ns_get_path()
if present. See ns_get_path()/ns_prune_dentry/nsfs_evict() for details
of that mechanism.

As the result, proc_ns_follow_link() has stopped poking in nd->path.mnt;
it does nd_jump_link() on a consistent pair it gets
from ns_get_path().

Signed-off-by: Al Viro

Al Viro
2014-12-11 10:30:20 +0800

05 Dec, 2014

2 commits

f77c80142 bury struct proc_ns in fs/proc ... Browse Code »

a) make get_proc_ns() return a pointer to struct ns_common
b) mirror ns_ops in dentry->d_fsdata of ns dentries, so that
is_mnt_ns_file() could get away with fewer dereferences.

That way struct proc_ns becomes invisible outside of fs/proc/*.c

Signed-off-by: Al Viro

Al Viro
2014-12-05 03:34:54 +0800
64964528b make proc_ns_operations work with struct ns_common * instead of void * ... Browse Code »

We can do that now. And kill ->inum(), while we are at it - all instances
are identical.

Signed-off-by: Al Viro

Al Viro
2014-12-05 03:34:17 +0800

02 Apr, 2014

1 commit

5d826c847 new helper: readlink_copy() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-04-02 11:19:15 +0800

16 Nov, 2013

1 commit

b26d4cd38 consolidate simple ->d_delete() instances ... Browse Code »

Rename simple_delete_dentry() to always_delete_dentry() and export it.
Export simple_dentry_operations, while we are at it, and get rid of
their duplicates

Signed-off-by: Al Viro

Al Viro
2013-11-16 11:04:17 +0800

29 Jun, 2013

2 commits

c52a47ace proc_fill_cache(): just make instantiate_t return int ... Browse Code »

all instances always return ERR_PTR(-E...) or NULL, anyway

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:57:18 +0800
f0c3b5093 [readdir] convert procfs ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:56:32 +0800

02 May, 2013

1 commit

0bb80f240 proc: Split the namespace stuff out into linux/proc_ns.h ... Browse Code »

Split the proc namespace stuff out into linux/proc_ns.h.

Signed-off-by: David Howells
cc: netdev@vger.kernel.org
cc: Serge E. Hallyn
cc: Eric W. Biederman
Signed-off-by: Al Viro

David Howells
2013-05-02 05:29:39 +0800

09 Mar, 2013

1 commit

db04dc679 proc: Use nd_jump_link in proc_ns_follow_link ... Browse Code »

Update proc_ns_follow_link to use nd_jump_link instead of just
manually updating nd.path.dentry.

This fixes the BUG_ON(nd->inode != parent->d_inode) reported by Dave
Jones and reproduced trivially with mkdir /proc/self/ns/uts/a.

Sigh it looks like the VFS change to require use of nd_jump_link
happend while proc_ns_follow_link was baking and since the common case
of proc_ns_follow_link continued to work without problems the need for
making this change was overlooked.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-03-09 16:14:45 +0800

20 Nov, 2012

3 commits

98f842e67 proc: Usable inode numbers for the namespace file descriptors. ... Browse Code »

Assign a unique proc inode to each namespace, and use that
inode number to ensure we only allocate at most one proc
inode for every namespace in proc.

A single proc inode per namespace allows userspace to test
to see if two processes are in the same namespace.

This has been a long requested feature and only blocked because
a naive implementation would put the id in a global space and
would ultimately require having a namespace for the names of
namespaces, making migration and certain virtualization tricks
impossible.

We still don't have per superblock inode numbers for proc, which
appears necessary for application unaware checkpoint/restart and
migrations (if the application is using namespace file descriptors)
but that is now allowd by the design if it becomes important.

I have preallocated the ipc and uts initial proc inode numbers so
their structures can be statically initialized.

Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-11-20 20:19:49 +0800
bf056bfa8 proc: Fix the namespace inode permission checks. ... Browse Code »

Change the proc namespace files into symlinks so that
we won't cache the dentries for the namespace files
which can bypass the ptrace_may_access checks.

To support the symlinks create an additional namespace
inode with it's own set of operations distinct from the
proc pid inode and dentry methods as those no longer
make sense.

Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-11-20 20:19:48 +0800
cde1975bc userns: Implent proc namespace operations ... Browse Code »

This allows entering a user namespace, and the ability
to store a reference to a user namespace with a bind
mount.

Addition of missing userns_ns_put in userns_install
from Gao feng

Acked-by: Serge Hallyn
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2012-11-20 20:18:13 +0800

19 Nov, 2012

2 commits

8823c079b vfs: Add setns support for the mount namespace ... Browse Code »

setns support for the mount namespace is a little tricky as an
arbitrary decision must be made about what to set fs->root and
fs->pwd to, as there is no expectation of a relationship between
the two mount namespaces. Therefore I arbitrarily find the root
mount point, and follow every mount on top of it to find the top
of the mount stack. Then I set fs->root and fs->pwd to that
location. The topmost root of the mount stack seems like a
reasonable place to be.

Bind mount support for the mount namespace inodes has the
possibility of creating circular dependencies between mount
namespaces. Circular dependencies can result in loops that
prevent mount namespaces from every being freed. I avoid
creating those circular dependencies by adding a sequence number
to the mount namespace and require all bind mounts be of a
younger mount namespace into an older mount namespace.

Add a helper function proc_ns_inode so it is possible to
detect when we are attempting to bind mound a namespace inode.

Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-11-19 21:59:18 +0800
57e8391d3 pidns: Add setns support ... Browse Code »

- Pid namespaces are designed to be inescapable so verify that the
passed in pid namespace is a child of the currently active
pid namespace or the currently active pid namespace itself.

Allowing the currently active pid namespace is important so
the effects of an earlier setns can be cancelled.

Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-11-19 21:59:14 +0800

14 Jul, 2012

2 commits

00cd8dd3b stop passing nameidata to ->lookup() ... Browse Code »

Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:32 +0800
0b728e191 stop passing nameidata * to ->d_revalidate() ... Browse Code »

Just the lookup flags. Die, bastard, die...

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:14 +0800

29 Mar, 2012

1 commit

4c619aa0b fs/proc/namespaces.c: prevent crash when ns_entries[] is empty ... Browse Code »

If CONFIG_NET_NS, CONFIG_UTS_NS and CONFIG_IPC_NS are disabled,
ns_entries[] becomes empty and things like
ns_entries[ARRAY_SIZE(ns_entries) - 1] will explode.

Reported-by: Richard Weinberger
Cc: "Eric W. Biederman"
Cc: Daniel Lezcano
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2012-03-29 08:14:37 +0800

24 Mar, 2012

1 commit

1b26c9b33 proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate(). ... Browse Code »
1

The namespace cleanup path leaks a dentry which holds a reference count
on a network namespace. Keeping that network namespace from being freed
when the last user goes away. Leaving things like vlan devices in the
leaked network namespace.

If you use ip netns add for much real work this problem becomes apparent
pretty quickly. It light testing the problem hides because frequently
you simply don't notice the leak.

Use d_set_d_op() so that DCACHE_OP_* flags are set correctly.

This issue exists back to 3.0.

Acked-by: "Eric W. Biederman"
Reported-by: Justin Pettit
Signed-off-by: Pravin B Shelar
Signed-off-by: Jesse Gross
Cc: David Miller
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pravin B Shelar
2012-03-24 07:58:42 +0800

04 Jan, 2012

1 commit

d10577a8d vfs: trim includes a bit ... Browse Code »

[folded fix for missing magic.h from Tetsuo Handa]

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:57:13 +0800

16 Jun, 2011

1 commit

793925334 proc: Fix Oops on stat of /proc/<zombie pid>/ns/net ... Browse Code »

Don't call iput with the inode half setup to be a namespace filedescriptor.
Instead rearrange the code so that we don't initialize ei->ns_ops until
after I ns_ops->get succeeds, preventing us from invoking ns_ops->put
when ns_ops->get failed.

Reported-by: Ingo Saitz
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-06-16 05:35:29 +0800

25 May, 2011

1 commit

62ca24baf ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry. ... Browse Code »

Spotted-by: Nathan Lynch
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-05-25 06:30:33 +0800

11 May, 2011

4 commits

a00eaf11a ns proc: Add support for the ipc namespace ... Browse Code »

Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-05-11 05:35:47 +0800
34482e89a ns proc: Add support for the uts namespace ... Browse Code »

Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-05-11 05:35:35 +0800
13b6f5762 ns proc: Add support for the network namespace. ... Browse Code »

Implementing file descriptors for the network namespace
is simple and straight forward.

Acked-by: David S. Miller
Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-05-11 05:34:26 +0800
6b4e306aa ns: proc files for namespace naming policy. ... Browse Code »

Create files under /proc//ns/ to allow controlling the
namespaces of a process.

This addresses three specific problems that can make namespaces hard to
work with.
- Namespaces require a dedicated process to pin them in memory.
- It is not possible to use a namespace unless you are the child
of the original creator.
- Namespaces don't have names that userspace can use to talk about
them.

The namespace files under /proc//ns/ can be opened and the
file descriptor can be used to talk about a specific namespace, and
to keep the specified namespace alive.

A namespace can be kept alive by either holding the file descriptor
open or bind mounting the file someplace else. aka:
mount --bind /proc/self/ns/net /some/filesystem/path
mount --bind /proc/self/fd/ /some/filesystem/path

This allows namespaces to be named with userspace policy.

It requires additional support to make use of these filedescriptors
and that will be comming in the following patches.

Acked-by: Daniel Lezcano
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2011-05-11 05:31:44 +0800