14 Nov, 2020
1 commit
-
If this is attempted by a kthread, then return -EOPNOTSUPP as we don't
currently support that. Once we can get task_pid_ptr() doing the right
thing, then this can go away again.Signed-off-by: Jens Axboe
13 Jun, 2020
1 commit
-
Recently syzbot reported that unmounting proc when there is an ongoing
inotify watch on the root directory of proc could result in a use
after free when the watch is removed after the unmount of proc
when the watcher exits.Commit 69879c01a0c3 ("proc: Remove the now unnecessary internal mount
of proc") made it easier to unmount proc and allowed syzbot to see the
problem, but looking at the code it has been around for a long time.Looking at the code the fsnotify watch should have been removed by
fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode
was allocated with new_inode_pseudo instead of new_inode so the inode
was not on the sb->s_inodes list. Which prevented
fsnotify_unmount_inodes from finding the inode and removing the watch
as well as made it so the "VFS: Busy inodes after unmount" warning
could not find the inodes to warn about them.Make all of the inodes in proc visible to generic_shutdown_super,
and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
The only functional difference is that new_inode places the inodes
on the sb->s_inodes list.I wrote a small test program and I can verify that without changes it
can trigger this issue, and by replacing new_inode_pseudo with
new_inode the issues goes away.Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com
Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com
Fixes: 0097875bd415 ("proc: Implement /proc/thread-self to point at the directory of the current thread")
Fixes: 021ada7dff22 ("procfs: switch /proc/self away from proc_dir_entry")
Fixes: 51f0885e5415 ("vfs,proc: guarantee unique inodes in /proc")
Signed-off-by: "Eric W. Biederman"
19 May, 2020
1 commit
-
syzbot found that
touch /proc/testfile
causes NULL pointer dereference at tomoyo_get_local_path()
because inode of the dentry is NULL.Before c59f415a7cb6, Tomoyo received pid_ns from proc's s_fs_info
directly. Since proc_pid_ns() can only work with inode, using it in
the tomoyo_get_local_path() was wrong.To avoid creating more functions for getting proc_ns, change the
argument type of the proc_pid_ns() function. Then, Tomoyo can use
the existing super_block to get pid_ns.Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
Fixes: c59f415a7cb6 ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
Signed-off-by: Alexey Gladkov
Signed-off-by: Eric W. Biederman
22 Apr, 2020
1 commit
-
This patch allows to have multiple procfs instances inside the
same pid namespace. The aim here is lightweight sandboxes, and to allow
that we have to modernize procfs internals.1) The main aim of this work is to have on embedded systems one
supervisor for apps. Right now we have some lightweight sandbox support,
however if we create pid namespacess we have to manages all the
processes inside too, where our goal is to be able to run a bunch of
apps each one inside its own mount namespace without being able to
notice each other. We only want to use mount namespaces, and we want
procfs to behave more like a real mount point.2) Linux Security Modules have multiple ptrace paths inside some
subsystems, however inside procfs, the implementation does not guarantee
that the ptrace() check which triggers the security_ptrace_check() hook
will always run. We have the 'hidepid' mount option that can be used to
force the ptrace_may_access() check inside has_pid_permissions() to run.
The problem is that 'hidepid' is per pid namespace and not attached to
the mount point, any remount or modification of 'hidepid' will propagate
to all other procfs mounts.This also does not allow to support Yama LSM easily in desktop and user
sessions. Yama ptrace scope which restricts ptrace and some other
syscalls to be allowed only on inferiors, can be updated to have a
per-task context, where the context will be inherited during fork(),
clone() and preserved across execve(). If we support multiple private
procfs instances, then we may force the ptrace_may_access() on
/proc// to always run inside that new procfs instances. This will
allow to specifiy on user sessions if we should populate procfs with
pids that the user can ptrace or not.By using Yama ptrace scope, some restricted users will only be able to see
inferiors inside /proc, they won't even be able to see their other
processes. Some software like Chromium, Firefox's crash handler, Wine
and others are already using Yama to restrict which processes can be
ptracable. With this change this will give the possibility to restrict
/proc// but more importantly this will give desktop users a
generic and usuable way to specifiy which users should see all processes
and which users can not.Side notes:
* This covers the lack of seccomp where it is not able to parse
arguments, it is easy to install a seccomp filter on direct syscalls
that operate on pids, however /proc// is a Linux ABI using
filesystem syscalls. With this change LSMs should be able to analyze
open/read/write/close...In the new patch set version I removed the 'newinstance' option
as suggested by Eric W. Biederman.Selftest has been added to verify new behavior.
Signed-off-by: Alexey Gladkov
Reviewed-by: Alexey Dobriyan
Reviewed-by: Kees Cook
Signed-off-by: Eric W. Biederman
06 Mar, 2019
1 commit
-
Remove unnecessary ERR_PTR()/PTR_ERR() cast in proc_setup_self().
Link: http://lkml.kernel.org/r/20190124030150.8472-1-cgxu519@gmx.com
Signed-off-by: Chengguang Xu
Reviewed-by: Andrew Morton
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 May, 2018
1 commit
-
Factor out retrieving the per-sb pid namespaces from the sb private data
into an easier to understand helper.Suggested-by: Eric W. Biederman
Signed-off-by: Christoph Hellwig
07 Feb, 2018
2 commits
-
/proc/self inode numbers, value of proc_inode_cache and st_nlink of
/proc/$TGID are fixed constants.Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
PROC_NUMBUF is 13 which is enough for "negative int + \n + \0".
However PIDs and TGIDs are never negative and newline is not a concern,
so use just 10 per integer.Link: http://lkml.kernel.org/r/20171120203005.GA27743@avx2
Signed-off-by: Alexey Dobriyan
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
09 Dec, 2016
2 commits
-
If .readlink == NULL implies generic_readlink().
Generated by:
to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; doneSigned-off-by: Miklos Szeredi
-
The /proc/self and /proc/self-thread symlinks have separate but identical
functionality for reading and following. This cleanup utilizes
generic_readlink to remove the duplication.Signed-off-by: Miklos Szeredi
28 Sep, 2016
1 commit
-
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.CURRENT_TIME is also not y2038 safe.
This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.Signed-off-by: Deepa Dinamani
Reviewed-by: Arnd Bergmann
Acked-by: Felipe Balbi
Acked-by: Steven Whitehouse
Acked-by: Ryusuke Konishi
Acked-by: David Sterba
Signed-off-by: Al Viro
23 Jan, 2016
1 commit
-
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.Signed-off-by: Al Viro
31 Dec, 2015
1 commit
-
Signed-off-by: Al Viro
09 Dec, 2015
2 commits
-
Signed-off-by: Al Viro
-
new method: ->get_link(); replacement of ->follow_link(). The differences
are:
* inode and dentry are passed separately
* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.It's a flagday change - the old method is gone, all in-tree instances
converted. Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode. That'll change
in the next commits.Signed-off-by: Al Viro
11 May, 2015
2 commits
-
its only use is getting passed to nd_jump_link(), which can obtain
it from current->nameidataSigned-off-by: Al Viro
-
a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks. Stored pointer
is ignored in all cases except the last one.Storing NULL for opaque pointer (or not storing it at all) means no call
of ->put_link().b) the body used to be passed to ->put_link() implicitly (via nameidata).
Now only the opaque pointer is. In the cases when we used the symlink body
to free stuff, ->follow_link() now should store it as opaque pointer in addition
to returning it.Signed-off-by: Al Viro
16 Apr, 2015
1 commit
-
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells
Signed-off-by: Al Viro
02 Apr, 2014
1 commit
-
Signed-off-by: Al Viro
25 Oct, 2013
1 commit
-
duplicated to hell and back...
Signed-off-by: Al Viro
30 Apr, 2013
1 commit
-
Include missing linux/slab.h inclusions where the source file is currently
expecting to get kmalloc() and co. through linux/proc_fs.h.Signed-off-by: David Howells
Acked-by: Greg Kroah-Hartman
cc: linux-s390@vger.kernel.org
cc: sparclinux@vger.kernel.org
cc: linux-efi@vger.kernel.org
cc: linux-mtd@lists.infradead.org
cc: devel@driverdev.osuosl.org
cc: x86@kernel.org
Signed-off-by: Al Viro
10 Apr, 2013
2 commits
-
Just have it pinned in dcache all along and let procfs ->kill_sb()
drop it before kill_anon_super().Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
19 Nov, 2012
1 commit
-
I had visions at one point of splitting proc into two filesystems. If
that had happened proc/self being the the part of proc that actually deals
with pids would have been a nice cleanup. As it is proc/self requires
a lot of unnecessary infrastructure for a single file.The only user visible change is that a mounted /proc for a pid namespace
that is dead now shows a broken proc symlink, instead of being completely
invisible. I don't think anyone will notice or care.Signed-off-by: Eric W. Biederman