14 Nov, 2020

1 commit


13 Jun, 2020

1 commit

  • Recently syzbot reported that unmounting proc when there is an ongoing
    inotify watch on the root directory of proc could result in a use
    after free when the watch is removed after the unmount of proc
    when the watcher exits.

    Commit 69879c01a0c3 ("proc: Remove the now unnecessary internal mount
    of proc") made it easier to unmount proc and allowed syzbot to see the
    problem, but looking at the code it has been around for a long time.

    Looking at the code the fsnotify watch should have been removed by
    fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode
    was allocated with new_inode_pseudo instead of new_inode so the inode
    was not on the sb->s_inodes list. Which prevented
    fsnotify_unmount_inodes from finding the inode and removing the watch
    as well as made it so the "VFS: Busy inodes after unmount" warning
    could not find the inodes to warn about them.

    Make all of the inodes in proc visible to generic_shutdown_super,
    and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
    The only functional difference is that new_inode places the inodes
    on the sb->s_inodes list.

    I wrote a small test program and I can verify that without changes it
    can trigger this issue, and by replacing new_inode_pseudo with
    new_inode the issues goes away.

    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com
    Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com
    Fixes: 0097875bd415 ("proc: Implement /proc/thread-self to point at the directory of the current thread")
    Fixes: 021ada7dff22 ("procfs: switch /proc/self away from proc_dir_entry")
    Fixes: 51f0885e5415 ("vfs,proc: guarantee unique inodes in /proc")
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

19 May, 2020

1 commit

  • syzbot found that

    touch /proc/testfile

    causes NULL pointer dereference at tomoyo_get_local_path()
    because inode of the dentry is NULL.

    Before c59f415a7cb6, Tomoyo received pid_ns from proc's s_fs_info
    directly. Since proc_pid_ns() can only work with inode, using it in
    the tomoyo_get_local_path() was wrong.

    To avoid creating more functions for getting proc_ns, change the
    argument type of the proc_pid_ns() function. Then, Tomoyo can use
    the existing super_block to get pid_ns.

    Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
    Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
    Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
    Fixes: c59f415a7cb6 ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
    Signed-off-by: Alexey Gladkov
    Signed-off-by: Eric W. Biederman

    Alexey Gladkov
     

22 Apr, 2020

1 commit

  • This patch allows to have multiple procfs instances inside the
    same pid namespace. The aim here is lightweight sandboxes, and to allow
    that we have to modernize procfs internals.

    1) The main aim of this work is to have on embedded systems one
    supervisor for apps. Right now we have some lightweight sandbox support,
    however if we create pid namespacess we have to manages all the
    processes inside too, where our goal is to be able to run a bunch of
    apps each one inside its own mount namespace without being able to
    notice each other. We only want to use mount namespaces, and we want
    procfs to behave more like a real mount point.

    2) Linux Security Modules have multiple ptrace paths inside some
    subsystems, however inside procfs, the implementation does not guarantee
    that the ptrace() check which triggers the security_ptrace_check() hook
    will always run. We have the 'hidepid' mount option that can be used to
    force the ptrace_may_access() check inside has_pid_permissions() to run.
    The problem is that 'hidepid' is per pid namespace and not attached to
    the mount point, any remount or modification of 'hidepid' will propagate
    to all other procfs mounts.

    This also does not allow to support Yama LSM easily in desktop and user
    sessions. Yama ptrace scope which restricts ptrace and some other
    syscalls to be allowed only on inferiors, can be updated to have a
    per-task context, where the context will be inherited during fork(),
    clone() and preserved across execve(). If we support multiple private
    procfs instances, then we may force the ptrace_may_access() on
    /proc// to always run inside that new procfs instances. This will
    allow to specifiy on user sessions if we should populate procfs with
    pids that the user can ptrace or not.

    By using Yama ptrace scope, some restricted users will only be able to see
    inferiors inside /proc, they won't even be able to see their other
    processes. Some software like Chromium, Firefox's crash handler, Wine
    and others are already using Yama to restrict which processes can be
    ptracable. With this change this will give the possibility to restrict
    /proc// but more importantly this will give desktop users a
    generic and usuable way to specifiy which users should see all processes
    and which users can not.

    Side notes:
    * This covers the lack of seccomp where it is not able to parse
    arguments, it is easy to install a seccomp filter on direct syscalls
    that operate on pids, however /proc// is a Linux ABI using
    filesystem syscalls. With this change LSMs should be able to analyze
    open/read/write/close...

    In the new patch set version I removed the 'newinstance' option
    as suggested by Eric W. Biederman.

    Selftest has been added to verify new behavior.

    Signed-off-by: Alexey Gladkov
    Reviewed-by: Alexey Dobriyan
    Reviewed-by: Kees Cook
    Signed-off-by: Eric W. Biederman

    Alexey Gladkov
     

06 Mar, 2019

1 commit


16 May, 2018

1 commit


07 Feb, 2018

2 commits

  • /proc/self inode numbers, value of proc_inode_cache and st_nlink of
    /proc/$TGID are fixed constants.

    Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • PROC_NUMBUF is 13 which is enough for "negative int + \n + \0".

    However PIDs and TGIDs are never negative and newline is not a concern,
    so use just 10 per integer.

    Link: http://lkml.kernel.org/r/20171120203005.GA27743@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Dec, 2016

2 commits


28 Sep, 2016

1 commit

  • CURRENT_TIME macro is not appropriate for filesystems as it
    doesn't use the right granularity for filesystem timestamps.
    Use current_time() instead.

    CURRENT_TIME is also not y2038 safe.

    This is also in preparation for the patch that transitions
    vfs timestamps to use 64 bit time and hence make them
    y2038 safe. As part of the effort current_time() will be
    extended to do range checks. Hence, it is necessary for all
    file system timestamps to use current_time(). Also,
    current_time() will be transitioned along with vfs to be
    y2038 safe.

    Note that whenever a single call to current_time() is used
    to change timestamps in different inodes, it is because they
    share the same time granularity.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Felipe Balbi
    Acked-by: Steven Whitehouse
    Acked-by: Ryusuke Konishi
    Acked-by: David Sterba
    Signed-off-by: Al Viro

    Deepa Dinamani
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

31 Dec, 2015

1 commit


09 Dec, 2015

2 commits


11 May, 2015

2 commits

  • its only use is getting passed to nd_jump_link(), which can obtain
    it from current->nameidata

    Signed-off-by: Al Viro

    Al Viro
     
  • a) instead of storing the symlink body (via nd_set_link()) and returning
    an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
    that opaque pointer (into void * passed by address by caller) and returns
    the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic
    symlinks) and pointer to symlink body for normal symlinks. Stored pointer
    is ignored in all cases except the last one.

    Storing NULL for opaque pointer (or not storing it at all) means no call
    of ->put_link().

    b) the body used to be passed to ->put_link() implicitly (via nameidata).
    Now only the opaque pointer is. In the cases when we used the symlink body
    to free stuff, ->follow_link() now should store it as opaque pointer in addition
    to returning it.

    Signed-off-by: Al Viro

    Al Viro
     

16 Apr, 2015

1 commit


02 Apr, 2014

1 commit


25 Oct, 2013

1 commit


30 Apr, 2013

1 commit

  • Include missing linux/slab.h inclusions where the source file is currently
    expecting to get kmalloc() and co. through linux/proc_fs.h.

    Signed-off-by: David Howells
    Acked-by: Greg Kroah-Hartman
    cc: linux-s390@vger.kernel.org
    cc: sparclinux@vger.kernel.org
    cc: linux-efi@vger.kernel.org
    cc: linux-mtd@lists.infradead.org
    cc: devel@driverdev.osuosl.org
    cc: x86@kernel.org
    Signed-off-by: Al Viro

    David Howells
     

10 Apr, 2013

2 commits


19 Nov, 2012

1 commit

  • I had visions at one point of splitting proc into two filesystems. If
    that had happened proc/self being the the part of proc that actually deals
    with pids would have been a nice cleanup. As it is proc/self requires
    a lot of unnecessary infrastructure for a single file.

    The only user visible change is that a mounted /proc for a pid namespace
    that is dead now shows a broken proc symlink, instead of being completely
    invisible. I don't think anyone will notice or care.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman