28 Sep, 2016

1 commit


24 Jun, 2016

1 commit

  • Move the call of get_pid_ns, the call of proc_parse_options, and
    the setting of s_iflags into proc_fill_super so that mount_ns
    can be used.

    Convert proc_mount to call mount_ns and remove the now unnecessary
    code.

    Acked-by: Seth Forshee
    Reviewed-by: Djalal Harouni
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

01 Jul, 2015

1 commit

  • Add a new function proc_create_mount_point that when used to creates a
    directory that can not be added to.

    Add a new function is_empty_pde to test if a function is a mount
    point.

    Update the code to use make_empty_dir_inode when reporting
    a permanently empty directory to the vfs.

    Update the code to not allow adding to permanently empty directories.

    Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

    Cc: stable@vger.kernel.org
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

23 Feb, 2015

1 commit


17 Dec, 2014

1 commit

  • Pull vfs pile #2 from Al Viro:
    "Next pile (and there'll be one or two more).

    The large piece in this one is getting rid of /proc/*/ns/* weirdness;
    among other things, it allows to (finally) make nameidata completely
    opaque outside of fs/namei.c, making for easier further cleanups in
    there"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    coda_venus_readdir(): use file_inode()
    fs/namei.c: fold link_path_walk() call into path_init()
    path_init(): don't bother with LOOKUP_PARENT in argument
    fs/namei.c: new helper (path_cleanup())
    path_init(): store the "base" pointer to file in nameidata itself
    make default ->i_fop have ->open() fail with ENXIO
    make nameidata completely opaque outside of fs/namei.c
    kill proc_ns completely
    take the targets of /proc/*/ns/* symlinks to separate fs
    bury struct proc_ns in fs/proc
    copy address of proc_ns_ops into ns_common
    new helpers: ns_alloc_inum/ns_free_inum
    make proc_ns_operations work with struct ns_common * instead of void *
    switch the rest of proc_ns_operations to working with &...->ns
    netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
    make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
    common object embedded into various struct ....ns

    Linus Torvalds
     

11 Dec, 2014

2 commits

  • procfs inodes need only the ns_ops part; nsfs inodes don't need it at all

    Signed-off-by: Al Viro

    Al Viro
     
  • When a lot of netdevices are created, one of the bottleneck is the
    creation of proc entries. This serie aims to accelerate this part.

    The current implementation for the directories in /proc is using a single
    linked list. This is slow when handling directories with large numbers of
    entries (eg netdevice-related entries when lots of tunnels are opened).

    This patch replaces this linked list by a red-black tree.

    Here are some numbers:

    dummy30000.batch contains 30 000 times 'link add type dummy'.

    Before the patch:
    $ time ip -b dummy30000.batch
    real 2m31.950s
    user 0m0.440s
    sys 2m21.440s
    $ time rmmod dummy
    real 1m35.764s
    user 0m0.000s
    sys 1m24.088s

    After the patch:
    $ time ip -b dummy30000.batch
    real 2m0.874s
    user 0m0.448s
    sys 1m49.720s
    $ time rmmod dummy
    real 1m13.988s
    user 0m0.000s
    sys 1m1.008s

    The idea of improving this part was suggested by Thierry Herbelot.

    [akpm@linux-foundation.org: initialise proc_root.subdir at compile time]
    Signed-off-by: Nicolas Dichtel
    Acked-by: David S. Miller
    Cc: Thierry Herbelot .
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicolas Dichtel
     

05 Dec, 2014

1 commit

  • a) make get_proc_ns() return a pointer to struct ns_common
    b) mirror ns_ops in dentry->d_fsdata of ns dentries, so that
    is_mnt_ns_file() could get away with fewer dereferences.

    That way struct proc_ns becomes invisible outside of fs/proc/*.c

    Signed-off-by: Al Viro

    Al Viro
     

10 Oct, 2014

3 commits

  • m_start() can use get_proc_task() instead, and "struct inode *"
    provides more potentially useful info, see the next changes.

    Signed-off-by: Oleg Nesterov
    Cc: Alexander Viro
    Cc: Cyrill Gorcunov
    Cc: "Eric W. Biederman"
    Cc: Greg Ungerer
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • A simple test-case from Kirill Shutemov

    cat /proc/self/maps >/dev/null
    chmod +x /proc/self/net/packet
    exec /proc/self/net/packet

    makes lockdep unhappy, cat/exec take seq_file->lock + cred_guard_mutex in
    the opposite order.

    It's a false positive and probably we should not allow "chmod +x" on proc
    files. Still I think that we should avoid mm_access() and cred_guard_mutex
    in sys_read() paths, security checking should happen at open time. Besides,
    this doesn't even look right if the task changes its ->mm between m_stop()
    and m_start().

    Add the new "mm_struct *mm" member into struct proc_maps_private and change
    proc_maps_open() to initialize it using proc_mem_open(). Change m_start() to
    use priv->mm if atomic_inc_not_zero(mm_users) succeeds or return NULL (eof)
    otherwise.

    The only complication is that proc_maps_open() users should additionally do
    mmdrop() in fop->release(), add the new proc_map_release() helper for that.

    Note: this is the user-visible change, if the task execs after open("maps")
    the new ->mm won't be visible via this file. I hope this is fine, and this
    matches /proc/pid/mem bahaviour.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Oleg Nesterov
    Reported-by: "Kirill A. Shutemov"
    Acked-by: Kirill A. Shutemov
    Acked-by: Cyrill Gorcunov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Extract the mm_access() code from __mem_open() into the new helper,
    proc_mem_open(), the next patch will add another caller.

    Signed-off-by: Oleg Nesterov
    Acked-by: Kirill A. Shutemov
    Acked-by: Cyrill Gorcunov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

10 Aug, 2014

1 commit

  • Pull namespace updates from Eric Biederman:
    "This is a bunch of small changes built against 3.16-rc6. The most
    significant change for users is the first patch which makes setns
    drmatically faster by removing unneded rcu handling.

    The next chunk of changes are so that "mount -o remount,.." will not
    allow the user namespace root to drop flags on a mount set by the
    system wide root. Aks this forces read-only mounts to stay read-only,
    no-dev mounts to stay no-dev, no-suid mounts to stay no-suid, no-exec
    mounts to stay no exec and it prevents unprivileged users from messing
    with a mounts atime settings. I have included my test case as the
    last patch in this series so people performing backports can verify
    this change works correctly.

    The next change fixes a bug in NFS that was discovered while auditing
    nsproxy users for the first optimization. Today you can oops the
    kernel by reading /proc/fs/nfsfs/{servers,volumes} if you are clever
    with pid namespaces. I rebased and fixed the build of the
    !CONFIG_NFS_FS case yesterday when a build bot caught my typo. Given
    that no one to my knowledge bases anything on my tree fixing the typo
    in place seems more responsible that requiring a typo-fix to be
    backported as well.

    The last change is a small semantic cleanup introducing
    /proc/thread-self and pointing /proc/mounts and /proc/net at it. This
    prevents several kinds of problemantic corner cases. It is a
    user-visible change so it has a minute chance of causing regressions
    so the change to /proc/mounts and /proc/net are individual one line
    commits that can be trivially reverted. Unfortunately I lost and
    could not find the email of the original reporter so he is not
    credited. From at least one perspective this change to /proc/net is a
    refgression fix to allow pthread /proc/net uses that were broken by
    the introduction of the network namespace"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc: Point /proc/mounts at /proc/thread-self/mounts instead of /proc/self/mounts
    proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net
    proc: Implement /proc/thread-self to point at the directory of the current thread
    proc: Have net show up under /proc//task/
    NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes
    mnt: Add tests for unprivileged remount cases that have found to be faulty
    mnt: Change the default remount atime from relatime to the existing value
    mnt: Correct permission checks in do_remount
    mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags into do_remount
    mnt: Only change user settable mount flags in remount
    namespaces: Use task_lock and not rcu to protect nsproxy

    Linus Torvalds
     

09 Aug, 2014

3 commits

  • If you're applying this patch, all /proc/$PID/* files were converted
    to seq_file interface and this code became unused.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Signed-off-by: Alexey Dobriyan
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • * remove proc_create(NULL, ...) check, let it oops

    * warn about proc_create("", ...) and proc_create("very very long name", ...)
    proc code keeps length as u8, no 256+ name length possible

    * warn about proc_create("123", ...)
    /proc/$PID and /proc/misc namespaces are separate things,
    but dumb module might create funky a-la $PID entry.

    * remove post mortem strchr('/') check
    Triggering it implies either strchr() is buggy or memory corruption.
    It should be VFS check anyway.

    In reality, none of these checks will ever trigger,
    it is preparation for the next patch.

    Based on patch from Al Viro.

    Signed-off-by: Alexey Dobriyan
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

05 Aug, 2014

1 commit

  • /proc/thread-self is derived from /proc/self. /proc/thread-self
    points to the directory in proc containing information about the
    current thread.

    This funtionality has been missing for a long time, and is tricky to
    implement in userspace as gettid() is not exported by glibc. More
    importantly this allows fixing defects in /proc/mounts and /proc/net
    where in a threaded application today they wind up being empty files
    when only the initial pthread has exited, causing problems for other
    threads.

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

12 Mar, 2014

1 commit

  • The same data is now available in sysfs, so we can remove the code
    that exports it in /proc and replace it with a symlink to the sysfs
    version.

    Tested on versatile qemu model and mpc5200 eval board. More testing
    would be appreciated.

    v5: Fixed up conflicts with mainline changes

    Signed-off-by: Grant Likely
    Cc: Rob Herring
    Cc: Benjamin Herrenschmidt
    Cc: David S. Miller
    Cc: Nathan Fontenot
    Cc: Pantelis Antoniou

    Grant Likely
     

29 Jun, 2013

2 commits


02 May, 2013

6 commits

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     
  • Move non-public declarations and definitions from linux/proc_fs.h to
    fs/proc/internal.h.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Make the PROC_I() and PDE() macros internal to procfs. This means making
    PDE_DATA() out of line. This could be made more optimal by storing
    PDE()->data into inode->i_private.

    Also provide a __PDE_DATA() that is inline and internal to procfs.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Move some bits from linux/proc_fs.h to linux/of.h, signal.h and tty.h.

    Also move proc_tty_init() and proc_device_tree_init() to fs/proc/internal.h as
    they're internal to procfs.

    Signed-off-by: David Howells
    Acked-by: Greg Kroah-Hartman
    Acked-by: Grant Likely
    cc: devicetree-discuss@lists.ozlabs.org
    cc: linux-arch@vger.kernel.org
    cc: Greg Kroah-Hartman
    cc: Jri Slaby
    Signed-off-by: Al Viro

    David Howells
     
  • Move proc_fd() to fs/proc/fd.h.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Uninline pid_delete_dentry() as it's only used by three function pointers.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

30 Apr, 2013

2 commits

  • Now get_vmalloc_info() is in fs/proc/mmu.c. There is no reason that this
    code must be here and it's implementation needs vmlist_lock and it iterate
    a vmlist which may be internal data structure for vmalloc.

    It is preferable that vmlist_lock and vmlist is only used in vmalloc.c
    for maintainability. So move the code to vmalloc.c

    Signed-off-by: Joonsoo Kim
    Signed-off-by: Joonsoo Kim
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Atsushi Kumagai
    Cc: Chris Metcalf
    Cc: Dave Anderson
    Cc: Eric Biederman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Delete create_proc_read_entry() as it no longer has any users.

    Also delete read_proc_t, write_proc_t, the read_proc member of the
    proc_dir_entry struct and the support functions that use them. This saves a
    pointer for every PDE allocated.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

10 Apr, 2013

4 commits


28 Feb, 2013

1 commit

  • The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_*
    defines introduced in 54b501992dd2 ("coredump: warn about unsafe
    suid_dumpable / core_pattern combo"). Remove the new ones, and use the
    prior values instead.

    Signed-off-by: Kees Cook
    Reported-by: Chen Gang
    Cc: Alexander Viro
    Cc: Alan Cox
    Cc: "Eric W. Biederman"
    Cc: Doug Ledford
    Cc: Serge Hallyn
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

19 Nov, 2012

1 commit

  • I had visions at one point of splitting proc into two filesystems. If
    that had happened proc/self being the the part of proc that actually deals
    with pids would have been a nice cleanup. As it is proc/self requires
    a lot of unnecessary infrastructure for a single file.

    The only user visible change is that a mounted /proc for a pid namespace
    that is dead now shows a broken proc symlink, instead of being completely
    invisible. I don't think anyone will notice or care.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

20 Oct, 2012

1 commit

  • /proc//numa_maps scans vma and show mempolicy under
    mmap_sem. It sometimes accesses task->mempolicy which can
    be freed without mmap_sem and numa_maps can show some
    garbage while scanning.

    This patch tries to take reference count of task->mempolicy at reading
    numa_maps before calling get_vma_policy(). By this, task->mempolicy
    will not be freed until numa_maps reaches its end.

    V2->v3
    - updated comments to be more verbose.
    - removed task_lock() in numa_maps code.
    V1->V2
    - access task->mempolicy only once and remember it. Becase kernel/exit.c
    can overwrite it.

    Signed-off-by: KAMEZAWA Hiroyuki
    Acked-by: David Rientjes
    Acked-by: KOSAKI Motohiro
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

06 Oct, 2012

1 commit


27 Sep, 2012

1 commit

  • This patch prepares the ground for further extension of
    /proc/pid/fd[info] handling code by moving fdinfo handling
    code into fs/proc/fd.c.

    I think such move makes both fs/proc/base.c and fs/proc/fd.c
    easier to read.

    Signed-off-by: Cyrill Gorcunov
    Acked-by: Pavel Emelyanov
    CC: Al Viro
    CC: Alexey Dobriyan
    CC: Andrew Morton
    CC: James Bottomley
    CC: "Aneesh Kumar K.V"
    CC: Alexey Dobriyan
    CC: Matthew Helsley
    CC: "J. Bruce Fields"
    CC: "Aneesh Kumar K.V"
    Signed-off-by: Al Viro

    Cyrill Gorcunov
     

14 Jul, 2012

2 commits


01 Jun, 2012

2 commits

  • When we do checkpoint of a task we need to know the list of children the
    task, has but there is no easy and fast way to generate reverse
    parent->children chain from arbitrary (while a parent pid is
    provided in "PPid" field of /proc//status).

    So instead of walking over all pids in the system (creating one big
    process tree in memory, just to figure out which children a task has) --
    we add explicit /proc//task//children entry, because the kernel
    already has this kind of information but it is not yet exported.

    This is a first level children, not the whole process tree.

    Signed-off-by: Cyrill Gorcunov
    Reviewed-by: Oleg Nesterov
    Reviewed-by: Kees Cook
    Cc: Pavel Emelyanov
    Cc: Serge Hallyn
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • mm_for_maps() is a simple wrapper for mm_access(), and the name is
    misleading, so just remove it and use mm_access() directly.

    Signed-off-by: Cong Wang
    Cc: Oleg Nesterov
    Cc: Alexey Dobriyan
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cong Wang