24 Jun, 2015

1 commit


11 May, 2015

2 commits

  • its only use is getting passed to nd_jump_link(), which can obtain
    it from current->nameidata

    Signed-off-by: Al Viro

    Al Viro
     
  • a) instead of storing the symlink body (via nd_set_link()) and returning
    an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
    that opaque pointer (into void * passed by address by caller) and returns
    the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic
    symlinks) and pointer to symlink body for normal symlinks. Stored pointer
    is ignored in all cases except the last one.

    Storing NULL for opaque pointer (or not storing it at all) means no call
    of ->put_link().

    b) the body used to be passed to ->put_link() implicitly (via nameidata).
    Now only the opaque pointer is. In the cases when we used the symlink body
    to free stuff, ->follow_link() now should store it as opaque pointer in addition
    to returning it.

    Signed-off-by: Al Viro

    Al Viro
     

16 Apr, 2015

1 commit


12 Apr, 2015

1 commit


23 Feb, 2015

2 commits

  • X-Coverup: just ask spender
    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

20 Feb, 2015

1 commit


20 Nov, 2014

1 commit


04 Nov, 2014

1 commit


14 Oct, 2014

5 commits

  • If rcu-walk mode we don't *have* to return -EISDIR for non-mount-traps
    as we will simply drop into REF-walk and handling DCACHE_NEED_AUTOMOUNT
    dentrys the slow way. But it is better if we do when possible.

    In 'oz_mode', use the same condition as ref-walk: if not a mountpoint,
    then it must be -EISDIR.

    In regular mode there are most tests needed. Most of them can be
    performed without taking any spinlocks. If we find a directory that
    isn't obviously empty, and isn't mounted on, we need to call
    'simple_empty()' which does take a spinlock. If this turned out to hurt
    performance, some other approach could be found to signal when a
    directory is known to be empty.

    Signed-off-by: NeilBrown
    Reviewed-by: Ian Kent
    Tested-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • ->fs_lock protects AUTOFS_INF_EXPIRING. We need to be sure that once
    the flag is set, no new references beneath the dentry are taken. So
    rcu-walk currently needs to take fs_lock before checking the flag. This
    hurts performance.

    Change the expiry to a two-stage process. First set AUTOFS_INF_NO_RCU
    which forces any path walk into ref-walk mode, then drop the lock and
    call synchronize_rcu(). Once that returns we can be sure no rcu-walk is
    active beneath the dentry and we can check reference counts again.

    Now during an RCU-walk we can test AUTOFS_INF_EXPIRING without taking
    the lock as along as we test AUTOFS_INF_NO_RCU too. If either are set,
    we must abort the RCU-walk If neither are set, we know that refcounts
    will be tested again after we finish the RCU-walk so we are safe to
    continue.

    ->fs_lock is still taken in d_manage() to check for a non-trap
    directory. That will be resolved in the next patch.

    Signed-off-by: NeilBrown
    Reviewed-by: Ian Kent
    Tested-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Have a "test" function change the value it is testing can be confusing,
    particularly as a future patch will be calling this function twice.

    So move the update for 'last_used' to avoid repeat expiry to the place
    where the final determination on what to expire is known.

    Signed-off-by: NeilBrown
    Reviewed-by: Ian Kent
    Tested-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Future patch will potentially call this twice, so make it separate.

    Signed-off-by: NeilBrown
    Reviewed-by: Ian Kent
    Tested-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • This series teaches autofs about RCU-walk so that we don't drop straight
    into REF-walk when we hit an autofs directory, and so that we avoid
    spinlocks as much as possible when performing an RCU-walk.

    This is needed so that the benefits of the recent NFS support for
    RCU-walk are fully available when NFS filesystems are automounted.

    Patches have been carefully reviewed and tested both with test suites
    and in production - thanks a lot to Ian Kent for his support there.

    This patch (of 6):

    Any attempt to look up a pathname that passes though an autofs4 mount is
    currently forced out of RCU-walk into REF-walk.

    This can significantly hurt performance of many-thread work loads on
    many-core systems, especially if the automounted filesystem supports
    RCU-walk but doesn't get to benefit from it.

    So if autofs4_d_manage is called with rcu_walk set, only fail with -ECHILD
    if it is necessary to wait longer than a spinlock.

    Signed-off-by: NeilBrown
    Reviewed-by: Ian Kent
    Tested-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

09 Oct, 2014

1 commit

  • Biederman's umount-on-rmdir series changes d_invalidate() to sumarily remove
    mounts under the passed in dentry regardless of whether they are busy
    or not. So calling this in fs/autofs4/expire.c:autofs4_tree_busy() is
    definitely the wrong thing to do becuase it will silently umount entries
    instead of just cleaning stale dentrys.

    But this call shouldn't be needed and testing shows that automounting
    continues to function without it.

    As Al Viro correctly surmises the original intent of the call was to
    perform what shrink_dcache_parent() does.

    If at some time in the future I see stale dentries accumulating
    following failed mounts I'll revisit the issue and possibly add a
    shrink_dcache_parent() call if needed.

    Signed-off-by: Ian Kent
    Cc: Al Viro
    Cc: Eric W. Biederman
    Signed-off-by: Al Viro

    Ian Kent
     

09 Aug, 2014

5 commits


04 Jul, 2014

1 commit

  • On strict build environments we can see:

    fs/autofs4/inode.c: In function 'autofs4_fill_super':
    fs/autofs4/inode.c:312: error: 'pgrp' may be used uninitialized in this function
    make[2]: *** [fs/autofs4/inode.o] Error 1
    make[1]: *** [fs/autofs4] Error 2
    make: *** [fs] Error 2
    make: *** Waiting for unfinished jobs....

    This is due to the use of pgrp_set being used to indicate pgrp has has
    been set rather than initializing pgrp itself.

    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     

05 Jun, 2014

1 commit


07 May, 2014

1 commit

  • autofs needs to be able to see private data dentry flags for its dentrys
    that are being created but not yet hashed and for its dentrys that have
    been rmdir()ed but not yet freed. It needs to do this so it can block
    processes in these states until a status has been returned to indicate
    the given operation is complete.

    It does this by keeping two lists, active and expring, of dentrys in
    this state and uses ->d_release() to keep them stable while it checks
    the reference count to determine if they should be used.

    But with the recent lockref changes dentrys being freed sometimes don't
    transition to a reference count of 0 before being freed so autofs can
    occassionally use a dentry that is invalid which can lead to a panic.

    Signed-off-by: Ian Kent
    Cc: Al Viro
    Cc: Linus Torvalds
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     

09 Apr, 2014

1 commit

  • There wasn't any check of the size passed from userspace before trying
    to allocate the memory required.

    This meant that userspace might request more space than allowed,
    triggering an OOM.

    Signed-off-by: Sasha Levin
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

24 Jan, 2014

5 commits

  • The autofs4 module doesn't consider symlinks for expire as it did in the
    older autofs v3 module (so it's actually a long standing regression).

    The user space daemon has focused on the use of bind mounts instead of
    symlinks for a long time now and that's why this has not been noticed.
    But with the future addition of amd map parsing to automount(8), not to
    mention amd itself (of am-utils), symlink expiry will be needed.

    The direct and offset mount types can't be symlinks and the tree mounts of
    version 4 were always real mounts so only indirect mounts need expire
    symlinks.

    Since the current users of the autofs4 module haven't reported this as a
    problem to date this patch probably isn't a candidate for backport to
    stable.

    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     
  • Use the helper macro !IS_ROOT to replace parent != dentry->d_parent. Just
    clean up.

    Signed-off-by: Rui Xiang
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rui Xiang
     
  • While kzallocing sbi/ino fails, it should return -ENOMEM.

    And it should return the err value from autofs_prepare_pipe.

    Signed-off-by: Rui Xiang
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rui Xiang
     
  • The PID and the TGID of the process triggering the mount are sent to the
    daemon. Currently the global pid values are sent (ones valid in the
    initial pid namespace) but this is wrong if the autofs daemon itself is
    not running in the initial pid namespace.

    So send the pid values that are valid in the namespace of the autofs
    daemon.

    The namespace to use is taken from the oz_pgrp pid pointer, which was
    set at mount time to the mounting process' pid namespace.

    If the pid translation fails (the triggering process is in an unrelated
    pid namespace) then the automount fails with ENOENT.

    Signed-off-by: Miklos Szeredi
    Acked-by: Serge Hallyn
    Cc: Eric Biederman
    Acked-by: Ian Kent
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Enable autofs4 to work in a "container". oz_pgrp is converted from
    pid_t to struct pid and this is stored at mount time based on the
    "pgrp=" option or if the option is missing then the current pgrp.

    The "pgrp=" option is interpreted in the PID namespace of the current
    process. This option is flawed in that it doesn't carry the namespace
    information, so it should be deprecated. AFAICS the autofs daemon
    always sends the current pgrp, which is the default anyway.

    The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl.
    This ioctl sets oz_pgrp to the current pgrp. It is not allowed to
    change the pid namespace.

    oz_pgrp is used mainly to determine whether the process traversing the
    autofs mount tree is the autofs daemon itself or not. This function now
    compares the pid pointers instead of the pid_t values.

    One other use of oz_pgrp is in autofs4_show_options. There is shows the
    virtual pid number (i.e. the one that is valid inside the PID namespace
    of the calling process)

    For debugging printk convert oz_pgrp to the value in the initial pid
    namespace.

    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Miklos Szeredi
    Acked-by: Serge Hallyn
    Cc: Eric Biederman
    Acked-by: Ian Kent
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     

25 Oct, 2013

2 commits


17 Sep, 2013

1 commit

  • Don't drop ->wq_mutex before calling autofs4_notify_daemon() only to regain it
    there. Besides being pointless, that opens a race window where autofs4_wait_release()
    could've come and freed wq->name.name. And do the debugging printk in the "reused an
    existing wq" case before dropping ->wq_mutex - the same reason...

    Signed-off-by: Al Viro
    Acked-by: Ian Kent

    Al Viro
     

09 Sep, 2013

1 commit

  • When reconnecting to automounts at startup an autofs ioctl is used
    to find the device and inode of existing mounts so they can be used
    to open a file descriptor of possibly covered mounts.

    At this time the the caller might not yet "own" the mount so it can
    trigger calling ->d_automount(). This causes automount to hang when
    trying to reconnect to direct or offset mount types.

    Consequently kern_path() can't be used but kern_path_mountpoint() can be.

    Signed-off-by: Ian Kent
    Cc: Jeff Layton
    Cc: Al Viro
    Signed-off-by: Al Viro

    Ian Kent
     

05 Jul, 2013

1 commit


29 Jun, 2013

1 commit


07 May, 2013

2 commits

  • When checking if an autofs mount point is busy it isn't sufficient to
    only check if it's a mount point.

    For example, if the mount of an offset mountpoint in a tree is denied
    for this host by its export and the dentry becomes a process working
    directory the check incorrectly returns the mount as not in use at
    expire.

    This can happen since the default when mounting within a tree is
    nostrict, which means ingnore mount fails on mounts within the tree and
    continue. The nostrict option is meant to allow mounting in this case.

    Signed-off-by: David Jeffery
    Signed-off-by: Ian Kent
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    David Jeffery
     
  • Fixed the sparse warning:

    fs/autofs4/root.c:411:5: warning: symbol 'autofs4_d_manage' was not declared. Should it be static?"

    [ Clearly it should be static as the function is declared static at the
    top of root.c. - imk ]

    Signed-off-by: Claudiu Ghioc
    Signed-off-by: Ian Kent
    Signed-off-by: Linus Torvalds

    Claudiu Ghioc
     

04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

02 Mar, 2013

1 commit