15 Nov, 2012

1 commit

  • Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue.

    When creating directories and symlinks default the uid and gid of
    the mount requester to the global root uid and gid. autofs4_wait
    will update these fields when a mount is requested.

    When generating autofsv5 packets report the uid and gid of the mount
    requestor in user namespace of the process that opened the pipe,
    reporting unmapped uids and gids as overflowuid and overflowgid.

    In autofs_dev_ioctl_requester return the uid and gid of the last mount
    requester converted into the calling processes user namespace. When the
    uid or gid don't map return overflowuid and overflowgid as appropriate,
    allowing failure to find a mount requester to be distinguished from
    failure to map a mount requester.

    The uid and gid mount options specifying the user and group of the
    root autofs inode are converted into kuid and kgid as they are parsed
    defaulting to the current uid and current gid of the process that
    mounts autofs.

    Mounting of autofs for the present remains confined to processes in
    the initial user namespace.

    Cc: Ian Kent
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

29 May, 2012

1 commit

  • Pull writeback tree from Wu Fengguang:
    "Mainly from Jan Kara to avoid iput() in the flusher threads."

    * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: Avoid iput() from flusher thread
    vfs: Rename end_writeback() to clear_inode()
    vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
    writeback: Refactor writeback_single_inode()
    writeback: Remove wb->list_lock from writeback_single_inode()
    writeback: Separate inode requeueing after writeback
    writeback: Move I_DIRTY_PAGES handling
    writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
    writeback: Move clearing of I_SYNC into inode_sync_complete()
    writeback: initialize global_dirty_limit
    fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
    mm: page-writeback.c: local functions should not be exposed globally

    Linus Torvalds
     

06 May, 2012

1 commit

  • After we moved inode_sync_wait() from end_writeback() it doesn't make sense
    to call the function end_writeback() anymore. Rename it to clear_inode()
    which well says what the function really does - set I_CLEAR flag.

    Signed-off-by: Jan Kara
    Signed-off-by: Fengguang Wu

    Jan Kara
     

30 Apr, 2012

1 commit

  • The autofs packet size has had a very unfortunate size problem on x86:
    because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
    because the packet data was not 8-byte aligned, the size of the autofsv5
    packet structure differed between 32-bit and 64-bit modes despite
    looking otherwise identical (300 vs 304 bytes respectively).

    We first fixed that up by making the 64-bit compat mode know about this
    problem in commit a32744d4abae ("autofs: work around unhappy compat
    problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
    64-bit kernel because everything then worked the same way as on a 32-bit
    kernel.

    But it turned out that 'automount' had actually known and worked around
    this problem in user space, so fixing the kernel to do the proper 32-bit
    compatibility handling actually *broke* 32-bit automount on a 64-bit
    kernel, because it knew that the packet sizes were wrong and expected
    those incorrect sizes.

    As a result, we ended up reverting that compatibility mode fix, and
    thus breaking systemd again, in commit fcbf94b9dedd.

    With both automount and systemd doing a single read() system call, and
    verifying that they get *exactly* the size they expect but using
    different sizes, it seemed that fixing one of them inevitably seemed to
    break the other. At one point, a patch I seriously considered applying
    from Michael Tokarev did a "strcmp()" to see if it was automount that
    was doing the operation. Ugly, ugly.

    However, a prettier solution exists now thanks to the packetized pipe
    mode. By marking the communication pipe as being packetized (by simply
    setting the O_DIRECT flag), we can always just write the bigger packet
    size, and if user-space does a smaller read, it will just get that
    partial end result and the extra alignment padding will simply be thrown
    away.

    This makes both automount and systemd happy, since they now get the size
    they asked for, and the kernel side of autofs simply no longer needs to
    care - it could pad out the packet arbitrarily.

    Of course, if there is some *other* user of autofs (please, please,
    please tell me it ain't so - and we haven't heard of any) that tries to
    read the packets with multiple writes, that other user will now be
    broken - the whole point of the packetized mode is that one system call
    gets exactly one packet, and you cannot read a packet in pieces.

    Tested-by: Michael Tokarev
    Cc: Alan Cox
    Cc: David Miller
    Cc: Ian Kent
    Cc: Thomas Meyer
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Apr, 2012

1 commit

  • This reverts commit a32744d4abae24572eff7269bc17895c41bd0085.

    While that commit was technically the right thing to do, and made the
    x86-64 compat mode work identically to native 32-bit mode (and thus
    fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
    turns out that the automount binaries had workarounds for this compat
    problem.

    Now, the workarounds are disgusting: doing an "uname()" to find out the
    architecture of the kernel, and then comparing it for the 64-bit cases
    and fixing up the size of the read() in automount for those. And they
    were confused: it's not actually a generic 64-bit issue at all, it's
    very much tied to just x86-64, which has different alignment for an
    'u64' in 64-bit mode than in 32-bit mode.

    But the end result is that fixing the compat layer actually breaks the
    case of a 32-bit automount on a x86-64 kernel.

    There are various approaches to fix this (including just doing a
    "strcmp()" on current->comm and comparing it to "automount"), but I
    think that I will do the one that teaches pipes about a special "packet
    mode", which will allow user space to not have to care too deeply about
    the padding at the end of the autofs packet.

    That change will make the compat workaround unnecessary, so let's revert
    it first, and get automount working again in compat mode. The
    packetized pipes will then fix autofs for systemd.

    Reported-and-requested-by: Michael Tokarev
    Cc: Ian Kent
    Cc: stable@kernel.org # for 3.3
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

21 Mar, 2012

1 commit


26 Feb, 2012

1 commit

  • When the autofs protocol version 5 packet type was added in commit
    5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
    obvously tried quite hard to be word-size agnostic, and uses explicitly
    sized fields that are all correctly aligned.

    However, with the final "char name[NAME_MAX+1]" array at the end, the
    actual size of the structure ends up being not very well defined:
    because the struct isn't marked 'packed', doing a "sizeof()" on it will
    align the size of the struct up to the biggest alignment of the members
    it has.

    And despite all the members being the same, the alignment of them is
    different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
    alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
    number (256), the name[] array starts out a 4-byte aligned.

    End result: the "packed" size of the structure is 300 bytes: 4-byte, but
    not 8-byte aligned.

    As a result, despite all the fields being in the same place on all
    architectures, sizeof() will round up that size to 304 bytes on
    architectures that have 8-byte alignment for u64.

    Note that this is *not* a problem for 32-bit compat mode on POWER, since
    there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
    and 64-bit alignment is different for 64-bit entities, and as a result
    the structure that has exactly the same layout has different sizes.

    So on x86-64, but no other architecture, we will just subtract 4 from
    the size of the structure when running in a compat task. That way we
    will write the properly sized packet that user mode expects.

    Not pretty. Sadly, this very subtle, and unnecessary, size difference
    has been encoded in user space that wants to read packets of *exactly*
    the right size, and will refuse to touch anything else.

    Reported-and-tested-by: Thomas Meyer
    Signed-off-by: Ian Kent
    Signed-off-by: Linus Torvalds

    Ian Kent
     

11 Jan, 2012

1 commit

  • Just serialize the actual writing of packets into pipe on
    a new mutex, independent from everything else in the locking
    hierarchy. As soon as something has started feeding a piece
    of packet into the pipe to daemon, we *want* everything else
    about to try the same to wait until we are done.

    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    Al Viro
     

07 Jan, 2012

1 commit


04 Jan, 2012

1 commit


02 Nov, 2011

1 commit


18 Jan, 2011

7 commits


16 Jan, 2011

6 commits

  • Merge the remaining autofs4 dentry ops tables. It doesn't matter if
    d_automount and d_manage are present on something that's not mountable or
    holdable as these ops are only used if the appropriate flags are set in
    dentry->d_flags.

    [AV] switch to ->s_d_op, since now _everything_ on autofs4 is using the
    same dentry_operations.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • When this function is called the local reference count does't need to
    be updated since the dentry is going away and dput definitely must
    not be called here.

    Also the autofs info struct field inode isn't used so remove it.

    Signed-off-by: Ian Kent
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • There are now two distinct dentry operations uses. One for dentrys
    that trigger mounts and one for dentrys that do not.

    Rationalize the use of these dentry operations and rename them to
    reflect their function.

    Signed-off-by: Ian Kent
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • Since the use of ->follow_link() has been eliminated there is no
    need to separate the indirect and direct inode operations.

    Signed-off-by: Ian Kent
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • This patch required a previous patch to add the ->d_automount()
    dentry operation.

    Add a function to use the newly defined ->d_manage() dentry operation
    for blocking during mount and expire.

    Whether the VFS calls the dentry operations d_automount() and d_manage()
    is controled by the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags. autofs
    uses the d_automount() operation to callback to user space to request
    mount operations and the d_manage() operation to block walks into mounts
    that are under construction or destruction.

    In order to prevent these functions from being called unnecessarily the
    DMANAGED_* flags are cleared for cases which would cause this. In the
    common case the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags are both
    set for dentrys waiting to be mounted. The DMANAGED_TRANSIT flag is
    cleared upon successful mount request completion and set during expire
    runs, both during the dentry expire check, and if selected for expire,
    is left set until a subsequent successful mount request completes.

    The exception to this is the so-called rootless multi-mount which has
    no actual mount at its base. In this case the DMANAGED_AUTOMOUNT flag
    is cleared upon successful mount request completion as well and set
    again after a successful expire.

    Signed-off-by: Ian Kent
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • Add a function to use the newly defined ->d_automount() dentry operation
    for triggering mounts instead of doing the user space callback in ->lookup()
    and ->d_revalidate().

    Note, to be useful the subsequent patch to add the ->d_manage() dentry
    operation is also needed so the discussion of functionality is deferred to
    that patch.

    Signed-off-by: Ian Kent
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     

07 Jan, 2011

1 commit

  • Reduce some branches and memory accesses in dcache lookup by adding dentry
    flags to indicate common d_ops are set, rather than having to check them.
    This saves a pointer memory access (dentry->d_op) in common path lookup
    situations, and saves another pointer load and branch in cases where we
    have d_op but not the particular operation.

    Patched with:

    git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i

    Signed-off-by: Nick Piggin

    Nick Piggin
     

26 Oct, 2010

1 commit

  • Instead of always assigning an increasing inode number in new_inode
    move the call to assign it into those callers that actually need it.
    For now callers that need it is estimated conservatively, that is
    the call is added to all filesystems that do not assign an i_ino
    by themselves. For a few more filesystems we can avoid assigning
    any inode number given that they aren't user visible, and for others
    it could be done lazily when an inode number is actually needed,
    but that's left for later patches.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Christoph Hellwig
     

04 Mar, 2010

2 commits


16 Dec, 2009

2 commits

  • We need to be able to cope with the directory mutex being held during
    ->d_revalidate() in some cases, but not all cases, and not necessarily by
    us. Because we need to release the mutex when we call back to the daemon
    to do perform a mount we must be sure that it is us who holds the mutex so
    we must redirect mount requests to ->lookup() if the mutex is held.

    Signed-off-by: Ian Kent
    Cc: Sage Weil
    Cc: Al Viro
    Cc: Andreas Dilger
    Cc: Christoph Hellwig
    Cc: Yehuda Saheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     
  • Define some simple helper functions for adding and deleting entries on the
    active (and unhashed) dentry list.

    Signed-off-by: Ian Kent
    Cc: Sage Weil
    Cc: Al Viro
    Cc: Andreas Dilger
    Cc: Christoph Hellwig
    Cc: Yehuda Saheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     

28 Mar, 2009

1 commit


07 Jan, 2009

1 commit

  • - the type assigned at mount when no type is given is changed
    from 0 to AUTOFS_TYPE_INDIRECT. This was done because 0 and
    AUTOFS_TYPE_INDIRECT were being treated implicitly as the same
    type.

    - previously, an offset mount had it's type set to
    AUTOFS_TYPE_DIRECT|AUTOFS_TYPE_OFFSET but the mount control
    re-implementation needs to be able distinguish all three types.
    So this was changed to make the type setting explicit.

    - a type AUTOFS_TYPE_ANY was added for use by the re-implementation
    when checking if a given path is a mountpoint. It's not really a
    type as we use this to ask if a given path is a mountpoint in the
    autofs_dev_ioctl_ismountpoint() function.

    - functions to set and test the autofs mount types have been added to
    improve readability and make the type usage explicit.

    - the mount type is used from user space for the mount control
    re-implementtion so, for consistency, all the definitions have
    been moved to the user space include file include/linux/auto_fs4.h.

    Signed-off-by: Ian Kent
    Signed-off-by: Jeff Moyer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     

06 Jan, 2009

1 commit


14 Nov, 2008

1 commit

  • Wrap access to task credentials so that they can be separated more easily from
    the task_struct during the introduction of COW creds.

    Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

    Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
    sense to use RCU directly rather than a convenient wrapper; these will be
    addressed by later patches.

    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Acked-by: Serge Hallyn
    Cc: Ian Kent
    Cc: autofs@linux.kernel.org
    Signed-off-by: James Morris

    David Howells
     

17 Oct, 2008

2 commits


14 Oct, 2008

1 commit

  • This is a much better version of a previous patch to make the parser
    tables constant. Rather than changing the typedef, we put the "const" in
    all the various places where its required, allowing the __initconst
    exception for nfsroot which was the cause of the previous trouble.

    This was posted for review some time ago and I believe its been in -mm
    since then.

    Signed-off-by: Steven Whitehouse
    Cc: Alexander Viro
    Signed-off-by: Linus Torvalds

    Steven Whitehouse
     

25 Jul, 2008

3 commits

  • The autofs4_catatonic_mode() function accesses the wait queue without any
    locking but can be called at any time. This could lead to a possible
    double free of the name field of the wait and a double fput of the daemon
    communication pipe or an fput of a NULL file pointer.

    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     
  • A while ago a patch to resolve a deadlock during directory creation was
    merged. This delayed the hashing of lookup dentrys until the ->mkdir()
    (or ->symlink()) operation completed to ensure we always went through
    ->lookup() instead of also having processes go through ->revalidate() so
    our VFS locking remained consistent.

    Now we are seeing a couple of side affects of that change in situations
    with heavy mount activity.

    Two cases have been identified:

    1) When a mount request is triggered, due to the delayed hashing, the
    directory created by user space for the mount point doesn't have the
    DCACHE_AUTOFS_PENDING flag set. In the case of an autofs multi-mount
    where a tree of mount point directories are created this can lead to
    the path walk continuing rather than the dentry being sent to the wait
    queue to wait for request completion. This is because, if the pending
    flag isn't set, the criteria for deciding this is a mount in progress
    fails to hold, namely that the dentry is not a mount point and has no
    subdirectories.

    2) A mount request dentry is initially created negative and unhashed.
    It remains this way until the ->mkdir() callback completes. Since it
    is unhashed a fresh dentry is used when the user space mount request
    creates the mount point directory. This leaves the original dentry
    negative and unhashed. But revalidate has no way to tell the VFS that
    the dentry has changed, other than to force another ->lookup() by
    returning false, which is at best wastefull and at worst not possible.
    This results in an -ENOENT return from the original path walk when in
    fact the mount succeeded.

    To resolve this we need to ensure that the same dentry is used in all
    calls to ->lookup() during the course of a mount request. This patch
    achieves that by adding the initial dentry to a look aside list and
    removes it at ->mkdir() or ->symlink() completion (or when the dentry is
    released), since these are the only create operations autofs4 supports.

    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent
     
  • Correct the error of making a positive dentry negative after it has been
    instantiated.

    The code that makes this error attempts to re-use the dentry from a
    concurrent expire and mount to resolve a race and the dentry used for the
    lookup must be negative for mounts to trigger in the required cases. The
    fact is that the dentry doesn't need to be re-used because all that is
    needed is to preserve the flag that indicates an expire is still
    incomplete at the time of the mount request.

    This change uses the the dentry to check the flag and wait for the expire
    to complete then discards it instead of attempting to re-use it.

    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Kent