29 May, 2012

1 commit

  • Pull writeback tree from Wu Fengguang:
    "Mainly from Jan Kara to avoid iput() in the flusher threads."

    * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: Avoid iput() from flusher thread
    vfs: Rename end_writeback() to clear_inode()
    vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
    writeback: Refactor writeback_single_inode()
    writeback: Remove wb->list_lock from writeback_single_inode()
    writeback: Separate inode requeueing after writeback
    writeback: Move I_DIRTY_PAGES handling
    writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
    writeback: Move clearing of I_SYNC into inode_sync_complete()
    writeback: initialize global_dirty_limit
    fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
    mm: page-writeback.c: local functions should not be exposed globally

    Linus Torvalds
     

06 May, 2012

1 commit

  • After we moved inode_sync_wait() from end_writeback() it doesn't make sense
    to call the function end_writeback() anymore. Rename it to clear_inode()
    which well says what the function really does - set I_CLEAR flag.

    Signed-off-by: Jan Kara
    Signed-off-by: Fengguang Wu

    Jan Kara
     

30 Apr, 2012

1 commit

  • The autofs packet size has had a very unfortunate size problem on x86:
    because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
    because the packet data was not 8-byte aligned, the size of the autofsv5
    packet structure differed between 32-bit and 64-bit modes despite
    looking otherwise identical (300 vs 304 bytes respectively).

    We first fixed that up by making the 64-bit compat mode know about this
    problem in commit a32744d4abae ("autofs: work around unhappy compat
    problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
    64-bit kernel because everything then worked the same way as on a 32-bit
    kernel.

    But it turned out that 'automount' had actually known and worked around
    this problem in user space, so fixing the kernel to do the proper 32-bit
    compatibility handling actually *broke* 32-bit automount on a 64-bit
    kernel, because it knew that the packet sizes were wrong and expected
    those incorrect sizes.

    As a result, we ended up reverting that compatibility mode fix, and
    thus breaking systemd again, in commit fcbf94b9dedd.

    With both automount and systemd doing a single read() system call, and
    verifying that they get *exactly* the size they expect but using
    different sizes, it seemed that fixing one of them inevitably seemed to
    break the other. At one point, a patch I seriously considered applying
    from Michael Tokarev did a "strcmp()" to see if it was automount that
    was doing the operation. Ugly, ugly.

    However, a prettier solution exists now thanks to the packetized pipe
    mode. By marking the communication pipe as being packetized (by simply
    setting the O_DIRECT flag), we can always just write the bigger packet
    size, and if user-space does a smaller read, it will just get that
    partial end result and the extra alignment padding will simply be thrown
    away.

    This makes both automount and systemd happy, since they now get the size
    they asked for, and the kernel side of autofs simply no longer needs to
    care - it could pad out the packet arbitrarily.

    Of course, if there is some *other* user of autofs (please, please,
    please tell me it ain't so - and we haven't heard of any) that tries to
    read the packets with multiple writes, that other user will now be
    broken - the whole point of the packetized mode is that one system call
    gets exactly one packet, and you cannot read a packet in pieces.

    Tested-by: Michael Tokarev
    Cc: Alan Cox
    Cc: David Miller
    Cc: Ian Kent
    Cc: Thomas Meyer
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Apr, 2012

1 commit

  • This reverts commit a32744d4abae24572eff7269bc17895c41bd0085.

    While that commit was technically the right thing to do, and made the
    x86-64 compat mode work identically to native 32-bit mode (and thus
    fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
    turns out that the automount binaries had workarounds for this compat
    problem.

    Now, the workarounds are disgusting: doing an "uname()" to find out the
    architecture of the kernel, and then comparing it for the 64-bit cases
    and fixing up the size of the read() in automount for those. And they
    were confused: it's not actually a generic 64-bit issue at all, it's
    very much tied to just x86-64, which has different alignment for an
    'u64' in 64-bit mode than in 32-bit mode.

    But the end result is that fixing the compat layer actually breaks the
    case of a 32-bit automount on a x86-64 kernel.

    There are various approaches to fix this (including just doing a
    "strcmp()" on current->comm and comparing it to "automount"), but I
    think that I will do the one that teaches pipes about a special "packet
    mode", which will allow user space to not have to care too deeply about
    the padding at the end of the autofs packet.

    That change will make the compat workaround unnecessary, so let's revert
    it first, and get automount working again in compat mode. The
    packetized pipes will then fix autofs for systemd.

    Reported-and-requested-by: Michael Tokarev
    Cc: Ian Kent
    Cc: stable@kernel.org # for 3.3
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 Mar, 2012

1 commit

  • Pull x32 support for x86-64 from Ingo Molnar:
    "This tree introduces the X32 binary format and execution mode for x86:
    32-bit data space binaries using 64-bit instructions and 64-bit kernel
    syscalls.

    This allows applications whose working set fits into a 32 bits address
    space to make use of 64-bit instructions while using a 32-bit address
    space with shorter pointers, more compressed data structures, etc."

    Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}

    * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
    x32: Fix alignment fail in struct compat_siginfo
    x32: Fix stupid ia32/x32 inversion in the siginfo format
    x32: Add ptrace for x32
    x32: Switch to a 64-bit clock_t
    x32: Provide separate is_ia32_task() and is_x32_task() predicates
    x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
    x86/x32: Fix the binutils auto-detect
    x32: Warn and disable rather than error if binutils too old
    x32: Only clear TIF_X32 flag once
    x32: Make sure TS_COMPAT is cleared for x32 tasks
    fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
    fs: Fix close_on_exec pointer in alloc_fdtable
    x32: Drop non-__vdso weak symbols from the x32 VDSO
    x32: Fix coding style violations in the x32 VDSO code
    x32: Add x32 VDSO support
    x32: Allow x32 to be configured
    x32: If configured, add x32 system calls to system call tables
    x32: Handle process creation
    x32: Signal-related system calls
    x86: Add #ifdef CONFIG_COMPAT to
    ...

    Linus Torvalds
     

21 Mar, 2012

2 commits


26 Feb, 2012

1 commit

  • When the autofs protocol version 5 packet type was added in commit
    5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
    obvously tried quite hard to be word-size agnostic, and uses explicitly
    sized fields that are all correctly aligned.

    However, with the final "char name[NAME_MAX+1]" array at the end, the
    actual size of the structure ends up being not very well defined:
    because the struct isn't marked 'packed', doing a "sizeof()" on it will
    align the size of the struct up to the biggest alignment of the members
    it has.

    And despite all the members being the same, the alignment of them is
    different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
    alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
    number (256), the name[] array starts out a 4-byte aligned.

    End result: the "packed" size of the structure is 300 bytes: 4-byte, but
    not 8-byte aligned.

    As a result, despite all the fields being in the same place on all
    architectures, sizeof() will round up that size to 304 bytes on
    architectures that have 8-byte alignment for u64.

    Note that this is *not* a problem for 32-bit compat mode on POWER, since
    there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
    and 64-bit alignment is different for 64-bit entities, and as a result
    the structure that has exactly the same layout has different sizes.

    So on x86-64, but no other architecture, we will just subtract 4 from
    the size of the structure when running in a compat task. That way we
    will write the properly sized packet that user mode expects.

    Not pretty. Sadly, this very subtle, and unnecessary, size difference
    has been encoded in user space that wants to read packets of *exactly*
    the right size, and will refuse to touch anything else.

    Reported-and-tested-by: Thomas Meyer
    Signed-off-by: Ian Kent
    Signed-off-by: Linus Torvalds

    Ian Kent
     

20 Feb, 2012

1 commit

  • Wrap accesses to the fd_sets in struct fdtable (for recording open files and
    close-on-exec flags) so that we can move away from using fd_sets since we
    abuse the fd_set structs by not allocating the full-sized structure under
    normal circumstances and by non-core code looking at the internals of the
    fd_sets.

    The first abuse means that use of FD_ZERO() on these fd_sets is not permitted,
    since that cannot be told about their abnormal lengths.

    This introduces six wrapper functions for setting, clearing and testing
    close-on-exec flags and fd-is-open flags:

    void __set_close_on_exec(int fd, struct fdtable *fdt);
    void __clear_close_on_exec(int fd, struct fdtable *fdt);
    bool close_on_exec(int fd, const struct fdtable *fdt);
    void __set_open_fd(int fd, struct fdtable *fdt);
    void __clear_open_fd(int fd, struct fdtable *fdt);
    bool fd_is_open(int fd, const struct fdtable *fdt);

    Note that I've prepended '__' to the names of the set/clear functions because
    they require the caller to hold a lock to use them.

    Note also that I haven't added wrappers for looking behind the scenes at the
    the array. Possibly that should exist too.

    Signed-off-by: David Howells
    Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.uk
    Signed-off-by: H. Peter Anvin
    Cc: Al Viro

    David Howells
     

14 Feb, 2012

1 commit

  • When recursing down the locks when traversing a tree/list in
    get_next_positive_dentry() or get_next_positive_subdir() a lock can
    change from being nested to being a parent which breaks lockdep. This
    patch tells lockdep about what we did.

    Signed-off-by: Steven Rostedt
    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    Steven Rostedt
     

14 Jan, 2012

1 commit

  • I don't know how I missed this obvious mistake when I
    reviewed Als' patches, sorry.

    [ Quoting Al:

    Grr... Note to self: do git status *and* git stash show -p
    before git push. Nothing like "WTF? I'd fixed that braino"
    feeling ;-/

    Al sent the same patch - it got broken in commit d668dc56631d:
    "autofs4: deal with autofs4_write/autofs4_write races". ]

    Reported-and-tested-by: Dave Airlie
    Signed-off-by: Ian Kent
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Ian Kent
     

11 Jan, 2012

3 commits

  • Just serialize the actual writing of packets into pipe on
    a new mutex, independent from everything else in the locking
    hierarchy. As soon as something has started feeding a piece
    of packet into the pipe to daemon, we *want* everything else
    about to try the same to wait until we are done.

    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    Al Viro
     
  • we need to hold ->wq_mutex while we are forming the packet to send,
    lest we have autofs4_catatonic_mode() setting wq->name.name to NULL
    just as autofs4_notify_daemon() decides to memcpy() from it...

    We do have check for catatonic mode immediately after that (under
    ->wq_mutex, as it ought to be) and packet won't be actually sent,
    but it'll be too late for us if we oops on that memcpy() from NULL...

    Fix is obvious - just extend the area covered by ->wq_mutex over
    that switch and check whether it's catatonic *before* doing anything
    else.

    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    Al Viro
     
  • We need to recheck ->catatonic after autofs4_wait() got ->wq_mutex
    for good, or we might end up with wq inserted into queue after
    autofs4_catatonic_mode() had done its thing. It will stick there
    forever, since there won't be anything to clear its ->name.name.

    A bit of a complication: validate_request() drops and regains ->wq_mutex.
    It actually ends up the most convenient place to stick the check into...

    Acked-by: Ian Kent
    Signed-off-by: Al Viro

    Al Viro
     

07 Jan, 2012

2 commits


04 Jan, 2012

2 commits


02 Nov, 2011

1 commit


09 Aug, 2011

2 commits

  • The previous comit made the autofs4 debug printouts check types against
    the printout format, and uncovered this bug:

    fs/autofs4/waitq.c:106:2: warning: format ‘%08lx’ expects type ‘long unsigned int’, but argument 4 has type ‘autofs_wqt_t’

    which is due to the insane type for wait_queue_token. That thing should
    be some fixed well-defined size (preferably just 'unsigned int' or
    'u32') but for unexplained reasons it is randomly either 'unsigned long'
    or 'unsigned int' depending on the architecture.

    For now, cast it to 'unsigned long' for printing, the way we do
    elsewhere. Somebody else can try to explain the typedef mess.

    (There's a reason we don't support excessive use of typedefs in the
    kernel: it's usually just a good way of confusing yourself).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Use 'pr_debug()' for DPRINTK, which will do the proper type checking on
    the arguments (without generating code) even when DEBUG isn't #defined.

    Also, use the standard __VA_ARGS__ for the macros, and stop the
    pointless abuse of 'do { xyz } while (0)' when the macro is already a
    perfectly well-formed single statement.

    Reported-by: David Howells
    Suggested-by: Joe Perches
    Cc: Ian Kent
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 May, 2011

1 commit


26 May, 2011

1 commit

  • Only a few file systems need this. Start by pushing it down into each
    fs rmdir method (except gfs2 and xfs) so it can be dealt with on a per-fs
    basis.

    This does not change behavior for any in-tree file systems.

    Acked-by: Christoph Hellwig
    Signed-off-by: Sage Weil
    Signed-off-by: Al Viro

    Sage Weil
     

31 Mar, 2011

1 commit


25 Mar, 2011

6 commits

  • …s_dev_ioctl_setpipefd()

    In fs/autofs4/dev-ioctl.c::autofs_dev_ioctl_setpipefd() we call fget(),
    which may return NULL, but we do not explicitly test for that NULL return
    so we may end up dereferencing a NULL pointer - bad.

    When I originally submitted this patch I had chosen EBUSY as the return
    value to use if this happens. Ian Kent was kind enough to explain why that
    would most likely be wrong and why EBADF should most likely be used
    instead. This version of the patch uses EBADF.

    Signed-off-by: Jesper Juhl <jj@chaosbits.net>
    Signed-off-by: Ian Kent <raven@themaw.net>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

    Jesper Juhl
     
  • The autofs4_lock introduced by the rcu-walk changes has unnecessarily
    broad scope. The locking is better handled by the per-autofs super
    block lookup_lock.

    Signed-off-by: Ian Kent
    Acked-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • The daemon never needs to block and, in the rcu-walk case an error
    return isn't used, so always return zero.

    Signed-off-by: Ian Kent
    Signed-off-by: Al Viro

    Ian Kent
     
  • The vfs-scale changes changed the traversal used in
    autofs4_expire_indirect() from a list to a depth first tree traversal
    which isn't right.

    Signed-off-by: Ian Kent
    Signed-off-by: Al Viro

    Ian Kent
     
  • There is a missing dput() when returning from autofs4_expire_direct()
    when we see that the dentry is already a pending mount.

    Signed-off-by: Ian Kent
    Acked-by: David Howells
    Signed-off-by: Al Viro

    Ian Kent
     
  • When direct (and offset) mounts were introduced the the last used
    timeout could no longer be updated in ->d_revalidate(). This is
    because covered direct mounts would be followed over without calling
    the autofs file system. As a result the definition of the busyness
    check for all entries was changed to be "actually busy" being an open
    file or working directory within the automount. But now we have a call
    back in the follow so the last used update on any access can be
    re-instated. This requires DCACHE_MANAGE_TRANSIT to always be set.

    Signed-off-by: Ian Kent
    Signed-off-by: Al Viro

    Ian Kent
     

18 Mar, 2011

1 commit


18 Jan, 2011

9 commits