05 Feb, 2015

1 commit


31 Jan, 2015

2 commits


30 Jan, 2015

1 commit

  • Pull NFS client bugfixes from Trond Myklebust:
    "Highlights include:

    - Stable fix for a NFSv4.1 Oops on mount
    - Stable fix for an O_DIRECT deadlock condition
    - Fix an issue with submounted volumes and fake duplicate inode
    numbers"

    * tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFS: Fix use of nfs_attr_use_mounted_on_fileid()
    NFSv4.1: Fix an Oops in nfs41_walk_client_list
    nfs: fix dio deadlock when O_DIRECT flag is flipped

    Linus Torvalds
     

28 Jan, 2015

3 commits

  • Currently ->get_dqblk() and ->set_dqblk() use struct fs_disk_quota which
    tracks space limits and usage in 512-byte blocks. However VFS quotas
    track usage in bytes (as some filesystems require that) and we need to
    somehow pass this information. Upto now it wasn't a problem because we
    didn't do any unit conversion (thus VFS quota routines happily stuck
    number of bytes into d_bcount field of struct fd_disk_quota). Only if
    you tried to use Q_XGETQUOTA or Q_XSETQLIM for VFS quotas (or Q_GETQUOTA
    / Q_SETQUOTA for XFS quotas), you got bogus results. Hardly anyone
    tried this but reportedly some Samba users hit the problem in practice.
    So when we want interfaces compatible we need to fix this.

    We bite the bullet and define another quota structure used for passing
    information from/to ->get_dqblk()/->set_dqblk. It's somewhat sad we have
    to have more conversion routines in fs/quota/quota.c and another copying
    of quota structure slows down getting of quota information by about 2%
    but it seems cleaner than overloading e.g. units of d_bcount to bytes.

    CC: stable@vger.kernel.org
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     
  • Commit 6fb1ca92a640 "udf: Fix race between write(2) and close(2)"
    changed the condition when preallocation is released. The idea was that
    we don't want to release the preallocation for an inode on close when
    there are other writeable file descriptors for the inode. However the
    condition was written in the opposite way so we released preallocation
    only if there were other writeable file descriptors. Fix the problem by
    changing the condition properly.

    CC: stable@vger.kernel.org
    Fixes: 6fb1ca92a6409a9d5b0696447cd4997bc9aaf5a2
    Reported-by: Fabian Frederick
    Signed-off-by: Jan Kara

    Jan Kara
     
  • The xfstests btrfs/072 reports uncorrectable read errors in dmesg,
    because scrub forgets to use commit_root for parity scrub routine
    and scrub attempts to scrub those extents items whose contents are
    not fully on disk.

    To fix it, we just add the @search_commit_root flag back.

    Signed-off-by: Gui Hecheng
    Signed-off-by: Qu Wenruo
    Reviewed-by: Miao Xie
    Signed-off-by: Chris Mason

    Gui Hecheng
     

27 Jan, 2015

1 commit

  • If CONFIG_CIFS_WEAK_PW_HASH is not set, CIFSSEC_MUST_LANMAN
    and CIFSSEC_MUST_PLNTXT is defined as 0.

    When setting new SecurityFlags without any MUST flags,
    your flags would be overwritten with CIFSSEC_MUST_LANMAN (0).

    Signed-off-by: Niklas Cassel
    Signed-off-by: Steve French

    Niklas Cassel
     

26 Jan, 2015

1 commit


24 Jan, 2015

1 commit

  • Pull btrfs fixes from Chris Mason:
    "We have a few fixes in my for-linus branch.

    Qu Wenruo's batch fix a regression between some our merge window pull
    and the inode_cache feature. The rest are smaller bugs"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
    btrfs: Fix the bug that fs_info->pending_changes is never cleared.
    btrfs: fix state->private cast on 32 bit machines
    Btrfs: fix race deleting block group from space_info->ro_bgs list
    Btrfs: fix incorrect freeing in scrub_stripe
    btrfs: sync ioctl, handle errors after transaction start

    Linus Torvalds
     

22 Jan, 2015

3 commits

  • This function call was being optimized out during nfs_fhget(), leading
    to situations where we have a valid fileid but still want to use the
    mounted_on_fileid. For example, imagine we have our server configured
    like this:

    server % df
    Filesystem Size Used Avail Use% Mounted on
    /dev/vda1 9.1G 6.5G 1.9G 78% /
    /dev/vdb1 487M 2.3M 456M 1% /exports
    /dev/vdc1 487M 2.3M 456M 1% /exports/vol1
    /dev/vdd1 487M 2.3M 456M 1% /exports/vol2

    If our client mounts /exports and tries to do a "chown -R" across the
    entire mountpoint, we will get a nasty message warning us about a circular
    directory structure. Running chown with strace tells me that each directory
    has the same device and inode number:

    newfstatat(AT_FDCWD, "/nfs/", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
    newfstatat(4, "vol1", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
    newfstatat(4, "vol2", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0

    With this patch the mounted_on_fileid values are used for st_ino, so the
    directory loop warning isn't reported.

    Signed-off-by: Anna Schumaker
    Signed-off-by: Trond Myklebust

    Anna Schumaker
     
  • If we start state recovery on a client that failed to initialise correctly,
    then we are very likely to Oops.

    Reported-by: "Mkrtchyan, Tigran"
    Link: http://lkml.kernel.org/r/130621862.279655.1421851650684.JavaMail.zimbra@desy.de
    Cc: stable@vger.kernel.org
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • We only support swap file calling nfs_direct_IO. However, application
    might be able to get to nfs_direct_IO if it toggles O_DIRECT flag
    during IO and it can deadlock because we grab inode->i_mutex in
    nfs_file_direct_write(). So return 0 for such case. Then the generic
    layer will fall back to buffer IO.

    Signed-off-by: Peng Tao
    Cc: stable@vger.kernel.org
    Signed-off-by: Trond Myklebust

    Peng Tao
     

21 Jan, 2015

2 commits

  • Commit 6b5fe46dfa52 (btrfs: do commit in sync_fs if there are pending
    changes) will call btrfs_start_transaction() in sync_fs(), to handle
    some operations needed to be done in next transaction.

    However this can cause deadlock if the filesystem is frozen, with the
    following sys_r+w output:
    [ 143.255932] Call Trace:
    [ 143.255936] [] schedule+0x29/0x70
    [ 143.255939] [] __sb_start_write+0xb3/0x100
    [ 143.255971] [] start_transaction+0x2e6/0x5a0
    [btrfs]
    [ 143.255992] [] btrfs_start_transaction+0x1b/0x20
    [btrfs]
    [ 143.256003] [] btrfs_sync_fs+0xca/0xd0 [btrfs]
    [ 143.256007] [] sync_fs_one_sb+0x20/0x30
    [ 143.256011] [] iterate_supers+0xe1/0xf0
    [ 143.256014] [] sys_sync+0x55/0x90
    [ 143.256017] [] system_call_fastpath+0x12/0x17
    [ 143.256111] Call Trace:
    [ 143.256114] [] schedule+0x29/0x70
    [ 143.256119] [] rwsem_down_write_failed+0x1c5/0x2d0
    [ 143.256123] [] call_rwsem_down_write_failed+0x13/0x20
    [ 143.256131] [] thaw_super+0x28/0xc0
    [ 143.256135] [] do_vfs_ioctl+0x3f5/0x540
    [ 143.256187] [] SyS_ioctl+0x91/0xb0
    [ 143.256213] [] system_call_fastpath+0x12/0x17

    The reason is like the following:
    (Holding s_umount)
    VFS sync_fs staff:
    |- btrfs_sync_fs()
    |- btrfs_start_transaction()
    |- sb_start_intwrite()
    (Waiting thaw_fs to unfreeze)
    VFS thaw_fs staff:
    thaw_fs()
    (Waiting sync_fs to release
    s_umount)

    So deadlock happens.
    This can be easily triggered by fstest/generic/068 with inode_cache
    mount option.

    The fix is to check if the fs is frozen, if the fs is frozen, just
    return and waiting for the next transaction.

    Cc: David Sterba
    Reported-by: Gui Hecheng
    Signed-off-by: Qu Wenruo
    [enhanced comment, changed to SB_FREEZE_WRITE]
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    Qu Wenruo
     
  • Fs_info->pending_changes is never cleared since the original code uses
    cmpxchg(&fs_info->pending_changes, 0, 0), which will only clear it if
    pending_changes is already 0.

    This will cause a lot of problem when mount it with inode_cache mount
    option.
    If the btrfs is mounted as inode_cache, pending_changes will always be
    1, even when the fs is frozen.

    Signed-off-by: Qu Wenruo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    Qu Wenruo
     

20 Jan, 2015

6 commits

  • Commit
    c11f1df5003d534fd067f0168bfad7befffb3b5c
    requires writers to wait for any pending oplock break handler to
    complete before proceeding to write. This is done by waiting on bit
    CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
    cleared by the oplock break handler job queued on the workqueue once it
    has completed handling the oplock break allowing writers to proceed with
    writing to the file.

    While testing, it was noticed that the filehandle could be closed while
    there is a pending oplock break which results in the oplock break
    handler on the cifsiod workqueue being cancelled before it has had a
    chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
    Any subsequent attempt to write to this file hangs waiting for the
    CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.

    We fix this by ensuring that we also clear the bit
    CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
    from the workqueue.

    The bug was found by Red Hat QA while testing using ltp's fsstress
    command.

    Signed-off-by: Sachin Prabhu
    Acked-by: Shirish Pargaonkar
    Signed-off-by: Jeff Layton
    Cc: stable@vger.kernel.org
    Signed-off-by: Steve French

    Sachin Prabhu
     
  • When leaving a function use memzero_explicit instead of memset(0) to
    clear stack allocated buffers. memset(0) may be optimized away.

    This particular buffer is highly likely to contain sensitive data which
    we shouldn't leak (it's named 'passwd' after all).

    Signed-off-by: Giel van Schijndel
    Acked-by: Herbert Xu
    Reported-at: http://www.viva64.com/en/b/0299/
    Reported-by: Andrey Karpov
    Reported-by: Svyatoslav Razmyslov
    Signed-off-by: Steve French

    Giel van Schijndel
     
  • Suppress the following warning displayed on building 32bit (i686) kernel.

    ===============================================================================
    ...
    CC [M] fs/btrfs/extent_io.o
    fs/btrfs/extent_io.c: In function ‘btrfs_free_io_failure_record’:
    fs/btrfs/extent_io.c:2193:13: warning: cast to pointer from integer of
    different size [-Wint-to-pointer-cast]
    failrec = (struct io_failure_record *)state->private;
    ...
    ===============================================================================

    Signed-off-by: Satoru Takeuchi
    Reported-by: Chris Murphy
    Signed-off-by: Chris Mason

    Satoru Takeuchi
     
  • When removing a block group we were deleting it from its space_info's
    ro_bgs list without the correct protection - the space info's spinlock.
    Fix this by doing the list delete while holding the spinlock of the
    corresponding space info, which is the correct lock for any operation
    on that list.

    This issue was introduced in the 3.19 kernel by the following change:

    Btrfs: move read only block groups onto their own list V2
    commit 633c0aad4c0243a506a3e8590551085ad78af82d

    I ran into a kernel crash while a task was running statfs, which iterates
    the space_info->ro_bgs list while holding the space info's spinlock,
    and another task was deleting it from the same list, without holding that
    spinlock, as part of the block group remove operation (while running the
    function btrfs_remove_block_group). This happened often when running the
    stress test xfstests/generic/038 I recently made.

    Signed-off-by: Filipe Manana
    Signed-off-by: Chris Mason

    Filipe Manana
     
  • The address that should be freed is not 'ppath' but 'path'.

    Signed-off-by: Tsutomu Itoh
    Reviewed-by: Miao Xie
    Signed-off-by: Chris Mason

    Tsutomu Itoh
     
  • The version merged to 3.19 did not handle errors from start_trancaction
    and could pass an invalid pointer to commit_transaction.

    Fixes: 6b5fe46dfa52441f ("btrfs: do commit in sync_fs if there are pending changes")
    Reported-by: Filipe Manana
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    David Sterba
     

19 Jan, 2015

1 commit

  • It really needs to check that src is non-directory *and* use
    {un,}lock_two_nodirectories(). As it is, it's trivial to cause
    double-lock (ioctl(fd, CIFS_IOC_COPYCHUNK_FILE, fd)) and if the
    last argument is an fd of directory, we are asking for trouble
    by violating the locking order - all directories go before all
    non-directories. If the last argument is an fd of parent
    directory, it has 50% odds of locking child before parent,
    which will cause AB-BA deadlock if we race with unlink().

    Cc: stable@vger.kernel.org @ 3.13+
    Signed-off-by: Al Viro

    Al Viro
     

17 Jan, 2015

2 commits


16 Jan, 2015

1 commit


13 Jan, 2015

1 commit

  • commit 0efaa7e82f02fe69c05ad28e905f31fc86e6f08e
    locks: generic_delete_lease doesn't need a file_lock at all

    moves the call to fl->fl_lmops->lm_change() to a place in the
    code where fl might be a non-lease lock.
    When that happens, fl_lmops is NULL and an Oops ensures.

    So add an extra test to restore correct functioning.

    Reported-by: Linda Walsh
    Link: https://bugzilla.suse.com/show_bug.cgi?id=912569
    Cc: stable@vger.kernel.org (v3.18)
    Fixes: 0efaa7e82f02fe69c05ad28e905f31fc86e6f08e
    Signed-off-by: NeilBrown
    Signed-off-by: Jeff Layton

    NeilBrown
     

12 Jan, 2015

1 commit

  • Pull scheduler fixes from Ingo Molnar:
    "Misc fixes: group scheduling corner case fix, two deadline scheduler
    fixes, effective_load() overflow fix, nested sleep fix, 6144 CPUs
    system fix"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()
    sched/deadline: Avoid double-accounting in case of missed deadlines
    sched/deadline: Fix migration of SCHED_DEADLINE tasks
    sched: Fix odd values in effective_load() calculations
    sched, fanotify: Deal with nested sleeps
    sched: Fix KMALLOC_MAX_SIZE overflow during cpumask allocation

    Linus Torvalds
     

10 Jan, 2015

4 commits

  • Pull two nfsd bugfixes from Bruce Fields.

    * 'for-3.19' of git://linux-nfs.org/~bfields/linux:
    rpc: fix xdr_truncate_encode to handle buffer ending on page boundary
    nfsd: fix fi_delegees leak when fi_had_conflict returns true

    Linus Torvalds
     
  • Pull two Ceph fixes from Sage Weil:
    "These are both pretty trivial: a sparse warning fix and size_t printk
    thing"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    libceph: fix sparse endianness warnings
    ceph: use %zu for len in ceph_fill_inline_data()

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "None of these are huge, but my commit does fix a regression from 3.18
    that could cause lost files during log replay.

    This also adds Dave Sterba to the list of Btrfs maintainers. It
    doesn't mean we're doing things differently, but Dave has really been
    helping with the maintainer workload for years"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: don't delay inode ref updates during log replay
    Btrfs: correctly get tree level in tree_backref_for_extent
    Btrfs: call inode_dec_link_count() on mkdir error path
    Btrfs: abort transaction if we don't find the block group
    Btrfs, scrub: uninitialized variable in scrub_extent_for_parity()
    Btrfs: add more maintainers

    Linus Torvalds
     
  • Returning a difference from a comparison functions is usually wrong
    (see acbbe6fbb240 "kcmp: fix standard comparison bug" for the long
    story). Here there is the additional twist that if the void pointers
    ns and kn->ns happen to differ by a multiple of 2^32,
    kernfs_name_compare returns 0, falsely reporting a match to the
    caller.

    Technically 'hash - kn->hash' is ok since the hashes are restricted to
    31 bits, but it's better to avoid that subtlety.

    Signed-off-by: Rasmus Villemoes
    Acked-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Rasmus Villemoes
     

09 Jan, 2015

5 commits

  • As per e23738a7300a ("sched, inotify: Deal with nested sleeps").

    fanotify_read is a wait loop with sleeps in. Wait loops rely on
    task_struct::state and sleeps do too, since that's the only means of
    actually sleeping. Therefore the nested sleeps destroy the wait loop
    state and the wait loop breaks the sleep functions that assume
    TASK_RUNNING (mutex_lock).

    Fix this by using the new woken_wake_function and wait_woken() stuff,
    which registers wakeups in wait and thereby allows shrinking the
    task_state::state changes to the actual sleep part.

    Reported-by: Yuanhan Liu
    Reported-by: Sedat Dilek
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Takashi Iwai
    Cc: Al Viro
    Cc: Eric Paris
    Cc: Linus Torvalds
    Cc: Eric Paris
    Link: http://lkml.kernel.org/r/20141216152838.GZ3337@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The
    clashing O_PATH value was added in commit 5229645bdc35 ("vfs: add
    nonconflicting values for O_PATH") but this can't be changed as it is
    user-visible.

    FMODE_NONOTIFY is only used internally in the kernel, but it is in the
    same numbering space as the other O_* flags, as indicated by the comment
    at the top of include/uapi/asm-generic/fcntl.h (and its use in
    fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash.

    All of this has happened before (commit 12ed2e36c98a: "fanotify:
    FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
    happen again -- so update the uniqueness check in fcntl_init() to
    include __FMODE_NONOTIFY.

    Signed-off-by: David Drysdale
    Acked-by: David S. Miller
    Acked-by: Jan Kara
    Cc: Heinrich Schuchardt
    Cc: Alexander Viro
    Cc: Arnd Bergmann
    Cc: Stephen Rothwell
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Drysdale
     
  • In ocfs2_link(), the parent directory inode passed to function
    ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of
    new_dentry not old_dentry. We should get old_dir from old_dentry and
    lookup old_dentry in old_dir in case another node remove the old dentry.

    With this change, hard linking works again, when paths are relative with
    at least one subdirectory. This is how the problem was reproducable:

    # mkdir a
    # mkdir b
    # touch a/test
    # ln a/test b/test
    ln: failed to create hard link `b/test' => `a/test': No such file or directory

    However when creating links in the same dir, it worked well.

    Now the link gets created.

    Fixes: 0e048316ff57 ("ocfs2: check existence of old dentry in ocfs2_link()")
    Signed-off-by: joyce.xue
    Reported-by: Szabo Aron - UBIT
    Cc: Mark Fasheh
    Cc: Joel Becker
    Tested-by: Aron Szabo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • In dlm_process_recovery_data, only when dlm_new_lock failed the ret will
    be set to -ENOMEM. And in this case, newlock is definitely NULL. So
    test newlock is meaningless, remove it.

    Signed-off-by: Joseph Qi
    Reviewed-by: Alex Chen
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • len is size_t, should be printed with %zu.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     

08 Jan, 2015

1 commit

  • Currently, nfs4_set_delegation takes a reference to an existing
    delegation and then checks to see if there is a conflict. If there is
    one, then it doesn't release that reference.

    Change the code to take the reference after the check and only if there
    is no conflict.

    Signed-off-by: Jeff Layton
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

07 Jan, 2015

1 commit


06 Jan, 2015

2 commits

  • Theoretically we need to order setting of various fields in fc with
    fc->initialized.

    No known bug reports related to this yet.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Analysis from Marc:

    "Commit 7078187a795f ("fuse: introduce fuse_simple_request() helper")
    from the above pull request triggers some EIO errors for me in some tests
    that rely on fuse

    Looking at the code changes and a bit of debugging info I think there's a
    general problem here that fuse_get_req checks and possibly waits for
    fc->initialized, and this was always called first. But this commit
    changes the ordering and in many places fc->minor is now possibly used
    before fuse_get_req, and we can't be sure that fc has been initialized.
    In my case fuse_lookup_init sets req->out.args[0].size to the wrong size
    because fc->minor at that point is still 0, leading to the EIO error."

    Fix by moving the compat adjustments into fuse_simple_request() to after
    fuse_get_req().

    This is also more readable than the original, since now compatibility is
    handled in a single function instead of cluttering each operation.

    Reported-by: Marc Dionne
    Tested-by: Marc Dionne
    Signed-off-by: Miklos Szeredi
    Fixes: 7078187a795f ("fuse: introduce fuse_simple_request() helper")

    Miklos Szeredi