26 Sep, 2014

5 commits

  • Commit 0227d6abb378 ("fs/cachefiles: replace kerror by pr_err") didn't
    include newline featuring in original kerror definition

    Signed-off-by: Fabian Frederick
    Reported-by: David Howells
    Acked-by: David Howells
    Cc: [3.16.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • In PTE holes that contain VM_SOFTDIRTY VMAs, unmapped addresses before
    VM_SOFTDIRTY VMAs are reported as softdirty by /proc/pid/pagemap. This
    bug was introduced in commit 68b5a6524856 ("mm: softdirty: respect
    VM_SOFTDIRTY in PTE holes"). That commit made /proc/pid/pagemap look at
    VM_SOFTDIRTY in PTE holes but neglected to observe the start of VMAs
    returned by find_vma.

    Tested:
    Wrote a selftest that creates a PMD-sized VMA then unmaps the first
    page and asserts that the page is not softdirty. I'm going to send the
    pagemap selftest in a later commit.

    Signed-off-by: Peter Feiner
    Cc: Cyrill Gorcunov
    Cc: Pavel Emelyanov
    Cc: Hugh Dickins
    Cc: Naoya Horiguchi
    Cc: "Kirill A. Shutemov"
    Cc: Jamie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Feiner
     
  • There is a deadlock case which reported by Guozhonghua:
    https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html

    This case is caused by &res->spinlock and &dlm->master_lock
    misordering in different threads.

    It was introduced by commit 8d400b81cc83 ("ocfs2/dlm: Clean up refmap
    helpers"). Since lockres is new, it doesn't not require the
    &res->spinlock. So remove it.

    Fixes: 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers")
    Signed-off-by: Joseph Qi
    Reviewed-by: joyce.xue
    Reported-by: Guozhonghua
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • This bug leads to reproducible silent data loss, despite the use of
    msync(), sync() and a clean unmount of the file system. It is easily
    reproducible with the following script:

    ----------------[BEGIN SCRIPT]--------------------
    mkfs.nilfs2 -f /dev/sdb
    mount /dev/sdb /mnt

    dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

    umount /mnt
    mount /dev/sdb /mnt
    CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"

    /root/mmaptest/mmaptest /mnt/testfile 30 10 5

    sync
    CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
    umount /mnt
    mount /dev/sdb /mnt
    CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
    umount /mnt

    echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
    echo "AFTER MMAP:\t$CHECKSUM_AFTER"
    echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
    ----------------[END SCRIPT]--------------------

    The mmaptest tool looks something like this (very simplified, with
    error checking removed):

    ----------------[BEGIN mmaptest]--------------------
    data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
    MAP_SHARED, fd, file_offset);

    for (i = 0; i < write_count; ++i) {
    memcpy(data + i * 4096, buf, sizeof(buf));
    msync(data, file_size - file_offset, MS_SYNC))
    }
    ----------------[END mmaptest]--------------------

    The output of the script looks something like this:

    BEFORE MMAP: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile
    AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile
    AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile

    So it is clear, that the changes done using mmap() do not survive a
    remount. This can be reproduced a 100% of the time. The problem was
    introduced in commit 136e8770cd5d ("nilfs2: fix issue of
    nilfs_set_page_dirty() for page at EOF boundary").

    If the page was read with mpage_readpage() or mpage_readpages() for
    example, then it has no buffers attached to it. In that case
    page_has_buffers(page) in nilfs_set_page_dirty() will be false.
    Therefore nilfs_set_file_dirty() is never called and the pages are never
    collected and never written to disk.

    This patch fixes the problem by also calling nilfs_set_file_dirty() if the
    page has no buffers attached to it.

    [akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
    Signed-off-by: Andreas Rohner
    Tested-by: Andreas Rohner
    Signed-off-by: Ryusuke Konishi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Rohner
     
  • osb->vol_label is malloced in ocfs2_initialize_super but not freed if
    error occurs or during umount, thus causing a memory leak.

    Signed-off-by: Joseph Qi
    Reviewed-by: joyce.xue
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

23 Sep, 2014

1 commit

  • …git/dhowells/linux-fs

    Pull fs-cache fixes from David Howells:

    - Put a timeout in releasepage() to deal with a recursive hang between
    the memory allocator, writeback, ext4 and fscache under memory
    pressure.

    - Fix a pair of refcount bugs in the fscache error handling.

    - Remove a couple of unused pagevecs.

    - The cachefiles requirement that the base directory support rename
    should permit rename2 as an alternative - otherwise certain
    filesystems cannot now be used as backing stores (such as ext4).

    * tag 'fscache-fixes-20140917' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    CacheFiles: Handle rename2
    cachefiles: remove two unused pagevecs.
    FS-Cache: refcount becomes corrupt under vma pressure.
    FS-Cache: Reduce cookie ref count if submit fails.
    FS-Cache: Timeout for releasepage()

    Linus Torvalds
     

22 Sep, 2014

1 commit

  • On 32-bit architectures, the legacy buffer_head functions are not always
    handling the sector number with the proper 64-bit types, and will thus
    fail on 4TB+ disks.

    Any code that uses __getblk() (and thus bread(), breadahead(),
    sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit
    block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop
    in __getblk_slow() with an infinite stream of errors logged to dmesg
    like this:

    __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
    b_state=0x00000020, b_size=512
    device sda1 blocksize: 512

    Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the
    top 32-bits are missing (in this case the 0x1 at the top).

    This is because grow_dev_page() is broken and has a 32-bit overflow due
    to shifting the page index value (a pgoff_t - which is just 32 bits on
    32-bit architectures) left-shifted as the block number. But the top
    bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit
    before the shift.

    This patch fixes this issue by type casting "index" to sector_t before
    doing the left shift.

    Note this is not a theoretical bug but has been seen in the field on a
    4TiB hard drive with logical sector size 512 bytes.

    This patch has been verified to fix the infinite loop problem on 3.17-rc5
    kernel using a 4TB disk image mounted using "-o loop". Without this patch
    doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop
    100% reproducibly whilst with the patch it works fine as expected.

    Signed-off-by: Anton Altaparmakov
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Anton Altaparmakov
     

20 Sep, 2014

2 commits


19 Sep, 2014

3 commits

  • Pull cifs/smb3 fixes from Steve French:
    "Fixes for problems found during testing and debugging at the SMB3
    storage test event (plugfest) this week"

    * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
    Fix mfsymlinks file size check
    Update version number displayed by modinfo for cifs.ko
    cifs: remove dead code
    Revert "cifs: No need to send SIGKILL to demux_thread during umount"
    [SMB3] Fix oops when creating symlinks on smb3
    [CIFS] Fix setting time before epoch (negative time values)

    Linus Torvalds
     
  • James Drew reports another bug whereby the NFS client is now sending
    an OPEN_DOWNGRADE in a situation where it should really have sent a
    CLOSE: the client is opening the file for O_RDWR, but then trying to
    do a downgrade to O_RDONLY, which is not allowed by the NFSv4 spec.

    Reported-by: James Drews
    Link: http://lkml.kernel.org/r/541AD7E5.8020409@engr.wisc.edu
    Fixes: aee7af356e15 (NFSv4: Fix problems with close in the presence...)
    Cc: stable@vger.kernel.org # 2.6.33+
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • There is a race between nfs4_state_manager() and
    nfs_server_remove_lists() that happens during a nfsv3 mount.

    The v3 mount notices there is already a supper block so
    nfs_server_remove_lists() called which uses the nfs_client_lock
    spin lock to synchronize access to the client list.

    At the same time nfs4_state_manager() is running through
    the client list looking for work to do, using the same
    lock. When nfs4_state_manager() wins the race to the
    list, a v3 client pointer is found and not ignored
    properly which causes the panic.

    Moving some protocol checks before the state checking
    avoids the panic.

    CC: Stable Tree
    Signed-off-by: Steve Dickson
    Signed-off-by: Trond Myklebust

    Steve Dickson
     

18 Sep, 2014

4 commits

  • This reverts commit b96de000bc8bc9688b3a2abea4332bd57648a49f.

    This commit is triggering failures to mount by subvolume id in some
    configurations. The main problem is how many different ways this
    scanning function is used, both for scanning while mounted and
    unmounted. A proper cleanup is too big for late rcs.

    For now, just revert the commit and we'll put a better fix into a later
    merge window.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • Not all filesystems now provide the rename i_op - ext4 for one - but rather
    provide the rename2 i_op. CacheFiles checks that the filesystem has rename
    and so will reject ext4 now with EPERM:

    CacheFiles: Failed to register: -1

    Fix this by checking for rename2 as an alternative. The call to vfs_rename()
    actually handles selection of the appropriate function, so we needn't worry
    about that.

    Turning on debugging shows:

    [cachef] ==> cachefiles_get_directory(,,cache)
    [cachef] subdir -> ffff88000b22b778 positive
    [cachef]

    David Howells
     
  • These two have been unused since

    commit c4d6d8dbf335c7fa47341654a37c53a512b519bb
    CacheFiles: Fix the marking of cached pages

    in 3.8.

    Signed-off-by: NeilBrown
    Signed-off-by: David Howells

    NeilBrown
     
  • In rare cases under heavy VMA pressure the ref count for a fscache cookie
    becomes corrupt. In this case we decrement ref count even if we fail before
    incrementing the refcount.

    FS-Cache: Assertion failed bnode-eca5f9c6/syslog
    0 > 0 is false
    ------------[ cut here ]------------
    kernel BUG at fs/fscache/cookie.c:519!
    invalid opcode: 0000 [#1] SMP
    Call Trace:
    [] __fscache_relinquish_cookie+0x50/0x220 [fscache]
    [] ceph_fscache_unregister_inode_cookie+0x3e/0x50 [ceph]
    [] ceph_destroy_inode+0x33/0x200 [ceph]
    [] ? __fsnotify_inode_delete+0xe/0x10
    [] destroy_inode+0x3c/0x70
    [] evict+0x111/0x180
    [] iput+0x103/0x190
    [] __dentry_kill+0x1c8/0x220
    [] shrink_dentry_list+0xf1/0x250
    [] prune_dcache_sb+0x4c/0x60
    [] super_cache_scan+0xff/0x170
    [] shrink_slab_node+0x140/0x2c0
    [] shrink_slab+0x8a/0x130
    [] balance_pgdat+0x3e2/0x5d0
    [] kswapd+0x16a/0x4a0
    [] ? __wake_up_sync+0x20/0x20
    [] ? balance_pgdat+0x5d0/0x5d0
    [] kthread+0xc9/0xe0
    [] ? ftrace_raw_event_xen_mmu_release_ptpage+0x70/0x90
    [] ? flush_kthread_worker+0xb0/0xb0
    [] ret_from_fork+0x7c/0xb0
    [] ? flush_kthread_worker+0xb0/0xb0
    RIP [] __fscache_disable_cookie+0x1db/0x210 [fscache]
    RSP
    ---[ end trace 254d0d7c74a01f25 ]---

    Signed-off-by: Milosz Tanski
    Signed-off-by: David Howells

    Milosz Tanski
     

17 Sep, 2014

1 commit

  • When a ranged fsync finishes if there are still extent maps in the modified
    list, still set the inode's logged_trans and last_log_commit. This is important
    in case an inode is fsync'ed and unlinked in the same transaction, to ensure its
    inode ref gets deleted from the log and the respective dentries in its parent
    are deleted too from the log (if the parent directory was fsync'ed in the same
    transaction).

    Instead make btrfs_inode_in_log() return false if the list of modified extent
    maps isn't empty.

    This is an incremental on top of the v4 version of the patch:

    "Btrfs: fix fsync data loss after a ranged fsync"

    which was added to its v5, but didn't make it on time.

    Signed-off-by: Filipe Manana
    Signed-off-by: Chris Mason

    Filipe Manana
     

16 Sep, 2014

7 commits

  • Pull gfs2 fixes from Steven Whitehouse:
    "Here are a number of small fixes for GFS2.

    There is a fix for FIEMAP on large sparse files, a negative dentry
    hashing fix, a fix for flock, and a bug fix relating to d_splice_alias
    usage.

    There are also (patches 1 and 5) a couple of updates which are less
    critical, but small and low risk"

    * tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
    GFS2: fix d_splice_alias() misuses
    GFS2: Don't use MAXQUOTAS value
    GFS2: Hash the negative dentry during inode lookup
    GFS2: Request demote when a "try" flock fails
    GFS2: Change maxlen variables to size_t
    GFS2: fs/gfs2/super.c: replace seq_printf by seq_puts

    Linus Torvalds
     
  • Commit d6bb3e9075bb ("vfs: simplify and shrink stack frame of
    link_path_walk()") introduced build problems with GCC versions older
    than 4.6 due to the initialisation of a member of an anonymous union in
    struct qstr without enclosing braces.

    This hits GCC bug 10676 [1] (which was fixed in GCC 4.6 by [2]), and
    causes the following build error:

    fs/namei.c: In function 'link_path_walk':
    fs/namei.c:1778: error: unknown field 'hash_len' specified in initializer

    This is worked around by adding explicit braces.

    [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676
    [2] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=159206

    Fixes: d6bb3e9075bb (vfs: simplify and shrink stack frame of link_path_walk())
    Signed-off-by: James Hogan
    Cc: Linus Torvalds
    Cc: Alexander Viro
    Cc: Geert Uytterhoeven
    Cc: linux-fsdevel@vger.kernel.org
    Cc: linux-metag@vger.kernel.org
    Signed-off-by: Linus Torvalds

    James Hogan
     
  • If the mfsymlinks file size has changed (e.g. the file no longer
    represents an emulated symlink) we were not returning an error properly.

    Signed-off-by: Steve French
    Reviewed-by: Stefan Metzmacher

    Steve French
     
  • Update cifs.ko version to 2.05

    Signed-off-by: Steve French w

    Steve French
     
  • cifs provides two dummy functions 'sess_auth_lanman' and
    'sess_auth_kerberos' for the case in which the respective
    features are not defined. However, the caller is also under
    an #ifdef, so we just get warnings about unused code:

    fs/cifs/sess.c:1109:1: warning: 'sess_auth_kerberos' defined but not used [-Wunused-function]
    sess_auth_kerberos(struct sess_data *sess_data)

    Removing the dead functions gets rid of the warnings without
    any downsides that I can see.

    (Yalin Wang reported the identical problem and fix so added him)

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Yalin Wang
    Signed-off-by: Steve French

    Arnd Bergmann
     
  • This reverts commit 52a36244443eabb594bdb63622ff2dd7a083f0e2.

    Causes rmmod to fail for at least 7 seconds after unmount which
    makes automated testing a little harder when reloading cifs.ko
    between test runs.

    Signed-off-by: Namjae Jeon
    CC: Jeff Layton
    Signed-off-by: Steve French

    Steve French
     
  • Commit 9226b5b440f2 ("vfs: avoid non-forwarding large load after small
    store in path lookup") made link_path_walk() always access the
    "hash_len" field as a single 64-bit entity, in order to avoid mixed size
    accesses to the members.

    However, what I didn't notice was that that effectively means that the
    whole "struct qstr this" is now basically redundant. We already
    explicitly track the "const char *name", and if we just use "u64
    hash_len" instead of "long len", there is nothing else left of the
    "struct qstr".

    We do end up wanting the "struct qstr" if we have a filesystem with a
    "d_hash()" function, but that's a rare case, and we might as well then
    just squirrell away the name and hash_len at that point.

    End result: fewer live variables in the loop, a smaller stack frame, and
    better code generation. And we don't need to pass in pointers variables
    to helper functions any more, because the return value contains all the
    relevant information. So this removes more lines than it adds, and the
    source code is clearer too.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 Sep, 2014

6 commits

  • We were not checking for symlink support properly for SMB2/SMB3
    mounts so could oops when mounted with mfsymlinks when try
    to create symlink when mfsymlinks on smb2/smb3 mounts

    Signed-off-by: Steve French
    Cc: # 3.14+
    CC: Sachin Prabhu

    Steve French
     
  • Pull vfs fixes from Al Viro:
    "double iput() on failure exit in lustre, racy removal of spliced
    dentries from ->s_anon in __d_materialise_dentry() plus a bunch of
    assorted RCU pathwalk fixes"

    The RCU pathwalk fixes end up fixing a couple of cases where we
    incorrectly dropped out of RCU walking, due to incorrect initialization
    and testing of the sequence locks in some corner cases. Since dropping
    out of RCU walk mode forces the slow locked accesses, those corner cases
    slowed down quite dramatically.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    be careful with nd->inode in path_init() and follow_dotdot_rcu()
    don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()
    fix bogus read_seqretry() checks introduced in b37199e
    move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon)
    [fix] lustre: d_make_root() does iput() on dentry allocation failure

    Linus Torvalds
     
  • The performance regression that Josef Bacik reported in the pathname
    lookup (see commit 99d263d4c5b2 "vfs: fix bad hashing of dentries") made
    me look at performance stability of the dcache code, just to verify that
    the problem was actually fixed. That turned up a few other problems in
    this area.

    There are a few cases where we exit RCU lookup mode and go to the slow
    serializing case when we shouldn't, Al has fixed those and they'll come
    in with the next VFS pull.

    But my performance verification also shows that link_path_walk() turns
    out to have a very unfortunate 32-bit store of the length and hash of
    the name we look up, followed by a 64-bit read of the combined hash_len
    field. That screws up the processor store to load forwarding, causing
    an unnecessary hickup in this critical routine.

    It's caused by the ugly calling convention for the "hash_name()"
    function, and easily fixed by just making hash_name() fill in the whole
    'struct qstr' rather than passing it a pointer to just the hash value.

    With that, the profile for this function looks much smoother.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • xfstest generic/258 sets the time on a file to a negative value
    (before 1970) which fails since do_div can not handle negative
    numbers. In addition 'normal' division of 64 bit values does
    not build on 32 bit arch so have to workaround this by special
    casing negative values in cifs_NTtimeToUnix

    Samba server also has a bug with this (see samba bugzilla 7771)
    but it works to Windows server.

    Signed-off-by: Steve French

    Steve French
     
  • in the former we simply check if dentry is still valid after picking
    its ->d_inode; in the latter we fetch ->d_inode in the same places
    where we fetch dentry and its ->d_seq, under the same checks.

    Cc: stable@vger.kernel.org # 2.6.38+
    Signed-off-by: Al Viro

    Al Viro
     
  • return the value instead, and have path_init() do the assignment. Broken by
    "vfs: Fix absolute RCU path walk failures due to uninitialized seq number",
    which was Cc-stable with 2.6.38+ as destination. This one should go where
    it went.

    To avoid dummy value returned in case when root is already set (it would do
    no harm, actually, since the only caller that doesn't ignore the return value
    is guaranteed to have nd->root *not* set, but it's more obvious that way),
    lift the check into callers. And do the same to set_root(), to keep them
    in sync.

    Cc: stable@vger.kernel.org # 2.6.38+
    Signed-off-by: Al Viro

    Al Viro
     

14 Sep, 2014

3 commits

  • read_seqretry() returns true on mismatch, not on match...

    Cc: stable@vger.kernel.org # 3.15+
    Signed-off-by: Al Viro

    Al Viro
     
  • and lock the right list there

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • Josef Bacik found a performance regression between 3.2 and 3.10 and
    narrowed it down to commit bfcfaa77bdf0 ("vfs: use 'unsigned long'
    accesses for dcache name comparison and hashing"). He reports:

    "The test case is essentially

    for (i = 0; i < 1000000; i++)
    mkdir("a$i");

    On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k
    dir/sec with 3.10. This is because we spend waaaaay more time in
    __d_lookup on 3.10 than in 3.2.

    The new hashing function for strings is suboptimal for <
    sizeof(unsigned long) string names (and hell even > sizeof(unsigned
    long) string names that I've tested). I broke out the old hashing
    function and the new one into a userspace helper to get real numbers
    and this is what I'm getting:

    Old hash table had 1000000 entries, 0 dupes, 0 max dupes
    New hash table had 12628 entries, 987372 dupes, 900 max dupes
    We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash

    My test does the hash, and then does the d_hash into a integer pointer
    array the same size as the dentry hash table on my system, and then
    just increments the value at the address we got to see how many
    entries we overlap with.

    As you can see the old hash function ended up with all 1 million
    entries in their own bucket, whereas the new one they are only
    distributed among ~12.5k buckets, which is why we're using so much
    more CPU in __d_lookup".

    The reason for this hash regression is two-fold:

    - On 64-bit architectures the down-mixing of the original 64-bit
    word-at-a-time hash into the final 32-bit hash value is very
    simplistic and suboptimal, and just adds the two 32-bit parts
    together.

    In particular, because there is no bit shuffling and the mixing
    boundary is also a byte boundary, similar character patterns in the
    low and high word easily end up just canceling each other out.

    - the old byte-at-a-time hash mixed each byte into the final hash as it
    hashed the path component name, resulting in the low bits of the hash
    generally being a good source of hash data. That is not true for the
    word-at-a-time case, and the hash data is distributed among all the
    bits.

    The fix is the same in both cases: do a better job of mixing the bits up
    and using as much of the hash data as possible. We already have the
    "hash_32|64()" functions to do that.

    Reported-by: Josef Bacik
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Chris Mason
    Cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Sep, 2014

4 commits

  • Callers of d_splice_alias(dentry, inode) don't need iput(), neither
    on success nor on failure. Either the reference to inode is stored
    in a previously negative dentry, or it's dropped. In either case
    inode reference the caller used to hold is consumed.

    __gfs2_lookup() does iput() in case when d_splice_alias() has failed.
    Double iput() if we ever hit that. And gfs2_create_inode() ends up
    not only with double iput(), but with link count dropped to zero - on
    an inode it has just found in directory.

    Cc: stable@vger.kernel.org # v3.14+
    Signed-off-by: Al Viro
    Signed-off-by: Steven Whitehouse

    Al Viro
     
  • Pull NFS client fixes from Trond Myklebust:
    "Highlights:
    - fix a kernel warning when removing /proc/net/nfsfs
    - revert commit 49a4bda22e18 due to Oopses
    - fix a typo in the pNFS file layout commit code"

    * tag 'nfs-for-3.17-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    pnfs: fix filelayout_retry_commit when idx > 0
    nfs: revert "nfs4: queue free_lock_state job submission to nfsiod"
    nfs: fix kernel warning when removing proc entry

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "Filipe is doing a careful pass through fsync problems, and these are
    the fixes so far. I'll have one more for rc6 that we're still
    testing.

    My big commit is fixing up some inode hash races that Al Viro found
    (thanks Al)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: use insert_inode_locked4 for inode creation
    Btrfs: fix fsync data loss after a ranged fsync
    Btrfs: kfree()ing ERR_PTRs
    Btrfs: fix crash while doing a ranged fsync
    Btrfs: fix corruption after write/fsync failure + fsync + log recovery
    Btrfs: fix autodefrag with compression

    Linus Torvalds
     
  • commit 4fa2c54b5198d09607a534e2fd436581064587ed
    NFS: nfs4_do_open should add negative results to the dcache.

    used "d_drop(); d_add();" to ensure that a dentry was hashed
    as a negative cached entry.
    This is not safe if the dentry has an non-NULL ->d_inode.
    It will trigger a BUG_ON in d_instantiate().
    In that case, d_delete() is needed.

    Also, only d_add if the dentry is currently unhashed, it seems
    pointless removed and re-adding it unchanged.

    Reported-by: Christoph Hellwig
    Fixes: 4fa2c54b5198d09607a534e2fd436581064587ed
    Cc: Jeff Layton
    Link: http://lkml.kernel.org/r/20140908144525.GB19811@infradead.org
    Signed-off-by: NeilBrown
    Acked-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    NeilBrown
     

11 Sep, 2014

3 commits

  • MAXQUOTAS value defines maximum number of quota types VFS supports.
    This isn't necessarily the number of types gfs2 supports and with
    addition of project quotas these two numbers stop matching. So make gfs2
    use its private definition.

    CC: cluster-devel@redhat.com
    Signed-off-by: Jan Kara
    Signed-off-by: Steven Whitehouse

    Jan Kara
     
  • Fix a regression introduced by:
    6d4ade986f9c8df31e68 GFS2: Add atomic_open support
    where an early return misses d_splice_alias() which had been
    adding the negative dentry.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Benjamin Coddington
     
  • Merge misc fixes from Andrew Morton:
    "10 fixes"

    * emailed patches from Andrew Morton :
    fs/notify: don't show f_handle if exportfs_encode_inode_fh failed
    fsnotify/fdinfo: use named constants instead of hardcoded values
    kcmp: fix standard comparison bug
    mm/mmap.c: use pr_emerg when printing BUG related information
    shm: add memfd.h to UAPI export list
    checkpatch: allow commit descriptions on separate line from commit id
    sh: get_user_pages_fast() must flush cache
    eventpoll: fix uninitialized variable in epoll_ctl
    kernel/printk/printk.c: fix faulty logic in the case of recursive printk
    mem-hotplug: let memblock skip the hotpluggable memory regions in __next_mem_range()

    Linus Torvalds