27 Nov, 2016

1 commit

  • Botched calculation of number of pages. As the result,
    we were dropping pieces when doing splice to pipe from
    e.g. 9p.

    Reported-by: Alexei Starovoitov
    Tested-by: Alexei Starovoitov
    Signed-off-by: Al Viro

    Al Viro
     

24 Nov, 2016

1 commit

  • Pull NFS client bugfixes from Anna Schumaker:
    "Most of these fix regressions or races, but there is one patch for
    stable that Arnd sent me

    Stable bugfix:
    - Hide array-bounds warning

    Bugfixes:
    - Keep a reference on lock states while checking
    - Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
    - Don't call close if the open stateid has already been cleared
    - Fix CLOSE rases with OPEN
    - Fix a regression in DELEGRETURN"

    * tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
    NFSv4.x: hide array-bounds warning
    NFSv4.1: Keep a reference on lock states while checking
    NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
    NFSv4: Don't call close if the open stateid has already been cleared
    NFSv4: Fix CLOSE races with OPEN
    NFSv4.1: Fix a regression in DELEGRETURN

    Linus Torvalds
     

23 Nov, 2016

1 commit

  • A correct bugfix introduced a harmless warning that shows up with gcc-7:

    fs/nfs/callback.c: In function 'nfs_callback_up':
    fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds]

    What happens here is that the 'minorversion == 0' check tells the
    compiler that we assume minorversion can be something other than 0,
    but when CONFIG_NFS_V4_1 is disabled that would be invalid and
    result in an out-of-bounds access.

    The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this
    really can't happen, which makes the code slightly smaller and also
    avoids the warning.

    The bugfix that introduced the warning is marked for stable backports,
    we want this one backported to the same releases.

    Fixes: 98b0f80c2396 ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net")
    Cc: stable@vger.kernel.org # v3.7+
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Anna Schumaker

    Arnd Bergmann
     

22 Nov, 2016

1 commit


20 Nov, 2016

4 commits

  • Pull ext4 fixes from Ted Ts'o:
    "A security fix (so a maliciously corrupted file system image won't
    panic the kernel) and some fixes for CONFIG_VMAP_STACK"

    * tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: sanity check the block and cluster size at mount time
    fscrypto: don't use on-stack buffer for key derivation
    fscrypto: don't use on-stack buffer for filename encryption

    Linus Torvalds
     
  • If the block size or cluster size is insane, reject the mount. This
    is important for security reasons (although we shouldn't be just
    depending on this check).

    Ref: http://www.securityfocus.com/archive/1/539661
    Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
    Reported-by: Borislav Petkov
    Reported-by: Nikolay Borisov
    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • With the new (in 4.9) option to use a virtually-mapped stack
    (CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
    the scatterlist crypto API because they may not be directly mappable to
    struct page. get_crypt_info() was using a stack buffer to hold the
    output from the encryption operation used to derive the per-file key.
    Fix it by using a heap buffer.

    This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
    because this allowed the BUG in sg_set_buf() to be triggered.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Theodore Ts'o

    Eric Biggers
     
  • With the new (in 4.9) option to use a virtually-mapped stack
    (CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
    the scatterlist crypto API because they may not be directly mappable to
    struct page. For short filenames, fname_encrypt() was encrypting a
    stack buffer holding the padded filename. Fix it by encrypting the
    filename in-place in the output buffer, thereby making the temporary
    buffer unnecessary.

    This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
    because this allowed the BUG in sg_set_buf() to be triggered.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Theodore Ts'o

    Eric Biggers
     

19 Nov, 2016

4 commits

  • Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
    a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() . Instead of
    marking state recovery as failed, mark the state for recovery again.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     
  • Ensure we test to see if the open stateid is actually set, before we
    send a CLOSE.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • If the reply to a successful CLOSE call races with an OPEN to the same
    file, we can end up scribbling over the stateid that represents the
    new open state.
    The race looks like:

    Client Server
    ====== ======

    CLOSE stateid A on file "foo"
    CLOSE stateid A, return stateid C
    OPEN file "foo"
    OPEN "foo", return stateid B
    Receive reply to OPEN
    Reset open state for "foo"
    Associate stateid B to "foo"

    Receive CLOSE for A
    Reset open state for "foo"
    Replace stateid B with C

    The fix is to examine the argument of the CLOSE, and check for a match
    with the current stateid "other" field. If the two do not match, then
    the above race occurred, and we should just ignore the CLOSE.

    Reported-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • We don't want to call nfs4_free_revoked_stateid() in the case where
    the delegreturn was successful.

    Reported-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     

18 Nov, 2016

2 commits


17 Nov, 2016

3 commits

  • The IOP_XATTR flag is set on sockfs because sockfs supports getting the
    "system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for
    setxattr support as well. This is wrong on sockfs because security xattr
    support there is supposed to be provided by security_inode_setsecurity. The
    smack security module relies on socket labels (xattrs).

    Fix this by adding a security xattr handler on sockfs that returns
    -EAGAIN, and by checking for -EAGAIN in setxattr.

    We cannot simply check for -EOPNOTSUPP in setxattr because there are
    filesystems that neither have direct security xattr support nor support
    via security_inode_setsecurity. A more proper fix might be to move the
    call to security_inode_setsecurity into sockfs, but it's not clear to me
    if that is safe: we would end up calling security_inode_post_setxattr after
    that as well.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Pull fuse fixes from Miklos Szeredi:
    "A regression fix and bug fix bound for stable"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: fix fuse_write_end() if zero bytes were copied
    fuse: fix root dentry initialization

    Linus Torvalds
     
  • Without ".owner = THIS_MODULE" it is possible to crash the kernel
    by unloading the Orangefs module while someone is reading debugfs
    files.

    Signed-off-by: Mike Marshall

    Mike Marshall
     

15 Nov, 2016

1 commit

  • If pos is at the beginning of a page and copied is zero then page is not
    zeroed but is marked uptodate.

    Fix by skipping everything except unlock/put of page if zero bytes were
    copied.

    Reported-by: Al Viro
    Fixes: 6b12c1b37e55 ("fuse: Implement write_begin/write_end callbacks")
    Cc: # v3.15+
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

12 Nov, 2016

7 commits

  • Merge misc fixes from Andrew Morton:
    "15 fixes"

    * emailed patches from Andrew Morton :
    lib/stackdepot: export save/fetch stack for drivers
    mm: kmemleak: scan .data.ro_after_init
    memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
    coredump: fix unfreezable coredumping task
    mm/filemap: don't allow partially uptodate page for pipes
    mm/hugetlb: fix huge page reservation leak in private mapping error paths
    ocfs2: fix not enough credit panic
    Revert "console: don't prefer first registered if DT specifies stdout-path"
    mm: hwpoison: fix thp split handling in memory_failure()
    swapfile: fix memory corruption via malformed swapfile
    mm/cma.c: check the max limit for cma allocation
    scripts/bloat-o-meter: fix SIGPIPE
    shmem: fix pageflags after swapping DMA32 object
    mm, frontswap: make sure allocated frontswap map is assigned
    mm: remove extra newline from allocation stall warning

    Linus Torvalds
     
  • Pull VFS fixes from Al Viro:
    "Christoph's and Jan's aio fixes, fixup for generic_file_splice_read
    (removal of pointless detritus that actually breaks it when used for
    gfs2 ->splice_read()) and fixup for generic_file_read_iter()
    interaction with ITER_PIPE destinations."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    splice: remove detritus from generic_file_splice_read()
    mm/filemap: don't allow partially uptodate page for pipes
    aio: fix freeze protection of aio writes
    fs: remove aio_run_iocb
    fs: remove the never implemented aio_fsync file operation
    aio: hold an extra file reference over AIO read/write operations

    Linus Torvalds
     
  • Pull Ceph fixes from Ilya Dryomov:
    "Ceph's ->read_iter() implementation is incompatible with the new
    generic_file_splice_read() code that went into -rc1. Switch to the
    less efficient default_file_splice_read() for now; the proper fix is
    being held for 4.10.

    We also have a fix for a 4.8 regression and a trival libceph fixup"

    * tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client:
    libceph: initialize last_linger_id with a large integer
    libceph: fix legacy layout decode with pool 0
    ceph: use default file splice read callback

    Linus Torvalds
     
  • Pull NFS client bugfixes from Anna Schumaker:
    "Most of these fix regressions in 4.9, and none are going to stable
    this time around.

    Bugfixes:
    - Trim extra slashes in v4 nfs_paths to fix tools that use this
    - Fix a -Wmaybe-uninitialized warnings
    - Fix suspicious RCU usages
    - Fix Oops when mounting multiple servers at once
    - Suppress a false-positive pNFS error
    - Fix a DMAR failure in NFS over RDMA"

    * tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
    xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
    fs/nfs: Fix used uninitialized warn in nfs4_slot_seqid_in_use()
    NFS: Don't print a pNFS error if we aren't using pNFS
    NFS: Ignore connections that have cl_rpcclient uninitialized
    SUNRPC: Fix suspicious RCU usage
    NFSv4.1: work around -Wmaybe-uninitialized warning
    NFS: Trim extra slash in v4 nfs_path

    Linus Torvalds
     
  • …rnel/git/dgc/linux-xfs

    Pull xfs fix from Dave Chinner:
    "This is a fix for an unmount hang (regression) when the filesystem is
    shutdown. It was supposed to go to you for -rc3, but I accidentally
    tagged the commit prior to it in that pullreq.

    Summary:

    - fix for aborting deferred transactions on filesystem shutdown"

    * tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
    xfs: defer should abort intent items if the trans roll fails

    Linus Torvalds
     
  • It could be not possible to freeze coredumping task when it waits for
    'core_state->startup' completion, because threads are frozen in
    get_signal() before they got a chance to complete 'core_state->startup'.

    Inability to freeze a task during suspend will cause suspend to fail.
    Also CRIU uses cgroup freezer during dump operation. So with an
    unfreezable task the CRIU dump will fail because it waits for a
    transition from 'FREEZING' to 'FROZEN' state which will never happen.

    Use freezer_do_not_count() to tell freezer to ignore coredumping task
    while it waits for core_state->startup completion.

    Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com
    Signed-off-by: Andrey Ryabinin
    Acked-by: Pavel Machek
    Acked-by: Oleg Nesterov
    Cc: Alexander Viro
    Cc: Tejun Heo
    Cc: "Rafael J. Wysocki"
    Cc: Michal Hocko
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • The following panic was caught when run ocfs2 disconfig single test
    (block size 512 and cluster size 8192). ocfs2_journal_dirty() return
    -ENOSPC, that means credits were used up.

    The total credit should include 3 times of "num_dx_leaves" from
    ocfs2_dx_dir_rebalance(), because 2 times will be consumed in
    ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in
    ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() ->
    ocfs2_dx_dir_format_cluster(). But only two times is included in
    ocfs2_dx_dir_rebalance_credits(), fix it.

    This can cause read-only fs(v4.1+) or panic for mainline linux depending
    on mount option.

    ------------[ cut here ]------------
    kernel BUG at fs/ocfs2/journal.c:775!
    invalid opcode: 0000 [#1] SMP
    Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
    CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2
    Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
    task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000
    RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
    RSP: 0018:ffff8800a7d4b6d8 EFLAGS: 00010286
    RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9
    RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460
    RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460
    R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e
    R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070
    FS: 00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0
    Call Trace:
    ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2]
    ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2]
    ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2]
    ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2]
    ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2]
    ocfs2_mknod+0x5a2/0x1400 [ocfs2]
    ocfs2_create+0x73/0x180 [ocfs2]
    vfs_create+0xd8/0x100
    lookup_open+0x185/0x1c0
    do_last+0x36d/0x780
    path_openat+0x92/0x470
    do_filp_open+0x4a/0xa0
    do_sys_open+0x11a/0x230
    SyS_open+0x1e/0x20
    system_call_fastpath+0x12/0x71
    Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54
    RIP ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
    ---[ end trace 91ac5312a6ee1288 ]---
    Kernel panic - not syncing: Fatal exception
    Kernel Offset: disabled

    Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.com
    Signed-off-by: Junxiao Bi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     

11 Nov, 2016

2 commits

  • i_size check is a leftover from the horrors that used to play with
    the page cache in that function. With the switch to ->read_iter(),
    it's neither needed nor correct - for gfs2 it ends up being buggy,
    since i_size is not guaranteed to be correct until later (inside
    ->read_iter()).

    Spotted-by: Abhi Das
    Signed-off-by: Al Viro

    Al Viro
     
  • Splice read/write implementation changed recently. When using
    generic_file_splice_read(), iov_iter with type == ITER_PIPE is
    passed to filesystem's read_iter callback. But ceph_sync_read()
    can't serve ITER_PIPE iov_iter correctly (ITER_PIPE iov_iter
    expects pages from page cache).

    Fixing ceph_sync_read() requires a big patch. So use default
    splice read callback for now.

    Signed-off-by: Yan, Zheng
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     

10 Nov, 2016

1 commit

  • Pull orangefs fix from Mike Marshall:
    "We recently refactored the Orangefs debugfs code. The refactor seemed
    to trigger dan.carpenter@oracle.com's static tester to find a possible
    double-free in the code.

    While designing the fix we saw a condition under which the buffer
    being freed could also be overflowed.

    We also realized how to rebuild the related debugfs file's "contents"
    (a string) without deleting and re-creating the file.

    This fix should eliminate the possible double-free, the potential
    overflow and improve code readability"

    * tag 'for-linus-4.9-rc4-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
    orangefs: clean up debugfs

    Linus Torvalds
     

08 Nov, 2016

3 commits

  • Fix the following warn:

    fs/nfs/nfs4session.c: In function ‘nfs4_slot_seqid_in_use’:
    fs/nfs/nfs4session.c:203:54: warning: ‘cur_seq’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    if (nfs4_slot_get_seqid(tbl, slotid, &cur_seq) == 0 &&
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
    cur_seq == seq_nr && test_bit(slotid, tbl->used_slots))
    ~~~~~~~~~~~~~~~~~

    Signed-off-by: Shuah Khan
    Signed-off-by: Anna Schumaker

    Shuah Khan
     
  • We used to check for a valid layout type id before verifying pNFS flags
    as an indicator for if we are using pNFS. This changed in 3132e49ece
    with the introduction of multiple layout types, since now we are passing
    an array of ids instead of just one. Since then, users have been seeing
    a KERN_ERR printk show up whenever mounting NFS v4 without pNFS. This
    patch restores the original behavior of exiting set_pnfs_layoutdriver()
    early if we aren't using pNFS.

    Fixes 3132e49ece ("pnfs: track multiple layout types in fsinfo
    structure")
    Reviewed-by: Jeff Layton
    Signed-off-by: Anna Schumaker

    Anna Schumaker
     
  • cl_rpcclient starts as ERR_PTR(-EINVAL), and connections like that
    are floating freely through the system. Most places check whether
    pointer is valid before dereferencing it, but newly added code
    in nfs_match_client does not.

    Which causes crashes when more than one NFS mount point is present.

    Signed-off-by: Petr Vandrovec
    Signed-off-by: Anna Schumaker

    Petr Vandrovec
     

07 Nov, 2016

1 commit

  • We recently refactored the Orangefs debugfs code.
    The refactor seemed to trigger dan.carpenter@oracle.com's
    static tester to find a possible double-free in the code.

    While designing the fix we saw a condition under which the
    buffer being freed could also be overflowed.

    We also realized how to rebuild the related debugfs file's
    "contents" (a string) without deleting and re-creating the file.

    This fix should eliminate the possible double-free, the
    potential overflow and improve code readability.

    Signed-off-by: Mike Marshall
    Signed-off-by: Martin Brandenburg

    Mike Marshall
     

05 Nov, 2016

3 commits

  • Pull nfsd bugfixes from Bruce Fields:
    "Fixes for some recent regressions including fallout from the vmalloc'd
    stack change (after which we can no longer encrypt stuff on the
    stack)"

    * tag 'nfsd-4.9-1' of git://linux-nfs.org/~bfields/linux:
    nfsd: Fix general protection fault in release_lock_stateid()
    svcrdma: backchannel cannot share a page for send and rcv buffers
    sunrpc: fix some missing rq_rbuffer assignments
    sunrpc: don't pass on-stack memory to sg_set_buf
    nfsd: move blocked lock handling under a dedicated spinlock

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "Some fixes that Dave Sterba collected. We held off on these last week
    because I was focused on the memory corruption testing"

    * 'for-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
    btrfs: fix WARNING in btrfs_select_ref_head()
    Btrfs: remove some no-op casts
    btrfs: pass correct args to btrfs_async_run_delayed_refs()
    btrfs: make file clone aware of fatal signals
    btrfs: qgroup: Prevent qgroup->reserved from going subzero
    Btrfs: kill BUG_ON in do_relocation

    Linus Torvalds
     
  • Pull overlayfs fixes from Miklos Szeredi:
    "Fix two more POSIX ACL bugs introduced in 4.8 and add a missing fsync
    during copy up to prevent possible data loss"

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: fsync after copy-up
    ovl: fix get_acl() on tmpfs
    ovl: update S_ISGID when setting posix ACLs

    Linus Torvalds
     

02 Nov, 2016

1 commit

  • When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
    I get this crash on the server:

    Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
    Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
    Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
    Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
    Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
    Oct 28 22:04:30 klimt kernel: RIP: 0010:[] [] release_lock_stateid+0x1f/0x60 [nfsd]
    Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0 EFLAGS: 00010246
    Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
    Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
    Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
    Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
    Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
    Oct 28 22:04:30 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
    Oct 28 22:04:30 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
    Oct 28 22:04:30 klimt kernel: Stack:
    Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
    Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
    Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
    Oct 28 22:04:30 klimt kernel: Call Trace:
    Oct 28 22:04:30 klimt kernel: [] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
    Oct 28 22:04:30 klimt kernel: [] nfsd4_proc_compound+0x40d/0x690 [nfsd]
    Oct 28 22:04:30 klimt kernel: [] nfsd_dispatch+0xd4/0x1d0 [nfsd]
    Oct 28 22:04:30 klimt kernel: [] svc_process_common+0x3d9/0x700 [sunrpc]
    Oct 28 22:04:30 klimt kernel: [] svc_process+0xf4/0x330 [sunrpc]
    Oct 28 22:04:30 klimt kernel: [] nfsd+0xfa/0x160 [nfsd]
    Oct 28 22:04:30 klimt kernel: [] ? nfsd_destroy+0x170/0x170 [nfsd]
    Oct 28 22:04:30 klimt kernel: [] kthread+0x10b/0x120
    Oct 28 22:04:30 klimt kernel: [] ? kthread_stop+0x280/0x280
    Oct 28 22:04:30 klimt kernel: [] ret_from_fork+0x2a/0x40
    Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
    Oct 28 22:04:30 klimt kernel: RIP [] release_lock_stateid+0x1f/0x60 [nfsd]
    Oct 28 22:04:30 klimt kernel: RSP
    Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---

    Jeff Layton says:
    > Hm...now that I look though, this is a little suspicious:
    >
    > struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
    >
    > I wonder if it's possible for the openstateid to have already been
    > destroyed at this point.
    >
    > We might be better off doing something like this to get the client pointer:
    >
    > stp->st_stid.sc_client;
    >
    > ...which should be more direct and less dependent on other stateids
    > staying valid.

    With the suggested change, I am no longer able to reproduce the above oops.

    v2: Fix unhash_lock_stateid() as well

    Fix-suggested-by: Jeff Layton
    Fixes: 42691398be08 ('nfsd: Fix race between FREE_STATEID and LOCK')
    Signed-off-by: Chuck Lever
    Reviewed-by: Jeff Layton
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

31 Oct, 2016

4 commits

  • Make sure the copied up file hits the disk before renaming to the final
    destination. If this is not done then the copy-up may corrupt the data in
    the file in case of a crash.

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • tmpfs doesn't have ->get_acl() because it only uses cached acls.

    This fixes the acl tests in pjdfstest when tmpfs is used as the upper layer
    of the overlay.

    Reported-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi
    Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes")
    Cc: # v4.8

    Miklos Szeredi
     
  • This change fixes xfstest generic/375, which failed to clear the
    setgid bit in the following test case on overlayfs:

    touch $testfile
    chown 100:100 $testfile
    chmod 2755 $testfile
    _runas -u 100 -g 101 -- setfacl -m u::rwx,g::rwx,o::rwx $testfile

    Reported-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi
    Tested-by: Amir Goldstein
    Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting")
    Cc: # v4.8

    Miklos Szeredi
     
  • Currently we dropped freeze protection of aio writes just after IO was
    submitted. Thus aio write could be in flight while the filesystem was
    frozen and that could result in unexpected situation like aio completion
    wanting to convert extent type on frozen filesystem. Testcase from
    Dmitry triggering this is like:

    for ((i=0;i
    Signed-off-by: Jan Kara
    [hch: forward ported on top of various VFS and aio changes]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Jan Kara