08 Apr, 2016

1 commit

  • Pull ext4 bugfixes from Ted Ts'o:
    "These changes contains a fix for overlayfs interacting with some
    (badly behaved) dentry code in various file systems. These have been
    reviewed by Al and the respective file system mtinainers and are going
    through the ext4 tree for convenience.

    This also has a few ext4 encryption bug fixes that were discovered in
    Android testing (yes, we will need to get these sync'ed up with the
    fs/crypto code; I'll take care of that). It also has some bug fixes
    and a change to ignore the legacy quota options to allow for xfstests
    regression testing of ext4's internal quota feature and to be more
    consistent with how xfs handles this case"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: ignore quota mount options if the quota feature is enabled
    ext4 crypto: fix some error handling
    ext4: avoid calling dquot_get_next_id() if quota is not enabled
    ext4: retry block allocation for failed DIO and DAX writes
    ext4: add lockdep annotations for i_data_sem
    ext4: allow readdir()'s of large empty directories to be interrupted
    btrfs: fix crash/invalid memory access on fsync when using overlayfs
    ext4 crypto: use dget_parent() in ext4_d_revalidate()
    ext4: use file_dentry()
    ext4: use dget_parent() in ext4_file_open()
    nfs: use file_dentry()
    fs: add file_dentry()
    ext4 crypto: don't let data integrity writebacks fail with ENOMEM
    ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()

    Linus Torvalds
     

05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

27 Mar, 2016

1 commit

  • NFS may be used as lower layer of overlayfs and accessing f_path.dentry can
    lead to a crash.

    Fix by replacing direct access of file->f_path.dentry with the
    file_dentry() accessor, which will always return a native object.

    Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
    Signed-off-by: Miklos Szeredi
    Tested-by: Goldwyn Rodrigues
    Acked-by: Trond Myklebust
    Signed-off-by: Theodore Ts'o
    Cc: # v4.2
    Cc: David Howells
    Cc: Al Viro

    Miklos Szeredi
     

25 Mar, 2016

1 commit

  • Pull more nfsd updates from Bruce Fields:
    "Apologies for the previous request, which omitted the top 8 commits
    from my for-next branch (including the SCSI layout commits). Thanks
    to Trond for spotting my error!"

    This actually includes the new layout types, so here's that part of
    the pull message repeated:

    "Support for a new pnfs layout type from Christoph Hellwig. The new
    layout type is a variant of the block layout which uses SCSI features
    to offer improved fencing and device identification.

    Note this pull request also includes the client side of SCSI layout,
    with Trond's permission"

    * tag 'nfsd-4.6-1' of git://linux-nfs.org/~bfields/linux:
    nfsd: use short read as well as i_size to set eof
    nfsd: better layoutupdate bounds-checking
    nfsd: block and scsi layout drivers need to depend on CONFIG_BLOCK
    nfsd: add SCSI layout support
    nfsd: move some blocklayout code
    nfsd: add a new config option for the block layout driver
    nfs/blocklayout: add SCSI layout support
    nfs4.h: add SCSI layout definitions

    Linus Torvalds
     

23 Mar, 2016

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "Highlights include:

    Features:
    - Add support for multiple NFSv4.1 callbacks in flight
    - Initial patchset for RPC multipath support
    - Adapt RPC/RDMA to use the new completion queue API

    Bugfixes and cleanups:
    - nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
    - Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
    - Fix RPC/RDMA credit accounting
    - Properly handle RDMA_ERROR replies
    - xprtrdma: Do not wait if ib_post_send() fails
    - xprtrdma: Segment head and tail XDR buffers on page boundaries
    - xprtrdma cleanups for dprintk, physical_op_map and unused macros"

    * tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (35 commits)
    nfs/blocklayout: make sure making a aligned read request
    nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
    nfs: remove nfs_inode_dio_wait
    nfs: remove nfs4_file_fsync
    xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
    xprtrdma: Use an anonymous union in struct rpcrdma_mw
    xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
    xprtrdma: Serialize credit accounting again
    xprtrdma: Properly handle RDMA_ERROR replies
    rpcrdma: Add RPCRDMA_HDRLEN_ERR
    xprtrdma: Do not wait if ib_post_send() fails
    xprtrdma: Segment head and tail XDR buffers on page boundaries
    xprtrdma: Clean up dprintk format string containing a newline
    xprtrdma: Clean up physical_op_map()
    xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
    NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
    pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
    SUNRPC: Allow addition of new transports to a struct rpc_clnt
    NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
    SUNRPC: Make NFS swap work with multipath
    ...

    Linus Torvalds
     

22 Mar, 2016

1 commit


20 Mar, 2016

1 commit

  • Pull vfs updates from Al Viro:

    - Preparations of parallel lookups (the remaining main obstacle is the
    need to move security_d_instantiate(); once that becomes safe, the
    rest will be a matter of rather short series local to fs/*.c

    - preadv2/pwritev2 series from Christoph

    - assorted fixes

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
    splice: handle zero nr_pages in splice_to_pipe()
    vfs: show_vfsstat: do not ignore errors from show_devname method
    dcache.c: new helper: __d_add()
    don't bother with __d_instantiate(dentry, NULL)
    untangle fsnotify_d_instantiate() a bit
    uninline d_add()
    replace d_add_unique() with saner primitive
    quota: use lookup_one_len_unlocked()
    cifs_get_root(): use lookup_one_len_unlocked()
    nfs_lookup: don't bother with d_instantiate(dentry, NULL)
    kill dentry_unhash()
    ceph_fill_trace(): don't bother with d_instantiate(dn, NULL)
    autofs4: don't bother with d_instantiate(dentry, NULL) in ->lookup()
    configfs: move d_rehash() into configfs_create() for regular files
    ceph: don't bother with d_rehash() in splice_dentry()
    namei: teach lookup_slow() to skip revalidate
    namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()
    lookup_one_len_unlocked(): use lookup_dcache()
    namei: simplify invalidation logics in lookup_dcache()
    namei: change calling conventions for lookup_{fast,slow} and follow_managed()
    ...

    Linus Torvalds
     

18 Mar, 2016

1 commit

  • This is a trivial extension to the block layout driver to support the
    new SCSI layouts draft. There are three changes:

    - device identifcation through the SCSI VPD page. This allows us to
    directly use the udev generated persistent device names instead of
    requiring an expensive lookup by crawling every block device node
    in /dev and reading a signature for it.
    - use of SCSI persistent reservations to protect device access and
    allow for robust fencing. On the client sides this just means
    registering and unregistering a server supplied key.
    - an optimized LAYOUTCOMMIT payload that doesn't send unessecary
    fields to the server.

    Signed-off-by: Christoph Hellwig
    Acked-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Christoph Hellwig
     

17 Mar, 2016

3 commits

  • I hit the following oops out of the blue while testing with flexfiles:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
    IP: [] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
    PGD 44031067 PUD 5062d067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in: nfsv3 nfs_layout_flexfiles tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bonding ipmi_devintf ipmi_msghandler snd_hda_codec_generic virtio_balloon ppdev snd_hda_intel snd_hda_controller snd_hda_codec iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core parport_pc snd_hwdep parport snd_seq snd_seq_device snd_pcm snd_timer acpi_cpufreq
    snd soundcore i2c_piix4 xfs libcrc32c joydev virtio_net virtio_console qxl drm_kms_helper ttm crc32c_intel drm virtio_pci serio_raw ata_generic virtio_ring virtio pata_acpi
    CPU: 0 PID: 19138 Comm: test5 Not tainted 4.1.9-100.pd.90.el7.x86_64 #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
    task: ffff88007b70cf00 ti: ffff88004cc44000 task.ti: ffff88004cc44000
    RIP: 0010:[] [] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
    RSP: 0018:ffff88004cc47890 EFLAGS: 00010246
    RAX: 0000000000000003 RBX: ffff880050932300 RCX: ffff88006978f488
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003e0e8540
    RBP: ffff88004cc47908 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff88007ff8c758 R11: 0000000000000005 R12: ffff88003e0e8540
    R13: 0000000000000000 R14: ffff88006978f488 R15: ffff88004431cc80
    FS: 00007fea40c7c740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000e8 CR3: 0000000044318000 CR4: 00000000000406f0
    Stack:
    ffffffffa048c934 ffff880050932310 0000000100000001 ffff88006978f510
    ffff88006978f3c8 ffff88003e56cd90 ffff88004cc479d0 00000020a052aff0
    000000000004b000 ffff88004cc47908 ffff880050932300 ffff88004cc479d0
    Call Trace:
    [] ? ff_layout_write_pagelist+0x64/0x220 [nfs_layout_flexfiles]
    [] pnfs_generic_pg_writepages+0xaf/0x1b0 [nfsv4]
    [] nfs_pageio_doio+0x27/0x60 [nfs]
    [] nfs_pageio_complete_mirror+0x54/0xa0 [nfs]
    [] nfs_pageio_complete+0x2d/0x90 [nfs]
    [] nfs_writepage_locked+0x8d/0xe0 [nfs]
    [] ? page_referenced_one+0x1a0/0x1a0
    [] nfs_wb_single_page+0xf7/0x190 [nfs]
    [] nfs_launder_page+0x41/0x90 [nfs]
    [] invalidate_inode_pages2_range+0x340/0x3a0
    [] invalidate_inode_pages2+0x17/0x20
    [] nfs_release+0x9e/0xb0 [nfs]
    [] nfs_file_release+0x3d/0x60 [nfs]
    [] __fput+0xdc/0x1e0
    [] ____fput+0xe/0x10
    [] task_work_run+0xa7/0xe0
    [] get_signal+0x565/0x600
    [] ? __filemap_fdatawrite_range+0x65/0x90
    [] do_signal+0x37/0x730
    [] ? nfs4_file_fsync+0x81/0x150 [nfsv4]
    [] ? vfs_fsync_range+0x3b/0xb0
    [] ? __audit_syscall_exit+0x1e6/0x280
    [] do_notify_resume+0x5f/0xa0
    [] int_signal+0x12/0x17
    Code: 48 8b 40 70 8b 00 83 f8 03 74 20 83 f8 04 75 13 55 48 89 ce 48 89 d7 48 89 e5 e8 14 0f 0e 00 5d c3 66 90 0f 0b 66 0f 1f 44 00 00 8b 82 e8 00 00 00 c3 66 66 66 66 90 55 48 89 e5 41 57 41 56
    RIP [] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles]
    RSP
    CR2: 00000000000000e8

    When the DS connection attempt fails, nfs4_ff_layout_prepare_ds marks it
    for the error but then just returns the ds as if it were usable. The
    comments though say:

    /* Upon return, either ds is connected, or ds is NULL */

    Ensure that we set the return pointer to NULL in the event that the
    connection attempt fails.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Just call inode_dio_wait directly instead of through a pointless wrapper.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     
  • The only difference to nfs_file_fsync is the call to pnfs_sync_inode. But
    pnfs_sync_inode is just an inline that calls a pNFS layout driver method
    if CONFIG_PNFS is designed, and thus can be called just fine from the core
    NFS module.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     

14 Mar, 2016

2 commits


23 Feb, 2016

4 commits

  • * multipath:
    NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
    pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
    SUNRPC: Allow addition of new transports to a struct rpc_clnt
    NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
    SUNRPC: Make NFS swap work with multipath
    SUNRPC: Add a helper to apply a function to all the rpc_clnt's transports
    SUNRPC: Allow caller to specify the transport to use
    SUNRPC: Use the multipath iterator to assign a transport to each task
    SUNRPC: Make rpc_clnt store the multipath iterators
    SUNRPC: Add a structure to track multiple transports
    SUNRPC: Make freeing of struct xprt rcu-safe
    SUNRPC: Uninline xprt_get(); It isn't performance critical.
    SUNRPC: Reorder rpc_task to put waitqueue related info in same cachelines
    SUNRPC: Remove unused function rpc_task_reset_client

    Trond Myklebust
     
  • * nfsv41_cb:
    NFSv4.x: Fix NFS4ERR_RETRY_UNCACHED_REP in nfs4_callback_sequence
    NFSv4.x: Allow multiple callbacks in flight
    NFSv4.x: Fix wraparound issues when validing the callback sequence id
    NFSv4.x: Enforce the ca_maxresponsesize_cached on the back channel
    NFSv4.x: CB_SEQUENCE should return NFS4ERR_DELAY if still executing
    NFSv4.x: Remove hard coded slotids in callback channel

    Trond Myklebust
     
  • Replace another case where the layout 'plh_block_lgets' can trigger
    infinite loops in send_layoutget().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If the server reboots while there is a layoutget outstanding, then
    the call to pnfs_choose_layoutget_stateid() will fail with an EAGAIN
    error, which causes an infinite loop in send_layoutget(). The reason
    why we never break out of the loop is that the layout 'plh_block_lgets'
    field is never cleared.

    Fix is to replace plh_block_lgets with NFS_LAYOUT_INVALID_STID, which
    can be reset after a new layoutget.

    Fixes: ab7d763e477c5 ("pNFS: Ensure nfs4_layoutget_prepare returns...")
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

18 Feb, 2016

3 commits

  • unreferenced object 0xffffc90000abf000 (size 16900):
    comm "fsync02", pid 15765, jiffies 4297431627 (age 423.772s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 a0 c2 19 00 88 ff ff ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc+0x4e/0xb0
    [] __vmalloc_node_range+0x231/0x280
    [] __vmalloc+0x4a/0x50
    [] ext_tree_prepare_commit+0x231/0x2e0 [blocklayoutdriver]
    [] bl_prepare_layoutcommit+0xe/0x10 [blocklayoutdriver]
    [] pnfs_layoutcommit_inode+0x29c/0x330 [nfsv4]
    [] pnfs_generic_sync+0x13/0x20 [nfsv4]
    [] nfs4_file_fsync+0x58/0x150 [nfsv4]
    [] vfs_fsync_range+0x4b/0xb0
    [] do_fsync+0x3d/0x70
    [] SyS_fsync+0x10/0x20
    [] entry_SYSCALL_64_fastpath+0x12/0x76
    [] 0xffffffffffffffff

    v2, add missing include header

    Signed-off-by: Kinglong Mee
    Signed-off-by: Trond Myklebust

    Kinglong Mee
     
  • The newly added NFS v4.2 operations (ALLOCATE, DEALLOCATE, SEEK and CLONE)
    use a helper called nfs42_set_rw_stateid to select a stateid that is sent
    to the server. But they don't set the inode and state fields in the
    nfs4_exception structure, and this don't partake in the stateid recovery
    protocol. Because of this they will simply return errors insted of trying
    to recover a stateid when the server return a BAD_STATEID error.

    Additionally CLONE has the problem that it operates on two files and thus
    two stateids, and thus needs to call the exception handler twice to
    recover stateids.

    While we're at it stop grabbing an addititional reference to the open
    context in all these operations - having the file open guarantees that
    the open context won't go away.

    All this can be produces with the generic/168 and generic/170 tests in
    xfstests which stress the CLONE stateid handling.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     
  • In the case where d_add_unique() finds an appropriate alias to use it will
    have already incremented the reference count. An additional dget() to swap
    the open context's dentry is unnecessary and will leak a reference.

    Signed-off-by: Benjamin Coddington
    Fixes: 275bb307865a3 ("NFSv4: Move dentry instantiation into the NFSv4-...")
    Cc: stable@vger.kernel.org # 3.10+
    Signed-off-by: Trond Myklebust

    Benjamin Coddington
     

16 Feb, 2016

2 commits


06 Feb, 2016

3 commits


02 Feb, 2016

1 commit


28 Jan, 2016

1 commit


27 Jan, 2016

1 commit

  • The layoutreturn code currently relies on pnfs_put_lseg() to initiate the
    RPC call when conditions are right. A problem arises when we want to
    free the layout segment from inside an inode->i_lock section (e.g. in
    pnfs_clear_request_commit()), since we cannot sleep.

    The workaround is to move the actual call to pnfs_send_layoutreturn()
    to pnfs_put_layout_hdr(), which doesn't have this restriction.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

25 Jan, 2016

5 commits


24 Jan, 2016

2 commits

  • Pull final vfs updates from Al Viro:

    - The ->i_mutex wrappers (with small prereq in lustre)

    - a fix for too early freeing of symlink bodies on shmem (they need to
    be RCU-delayed) (-stable fodder)

    - followup to dedupe stuff merged this cycle

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: abort dedupe loop if fatal signals are pending
    make sure that freeing shmem fast symlinks is RCU-delayed
    wrappers for ->i_mutex access
    lustre: remove unused declaration

    Linus Torvalds
     
  • Pull NFS client bugfixes and cleanups from Trond Myklebust:
    "Bugfixes:
    - pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
    - pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN

    Cleanups:
    - NFS: Simplify nfs_request_add_commit_list() arguments"

    * tag 'nfs-for-4.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
    NFS: Simplify nfs_request_add_commit_list() arguments
    pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN

    Linus Torvalds
     

23 Jan, 2016

3 commits


22 Jan, 2016

2 commits