05 Mar, 2020

1 commit

  • [ Upstream commit cf5b4059ba7197d6cef9c0e024979d178ed8c8ec ]

    We want to make sure that we revalidate the dentry if and only if
    we've done an OPEN by filename.
    In order to avoid races with remote changes to the directory on the
    server, we want to save the verifier before calling OPEN. The exception
    is if the server returned a delegation with our OPEN, as we then
    know that the filename can't have changed on the server.

    Signed-off-by: Trond Myklebust
    Reviewed-by: Benjamin Coddington
    Tested-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker
    Signed-off-by: Sasha Levin

    Trond Myklebust
     

24 Feb, 2020

1 commit

  • [ Upstream commit 123c23c6a7b7ecd2a3d6060bea1d94019f71fd66 ]

    In _nfs42_proc_copy(), 'res->commit_res.verf' is allocated through
    kzalloc() if 'args->sync' is true. In the following code, if
    'res->synchronous' is false, handle_async_copy() will be invoked. If an
    error occurs during the invocation, the following code will not be executed
    and the error will be returned . However, the allocated
    'res->commit_res.verf' is not deallocated, leading to a memory leak. This
    is also true if the invocation of process_copy_commit() returns an error.

    To fix the above leaks, redirect the execution to the 'out' label if an
    error is encountered.

    Signed-off-by: Wenwen Wang
    Signed-off-by: Anna Schumaker
    Signed-off-by: Sasha Levin

    Wenwen Wang
     

20 Feb, 2020

1 commit

  • commit cd1b659d8ce7697ee9799b64f887528315b9097b upstream.

    Turning caching off for writes on the server should improve performance.

    Fixes: fba83f34119a ("NFS: Pass "privileged" value to nfs4_init_sequence()")
    Signed-off-by: Olga Kornievskaia
    Reviewed-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Olga Kornievskaia
     

15 Feb, 2020

7 commits

  • commit 7dc2993a9e51dd2eee955944efec65bef90265b7 upstream.

    Currently, each time nfs4_do_fsinfo() is called it will do an implicit
    NFS4 lease renewal, which is not compliant with the NFS4 specification.
    This can result in a lease being expired by an NFS server.

    Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
    introduced implicit client lease renewal in nfs4_do_fsinfo(),
    which can result in the NFSv4.0 lease to expire on a server side,
    and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

    This can easily be reproduced by frequently unmounting a sub-mount,
    then stat'ing it to get it mounted again, which will delay or even
    completely prevent client from sending RENEW operations if no other
    NFS operations are issued. Eventually nfs server will expire client's
    lease and return an error on file access or next RENEW.

    This can also happen when a sub-mount is automatically unmounted
    due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
    mounted again via stat(). This can result in a short window during
    which client's lease will expire on a server but not on a client.
    This specific case was observed on production systems.

    This patch removes the implicit lease renewal from nfs4_do_fsinfo().

    Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
    Signed-off-by: Robert Milkowski
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Robert Milkowski
     
  • commit 924491f2e476f7234d722b24171a4daff61bbe13 upstream.

    Currently, if an nfs server returns NFS4ERR_EXPIRED to open(),
    we return EIO to applications without even trying to recover.

    Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid")
    Signed-off-by: Robert Milkowski
    Reviewed-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Robert Milkowski
     
  • commit 387122478775be5d9816c34aa29de53d0b926835 upstream.

    When comparing two 'struct cred' for equality w.r.t. behaviour under
    filesystem access, we need to use cred_fscmp().

    Fixes: a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 118b6292195cfb86a9f43cb65610fc6d980c65f4 upstream.

    Casting a negative value to an unsigned long is not the same as
    converting it to its absolute value.

    Fixes: 96650e2effa2 ("NFS: Fix show_nfs_errors macros again")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 221203ce6406273cf00e5c6397257d986c003ee6 upstream.

    Instead of making assumptions about the commit verifier contents, change
    the commit code to ensure we always check that the verifier was set
    by the XDR code.

    Fixes: f54bcf2ecee9 ("pnfs: Prepare for flexfiles by pulling out common code")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 0df68ced55443243951d02cc497be31fadf28173 upstream.

    If we suffer a fatal error upon writing a file, which causes us to
    need to revalidate the entire mapping, then we should also revalidate
    the file size.

    Fixes: d2ceb7e57086 ("NFS: Don't use page_file_mapping after removing the page")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 474c4f306eefbb21b67ebd1de802d005c7d7ecdc upstream.

    If CONFIG_SWAP=n, it does not make much sense to offer the user the
    option to enable support for swapping over NFS, as that will still fail
    at run time:

    # swapon /swap
    swapon: /swap: swapon failed: Function not implemented

    Fix this by adding a dependency on CONFIG_SWAP.

    Fixes: a564b8f0398636ba ("nfs: enable swap on NFS")
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Geert Uytterhoeven
     

11 Feb, 2020

2 commits

  • commit 114de38225d9b300f027e2aec9afbb6e0def154b upstream.

    When a NFS directory page cache page is removed from the page cache,
    its contents are freed through a call to nfs_readdir_clear_array().
    To prevent the removal of the page cache entry until after we've
    finished reading it, we must take the page lock.

    Fixes: 11de3b11e08c ("NFS: Fix a memory leak in nfs_readdir")
    Cc: stable@vger.kernel.org # v2.6.37+
    Signed-off-by: Trond Myklebust
    Reviewed-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 4b310319c6a8ce708f1033d57145e2aa027a883c upstream.

    nfs_readdir_xdr_to_array() must not exit without having initialised
    the array, so that the page cache deletion routines can safely
    call nfs_readdir_clear_array().
    Furthermore, we should ensure that if we exit nfs_readdir_filler()
    with an error, we free up any page contents to prevent a leak
    if we try to fill the page again.

    Fixes: 11de3b11e08c ("NFS: Fix a memory leak in nfs_readdir")
    Cc: stable@vger.kernel.org # v2.6.37+
    Signed-off-by: Trond Myklebust
    Reviewed-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     

18 Jan, 2020

3 commits


01 Nov, 2019

2 commits


11 Oct, 2019

1 commit

  • Our client can issue multiple SETCLIENTID operations to the same
    server in some circumstances. Ensure that calls to
    nfs4_proc_setclientid() after the first one do not overwrite the
    previously allocated cl_acceptor string.

    unreferenced object 0xffff888461031800 (size 32):
    comm "mount.nfs", pid 2227, jiffies 4294822467 (age 1407.749s)
    hex dump (first 32 bytes):
    6e 66 73 40 6b 6c 69 6d 74 2e 69 62 2e 31 30 31 nfs@klimt.ib.101
    35 67 72 61 6e 67 65 72 2e 6e 65 74 00 00 00 00 5granger.net....
    backtrace:
    [] __kmalloc+0x128/0x176
    [] gss_stringify_acceptor+0xbd/0x1a7 [auth_rpcgss]
    [] nfs4_proc_setclientid+0x34e/0x46c [nfsv4]
    [] nfs40_discover_server_trunking+0x7a/0xed [nfsv4]
    [] nfs4_discover_server_trunking+0x81/0x244 [nfsv4]
    [] nfs4_init_client+0x1b0/0x238 [nfsv4]
    [] nfs4_set_client+0xfe/0x14d [nfsv4]
    [] nfs4_create_server+0x107/0x1db [nfsv4]
    [] nfs4_remote_mount+0x2c/0x59 [nfsv4]
    [] legacy_get_tree+0x2d/0x4c
    [] vfs_get_tree+0x20/0xc7
    [] fc_mount+0xe/0x36
    [] vfs_kern_mount+0x74/0x8d
    [] nfs_do_root_mount+0x8a/0xa3 [nfsv4]
    [] nfs4_try_mount+0x58/0xad [nfsv4]
    [] nfs_fs_mount+0x820/0x869 [nfs]

    Fixes: f11b2a1cfbf5 ("nfs4: copy acceptor name from context ... ")
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

09 Oct, 2019

2 commits

  • We no longer need the extra mirror length tracking in the O_DIRECT code,
    as we are able to track the maximum contiguous length in dreq->max_count.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • When a series of O_DIRECT reads or writes are truncated, either due to
    eof or due to an error, then we should return the number of contiguous
    bytes that were received/sent starting at the offset specified by the
    application.

    Currently, we are failing to correctly check contiguity, and so we're
    failing the generic/465 in xfstests when the race between the read
    and write RPCs causes the file to get extended while the 2 reads are
    outstanding. If the first read RPC call wins the race and returns with
    eof set, we should treat the second read RPC as being truncated.

    Reported-by: Su Yanjun
    Fixes: 1ccbad9f9f9bd ("nfs: fix DIO good bytes calculation")
    Cc: stable@vger.kernel.org # 4.1+
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     

02 Oct, 2019

1 commit

  • When xfstests testing, there are some WARNING as below:

    WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 nfs_clear_inode+0x9c/0xd8
    Modules linked in:
    CPU: 0 PID: 6235 Comm: umount.nfs
    Hardware name: linux,dummy-virt (DT)
    pstate: 60000005 (nZCv daif -PAN -UAO)
    pc : nfs_clear_inode+0x9c/0xd8
    lr : nfs_evict_inode+0x60/0x78
    sp : fffffc000f68fc00
    x29: fffffc000f68fc00 x28: fffffe00c53155c0
    x27: fffffe00c5315000 x26: fffffc0009a63748
    x25: fffffc000f68fd18 x24: fffffc000bfaaf40
    x23: fffffc000936d3c0 x22: fffffe00c4ff5e20
    x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10
    x19: fffffc000c056000 x18: 000000000000003c
    x17: 0000000000000000 x16: 0000000000000000
    x15: 0000000000000040 x14: 0000000000000228
    x13: fffffc000c3a2000 x12: 0000000000000045
    x11: 0000000000000000 x10: 0000000000000000
    x9 : 0000000000000000 x8 : 0000000000000000
    x7 : 0000000000000000 x6 : fffffc00084b027c
    x5 : fffffc0009a64000 x4 : fffffe00c0e77400
    x3 : fffffc000c0563a8 x2 : fffffffffffffffb
    x1 : 000000000000764e x0 : 0000000000000001
    Call trace:
    nfs_clear_inode+0x9c/0xd8
    nfs_evict_inode+0x60/0x78
    evict+0x108/0x380
    dispose_list+0x70/0xa0
    evict_inodes+0x194/0x210
    generic_shutdown_super+0xb0/0x220
    nfs_kill_super+0x40/0x88
    deactivate_locked_super+0xb4/0x120
    deactivate_super+0x144/0x160
    cleanup_mnt+0x98/0x148
    __cleanup_mnt+0x38/0x50
    task_work_run+0x114/0x160
    do_notify_resume+0x2f8/0x308
    work_pending+0x8/0x14

    The nrequest should be increased/decreased only if PG_INODE_REF flag
    was setted.

    But in the nfs_inode_remove_request function, it maybe decrease when
    no PG_INODE_REF flag, this maybe lead nrequests count error.

    Reported-by: Hulk Robot
    Signed-off-by: ZhangXiaoxu
    Signed-off-by: Anna Schumaker

    ZhangXiaoxu
     

27 Sep, 2019

1 commit

  • Pull NFS client updates from Anna Schumaker:
    "Stable bugfixes:
    - Dequeue the request from the receive queue while we're re-encoding
    # v4.20+
    - Fix buffer handling of GSS MIC without slack # 5.1

    Features:
    - Increase xprtrdma maximum transport header and slot table sizes
    - Add support for nfs4_call_sync() calls using a custom
    rpc_task_struct
    - Optimize the default readahead size
    - Enable pNFS filelayout LAYOUTGET on OPEN

    Other bugfixes and cleanups:
    - Fix possible null-pointer dereferences and memory leaks
    - Various NFS over RDMA cleanups
    - Various NFS over RDMA comment updates
    - Don't receive TCP data into a reset request buffer
    - Don't try to parse incomplete RPC messages
    - Fix congestion window race with disconnect
    - Clean up pNFS return-on-close error handling
    - Fixes for NFS4ERR_OLD_STATEID handling"

    * tag 'nfs-for-5.4-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (53 commits)
    pNFS/filelayout: enable LAYOUTGET on OPEN
    NFS: Optimise the default readahead size
    NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU
    NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
    NFSv4: Fix OPEN_DOWNGRADE error handling
    pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping the state seqid
    NFSv4: Add a helper to increment stateid seqids
    NFSv4: Handle RPC level errors in LAYOUTRETURN
    NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close
    NFSv4: Clean up pNFS return-on-close error handling
    pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors
    NFS: remove unused check for negative dentry
    NFSv3: use nfs_add_or_obtain() to create and reference inodes
    NFS: Refactor nfs_instantiate() for dentry referencing callers
    SUNRPC: Fix congestion window race with disconnect
    SUNRPC: Don't try to parse incomplete RPC messages
    SUNRPC: Rename xdr_buf_read_netobj to xdr_buf_read_mic
    SUNRPC: Fix buffer handling of GSS MIC without slack
    SUNRPC: RPC level errors should always set task->tk_rpc_status
    SUNRPC: Don't receive TCP data into a request buffer that has been reset
    ...

    Linus Torvalds
     

25 Sep, 2019

2 commits

  • Add the flag to the filelayout driver to add LAYOUTGET to
    the OPEN compound.

    Signed-off-by: Olga Kornievskaia
    Signed-off-by: Anna Schumaker

    Olga Kornievskaia
     
  • In the years since the max readahead size was fixed in NFS, a number of
    things have happened:
    - Users can now set the value directly using /sys/class/bdi
    - NFS max supported block sizes have increased by several orders of
    magnitude from 64K to 1MB.
    - Disk access latencies are orders of magnitude faster due to SSD + NVME.

    In particular note that if the server is advertising 1MB as the optimal
    read size, as that will set the readahead size to 15MB.
    Let's therefore adjust down, and try to default to VM_READAHEAD_PAGES.
    However let's inform the VM about our preferred block size so that it
    can choose to round up in cases where that makes sense.

    Reported-by: Alkis Georgopoulos
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     

21 Sep, 2019

12 commits


20 Sep, 2019

1 commit

  • Pull y2038 vfs updates from Arnd Bergmann:
    "Add inode timestamp clamping.

    This series from Deepa Dinamani adds a per-superblock minimum/maximum
    timestamp limit for a file system, and clamps timestamps as they are
    written, to avoid random behavior from integer overflow as well as
    having different time stamps on disk vs in memory.

    At mount time, a warning is now printed for any file system that can
    represent current timestamps but not future timestamps more than 30
    years into the future, similar to the arbitrary 30 year limit that was
    added to settimeofday().

    This was picked as a compromise to warn users to migrate to other file
    systems (e.g. ext4 instead of ext3) when they need the file system to
    survive beyond 2038 (or similar limits in other file systems), but not
    get in the way of normal usage"

    * tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
    ext4: Reduce ext4 timestamp warnings
    isofs: Initialize filesystem timestamp ranges
    pstore: fs superblock limits
    fs: omfs: Initialize filesystem timestamp ranges
    fs: hpfs: Initialize filesystem timestamp ranges
    fs: ceph: Initialize filesystem timestamp ranges
    fs: sysv: Initialize filesystem timestamp ranges
    fs: affs: Initialize filesystem timestamp ranges
    fs: fat: Initialize filesystem timestamp ranges
    fs: cifs: Initialize filesystem timestamp ranges
    fs: nfs: Initialize filesystem timestamp ranges
    ext4: Initialize timestamps limits
    9p: Fill min and max timestamps in sb
    fs: Fill in max and min timestamps in superblock
    utimes: Clamp the timestamps before update
    mount: Add mount warning for impending timestamp expiry
    timestamp_truncate: Replace users of timespec64_trunc
    vfs: Add timestamp_truncate() api
    vfs: Add file timestamp range support

    Linus Torvalds
     

19 Sep, 2019

1 commit

  • Pull vfs namei updates from Al Viro:
    "Pathwalk-related stuff"

    [ Audit-related cleanups, misc simplifications, and easier to follow
    nd->root refcounts - Linus ]

    * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    devpts_pty_kill(): don't bother with d_delete()
    infiniband: don't bother with d_delete()
    hypfs: don't bother with d_delete()
    fs/namei.c: keep track of nd->root refcount status
    fs/namei.c: new helper - legitimize_root()
    kill the last users of user_{path,lpath,path_dir}()
    namei.h: get the comments on LOOKUP_... in sync with reality
    kill LOOKUP_NO_EVAL, don't bother including namei.h from audit.h
    audit_inode(): switch to passing AUDIT_INODE_...
    filename_mountpoint(): make LOOKUP_NO_EVAL unconditional there
    filename_lookup(): audit_inode() argument is always 0

    Linus Torvalds
     

03 Sep, 2019

1 commit


31 Aug, 2019

1 commit