25 May, 2016

1 commit

  • Pull nfsd updates from Bruce Fields:
    "A very quiet cycle for nfsd, mainly just an RDMA update from Chuck
    Lever"

    * tag 'nfsd-4.7' of git://linux-nfs.org/~bfields/linux:
    sunrpc: fix stripping of padded MIC tokens
    svcrpc: autoload rdma module
    svcrdma: Generalize svc_rdma_xdr_decode_req()
    svcrdma: Eliminate code duplication in svc_rdma_recvfrom()
    svcrdma: Drain QP before freeing svcrdma_xprt
    svcrdma: Post Receives only for forward channel requests
    svcrdma: Remove superfluous line from rdma_read_chunks()
    svcrdma: svc_rdma_put_context() is invoked twice in Send error path
    svcrdma: Do not add XDR padding to xdr_buf page vector
    svcrdma: Support IPv6 with NFS/RDMA
    nfsd: handle seqid wraparound in nfsd4_preprocess_layout_stateid
    Remove unnecessary allocation

    Linus Torvalds
     

14 May, 2016

1 commit

  • An xdr_buf has a head, a vector of pages, and a tail. Each
    RPC request is presented to the NFS server contained in an
    xdr_buf.

    The RDMA transport would like to supply the NFS server with only
    the NFS WRITE payload bytes in the page vector. In some common
    cases, that would allow the NFS server to swap those pages right
    into the target file's page cache.

    Have the transport's RDMA Read logic put XDR pad bytes in the tail
    iovec, and not in the pages that hold the data payload.

    The NFSv3 WRITE XDR decoder is finicky about the lengths involved,
    so make sure it is looking in the correct places when computing
    the total length of the incoming NFS WRITE request.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

11 Apr, 2016

1 commit


09 Jan, 2016

1 commit

  • We need information about exports when crossing mountpoints during
    lookup or NFSv4 readdir. If we don't already have that information
    cached, we may have to ask (and wait for) rpc.mountd.

    In both cases we currently hold the i_mutex on the parent of the
    directory we're asking rpc.mountd about. We've seen situations where
    rpc.mountd performs some operation on that directory that tries to take
    the i_mutex again, resulting in deadlock.

    With some care, we may be able to avoid that in rpc.mountd. But it
    seems better just to avoid holding a mutex while waiting on userspace.

    It appears that lookup_one_len is pretty much the only operation that
    needs the i_mutex. So we could just drop the i_mutex elsewhere and do
    something like

    mutex_lock()
    lookup_one_len()
    mutex_unlock()

    In many cases though the lookup would have been cached and not required
    the i_mutex, so it's more efficient to create a lookup_one_len() variant
    that only takes the i_mutex when necessary.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Al Viro

    NeilBrown
     

13 Oct, 2015

1 commit


07 May, 2015

1 commit

  • The NFSv3 READDIRPLUS gets some of the returned attributes from the
    readdir, and some from an inode returned from a new lookup. The two
    objects could be different thanks to intervening renames.

    The attributes in READDIRPLUS are optional, so let's just skip them if
    we notice this case.

    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

16 Apr, 2015

1 commit


23 Jun, 2014

1 commit


23 May, 2014

1 commit

  • Assignments should not happen inside an if conditional, but in the line
    before. This issue was reported by checkpatch.

    The semantic patch that makes this change is as follows
    (http://coccinelle.lip6.fr/):

    //

    @@
    identifier i1;
    expression e1;
    statement S;
    @@
    -if(!(i1 = e1)) S
    +i1 = e1;
    +if(!i1)
    +S

    //

    It has been tested by compilation.

    Signed-off-by: Benoit Taine
    Signed-off-by: J. Bruce Fields

    Benoit Taine
     

24 Jan, 2014

1 commit


11 Dec, 2013

1 commit

  • The Linux NFS server replies among other things to a "Check access permission"
    the following:

    NFS: File type = 2 (Directory)
    NFS: Mode = 040755

    A netapp server replies here:
    NFS: File type = 2 (Directory)
    NFS: Mode = 0755

    The RFC 1813 i read:
    fattr3

    struct fattr3 {
    ftype3 type;
    mode3 mode;
    uint32 nlink;
    ...
    For the mode bits only the lowest 9 are defined in the RFC

    As far as I can tell, knfsd has always done this, so apparently it's harmless.
    Nevertheless, it appears to be wrong.

    Note this is already correct in the NFSv4 case, only v2 and v3 need
    fixing.

    Signed-off-by: J. Bruce Fields

    Albert Fluegel
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

26 Feb, 2013

1 commit


13 Feb, 2013

2 commits


18 Dec, 2012

1 commit


11 Dec, 2012

1 commit


13 Apr, 2012

1 commit

  • Restore the original logics ("fail on mountpoints, negatives and in
    case of fh_compose() failures"). Since commit 8177e (nfsd: clean up
    readdirplus encoding) that got broken -
    rv = fh_compose(fhp, exp, dchild, &cd->fh);
    if (rv)
    goto out;
    if (!dchild->d_inode)
    goto out;
    rv = 0;
    out:
    is equivalent to
    rv = fh_compose(fhp, exp, dchild, &cd->fh);
    out:
    and the second check has no effect whatsoever...

    Signed-off-by: Al Viro

    Al Viro
     

30 May, 2011

1 commit

  • * 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits)
    nfsd: make local functions static
    NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session()
    NFSD: Check status from nfsd4_map_bcts_dir()
    NFSD: Remove setting unused variable in nfsd_vfs_read()
    nfsd41: error out on repeated RECLAIM_COMPLETE
    nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence
    nfsd v4.1 lOCKT clientid field must be ignored
    nfsd41: add flag checking for create_session
    nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly
    nfsd4: fix wrongsec handling for PUTFH + op cases
    nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller
    nfsd4: introduce OPDESC helper
    nfsd4: allow fh_verify caller to skip pseudoflavor checks
    nfsd: distinguish functions of NFSD_MAY_* flags
    svcrpc: complete svsk processing on cb receive failure
    svcrpc: take advantage of tcp autotuning
    SUNRPC: Don't wait for full record to receive tcp data
    svcrpc: copy cb reply instead of pages
    svcrpc: close connection if client sends short packet
    svcrpc: note network-order types in svc_process_calldir
    ...

    Linus Torvalds
     

19 May, 2011

1 commit


31 Mar, 2011

1 commit


09 Dec, 2010

1 commit

  • If vfs_getattr in fill_post_wcc returns an error, we don't
    set fh_post_change.
    For NFSv4, this can result in set_change_info triggering a BUG_ON.
    i.e. fh_post_saved being zero isn't really a bug.

    So:
    - instead of BUGging when fh_post_saved is zero, just clear ->atomic.
    - if vfs_getattr fails in fill_post_wcc, take a copy of i_ctime anyway.
    This will be used i seg_change_info, but not overly trusted.
    - While we are there, remove the pointless 'if' statements in set_change_info.
    There is no harm setting all the values.

    Signed-off-by: NeilBrown
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    Neil Brown
     

16 Dec, 2009

1 commit


15 Dec, 2009

2 commits


15 Nov, 2009

1 commit

  • Commit 8177e6d6dfb9cd03d9bdeb647c32161f8f58f686 ("nfsd: clean up
    readdirplus encoding") introduced single character typo in nfs3 readdir+
    implementation. Unfortunately that typo has quite bad side effects:
    random memory corruption, followed (on my box) with immediate
    spontaneous box reboot.

    Using 'p1' instead of 'p' fixes my Linux box rebooting whenever VMware
    ESXi box tries to list contents of my home directory.

    Signed-off-by: Petr Vandrovec
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Signed-off-by: Linus Torvalds

    Petr Vandrovec
     

05 Sep, 2009

2 commits


29 Apr, 2009

1 commit

  • ext4 supports a real NFSv4 change attribute, which is bumped whenever
    the ctime would be updated, including times when two updates arrive
    within a jiffy of each other. (Note that although ext4 has space for
    nanosecond-precision ctime, the real resolution is lower: it actually
    uses jiffies as the time-source.) This ensures clients will invalidate
    their caches when they need to.

    There is some fear that keeping the i_version up-to-date could have
    performance drawbacks, so for now it's turned on only by a mount option.
    We hope to do something better eventually.

    Signed-off-by: J. Bruce Fields
    Cc: Theodore Tso

    J. Bruce Fields
     

15 Feb, 2008

1 commit

  • I'm embedding struct path into struct svc_export.

    [akpm@linux-foundation.org: coding-style fixes]
    [ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename]
    Signed-off-by: Jan Blunck
    Acked-by: J. Bruce Fields
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Cc: Trond Myklebust
    Signed-off-by: Erez Zadok
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     

02 Feb, 2008

4 commits


14 Jan, 2008

1 commit

  • When RPCSEC/GSS and krb5i is used, requests are padded, typically to a multiple
    of 8 bytes. This can make the request look slightly longer than it
    really is.

    As of

    f34b95689d2ce001c "The NFSv2/NFSv3 server does not handle zero
    length WRITE request correctly",

    the xdr decode routines for NFSv2 and NFSv3 reject requests that aren't
    the right length, so krb5i (for example) WRITE requests can get lost.

    This patch relaxes the appropriate test and enhances the related comment.

    Signed-off-by: Neil Brown
    Signed-off-by: J. Bruce Fields
    Cc: Peter Staubach
    Signed-off-by: Linus Torvalds

    NeilBrown
     

10 Oct, 2007

1 commit

  • Modify the NFS server code to support 64 bit ino's, as
    appropriate for the system and the NFS protocol version.

    The gist of the changes is to query the underlying file system
    for attributes and not just to use the cached attributes in the
    inode. For this specific purpose, the inode only contains an
    ino field which unsigned long, which is large enough on 64 bit
    platforms, but is not large enough on 32 bit platforms.

    I haven't been able to find any reason why ->getattr can't be called
    while i_mutex. The specification indicates that i_mutex is not
    required to be held in order to invoke ->getattr, but it doesn't say
    that i_mutex can't be held while invoking ->getattr.

    I also haven't come to any conclusions regarding the value of
    lease_get_mtime() and whether it should or should not be invoked
    by fill_post_wcc() too. I chose not to change this because I
    thought that it was safer to leave well enough alone. If we
    decide to make a change, it can be done separately.

    Signed-off-by: Peter Staubach
    Signed-off-by: J. Bruce Fields
    Acked-by: Neil Brown

    Peter Staubach
     

10 May, 2007

2 commits

  • 1/ decode_sattr and decode_sattr3 never return NULL, so remove
    several checks for that. ditto for xdr_decode_hyper.

    2/ replace some open coded XDR_QUADLEN calls with calls to
    XDR_QUADLEN

    3/ in decode_writeargs, simply an 'if' to use a single
    calculation.
    .page_len is the length of that part of the packet that did
    not fit in the first page (the head).
    So the length of the data part is the remainder of the
    head, plus page_len.

    3/ other minor cleanups.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The NFSv2 and NFSv3 servers do not handle WRITE requests for 0 bytes
    correctly. The specifications indicate that the server should accept the
    request, but it should mostly turn into a no-op. Currently, the server
    will return an XDR decode error, which it should not.

    Attached is a patch which addresses this issue. It also adds some boundary
    checking to ensure that the request contains as much data as was requested
    to be written. It also correctly handles an NFSv3 request which requests
    to write more data than the server has stated that it is prepared to
    handle. Previously, there was some support which looked like it should
    work, but wasn't quite right.

    Signed-off-by: Peter Staubach
    Acked-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Staubach
     

28 Mar, 2007

1 commit

  • ->readdir passes lofft_t offsets (used as nfs cookies) to
    nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
    becomes an 'off_t', which isn't good.

    So filesystems that returned 64bit offsets would lose.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

15 Feb, 2007

1 commit

  • Add support for using a filesystem UUID to identify and export point in the
    filehandle.

    For NFSv2, this UUID is xor-ed down to 4 or 8 bytes so that it doesn't take up
    too much room. For NFSv3+, we use the full 16 bytes, and possibly also a
    64bit inode number for exports beneath the root of a filesystem.

    When generating an fsid to return in 'stat' information, use the UUID (hashed
    down to size) if it is available and a small 'fsid' was not specifically
    provided.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown