14 Oct, 2016

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Some RDMA work and some good bugfixes, and two new features that could
    benefit from user testing:

    - Anna Schumacker contributed a simple NFSv4.2 COPY implementation.
    COPY is already supported on the client side, so a call to
    copy_file_range() on a recent client should now result in a
    server-side copy that doesn't require all the data to make a round
    trip to the client and back.

    - Jeff Layton implemented callbacks to notify clients when contended
    locks become available, which should reduce latency on workloads
    with contended locks"

    * tag 'nfsd-4.9' of git://linux-nfs.org/~bfields/linux:
    NFSD: Implement the COPY call
    nfsd: handle EUCLEAN
    nfsd: only WARN once on unmapped errors
    exportfs: be careful to only return expected errors.
    nfsd4: setclientid_confirm with unmatched verifier should fail
    nfsd: randomize SETCLIENTID reply to help distinguish servers
    nfsd: set the MAY_NOTIFY_LOCK flag in OPEN replies
    nfs: add a new NFS4_OPEN_RESULT_MAY_NOTIFY_LOCK constant
    nfsd: add a LRU list for blocked locks
    nfsd: have nfsd4_lock use blocking locks for v4.1+ locks
    nfsd: plumb in a CB_NOTIFY_LOCK operation
    NFSD: fix corruption in notifier registration
    svcrdma: support Remote Invalidation
    svcrdma: Server-side support for rpcrdma_connect_private
    rpcrdma: RDMA/CM private message data structure
    svcrdma: Skip put_page() when send_reply() fails
    svcrdma: Tail iovec leaves an orphaned DMA mapping
    nfsd: fix dprintk in nfsd4_encode_getdeviceinfo
    nfsd: eliminate cb_minorversion field
    nfsd: don't set a FL_LAYOUT lease for flexfiles layouts

    Linus Torvalds
     

08 Oct, 2016

2 commits


22 Sep, 2016

1 commit

  • inode_change_ok() will be resposible for clearing capabilities and IMA
    extended attributes and as such will need dentry. Give it as an argument
    to inode_change_ok() instead of an inode. Also rename inode_change_ok()
    to setattr_prepare() to better relect that it does also some
    modifications in addition to checks.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     

05 Aug, 2016

2 commits

  • There's some odd logic in nfsd_create() that allows it to be called with
    the parent directory either locked or unlocked. The only already-locked
    caller is NFSv2's nfsd_proc_create(). It's less confusing to split out
    the unlocked case into a separate function which the NFSv2 code can call
    directly.

    Also fix some comments while we're here.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • lookup_one_len already has this check.

    The only effect of this patch is to return access instead of perm in the
    0-length-filename case. I actually prefer nfserr_perm (or _inval?), but
    I doubt anyone cares.

    The isdotent check seems redundant too, but I worry that some client
    might actually care about that strange nfserr_exist error.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

29 May, 2015

1 commit

  • NFSv2 can set the atime and/or mtime of a file to specific timestamps but not
    to the server's current time. To implement the equivalent of utimes("file",
    NULL), it uses a heuristic.

    NFSv3 and later do support setting the atime and/or mtime to the server's
    current time directly. The NFSv2 heuristic is still enabled, and causes
    timestamps to be set wrong sometimes.

    Fix this by moving the heuristic into the NFSv2 specific code. We can leave it
    out of the create code path: the owner can always set timestamps arbitrarily,
    and the workaround would never trigger.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Andreas Gruenbacher
     

16 Apr, 2015

1 commit


30 Jul, 2014

1 commit

  • It's possible for nfsd to fail opening a file that it has just created.
    When that happens, we throw a WARN but it doesn't include any info about
    the error code. Print the status code to give us a bit more info.

    Our QA group hit some of these warnings under some very heavy stress
    testing. My suspicion is that they hit the file-max limit, but it's hard
    to know for sure. Go ahead and add a -ENFILE mapping to
    nfserr_serverfault to make the error more distinct (and correct).

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

10 Jul, 2014

1 commit

  • I saw this pop up with some pynfs testing:

    [ 123.609992] nfsd: non-standard errno: -7

    ...and -7 is -E2BIG. I think what happened is that XFS returned -E2BIG
    due to some xattr operations with the ACL10 pynfs TEST (I guess it has
    limited xattr size?).

    Add a better mapping for that error since it's possible that we'll need
    it. How about we convert it to NFSERR_FBIG? As Bruce points out, they
    both have "BIG" in the name so it must be good.

    Also, turn the printk in this function into a WARN() so that we can get
    a bit more information about situations that don't have proper mappings.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

09 Jul, 2014

3 commits

  • Commit db2e747b1499 (vfs: remove mode parameter from vfs_symlink())
    have remove mode parameter from vfs_symlink.
    So that, iattr isn't needed by nfsd_symlink now, just remove it.

    Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • Currently nfsd_symlink has a weird hack to serve callers who don't
    null-terminate symlink data: it looks ahead at the next byte to see if
    it's zero, and copies it to a new buffer to null-terminate if not.

    That means callers don't have to null-terminate, but they *do* have to
    ensure that the byte following the end of the data is theirs to read.

    That's a bit subtle, and the NFSv4 code actually got this wrong.

    So let's just throw out that code and let callers pass null-terminated
    strings; we've already fixed them to do that.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • It's simple enough for NFSv2 to null-terminate the symlink data.

    A bit weird (it depends on knowing that we've already read the following
    byte, which is either padding or part of the mode), but no worse than
    the conditional kstrdup it otherwise relies on in nfsd_symlink().

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

26 Feb, 2013

1 commit


31 Jul, 2012

1 commit

  • When mnt_want_write() starts to handle freezing it will get a full lock
    semantics requiring proper lock ordering. So push mnt_want_write() call
    consistently outside of i_mutex.

    CC: linux-nfs@vger.kernel.org
    CC: "J. Bruce Fields"
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

05 Jan, 2011

4 commits


31 Jul, 2010

1 commit


30 Jul, 2010

1 commit

  • Fixes at least one real minor bug: the nfs4 recovery dir sysctl
    would not return its status properly.

    Also I finished Al's 1e41568d7378d ("Take ima_path_check() in nfsd
    past dentry_open() in nfsd_open()") commit, it moved the IMA
    code, but left the old path initializer in there.

    The rest is just dead code removed I think, although I was not
    fully sure about the "is_borc" stuff. Some more review
    would be still good.

    Found by gcc 4.6's new warnings.

    Signed-off-by: Andi Kleen
    Cc: Al Viro
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: J. Bruce Fields

    Andi Kleen
     

16 Dec, 2009

1 commit


15 Dec, 2009

2 commits


14 Nov, 2009

1 commit


29 Sep, 2009

1 commit

  • We really shouldn't hit this case at all, and forthcoming kernel and
    nfs-utils changes should eliminate this case; if it does happen,
    consider it a bug rather than reporting an error that doesn't really
    make sense for the operation (since there's no reason for a server to be
    accepting v4 traffic yet have no root filehandle).

    Also move some exp_pseudoroot code into a helper function while we're
    here.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

16 Jun, 2009

1 commit

  • kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear.

    Among other things, this makes it simpler to grep for callers of these
    functions--something which has frequently caused confusion among nfs
    developers.

    Signed-off-by: Yu Zhiguo
    Signed-off-by: J. Bruce Fields

    Yu Zhiguo
     

19 Mar, 2009

1 commit

  • If a filesystem being written to via NFS returns a short write count
    (as opposed to an error) to nfsd, nfsd treats that as a success for
    the entire write, rather than the short count that actually succeeded.

    For example, given a 8192 byte write, if the underlying filesystem
    only writes 4096 bytes, nfsd will ack back to the nfs client that all
    8192 bytes were written. The nfs client does have retry logic for
    short writes, but this is never called as the client is told the
    complete write succeeded.

    There are probably other ways it could happen, but in my case it
    happened with a fuse (filesystem in userspace) filesystem which can
    rather easily have a partial write.

    Here is a patch to properly return the short write count to the
    client.

    Signed-off-by: David Shaw
    Signed-off-by: J. Bruce Fields

    David Shaw
     

08 Jan, 2009

1 commit


30 Sep, 2008

1 commit

  • RFC 2623 section 2.3.2 permits the server to bypass gss authentication
    checks for certain operations that a client may perform when mounting.
    In the case of a client that doesn't have some form of credentials
    available to it on boot, this allows it to perform the mount unattended.
    (Presumably real file access won't be needed until a user with
    credentials logs in.)

    Being slightly more lenient allows lots of old clients to access
    krb5-only exports, with the only loss being a small amount of
    information leaked about the root directory of the export.

    This affects only v2 and v3; v4 still requires authentication for all
    access.

    Thanks to Peter Staubach testing against a Solaris client, which
    suggesting addition of v3 getattr, to the list, and to Trond for noting
    that doing so exposes no additional information.

    Signed-off-by: J. Bruce Fields
    Cc: Peter Staubach
    Cc: Trond Myklebust

    J. Bruce Fields
     

24 Jun, 2008

2 commits

  • Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it
    clear, that these are not used outside nfsd, and to avoid name and
    number space conflicts with the VFS.

    [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well]

    Signed-off-by: Miklos Szeredi
    Signed-off-by: J. Bruce Fields

    Miklos Szeredi
     
  • OCFS2 can return -ERESTARTSYS from write requests (and possibly
    elsewhere) if there is a signal pending.

    If nfsd is shutdown (by sending a signal to each thread) while there
    is still an IO load from the client, each thread could handle one last
    request with a signal pending. This can result in -ERESTARTSYS
    which is not understood by nfserrno() and so is reflected back to
    the client as nfserr_io aka -EIO. This is wrong.

    Instead, interpret ERESTARTSYS to mean "try again later" by returning
    nfserr_jukebox. The client will resend and - if the server is
    restarted - the write will (hopefully) be successful and everyone will
    be happy.

    The symptom that I narrowed down to this was:
    copy a large file via NFS to an OCFS2 filesystem, and restart
    the nfs server during the copy.
    The 'cp' might get an -EIO, and the file will be corrupted -
    presumably holes in the middle where writes appeared to fail.

    Signed-off-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    NeilBrown
     

15 Feb, 2008

1 commit

  • I'm embedding struct path into struct svc_export.

    [akpm@linux-foundation.org: coding-style fixes]
    [ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename]
    Signed-off-by: Jan Blunck
    Acked-by: J. Bruce Fields
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Cc: Trond Myklebust
    Signed-off-by: Erez Zadok
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     

18 Jul, 2007

1 commit

  • Allow readonly access to vary depending on the pseudoflavor, using the flag
    passed with each pseudoflavor in the export downcall. The rest of the flags
    are ignored for now, though some day we might also allow id squashing to vary
    based on the flavor.

    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     

10 May, 2007

1 commit

  • When the kernel calls svc_reserve to downsize the expected size of an RPC
    reply, it fails to account for the possibility of a checksum at the end of
    the packet. If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O
    then you'll generally see messages similar to this in the server's ring
    buffer:

    RPC request reserved 164 but used 208

    While I was never able to verify it, I suspect that this problem is also
    the root cause of some oopses I've seen under these conditions:

    https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726

    This is probably also a problem for other sec= types and for NFSv4. The
    large reserved size for NFSv4 compound packets seems to generally paper
    over the problem, however.

    This patch adds a wrapper for svc_reserve that accounts for the possibility
    of a checksum. It also fixes up the appropriate callers of svc_reserve to
    call the wrapper. For now, it just uses a hardcoded value that I
    determined via testing. That value may need to be revised upward as things
    change, or we may want to eventually add a new auth_op that attempts to
    calculate this somehow.

    Unfortunately, there doesn't seem to be a good way to reliably determine
    the expected checksum length prior to actually calculating it, particularly
    with schemes like spkm3.

    Signed-off-by: Jeff Layton
    Acked-by: Neil Brown
    Cc: Trond Myklebust
    Acked-by: J. Bruce Fields
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Layton
     

13 Feb, 2007

1 commit

  • There are loads of places where the RPC server assumes that the rq_addr fields
    contains an IPv4 address. Top among these are error and debugging messages
    that display the server's IP address.

    Let's refactor the address printing into a separate function that's smart
    enough to figure out the difference between IPv4 and IPv6 addresses.

    Signed-off-by: Chuck Lever
    Cc: Aurelien Charbon
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chuck Lever
     

21 Oct, 2006

3 commits


04 Oct, 2006

1 commit

  • The limit over UDP remains at 32K. Also, make some of the apparently
    arbitrary sizing constants clearer.

    The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
    the rqstp. This allows it to be different for different protocols (udp/tcp)
    and also allows it to depend on the servers declared sv_bufsiz.

    Note that we don't actually increase sv_bufsz for nfs yet. That comes next.

    Signed-off-by: Greg Banks
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Banks