15 Jan, 2012

1 commit

  • * 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits)
    nfsd4: nfsd4_create_clid_dir return value is unused
    NFSD: Change name of extended attribute containing junction
    svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
    svcrpc: fix double-free on shutdown of nfsd after changing pool mode
    nfsd4: be forgiving in the absence of the recovery directory
    nfsd4: fix spurious 4.1 post-reboot failures
    NFSD: forget_delegations should use list_for_each_entry_safe
    NFSD: Only reinitilize the recall_lru list under the recall lock
    nfsd4: initialize special stateid's at compile time
    NFSd: use network-namespace-aware cache registering routines
    SUNRPC: create svc_xprt in proper network namespace
    svcrpc: update outdated BKL comment
    nfsd41: allow non-reclaim open-by-fh's in 4.1
    svcrpc: avoid memory-corruption on pool shutdown
    svcrpc: destroy server sockets all at once
    svcrpc: make svc_delete_xprt static
    nfsd: Fix oops when parsing a 0 length export
    nfsd4: Use kmemdup rather than duplicating its implementation
    nfsd4: add a separate (lockowner, inode) lookup
    nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
    ...

    Linus Torvalds
     

06 Jan, 2012

1 commit

  • As of fedfs-utils-0.8.0, user space stores all NFS junction
    information in a single extended attribute: "trusted.junction.nfs".

    Both FedFS and NFS basic junctions are stored in this one attribute,
    and the intention is that all future forms of NFS junction metadata
    will be stored in this attribute. Other protocols may use a different
    extended attribute.

    Thus NFSD needs to look only for that one extended attribute. The
    "trusted.junction.type" xattr is deprecated. fedfs-utils-0.8.0 will
    continue to attach a "trusted.junction.type" xattr to junctions, but
    future fedfs-utils releases may no longer do that.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

04 Jan, 2012

2 commits


18 Oct, 2011

1 commit


14 Sep, 2011

1 commit


28 Aug, 2011

1 commit

  • A client that wants to execute a file must be able to read it. Read
    opens over nfs are therefore implicitly allowed for executable files
    even when those files are not readable.

    NFSv2/v3 get this right by using a passed-in NFSD_MAY_OWNER_OVERRIDE on
    read requests, but NFSv4 has gotten this wrong ever since
    dc730e173785e29b297aa605786c94adaffe2544 "nfsd4: fix owner-override on
    open", when we realized that the file owner shouldn't override
    permissions on non-reclaim NFSv4 opens.

    So we can't use NFSD_MAY_OWNER_OVERRIDE to tell nfsd_permission to allow
    reads of executable files.

    So, do the same thing we do whenever we encounter another weird NFS
    permission nit: define yet another NFSD_MAY_* flag.

    The industry's future standardization on 128-bit processors will be
    motivated primarily by the need for integers with enough bits for all
    the NFSD_MAY_* flags.

    Reported-by: Leonardo Borda
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

27 Aug, 2011

2 commits


20 Jun, 2011

1 commit

  • Thanks to Casey Bodley for pointing out that on a read open we pass 0,
    instead of O_RDONLY, to break_lease, with the result that a read open is
    treated like a write open for the purposes of lease breaking!

    Reported-by: Casey Bodley
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

07 Jun, 2011

1 commit

  • fix for commit 4795bb37effb7b8fe77e2d2034545d062d3788a8, nfsd: break
    lease on unlink, link, and rename

    if the LINK operation breaks a delegation, it returns NFS4ERR_NOENT
    (which is not a valid error in rfc 5661) instead of NFS4ERR_DELAY.
    the return value of nfsd_break_lease() in nfsd_link() must be
    converted from host_err to err

    Signed-off-by: Casey Bodley
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    Casey Bodley
     

30 May, 2011

1 commit

  • * 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits)
    nfsd: make local functions static
    NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session()
    NFSD: Check status from nfsd4_map_bcts_dir()
    NFSD: Remove setting unused variable in nfsd_vfs_read()
    nfsd41: error out on repeated RECLAIM_COMPLETE
    nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence
    nfsd v4.1 lOCKT clientid field must be ignored
    nfsd41: add flag checking for create_session
    nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly
    nfsd4: fix wrongsec handling for PUTFH + op cases
    nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller
    nfsd4: introduce OPDESC helper
    nfsd4: allow fh_verify caller to skip pseudoflavor checks
    nfsd: distinguish functions of NFSD_MAY_* flags
    svcrpc: complete svsk processing on cb receive failure
    svcrpc: take advantage of tcp autotuning
    SUNRPC: Don't wait for full record to receive tcp data
    svcrpc: copy cb reply instead of pages
    svcrpc: close connection if client sends short packet
    svcrpc: note network-order types in svc_process_calldir
    ...

    Linus Torvalds
     

30 Apr, 2011

2 commits


20 Apr, 2011

1 commit

  • An open on a NFS4 share using the O_CREAT flag on an existing file for
    which we have permissions to open but contained in a directory with no
    write permissions will fail with EACCES.

    A tcpdump shows that the client had set the open mode to UNCHECKED which
    indicates that the file should be created if it doesn't exist and
    encountering an existing flag is not an error. Since in this case the
    file exists and can be opened by the user, the NFS server is wrong in
    attempting to check create permissions on the parent directory.

    The patch adds a conditional statement to check for create permissions
    only if the file doesn't exist.

    Signed-off-by: Sachin S. Prabhu
    Signed-off-by: J. Bruce Fields

    Sachin Prabhu
     

11 Apr, 2011

2 commits


24 Mar, 2011

1 commit

  • * 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
    SUNRPC: Remove resource leak in svc_rdma_send_error()
    nfsd: wrong index used in inner loop
    nfsd4: fix comment and remove unused nfsd4_file fields
    nfs41: make sure nfs server return right ca_maxresponsesize_cached
    nfsd: fix compile error
    svcrpc: fix bad argument in unix_domain_find
    nfsd4: fix struct file leak
    nfsd4: minor nfs4state.c reshuffling
    svcrpc: fix rare race on unix_domain creation
    nfsd41: modify the members value of nfsd4_op_flags
    nfsd: add proc file listing kernel's gss_krb5 enctypes
    gss:krb5 only include enctype numbers in gm_upcall_enctypes
    NFSD, VFS: Remove dead code in nfsd_rename()
    nfsd: kill unused macro definition
    locks: use assign_type()

    Linus Torvalds
     

18 Mar, 2011

1 commit


08 Mar, 2011

1 commit

  • Currently we have the following code in fs/nfsd/vfs.c::nfsd_rename() :

    ...
    host_err = nfsd_break_lease(odentry->d_inode);
    if (host_err)
    goto out_drop_write;
    if (ndentry->d_inode) {
    host_err = nfsd_break_lease(ndentry->d_inode);
    if (host_err)
    goto out_drop_write;
    }
    if (host_err)
    goto out_drop_write;
    ...

    'host_err' is guaranteed to be 0 by the time we test 'ndentry->d_inode'.
    If 'host_err' becomes != 0 inside the 'if' statement, then we goto
    'out_drop_write'. So, after the 'if' statement there is no way that
    'host_err' can be anything but 0, so the test afterwards is just dead
    code.
    This patch removes the dead code.

    Signed-off-by: Jesper Juhl
    Signed-off-by: J. Bruce Fields

    Jesper Juhl
     

14 Feb, 2011

3 commits

  • 4795bb37effb7b8fe77e2d2034545d062d3788a8 "nfsd: break lease on unlink,
    link, and rename", only broke the lease on the file that was being
    renamed, and didn't handle the case where the target path refers to an
    already-existing file that will be unlinked by a rename--in that case
    the target file should have any leases broken as well.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • If nfsd fails to find an exported via NFS file in the readahead cache, it
    should increment corresponding nfsdstats counter (ra_depth[10]), but due to a
    bug it may instead write to ra_depth[11], corrupting the following field.

    In a kernel with NFSDv4 compiled in the corruption takes the form of an
    increment of a counter of the number of NFSv4 operation 0's received; since
    there is no operation 0, this is harmless.

    In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the
    memory beyond nfsdstats.

    Signed-off-by: Konstantin Khorenko
    Cc: stable@kernel.org
    Signed-off-by: J. Bruce Fields

    Konstantin Khorenko
     
  • The exit cleanup isn't quite right here.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

17 Jan, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (23 commits)
    sanitize vfsmount refcounting changes
    fix old umount_tree() breakage
    autofs4: Merge the remaining dentry ops tables
    Unexport do_add_mount() and add in follow_automount(), not ->d_automount()
    Allow d_manage() to be used in RCU-walk mode
    Remove a further kludge from __do_follow_link()
    autofs4: Bump version
    autofs4: Add v4 pseudo direct mount support
    autofs4: Fix wait validation
    autofs4: Clean up autofs4_free_ino()
    autofs4: Clean up dentry operations
    autofs4: Clean up inode operations
    autofs4: Remove unused code
    autofs4: Add d_manage() dentry operation
    autofs4: Add d_automount() dentry operation
    Remove the automount through follow_link() kludge code from pathwalk
    CIFS: Use d_automount() rather than abusing follow_link()
    NFS: Use d_automount() rather than abusing follow_link()
    AFS: Use d_automount() rather than abusing follow_link()
    Add an AT_NO_AUTOMOUNT flag to suppress terminal automount
    ...

    Linus Torvalds
     

16 Jan, 2011

1 commit

  • Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
    sleep when it tries to transit away from one of that filesystem's directories
    during a pathwalk. The operation is keyed off a new dentry flag
    (DCACHE_MANAGE_TRANSIT).

    The filesystem is allowed to be selective about which processes it holds and
    which it permits to continue on or prohibits from transiting from each flagged
    directory. This will allow autofs to hold up client processes whilst letting
    its userspace daemon through to maintain the directory or the stuff behind it
    or mounted upon it.

    The ->d_manage() dentry operation:

    int (*d_manage)(struct path *path, bool mounting_here);

    takes a pointer to the directory about to be transited away from and a flag
    indicating whether the transit is undertaken by do_add_mount() or
    do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.

    It should return 0 if successful and to let the process continue on its way;
    -EISDIR to prohibit the caller from skipping to overmounted filesystems or
    automounting, and to use this directory; or some other error code to return to
    the user.

    ->d_manage() is called with namespace_sem writelocked if mounting_here is true
    and no other locks held, so it may sleep. However, if mounting_here is true,
    it may not initiate or wait for a mount or unmount upon the parameter
    directory, even if the act is actually performed by userspace.

    Within fs/namei.c, follow_managed() is extended to check with d_manage() first
    on each managed directory, before transiting away from it or attempting to
    automount upon it.

    follow_down() is renamed follow_down_one() and should only be used where the
    filesystem deliberately intends to avoid management steps (e.g. autofs).

    A new follow_down() is added that incorporates the loop done by all other
    callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
    and CIFS do use it, their use is removed by converting them to use
    d_automount()). The new follow_down() calls d_manage() as appropriate. It
    also takes an extra parameter to indicate if it is being called from mount code
    (with namespace_sem writelocked) which it passes to d_manage(). follow_down()
    ignores automount points so that it can be used to mount on them.

    __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
    DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
    sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have
    that determine whether to abort or not itself. That would allow the autofs
    daemon to continue on in rcu-walk mode.

    Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
    required as every tranist from that directory will cause d_manage() to be
    invoked. It can always be set again when necessary.

    ==========================
    WHAT THIS MEANS FOR AUTOFS
    ==========================

    Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
    trigger the automounting of indirect mounts, and both of these can be called
    with i_mutex held.

    autofs knows that the i_mutex will be held by the caller in lookup(), and so
    can drop it before invoking the daemon - but this isn't so for d_revalidate(),
    since the lock is only held on _some_ of the code paths that call it. This
    means that autofs can't risk dropping i_mutex from its d_revalidate() function
    before it calls the daemon.

    The bug could manifest itself as, for example, a process that's trying to
    validate an automount dentry that gets made to wait because that dentry is
    expired and needs cleaning up:

    mkdir S ffffffff8014e05a 0 32580 24956
    Call Trace:
    [] :autofs4:autofs4_wait+0x674/0x897
    [] avc_has_perm+0x46/0x58
    [] autoremove_wake_function+0x0/0x2e
    [] :autofs4:autofs4_expire_wait+0x41/0x6b
    [] :autofs4:autofs4_revalidate+0x91/0x149
    [] __lookup_hash+0xa0/0x12f
    [] lookup_create+0x46/0x80
    [] sys_mkdirat+0x56/0xe4

    versus the automount daemon which wants to remove that dentry, but can't
    because the normal process is holding the i_mutex lock:

    automount D ffffffff8014e05a 0 32581 1 32561
    Call Trace:
    [] __mutex_lock_slowpath+0x60/0x9b
    [] do_path_lookup+0x2ca/0x2f1
    [] .text.lock.mutex+0xf/0x14
    [] do_rmdir+0x77/0xde
    [] tracesys+0x71/0xe0
    [] tracesys+0xd5/0xe0

    which means that the system is deadlocked.

    This patch allows autofs to hold up normal processes whilst the daemon goes
    ahead and does things to the dentry tree behind the automouter point without
    risking a deadlock as almost no locks are held in d_manage() and none in
    d_automount().

    Signed-off-by: David Howells
    Was-Acked-by: Ian Kent
    Signed-off-by: Al Viro

    David Howells
     

15 Jan, 2011

1 commit

  • * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
    nfsd4: fix callback restarting
    nfsd: break lease on unlink, link, and rename
    nfsd4: break lease on nfsd setattr
    nfsd: don't support msnfs export option
    nfsd4: initialize cb_per_client
    nfsd4: allow restarting callbacks
    nfsd4: simplify nfsd4_cb_prepare
    nfsd4: give out delegations more quickly in 4.1 case
    nfsd4: add helper function to run callbacks
    nfsd4: make sure sequence flags are set after destroy_session
    nfsd4: re-probe callback on connection loss
    nfsd4: set sequence flag when backchannel is down
    nfsd4: keep finer-grained callback status
    rpc: allow xprt_class->setup to return a preexisting xprt
    rpc: keep backchannel xprt as long as server connection
    rpc: move sk_bc_xprt to svc_xprt
    nfsd4: allow backchannel recovery
    nfsd4: support BIND_CONN_TO_SESSION
    nfsd4: modify session list under cl_lock
    Documentation: fl_mylease no longer exists
    ...

    Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The
    vfs-scale work touched some msnfs cases, and this merge removes support
    for that entirely, so the conflict was trivial to resolve.

    Linus Torvalds
     

14 Jan, 2011

4 commits

  • Any change to any of the links pointing to an entry should also break
    delegations.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Leases (delegations) should really be broken on any metadata change, not
    just on size change.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • We've long had these pointless #ifdef MSNFS's sprinkled throughout the
    code--pointless because MSNFS is always defined (and we give no config
    option to make that easy to change). So we could just remove the
    ifdef's and compile the resulting code unconditionally.

    But as long as we're there: why not just rip out this code entirely?
    The only purpose is to implement the "msnfs" export option which turns
    on Windows-like behavior in some cases, and:

    - the export option isn't documented anywhere;
    - the userland utilities (which would need to be able to parse
    "msnfs" in an export file) don't support it;
    - I don't know how to maintain this, as I don't know what the
    proper behavior is; and
    - google shows no evidence that anyone has ever used this.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • * 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits)
    block: ensure that completion error gets properly traced
    blktrace: add missing probe argument to block_bio_complete
    block cfq: don't use atomic_t for cfq_group
    block cfq: don't use atomic_t for cfq_queue
    block: trace event block fix unassigned field
    block: add internal hd part table references
    block: fix accounting bug on cross partition merges
    kref: add kref_test_and_get
    bio-integrity: mark kintegrityd_wq highpri and CPU intensive
    block: make kblockd_workqueue smarter
    Revert "sd: implement sd_check_events()"
    block: Clean up exit_io_context() source code.
    Fix compile warnings due to missing removal of a 'ret' variable
    fs/block: type signature of major_to_index(int) to major_to_index(unsigned)
    block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p)
    cfq-iosched: don't check cfqg in choose_service_tree()
    fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors
    cdrom: export cdrom_check_events()
    sd: implement sd_check_events()
    sr: implement sr_check_events()
    ...

    Linus Torvalds
     

07 Jan, 2011

1 commit

  • Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
    0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
    we start protecting many other dentry members with d_lock.

    Signed-off-by: Nick Piggin

    Nick Piggin
     

05 Jan, 2011

2 commits


20 Dec, 2010

1 commit

  • Commit a8adbe3 forgot to remove the return variable, kill it.

    drivers/block/loop.c: In function 'lo_splice_actor':
    drivers/block/loop.c:398: warning: unused variable 'ret'
    [...]
    fs/nfsd/vfs.c: In function 'nfsd_splice_actor':
    fs/nfsd/vfs.c:848: warning: unused variable 'ret'

    Reported-by: Stephen Rothwell
    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Dec, 2010

1 commit

  • This patch pulls calls to buf->ops->confirm() from all actors passed
    (also indirectly) to splice_from_pipe_feed().

    Is avoiding the call to buf->ops->confirm() while splice()ing to
    /dev/null is an intentional optimization? No other user does that
    and this will remove this special case.

    Against current linux.git 6313e3c21743cc88bb5bd8aa72948ee1e83937b6.

    Signed-off-by: Michał Mirosław
    Signed-off-by: Jens Axboe

    Michał Mirosław
     

26 Oct, 2010

1 commit

  • Add a new helper to write out the inode using the writeback code,
    that is including the correct dirty bit and list manipulation. A few
    of filesystems already opencode this, and a lot of others should be
    using it instead of using write_inode_now which also writes out the
    data.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

27 Aug, 2010

1 commit

  • The commit ebabe9a9001af0af56c0c2780ca1576246e7a74b
    pass a struct path to vfs_statfs
    introduced the struct path initialization, and this seems to trigger
    an Oops on my machine.

    fh_dentry field may be NULL and set later in fh_verify(), thus the
    initialization of path must be after fh_verify().

    Signed-off-by: Takashi Iwai
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Minchan Kim
    Signed-off-by: J. Bruce Fields

    Takashi Iwai
     

11 Aug, 2010

2 commits

  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
    no need for list_for_each_entry_safe()/resetting with superblock list
    Fix sget() race with failing mount
    vfs: don't hold s_umount over close_bdev_exclusive() call
    sysv: do not mark superblock dirty on remount
    sysv: do not mark superblock dirty on mount
    btrfs: remove junk sb_dirt change
    BFS: clean up the superblock usage
    AFFS: wait for sb synchronization when needed
    AFFS: clean up dirty flag usage
    cifs: truncate fallout
    mbcache: fix shrinker function return value
    mbcache: Remove unused features
    add f_flags to struct statfs(64)
    pass a struct path to vfs_statfs
    update VFS documentation for method changes.
    All filesystems that need invalidate_inode_buffers() are doing that explicitly
    convert remaining ->clear_inode() to ->evict_inode()
    Make ->drop_inode() just return whether inode needs to be dropped
    fs/inode.c:clear_inode() is gone
    fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
    ...

    Fix up trivial conflicts in fs/nilfs2/super.c

    Linus Torvalds
     

10 Aug, 2010

1 commit

  • We'll need the path to implement the flags field for statvfs support.
    We do have it available in all callers except:

    - ecryptfs_statfs. This one doesn't actually need vfs_statfs but just
    needs to do a caller to the lower filesystem statfs method.
    - sys_ustat. Add a non-exported statfs_by_dentry helper for it which
    doesn't won't be able to fill out the flags field later on.

    In addition rename the helpers for statfs vs fstatfs to do_*statfs instead
    of the misleading vfs prefix.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig