09 Sep, 2006

1 commit

  • The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can
    allow data->npages to be one larger than rpages. This causes a page
    pointer to be written beyond the end of the pagevec in nfs_read_data (or
    nfs_write_data).

    Fix this by making nfs_(read|write)_alloc() calculate the size of the
    pagevec array, and initialise data->npages.

    Also get rid of the redundant argument to nfs_commit_alloc().

    Signed-off-by: Trond Myklebust
    Cc: Chuck Lever
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

25 Aug, 2006

6 commits

  • This is needed in order to handle any NFS4ERR_DELAY errors that might be
    returned by the server. It also ensures that we map the NFSv4 errors before
    they are returned to userland.

    Signed-off-by: Trond Myklebust
    (cherry picked from 71c12b3f0abc7501f6ed231a6d17bc9c05a238dc commit)

    Trond Myklebust
     
  • Check the bounds of length specifiers more thoroughly in the XDR decoding of
    NFS4 readdir reply data.

    Currently, if the server returns a bitmap or attr length that causes the
    current decode point pointer to wrap, this could go undetected (consider a
    small "negative" length on a 32-bit machine).

    Also add a check into the main XDR decode handler to make sure that the amount
    of data is a multiple of four bytes (as specified by RFC-1014). This makes
    sure that we can do u32* pointer subtraction in the NFS client without risking
    an undefined result (the result is undefined if the pointers are not correctly
    aligned with respect to one another).

    Signed-Off-By: David Howells
    Signed-off-by: Trond Myklebust
    (cherry picked from 5861fddd64a7eaf7e8b1a9997455a24e7f688092 commit)

    David Howells
     
  • The problem is that we may be caching writes that would extend the file and
    create a hole in the region that we are reading. In this case, we need to
    detect the eof from the server, ensure that we zero out the pages that
    are part of the hole and mark them as up to date.

    Signed-off-by: Trond Myklebust
    (cherry picked from 856b603b01b99146918c093969b6cb1b1b0f1c01 commit)

    Trond Myklebust
     
  • rpc_unlink() and rpc_rmdir() will dput the dentry reference for you.

    Signed-off-by: Trond Myklebust
    (cherry picked from a05a57effa71a1f67ccbfc52335c10c8b85f3f6a commit)

    Trond Myklebust
     
  • Signe-off-by: Trond Myklebust
    (cherry picked from 88bf6d811b01a4be7fd507d18bf5f1c527989089 commit)

    Trond Myklebust
     
  • nfs_wb_page() waits on request completion and, as a result, is not safe to be
    called from nfs_release_page() invoked by VM scanner as part of GFP_NOFS
    allocation. Fix possible deadlock by analyzing gfp mask and refusing to
    release page if __GFP_FS is not set.

    Signed-off-by: Nikita Danilov
    Signed-off-by: Trond Myklebust
    (cherry picked from 374d969debfb290bafcb41d28918dc6f7e43ce31 commit)

    Nikita Danilov
     

04 Aug, 2006

2 commits

  • nfs_writedata_free() and nfs_readdata_free() can now become static.

    Signed-off-by: Adrian Bunk
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Trond Myklebust
    (cherry picked from 5e1ce40f0c3c8f67591aff17756930d7a18ceb1a commit)

    Adrian Bunk
     
  • In one of the error paths of nfs_path, it may return with dcache_lock still
    held; fix this by adding and using a new error path Elong_unlock which unlocks
    dcache_lock.

    Signed-off-by: Josh Triplett
    Signed-off-by: Trond Myklebust
    (cherry picked from f4b90b43677fb23297c56802c3056fc304f988d9 commit)

    Josh Triplett
     

06 Jul, 2006

6 commits


04 Jul, 2006

1 commit


03 Jul, 2006

1 commit


01 Jul, 2006

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
    Remove obsolete #include
    remove obsolete swsusp_encrypt
    arch/arm26/Kconfig typos
    Documentation/IPMI typos
    Kconfig: Typos in net/sched/Kconfig
    v9fs: do not include linux/version.h
    Documentation/DocBook/mtdnand.tmpl: typo fixes
    typo fixes: specfic -> specific
    typo fixes in Documentation/networking/pktgen.txt
    typo fixes: occuring -> occurring
    typo fixes: infomation -> information
    typo fixes: disadvantadge -> disadvantage
    typo fixes: aquire -> acquire
    typo fixes: mecanism -> mechanism
    typo fixes: bandwith -> bandwidth
    fix a typo in the RTC_CLASS help text
    smb is no longer maintained

    Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S

    Linus Torvalds
     
  • Conversion of nr_unstable to a per zone counter

    We need to do some special modifications to the nfs code since there are
    multiple cases of disposition and we need to have a page ref for proper
    accounting.

    This converts the last critical page state of the VM and therefore we need to
    remove several functions that were depending on GET_PAGE_STATE_LAST in order
    to make the kernel compile again. We are only left with event type counters
    in page state.

    [akpm@osdl.org: bugfixes]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This makes nr_dirty a per zone counter. Looping over all processors is
    avoided during writeback state determination.

    The counter aggregation for nr_dirty had to be undone in the NFS layer since
    we summed up the page counts from multiple zones. Someone more familiar with
    NFS should probably review what I have done.

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Signed-off-by: Jörn Engel
    Signed-off-by: Adrian Bunk

    Jörn Engel
     

29 Jun, 2006

3 commits


28 Jun, 2006

1 commit

  • Builds on ARM report link problems with common configurations like
    statically linked NFS (for nfsroot). The symptom is that __init
    section code references __exit section code; that won't work since
    the exit sections are discarded (since they can never be called).

    The best fix for these particular cases would be an "__init_or_exit"
    section annotation.

    Signed-off-by: David Brownell
    Acked-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    David Brownell
     

26 Jun, 2006

1 commit

  • Trond had apparently merged the same patch twice, causing a duplicate
    include of the "internal.h" file, with resulting obvious confusion.

    Tssk. I'm the only one allowed to send out trees that don't even
    compile! Who does this Trond guy think he is?

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

25 Jun, 2006

11 commits

  • Signed-off-by: Alexey Dobriyan
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Trond Myklebust

    Alexey Dobriyan
     
  • fs/built-in.o:(__param+0x20): undefined reference to `nfs_idmap_cache_timeout'
    fs/built-in.o:(__param+0x48): undefined reference to `nfs_callback_set_tcpport'

    Cc: Alexey Dobriyan
    Cc: Andreas Gruenbacher
    Cc: Andy Adamson
    Cc: Chuck Lever
    Cc: David Howells
    Cc: J. Bruce Fields
    Cc: Manoj Naik
    Cc: Marc Eshel
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Trond Myklebust

    Andrew Morton
     
  • Fix various problems with nfs4 disabled. And various other things.

    In file included from fs/nfs/inode.c:50:
    fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
    include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
    fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
    fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
    fs/nfs/internal.h: In function 'nfs4_path':
    fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
    fs/nfs/inode.c: In function 'init_once':
    fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'open_states'
    fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation'
    fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation_state'
    fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'rwsem'
    distcc[26452] ERROR: compile fs/nfs/inode.c on g5/64 failed
    make[1]: *** [fs/nfs/inode.o] Error 1
    make: *** [fs/nfs/inode.o] Error 2
    make: *** Waiting for unfinished jobs....
    In file included from fs/nfs/nfs3xdr.c:26:
    fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
    include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
    fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
    fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
    fs/nfs/internal.h: In function 'nfs4_path':
    fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
    distcc[26486] ERROR: compile fs/nfs/nfs3xdr.c on g5/64 failed
    make[1]: *** [fs/nfs/nfs3xdr.o] Error 1
    make: *** [fs/nfs/nfs3xdr.o] Error 2
    In file included from fs/nfs/nfs3proc.c:24:
    fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
    include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
    fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
    fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
    fs/nfs/internal.h: In function 'nfs4_path':
    fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
    distcc[26469] ERROR: compile fs/nfs/nfs3proc.c on bix/32 failed
    make[1]: *** [fs/nfs/nfs3proc.o] Error 1
    make: *** [fs/nfs/nfs3proc.o] Error 2
    **FAILED**

    Cc: Alexey Dobriyan
    Cc: Andreas Gruenbacher
    Cc: Andy Adamson
    Cc: Chuck Lever
    Cc: David Howells
    Cc: J. Bruce Fields
    Cc: Manoj Naik
    Cc: Marc Eshel
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Trond Myklebust

    Andrew Morton
     
  • Trond Myklebust
     
  • Re-arrange the logic in the NFS direct I/O path so that nfs_read/write_data
    structs are allocated just before they are scheduled, rather than
    allocating them all at once before we start scheduling requests.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Neil Brown observed that the kmalloc() in nfs_get_user_pages() is more
    likely to fail if the I/O is large enough to require the allocation of more
    than a single page to keep track of all the pinned pages in the user's
    buffer.

    Instead of tracking one large page array per dreq/iocb, track pages per
    nfs_read/write_data, just like the cached I/O path does. An array for
    pages is already allocated for us by nfs_readdata_alloc() (and the write
    and commit equivalents).

    This is also required for adding support for vectored I/O to the NFS direct
    I/O path.

    The original reason to pin the user buffer and allocate all the NFS data
    structures before trying to schedule I/O was to ensure all needed resources
    are allocated on the client before starting to send requests. This reduces
    the chance that resource exhaustion on the client will cause a short read
    or write.

    On the other hand, for an application making very large application I/O
    requests, this means that it will be nearly impossible for the application
    to make forward progress on a resource-limited client.

    Thus, moving the buffer pinning functionality into the I/O scheduling
    loops should be good for scalability. The next patch will do the same for
    NFS data structure allocation.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean-up and fix a minor bug: the logic was dirtying page cache pages on
    both read and write operations.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Make the user_addr, user_count, and pos parameters explicit to the
    scheduler routines, and remove the fields from nfs_direct_req. The
    iovec API will be passing in a series of these, not just one set.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • An NFSv3/v4 client must reschedule on-the-wire writes if the writes are
    UNSTABLE, and the server reboots before the client can complete a
    subsequent COMMIT request.

    To support direct asynchronous scatter-gather writes, the write
    rescheduler in fs/nfs/direct.c must not depend on the I/O parameters
    in the controlling nfs_direct_req structure. iovecs can be somewhat
    arbitrarily complex, so there could be an unbounded amount of information
    to save for a rarely encountered requirement.

    Refactor the direct write rescheduler so it uses information from each
    nfs_write_data structure to reschedule writes, instead of caching that
    information in the controlling nfs_direct_req structure.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Factor out the logic that increments and decrements the outstanding I/O
    count. This will be a commonly used bit of code in upcoming patches.
    Also make this an atomic_t again, since it will be very often manipulated
    outside dreq->spin lock.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Conflicts:

    fs/nfs/inode.c
    fs/super.c

    Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
    'VFS: Permit filesystem to override root dentry on mount'

    Trond Myklebust
     

23 Jun, 2006

3 commits

  • Pass the POSIX lock owner ID to the flush operation.

    This is useful for filesystems which don't want to store any locking state
    in inode->i_flock but want to handle locking/unlocking POSIX locks
    internally. FUSE is one such filesystem but I think it possible that some
    network filesystems would need this also.

    Also add a flag to indicate that a POSIX locking request was generated by
    close(), so filesystems using the above feature won't send an extra locking
    request in this case.

    Signed-off-by: Miklos Szeredi
    Cc: Trond Myklebust
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Give the statfs superblock operation a dentry pointer rather than a superblock
    pointer.

    This complements the get_sb() patch. That reduced the significance of
    sb->s_root, allowing NFS to place a fake root there. However, NFS does
    require a dentry to use as a target for the statfs operation. This permits
    the root in the vfsmount to be used instead.

    linux/mount.h has been added where necessary to make allyesconfig build
    successfully.

    Interest has also been expressed for use with the FUSE and XFS filesystems.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Extend the get_sb() filesystem operation to take an extra argument that
    permits the VFS to pass in the target vfsmount that defines the mountpoint.

    The filesystem is then required to manually set the superblock and root dentry
    pointers. For most filesystems, this should be done with simple_set_mnt()
    which will set the superblock pointer and then set the root dentry to the
    superblock's s_root (as per the old default behaviour).

    The get_sb() op now returns an integer as there's now no need to return the
    superblock pointer.

    This patch permits a superblock to be implicitly shared amongst several mount
    points, such as can be done with NFS to avoid potential inode aliasing. In
    such a case, simple_set_mnt() would not be called, and instead the mnt_root
    and mnt_sb would be set directly.

    The patch also makes the following changes:

    (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
    pointer argument and return an integer, so most filesystems have to change
    very little.

    (*) If one of the convenience function is not used, then get_sb() should
    normally call simple_set_mnt() to instantiate the vfsmount. This will
    always return 0, and so can be tail-called from get_sb().

    (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
    dcache upon superblock destruction rather than shrink_dcache_anon().

    This is required because the superblock may now have multiple trees that
    aren't actually bound to s_root, but that still need to be cleaned up. The
    currently called functions assume that the whole tree is rooted at s_root,
    and that anonymous dentries are not the roots of trees which results in
    dentries being left unculled.

    However, with the way NFS superblock sharing are currently set to be
    implemented, these assumptions are violated: the root of the filesystem is
    simply a dummy dentry and inode (the real inode for '/' may well be
    inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
    with child trees.

    [*] Anonymous until discovered from another tree.

    (*) The documentation has been adjusted, including the additional bit of
    changing ext2_* into foo_* in the documentation.

    [akpm@osdl.org: convert ipath_fs, do other stuff]
    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells