11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
    no need for list_for_each_entry_safe()/resetting with superblock list
    Fix sget() race with failing mount
    vfs: don't hold s_umount over close_bdev_exclusive() call
    sysv: do not mark superblock dirty on remount
    sysv: do not mark superblock dirty on mount
    btrfs: remove junk sb_dirt change
    BFS: clean up the superblock usage
    AFFS: wait for sb synchronization when needed
    AFFS: clean up dirty flag usage
    cifs: truncate fallout
    mbcache: fix shrinker function return value
    mbcache: Remove unused features
    add f_flags to struct statfs(64)
    pass a struct path to vfs_statfs
    update VFS documentation for method changes.
    All filesystems that need invalidate_inode_buffers() are doing that explicitly
    convert remaining ->clear_inode() to ->evict_inode()
    Make ->drop_inode() just return whether inode needs to be dropped
    fs/inode.c:clear_inode() is gone
    fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
    ...

    Fix up trivial conflicts in fs/nilfs2/super.c

    Linus Torvalds
     

10 Aug, 2010

1 commit


04 Aug, 2010

1 commit


31 Jul, 2010

1 commit


15 May, 2010

4 commits


10 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

11 Mar, 2010

1 commit


06 Mar, 2010

8 commits


04 Mar, 2010

1 commit


10 Feb, 2010

1 commit

  • For NFSv2 and v3:

    O_DIRECT writes are always synchronous, and aren't cached, so nothing
    should be flushed when closing an NFS O_DIRECT file descriptor. Thus
    there are no write errors to report on close(2).

    In addition, there's no cached data to verify on the next open(2),
    so we don't need clean GETATTR results at close time to compare with.

    Thus, there's no need for the nfs_revalidate_inode() call when closing
    an NFS O_DIRECT file. This reduces the number of synchronous
    on-the-wire requests for a simple open-write-close of an NFS O_DIRECT
    file by roughly 20%.

    For NFSv4:

    Call nfs4_do_close() with wait set to zero when closing an NFS
    O_DIRECT file. The CLOSE will go on the wire, but the application
    won't wait for it to complete.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

03 Feb, 2010

1 commit


24 Sep, 2009

1 commit

  • Update some fs code to make use of new helper functions introduced
    in the previous patch. Should be no significant change in behaviour
    (except CIFS now calls send_sig under i_lock, via inode_newsize_ok).

    Reviewed-by: Christoph Hellwig
    Acked-by: Miklos Szeredi
    Cc: linux-nfs@vger.kernel.org
    Cc: Trond.Myklebust@netapp.com
    Cc: linux-cifs-client@lists.samba.org
    Cc: sfrench@samba.org
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    npiggin@suse.de
     

20 Aug, 2009

1 commit

  • The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client
    from one server to another in order to support filesystem migration and
    replication. For full protocol support, we need to add the ability to
    convert a DNS host name into an IP address that we can feed to the RPC
    client.

    We'll reuse the sunrpc cache, now that it has been converted to work with
    rpc_pipefs.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

10 Aug, 2009

1 commit

  • If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code
    needs to know that, so that it don't keep trying to poll for it.

    However, by the same count, if the NFSv4 server does support that
    attribute, then we should ensure that the inode metadata is appropriately
    labelled as being untrusted. For instance, if we don't know the correct
    value of the file's uid, we should certainly not be caching ACLs or ACCESS
    results.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

03 Apr, 2009

2 commits


20 Mar, 2009

1 commit

  • Close-to-open cache consistency rules really only require us to flush out
    writes on calls to close(), and require us to revalidate attributes on the
    very last close of the file.

    Currently we appear to be doing a lot of extra attribute revalidation
    and cache flushes.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

12 Mar, 2009

5 commits

  • The following patch is a combination of a patch by myself and Peter
    Staubach.

    Trond: If we allow other processes to dirty pages while a process is doing
    a consistency sync to disk, we can end up never making progress.

    Peter: Attached is a patch which addresses a continuing problem with
    the NFS client generating out of order WRITE requests. While
    this is compliant with all of the current protocol
    specifications, there are servers in the market which can not
    handle out of order WRITE requests very well. Also, this may
    lead to sub-optimal block allocations in the underlying file
    system on the server. This may cause the read throughputs to
    be reduced when reading the file from the server.

    Peter: There has been a lot of work recently done to address out of
    order issues on a systemic level. However, the NFS client is
    still susceptible to the problem. Out of order WRITE
    requests can occur when pdflush is in the middle of writing
    out pages while the process dirtying the pages calls
    generic_file_buffered_write which calls
    generic_perform_write which calls
    balance_dirty_pages_rate_limited which ends up calling
    writeback_inodes which ends up calling back into the NFS
    client to writes out dirty pages for the same file that
    pdflush happens to be working with.

    Signed-off-by: Peter Staubach
    [modification by Trond to merge the two similar patches]
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently, filling struct nfs_fattr is more or less an all or nothing
    operation, since NFSv2 and NFSv3 have only mandatory attributes.
    In NFSv4, some attributes are optional, and so we may simply not be able to
    fill in those fields. Furthermore, NFSv4 allows you to specify which
    attributes you are interested in retrieving, thus permitting you to
    optimise away retrieval of attributes that you know will no change...

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If cached directory contents becomes incorrect, there is no way to
    flush the contents. This contrasts with files where file locking is
    the recommended way to ensure cache consistency between multiple
    applications (a read-lock always flushes the cache).

    Also while changes to files often change the size of the file (thus
    triggering a cache flush), changes to directories often do not change
    the apparent size (as the size is often rounded to a block size).

    So it is particularly important with directories to avoid the
    possibility of an incorrect cache wherever possible.

    When the link count on a directory changes it implies a change in the
    number of child directories, and so a change in the contents of this
    directory. So use that as a trigger to flush cached contents.

    When the ctime changes but the mtime does not, there are two possible
    reasons.
    1/ The owner/mode information has been changed.
    2/ utimes has been used to set the mtime backwards.

    In the first case, a data-cache flush is not required.
    In the second case it is.

    So on the basis that correctness trumps performance, flush the
    directory contents cache in this case also.

    Signed-off-by: NeilBrown
    Signed-off-by: Trond Myklebust

    NeilBrown
     
  • Remove redundant NFS_STALE() check, a leftover due to the commit
    691beb13cdc88358334ef0ba867c080a247a760f

    Signed-off-by: Suresh Jayaraman
    Signed-off-by: Trond Myklebust

    Suresh Jayaraman
     

24 Dec, 2008

2 commits

  • Hi.

    I've been looking at a bugzilla which describes a problem where
    a customer was advised to use either the "noac" or "actimeo=0"
    mount options to solve a consistency problem that they were
    seeing in the file attributes. It turned out that this solution
    did not work reliably for them because sometimes, the local
    attribute cache was believed to be valid and not timed out.
    (With an attribute cache timeout of 0, the cache should always
    appear to be timed out.)

    In looking at this situation, it appears to me that the problem
    is that the attribute cache timeout code has an off-by-one
    error in it. It is assuming that the cache is valid in the
    region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The
    cache should be considered valid only in the region,
    [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this
    change, the options, "noac" and "actimeo=0", work as originally
    expected.

    This problem was previously addressed by special casing the
    attrtimeo == 0 case. However, since the problem is only an off-
    by-one error, the cleaner solution is address the off-by-one
    error and thus, not require the special case.

    Thanx...

    ps

    Signed-off-by: Peter Staubach
    Signed-off-by: Trond Myklebust

    Peter Staubach
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     

29 Oct, 2008

1 commit


27 Oct, 2008

1 commit


15 Oct, 2008

2 commits

  • The cache_change_attribute is used to decide whether or not a directory has
    changed, in which case we may need to look it up again. Again, the use of
    'jiffies' leads to an issue of resolution.

    Once again, the fix is to change nfs_inode->cache_change_attribute, and
    just make it a simple counter.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • It appears that 'jiffies' timestamps do not have high enough resolution for
    nfs_inode_attrs_need_update(). One problem is that a GETATTR can be
    launched within < 1 jiffy of the last operation that updated the attribute.
    Another problem is that RPC calls can take < 1 jiffy to execute.

    We can fix this by switching the variables to use a simple global counter
    that gets incremented every time we start another GETATTR call.

    Signed-off-by: Trond Myklebust

    Trond Myklebust