27 May, 2009

2 commits

  • If the asynchronous lease renewal fails (usually due to a soft timeout),
    then we _must_ schedule state recovery in order to ensure that we don't
    lose the lease unnecessarily or, if the lease is already lost, that we
    recover the locking state promptly...

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • fix build error with latest kbuild adjustments to initconst.

    The commit a447c0932445f92ce6f4c1bd020f62c5097a7842 ("vfs: Use
    const for kernel parser table") changed:

    static match_table_t __initdata tokens = {
    to
    static match_table_t __initconst tokens = {

    But the missing const causes popwerpc to fail with latest
    updates to __initconst like this:

    fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
    fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict

    The bug is only present with kbuild-next.
    Following patch has been build tested.

    Signed-off-by: Sam Ravnborg
    Cc: Steven Whitehouse
    Cc: Stephen Rothwell
    Acked-by: Jan Beulich
    Signed-off-by: Trond Myklebust

    Sam Ravnborg
     

19 May, 2009

1 commit

  • The problem is that permission checking is skipped if atomic open is
    possible, but when exec opens a file, it just opens it O_READONLY which
    means EXEC permission will not be checked at that time.

    This problem is observed by the following sequence (executed as root):

    mount -t nfs4 server:/ /mnt4
    echo "ls" >/mnt4/foo
    chmod 744 /mnt4/foo
    su guest -c "mnt4/foo"

    Signed-off-by: Frank Filz
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org
    Tested-by: Eugene Teo
    Signed-off-by: Linus Torvalds

    Frank Filz
     

09 May, 2009

2 commits


03 May, 2009

1 commit

  • Follow up to Nick Piggin's patches to ensure that nfs_vm_page_mkwrite
    returns with the page lock held, and sets the VM_FAULT_LOCKED flag.

    See http://bugzilla.kernel.org/show_bug.cgi?id=12913

    Signed-off-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

21 Apr, 2009

1 commit

  • Commit ae46141ff08f1965b17c531b571953c39ce8b9e2 (NFSv3: Fix posix ACL code)
    introduces a bug in the calculation of the XDR header iovec. In the case
    where we are inlining the acls, we need to adjust the length of the iovec
    req->rq_svec, in addition to adjusting the total buffer length.

    Tested-by: Leonardo Chiquitto
    Tested-by: Suresh Jayaraman
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

08 Apr, 2009

1 commit


07 Apr, 2009

1 commit


03 Apr, 2009

17 commits

  • Add NFS mount options to allow the local caching support to be enabled.

    The attached patch makes it possible for the NFS filesystem to be told to make
    use of the network filesystem local caching service (FS-Cache).

    To be able to use this, a recent nfsutils package is required.

    There are three variant NFS mount options that can be added to a mount command
    to control caching for a mount. Only the last one specified takes effect:

    (*) Adding "fsc" will request caching.

    (*) Adding "fsc=" will request caching and also specify a uniquifier.

    (*) Adding "nofsc" will disable caching.

    For example:

    mount warthog:/ /a -o fsc

    The cache of a particular superblock (NFS FSID) will be shared between all
    mounts of that volume, provided they have the same connection parameters and
    are not marked 'nosharecache'.

    Where it is otherwise impossible to distinguish superblocks because all the
    parameters are identical, but the 'nosharecache' option is supplied, a
    uniquifying string must be supplied, else only the first mount will be
    permitted to use the cache.

    If there's a key collision, then the second mount will disable caching and give
    a warning into the kernel log.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Display the local caching state in /proc/fs/nfsfs/volumes.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Store pages from an NFS inode into the cache data storage object associated
    with that inode.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Read pages from an FS-Cache data storage object representing an inode into an
    NFS inode.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • nfs_readpage_async() needs to be non-static so that it can be used as a
    fallback for the local on-disk caching should an EIO crop up when reading the
    cache.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Add read context retention so that FS-Cache can call back into NFS when a read
    operation on the cache fails EIO rather than reading data. This permits NFS to
    then fetch the data from the server instead using the appropriate security
    context.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • FS-Cache page management for NFS. This includes hooking the releasing and
    invalidation of pages marked with PG_fscache (aka PG_private_2) and waiting for
    completion of the write-to-cache flag (PG_fscache_write aka PG_owner_priv_2).

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Add some new NFS I/O counters for FS-Cache doing things for NFS. A new line is
    emitted into /proc/pid/mountstats if caching is enabled that looks like:

    fsc:

    Where is the number of pages read successfully from the cache, is
    the number of failed page reads against the cache, is the number of
    successful page writes to the cache, is the number of failed page writes
    to the cache, and is the number of NFS pages that have been disconnected
    from the cache.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Invalidate the FsCache page flags on the pages belonging to an inode when the
    cache backing that NFS inode is removed.

    This allows a live cache to be withdrawn.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Bind data storage objects in the local cache to NFS inodes.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Define and create inode-level cache data storage objects (as managed by
    nfs_inode structs).

    Each inode-level object is created in a superblock-level index object and is
    itself a data storage object into which pages from the inode are stored.

    The inode object key is the NFS file handle for the inode.

    The inode object is given coherency data to carry in the auxiliary data
    permitted by the cache. This is a sequence made up of:

    (1) i_mtime from the NFS inode.

    (2) i_ctime from the NFS inode.

    (3) i_size from the NFS inode.

    (4) change_attr from the NFSv4 attribute data.

    As the cache is a persistent cache, the auxiliary data is checked when a new
    NFS in-memory inode is set up that matches an already existing data storage
    object in the cache. If the coherency data is the same, the on-disk object is
    retained and used; if not, it is scrapped and a new one created.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Define and create superblock-level cache index objects (as managed by
    nfs_server structs).

    Each superblock object is created in a server level index object and is itself
    an index into which inode-level objects are inserted.

    Ideally there would be one superblock-level object per server, and the former
    would be folded into the latter; however, since the "nosharecache" option
    exists this isn't possible.

    The superblock object key is a sequence consisting of:

    (1) Certain superblock s_flags.

    (2) Various connection parameters that serve to distinguish superblocks for
    sget().

    (3) The volume FSID.

    (4) The security flavour.

    (5) The uniquifier length.

    (6) The uniquifier text. This is normally an empty string, unless the fsc=xyz
    mount option was used to explicitly specify a uniquifier.

    The key blob is of variable length, depending on the length of (6).

    The superblock object is given no coherency data to carry in the auxiliary data
    permitted by the cache. It is assumed that the superblock is always coherent.

    This patch also adds uniquification handling such that two otherwise identical
    superblocks, at least one of which is marked "nosharecache", won't end up
    trying to share the on-disk cache. It will be possible to manually provide a
    uniquifier through a mount option with a later patch to avoid the error
    otherwise produced.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Define and create server-level cache index objects (as managed by nfs_client
    structs).

    Each server object is created in the NFS top-level index object and is itself
    an index into which superblock-level objects are inserted.

    Ideally there would be one superblock-level object per server, and the former
    would be folded into the latter; however, since the "nosharecache" option
    exists this isn't possible.

    The server object key is a sequence consisting of:

    (1) NFS version

    (2) Server address family (eg: AF_INET or AF_INET6)

    (3) Server port.

    (4) Server IP address.

    The key blob is of variable length, depending on the length of (4).

    The server object is given no coherency data to carry in the auxiliary data
    permitted by the cache.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Register NFS for caching and retrieve the top-level cache index object cookie.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Permit local filesystem caching to be enabled for NFS in the kernel
    configuration.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Add comment banners to some NFS functions so that they can be modified by the
    NFS fscache patches for further information.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    Remove two unneeded exports and make two symbols static in fs/mpage.c
    Cleanup after commit 585d3bc06f4ca57f975a5a1f698f65a45ea66225
    Trim includes of fdtable.h
    Don't crap into descriptor table in binfmt_som
    Trim includes in binfmt_elf
    Don't mess with descriptor table in load_elf_binary()
    Get rid of indirect include of fs_struct.h
    New helper - current_umask()
    check_unsafe_exec() doesn't care about signal handlers sharing
    New locking/refcounting for fs_struct
    Take fs_struct handling to new file (fs/fs_struct.c)
    Get rid of bumping fs_struct refcount in pivot_root(2)
    Kill unsharing fs_struct in __set_personality()

    Linus Torvalds
     

02 Apr, 2009

1 commit


01 Apr, 2009

2 commits

  • Change the page_mkwrite prototype to take a struct vm_fault, and return
    VM_FAULT_xxx flags. There should be no functional change.

    This makes it possible to return much more detailed error information to
    the VM (and also can provide more information eg. virtual_address to the
    driver, which might be important in some special cases).

    This is required for a subsequent fix. And will also make it easier to
    merge page_mkwrite() with fault() in future.

    Signed-off-by: Nick Piggin
    Cc: Chris Mason
    Cc: Trond Myklebust
    Cc: Miklos Szeredi
    Cc: Steven Whitehouse
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Artem Bityutskiy
    Cc: Felix Blyakher
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • current->fs->umask is what most of fs_struct users are doing.
    Put that into a helper function.

    Signed-off-by: Al Viro

    Al Viro
     

31 Mar, 2009

1 commit

  • Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
    as correctly noted at bug #12454. Someone can lookup entry with NULL
    ->owner, thus not pinning enything, and release it later resulting
    in module refcount underflow.

    We can keep ->owner and supply it at registration time like ->proc_fops
    and ->data.

    But this leaves ->owner as easy-manipulative field (just one C assignment)
    and somebody will forget to unpin previous/pin current module when
    switching ->owner. ->proc_fops is declared as "const" which should give
    some thoughts.

    ->read_proc/->write_proc were just fixed to not require ->owner for
    protection.

    rmmod'ed directories will be empty and return "." and ".." -- no harm.
    And directories with tricky enough readdir and lookup shouldn't be modular.
    We definitely don't want such modular code.

    Removing ->owner will also make PDE smaller.

    So, let's nuke it.

    Kudos to Jeff Layton for reminding about this, let's say, oversight.

    http://bugzilla.kernel.org/show_bug.cgi?id=12454

    Signed-off-by: Alexey Dobriyan

    Alexey Dobriyan
     

29 Mar, 2009

6 commits


28 Mar, 2009

1 commit


20 Mar, 2009

3 commits