29 Oct, 2008

1 commit


27 Oct, 2008

1 commit


23 Oct, 2008

3 commits

  • For execute permission on a regular files we need to check if file has
    any execute bits at all, regardless of capabilites.

    This check is normally performed by generic_permission() but was also
    added to the case when the filesystem defines its own ->permission()
    method. In the latter case the filesystem should be responsible for
    performing this check.

    Move the check from inode_permission() inside filesystems which are
    not calling generic_permission().

    Create a helper function execute_ok() that returns true if the inode
    is a directory or if any execute bits are present in i_mode.

    Also fix up the following code:

    - coda control file is never executable
    - sysctl files are never executable
    - hfs_permission seems broken on MAY_EXEC, remove
    - hfsplus_permission is eqivalent to generic_permission(), remove

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Switch all users of d_alloc_anon to d_obtain_alias.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • New flag: LOOKUP_EXCL. Set before doing the final step of pathname
    resolution on the paths that have LOOKUP_CREATE and O_EXCL.

    Signed-off-by: Al Viro

    Al Viro
     

21 Oct, 2008

1 commit


20 Oct, 2008

1 commit

  • Split the LRU lists in two, one set for pages that are backed by real file
    systems ("file") and one for pages that are backed by memory and swap
    ("anon"). The latter includes tmpfs.

    The advantage of doing this is that the VM will not have to scan over lots
    of anonymous pages (which we generally do not want to swap out), just to
    find the page cache pages that it should evict.

    This patch has the infrastructure and a basic policy to balance how much
    we scan the anon lists and how much we scan the file lists. The big
    policy changes are in separate patches.

    [lee.schermerhorn@hp.com: collect lru meminfo statistics from correct offset]
    [kosaki.motohiro@jp.fujitsu.com: prevent incorrect oom under split_lru]
    [kosaki.motohiro@jp.fujitsu.com: fix pagevec_move_tail() doesn't treat unevictable page]
    [hugh@veritas.com: memcg swapbacked pages active]
    [hugh@veritas.com: splitlru: BDI_CAP_SWAP_BACKED]
    [akpm@linux-foundation.org: fix /proc/vmstat units]
    [nishimura@mxp.nes.nec.co.jp: memcg: fix handling of shmem migration]
    [kosaki.motohiro@jp.fujitsu.com: adjust Quicklists field of /proc/meminfo]
    [kosaki.motohiro@jp.fujitsu.com: fix style issue of get_scan_ratio()]
    Signed-off-by: Rik van Riel
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Hugh Dickins
    Signed-off-by: Daisuke Nishimura
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     

18 Oct, 2008

3 commits


16 Oct, 2008

1 commit


15 Oct, 2008

4 commits

  • The cache_change_attribute is used to decide whether or not a directory has
    changed, in which case we may need to look it up again. Again, the use of
    'jiffies' leads to an issue of resolution.

    Once again, the fix is to change nfs_inode->cache_change_attribute, and
    just make it a simple counter.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • It appears that 'jiffies' timestamps do not have high enough resolution for
    nfs_inode_attrs_need_update(). One problem is that a GETATTR can be
    launched within < 1 jiffy of the last operation that updated the attribute.
    Another problem is that RPC calls can take < 1 jiffy to execute.

    We can fix this by switching the variables to use a simple global counter
    that gets incremented every time we start another GETATTR call.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits)
    svcrdma: Fix IRD/ORD polarity
    svcrdma: Update svc_rdma_send_error to use DMA LKEY
    svcrdma: Modify the RPC reply path to use FRMR when available
    svcrdma: Modify the RPC recv path to use FRMR when available
    svcrdma: Add support to svc_rdma_send to handle chained WR
    svcrdma: Modify post recv path to use local dma key
    svcrdma: Add a service to register a Fast Reg MR with the device
    svcrdma: Query device for Fast Reg support during connection setup
    svcrdma: Add FRMR get/put services
    NLM: Remove unused argument from svc_addsock() function
    NLM: Remove "proto" argument from lockd_up()
    NLM: Always start both UDP and TCP listeners
    lockd: Remove unused fields in the nlm_reboot structure
    lockd: Add helper to sanity check incoming NOTIFY requests
    lockd: change nlmclnt_grant() to take a "struct sockaddr *"
    lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
    lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
    lockd: Support non-AF_INET addresses in nlm_lookup_host()
    NLM: Convert nlm_lookup_host() to use a single argument
    svcrdma: Add Fast Reg MR Data Types
    ...

    Linus Torvalds
     

14 Oct, 2008

1 commit

  • This is a much better version of a previous patch to make the parser
    tables constant. Rather than changing the typedef, we put the "const" in
    all the various places where its required, allowing the __initconst
    exception for nfsroot which was the cause of the previous trouble.

    This was posted for review some time ago and I believe its been in -mm
    since then.

    Signed-off-by: Steven Whitehouse
    Cc: Alexander Viro
    Signed-off-by: Linus Torvalds

    Steven Whitehouse
     

11 Oct, 2008

2 commits

  • Bruce observed that nfs_parse_ip_address() will successfully parse an
    IPv6 address that looks like this:

    "::1%"

    A scope delimiter is present, but there is no scope ID following it.
    This is harmless, as it would simply set the scope ID to zero. However,
    in some cases we would like to flag this as an improperly formed
    address.

    We are now also careful to reject addresses where garbage follows the
    address (up to the length of the string), instead of ignoring the
    non-address characters; and where the scope ID is nonsense (not a valid
    device name, but also not numeric). Before, both of these cases would
    result in a harmless zero scope ID.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     

10 Oct, 2008

1 commit

  • This fixes a regression seen when running the Connectathon testsuite
    against an ext3 filesystem. The reason was that the inode was constantly
    being marked as 'just updated' by the jiffy wraparound test.
    This again meant that newer GETATTR calls were failing to pass the
    nfs_inode_attrs_need_update() test unless the changes caused a ctime update
    on the server, since they were perceived as having been started before the
    latest inode update.

    Given that nfs_inode_attrs_need_update() already checks for wraparound
    of nfsi->last_updated, we can drop the buggy "protection" in
    nfs_update_inode().

    Also make a slight micro-optimisation of nfs_inode_attrs_need_update(): we
    are more often going to see time_after(fattr->time_start, nfsi->last_updated)
    be true, rather than seeing an update of ctime/size, so put that test
    first to ensure that we optimise away the ctime/size tests.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

08 Oct, 2008

21 commits

  • It is more efficient to write linearly starting from the beginning of the
    file.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • This patch fixes a regression that was introduced by the string based mounts.

    nfs_mount() statically returns -EACCES for every error returned
    by the remote mounted. This is incorrect because -EACCES is
    an non-fatal error to the mount.nfs command. This error causes
    mount.nfs to retry the mount even in the case when the exported
    directory does not exist.

    This patch maps the errors returned by the remote mountd into
    valid errno values, exactly how it was done pre-string based
    mounts. By returning the correct errno enables mount.nfs
    to do the right thing.

    Signed-off-by: Steve Dickson
    [Trond.Myklebust@netapp.com: nfs_stat_to_errno() now correctly returns
    negative errors, so remove the sign change.]
    Signed-off-by: Trond Myklebust

    Steve Dickson
     
  • The code incorrectly assumes here that the server name (or ip address)
    is null-terminated. This can cause referrals to fail in some cases.

    Also support ipv6 addresses.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • We plan to use this function elsewhere.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Whoever wrote this had a bizarre allergy to for loops.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • This function is a little longer and more deeply nested than necessary.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Allow mount to do authenticated mounts below the root of the exported tree.
    The wording in RFC 2623, sec 2.3.2. allows fsinfo with UNIX authentication
    on the root of the export. Mounts are not always done on the root
    of the exported tree. Especially autoumounts often mount below the root of
    the exported tree.
    Some server implementations (justly) require full authentication for the
    so-called deep mounts. The old code used AUTH_SYS only. This caused deep
    mounts to fail on systems requiring stronger authentication..
    The client should try both authentication types and use the first one that
    succeeds.
    This method was already partially implemented. This patch completes
    the implementation for NFS2 and NFS3.
    This patch was developed to allow Debian systems to automount home directories
    on Solaris servers with krb5 authentication.

    Tested on kernel 2.6.24-etchnhalf.1

    Signed-off-by: E.G. Keizer
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    EG Keizer
     
  • The fattrs used in the NFSv3 getacl/setacl calls are not being properly
    initialized. This occasionally causes nfs_update_inode to fall into
    NFSv4 specific codepaths when handling post-op attrs from these calls.

    Thanks to Cai Qian for noticing the spurious NFSv4 messages in debug
    output from a v3 mount...

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • We *do* now allow bsd flocks over nfs.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Unfortunately, BUG_ON(IS_ROOT(dentry)) can happen inside
    nfs_follow_mountpoint with NFS running Fedora 8 using a
    specific setup.
    https://bugzilla.redhat.com/show_bug.cgi?id=458622

    So, the situation should be handled on NFS client gracefully.

    Signed-off-by: Denis V. Lunev
    CC: Trond Myklebust
    CC: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Denis V. Lunev
     
  • Replace NULL with ERR_PTR(-EINVAL).

    Signed-off-by: Denis V. Lunev
    Signed-off-by: Trond Myklebust

    Denis V. Lunev
     
  • This patch fixes the following compile error caused by
    commit f9247273cb69ba101877e946d2d83044409cc8c5
    (UFS: add const to parser token tabl):

    ...
    CC fs/nfs/nfsroot.o
    /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict
    make[3]: *** [fs/nfs/nfsroot.o] Error 1

    Signed-off-by: Adrian Bunk
    Signed-off-by: Trond Myklebust

    Adrian Bunk
     
  • Currently, if two processes are both trying to revalidate metadata for the
    same inode, they will find themselves being serialised. There is no good
    justification for this now that we have improved our ability to detect
    stale attribute data, so we should remove that serialisation.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Ensure that it sets the inode metadata under the correct spinlock.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If we're merely checking the inode attributes because we suspect that the
    'updated' attributes returned by the RPC call are stale, then we shouldn't
    be doing weak cache consistency updates or clearing the cache_validity
    flags.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • In the case where there are parallel RPC calls to the same inode, we may
    receive stale metadata due to the lack of ordering, hence the sanity
    checking of metadata in nfs_refresh_inode().
    Currently, __nfs_revalidate_inode() is calling nfs_update_inode() directly,
    without any further sanity checks, and hence may end up setting the inode
    up with stale metadata.

    Fix is to use nfs_refresh_inode() instead of nfs_update_inode().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If we believe that the attributes are old (see nfs_refresh_inode()), then
    we shouldn't force an update.
    Also ensure that we hold the inode->i_lock across attribute checks and the
    call to nfs_refresh_inode_locked() to ensure that we don't race with other
    attribute updates.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently nfs_refresh_inode() will only update the inode metadata if it
    sees that the RPC call that returned the nfs_fattr was started
    after the last update of the inode. This means that if we have parallel
    RPC calls to the same inode (when sending WRITE calls, for instance), we
    may often miss updates.

    This patch attempts to recover those missed updates by also accepting
    them if the ctime in the nfs_fattr is more recent than the inode's
    cached ctime.
    It also recovers the case where the file size has increased, but the
    ctime has not been updated due to limited ctime resolution.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Try to avoid taking and dropping the inode->i_lock more than once. Do so by
    moving the code in nfs_refresh_inode() that needs to be done under the
    spinlock into a function nfs_refresh_inode_locked(), and then having both
    nfs_refresh_inode() and nfs_post_op_update_inode() call it directly.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Add the following NFS-specific mount options to the parser.

    -o lookupcache=all /* Default: cache positive & negative
    dentries */
    -o lookupcache=pos[itive] /* Don't cache negative dentries */
    -o lookupcache=none /* Strict revalidation of all dentries */

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The point of introducing text-based mounts was to allow us to add
    functionality without having to worry about legacy binary mount formats.
    The mask should be there in order to ensure that binary formats don't start
    enabling features that they cannot support. There is no justification for
    applying it to the text mount path.

    Signed-off-by: Trond Myklebust

    Trond Myklebust