09 Jan, 2009

1 commit

  • John Stanley reported EOVERFLOW errors in readdir from his self-build
    glibc. I traced this down to glibc enabling d_off overflow checks
    in one of the about five million different getdents implementations.

    In 2.6.28 Dave Woodhouse moved our readdir double buffering required
    for NFS4 readdirplus into nfsd and at that point we lost the capping
    of the directory offsets to 32 bit signed values. Johns glibc used
    getdents64 to even implement readdir for normal 32 bit offset dirents,
    and failed with EOVERFLOW only if this happens on the first dirent in
    a getdents call. I managed to come up with a testcase that uses
    raw getdents and does the EOVERFLOW check manually. We always hit
    it with our last entry due to the special end of directory marker.

    The patch below is a dumb version of just putting back the masking,
    to make sure we have the same behavior as in 2.6.27 and earlier.

    I will work on a better and cleaner fix for 2.6.30.

    Reported-by: John Stanley
    Tested-by: John Stanley
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

28 Jul, 2008

4 commits

  • This implements the code to store the actual filename found during a
    lookup in the dentry cache and to avoid multiple entries in the dcache
    pointing to the same inode.

    To avoid polluting the dcache, we implement a new directory inode
    operations for lookup. xfs_vn_ci_lookup() stores the correct case name in
    the dcache.

    The "actual name" is only allocated and returned for a case- insensitive
    match and not an actual match.

    Another unusual interaction with the dcache is not storing negative
    dentries like other filesystems doing a d_add(dentry, NULL) when an ENOENT
    is returned. During the VFS lookup, if a dentry returned has no inode,
    dput is called and ENOENT is returned. By not doing a d_add, this actually
    removes it completely from the dcache to be reused. create/rename have to
    be modified to support unhashed dentries being passed in.

    SGI-PV: 981521
    SGI-Modid: xfs-linux-melb:xfs-kern:31208a

    Signed-off-by: Barry Naujok
    Signed-off-by: Christoph Hellwig

    Barry Naujok
     
  • The end of the xfs_da_args structure has 4 unsigned char fields for
    true/false information on directory and attr operations using the
    xfs_da_args structure.

    The following converts these 4 into a op_flags field that uses the first 4
    bits for these fields and allows expansion for future operation
    information (eg. case-insensitive lookup request).

    SGI-PV: 981520
    SGI-Modid: xfs-linux-melb:xfs-kern:31206a

    Signed-off-by: Barry Naujok
    Signed-off-by: Christoph Hellwig

    Barry Naujok
     
  • Adds two pieces of functionality for the basis of case-insensitive support
    in XFS:

    1. A comparison result enumerated type: xfs_dacmp. It represents an

    exact match, case-insensitive match or no match at all. This patch

    only implements different and exact results.

    2. xfs_nameops vector for specifying how to perform the hash generation

    of filenames and comparision methods. In this patch the hash vector

    points to the existing xfs_da_hashname function and the comparison

    method does a length compare, and if the same, does a memcmp and

    return the xfs_dacmp result.

    All filename functions that use the hash (create, lookup remove, rename,
    etc) now use the xfs_nameops.hashname function and all directory lookup
    functions also use the xfs_nameops.compname function.

    The lookup functions also handle case-insensitive results even though the
    default comparison function cannot return that. And important aspect of
    the lookup functions is that an exact match always has precedence over a
    case-insensitive. So while a case-insensitive match is found, we have to
    keep looking just in case there is an exact match. In the meantime, the
    info for the first case-insensitive match is retained if no exact match is
    found.

    SGI-PV: 981519
    SGI-Modid: xfs-linux-melb:xfs-kern:31205a

    Signed-off-by: Barry Naujok
    Signed-off-by: Christoph Hellwig

    Barry Naujok
     
  • kmem_free() function takes (ptr, size) arguments but doesn't actually use
    second one.

    This patch removes size argument from all callsites.

    SGI-PV: 981498
    SGI-Modid: xfs-linux-melb:xfs-kern:31050a

    Signed-off-by: Denys Vlasenko
    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy

    Denys Vlasenko
     

14 Feb, 2008

1 commit


18 Dec, 2007

1 commit

  • The recent filldir regression fix was not putting the correct d_off in
    each dirent. This was resulting in incorrect cookies being passed to dmapi
    ioctls and the wrong offset appearing in the dirents. readdir was
    unaffected as the filp->f_pos was being updated with the correct offset
    and this was being written into the last dirent in each buffer. Fix the
    XFS code to do the right thing.

    SGI-PV: 973746
    SGI-Modid: xfs-linux-melb:xfs-kern:30240a

    Signed-off-by: David Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    Lachlan McIlroy
     

15 Oct, 2007

2 commits

  • One of the perpetual scaling problems XFS has is indexing it's incore
    inodes. We currently uses hashes and the default hash sizes chosen can
    only ever be a tradeoff between memory consumption and the maximum
    realistic size of the cache.

    As a result, anyone who has millions of inodes cached on a filesystem
    needs to tunes the size of the cache via the ihashsize mount option to
    allow decent scalability with inode cache operations.

    A further problem is the separate inode cluster hash, whose size is based
    on the ihashsize but is smaller, and so under certain conditions (sparse
    cluster cache population) this can become a limitation long before the
    inode hash is causing issues.

    The following patchset removes the inode hash and cluster hash and
    replaces them with radix trees to avoid the scalability limitations of the
    hashes. It also reduces the size of the inodes by 3 pointers....

    SGI-PV: 969561
    SGI-Modid: xfs-linux-melb:xfs-kern:29481a

    Signed-off-by: David Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Tim Shimmin

    David Chinner
     
  • Currently xfs has a rather complicated internal scheme to allow for
    different directory formats in IRIX. This patch rips all code related to
    this out and pushes useage of the Linux filldir callback into the lowlevel
    directory code. This does not make the code any less portable because
    filldir can be used to create dirents of all possible variations
    (including the IRIX ones as proved by the IRIX binary emulation code under
    arch/mips/).

    This patch get rid of an unessecary copy in the readdir path, about 400
    lines of code and one of the last two users of the uio structure.

    This version is updated to deal with dmapi aswell which greatly simplifies
    the get_dirattrs code. The dmapi part has been tested using the
    get_dirattrs tools from the xfstest dmapi suite1 with various small and
    large directories.

    SGI-PV: 968563
    SGI-Modid: xfs-linux-melb:xfs-kern:29478a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David Chinner
    Signed-off-by: Tim Shimmin

    Christoph Hellwig
     

14 Jul, 2007

1 commit


08 May, 2007

1 commit


20 Jun, 2006

1 commit


09 Jun, 2006

1 commit


29 Mar, 2006

1 commit


17 Mar, 2006

11 commits


02 Nov, 2005

3 commits


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds