30 Jul, 2010

1 commit


29 Jul, 2010

1 commit

  • Function gfs2_write_alloc_required always returned zero as its
    return code. Therefore, it doesn't need to return a return code
    at all. Given that, we can use the return value to return whether
    or not the dinode needs block allocations rather than passing
    that value in, which in turn simplifies a bunch of error checking.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

15 Jul, 2010

1 commit


21 May, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
    GFS2: Fix typo
    GFS2: stuck in inode wait, no glocks stuck
    GFS2: Eliminate useless err variable
    GFS2: Fix writing to non-page aligned gfs2_quota structures
    GFS2: Add some useful messages
    GFS2: fix quota state reporting
    GFS2: Various gfs2_logd improvements
    GFS2: glock livelock
    GFS2: Clean up stuffed file copying
    GFS2: docs update
    GFS2: Remove space from slab cache name

    Linus Torvalds
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

29 Mar, 2010

1 commit

  • If the inode size was corrupt for stuffed files, it was possible
    for the copying of data to overrun the block and/or page. This patch
    checks for that condition so that this is no longer possible.

    This is also preparation for the new truncate sequence patch which
    requires the ability to have stuffed files with larger sizes than
    (disk block size - sizeof(on disk inode)) with the restriction that
    only the initial part of the file may be non-zero.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

12 Feb, 2010

1 commit

  • This patch solves a corner case during allocation which occurs if both
    metadata (indirect) and data blocks are required but there is an
    obstacle in the filesystem (e.g. a resource group header or another
    allocated block) such that when the allocation is requested only
    enough blocks for the metadata are returned.

    By changing the exit condition of this loop, we ensure that a
    minimum of one data block will always be returned.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

12 Jun, 2009

1 commit

  • This patch adds the ability to trace various aspects of the GFS2
    filesystem. The trace points are divided into three groups,
    glocks, logging and bmap. These points have been chosen because
    they allow inspection of the major internal functions of GFS2
    and they are also generic enough that they are unlikely to need
    any major changes as the filesystem evolves.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

10 Jun, 2009

1 commit


22 May, 2009

1 commit

  • This patch renames the ops_*.c files which have no counterpart
    without the ops_ prefix in order to shorten the name and make
    it more readable. In addition, ops_address.h (which was very
    small) is moved into inode.h and inode.h is cleaned up by
    adding extern where required.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

20 May, 2009

1 commit

  • This patch improves the error handling in the case where we
    discover that the summary information in the resource group
    doesn't match the bitmap information while in the process of
    allocating blocks. Originally this resulted in a kernel bug,
    but this patch changes that so that we return -EIO and print
    some messages explaining what went wrong, and how to fix it.

    We also remember locally not to try and allocate from the
    same rgrp again, so that a subsequent allocation in a
    different rgrp should succeed.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Mar, 2009

1 commit

  • This is the big patch that I've been working on for some time
    now. There are many reasons for wanting to make this change
    such as:
    o Reducing overhead by eliminating duplicated fields between structures
    o Simplifcation of the code (reduces the code size by a fair bit)
    o The locking interface is now the DLM interface itself as proposed
    some time ago.
    o Fewer lookups of glocks when processing replies from the DLM
    o Fewer memory allocations/deallocations for each glock
    o Scope to do further optimisations in the future (but this patch is
    more than big enough for now!)

    Please note that (a) this patch relates to the lock_dlm module and
    not the DLM itself, that is still a separate module; and (b) that
    we retain the ability to build GFS2 as a standalone single node
    filesystem with out requiring the DLM.

    This patch needs a lot of testing, hence my keeping it I restarted
    my -git tree after the last merge window. That way, this has the maximum
    exposure before its merged. This is (modulo a few minor bug fixes) the
    same patch that I've been posting on and off the the last three months
    and its passed a number of different tests so far.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

05 Jan, 2009

3 commits


25 Jun, 2008

1 commit

  • This patch fixes bz 450641.

    This patch changes the computation for zero_metapath_length(), which it
    renames to metapath_branch_start(). When you are extending the metadata
    tree, The indirect blocks that point to the new data block must either
    diverge from the existing tree either at the inode, or at the first
    indirect block. They can diverge at the first indirect block because the
    inode has room for 483 pointers while the indirect blocks have room for
    509 pointers, so when the tree is grown, there is some free space in the
    first indirect block. What metapath_branch_start() now computes is the
    height where the first indirect block for the new data block is located.
    It can either be 1 (if the indirect block diverges from the inode) or 2
    (if it diverges from the first indirect block).

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

31 Mar, 2008

17 commits

  • This patch streamlines the quota checking in the "no quota" case by
    making the check inline in the calling function, thus reducing the
    number of function calls. Eventually we might be able to remove the
    checks from the gfs2_quota_lock() and gfs2_quota_check() functions, but
    currently we can't as there are a very few places in the code which need
    to call these functions directly still.

    Signed-off-by: Steven Whitehouse
    Cc: Abhijith Das

    Steven Whitehouse
     
  • gfs2_alloc_get may fail so we have to check it to prevent
    NULL pointer dereference.

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Steven Whitehouse

    Cyrill Gorcunov
     
  • We've supported mapping of extents when no block allocation is required
    for some time. This patch extends that to mapping of extents when an
    allocation has been requested. In that case we try to allocate as many
    blocks as are requested, but we might return fewer in case there is
    something preventing us from returning the complete amount (e.g. an
    already allocated block is in the way).

    Currently the only code path which can actually request multiple data
    blocks in a single bmap call is the page_mkwrite path and even then it
    only happens if there are multiple blocks per page. What this patch does
    do however, is merge the allocation requests for metadata (growing the
    metadata tree in either height or depth) with the allocation of the data
    blocks in the case that both are needed. This results in lower overheads
    even in the single block allocation case.

    The one thing which we can't handle here at the moment is unstuffing. I
    would like to be able to do that, but the problem which arises is that
    in order to unstuff one has to get a locked page from the page cache
    which results in locking problems in the (usual) case that the caller is
    holding the page lock on the page it wishes to map. So that case will
    have to be addressed in future patches.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • In the case that we needed to grow the height of the metadata tree
    we were looking up the inode buffer and then brelse()ing it despite
    the fact that it is needed later in the block map process.

    This patch ensures that we look up the inode's buffer once and only
    once during the block map process.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The blocks counter is almost a duplicate of the i_blocks
    field in the VFS inode. The only difference is that i_blocks
    can be only 32bits long for 32bit arch without large single file
    support. Since GFS2 doesn't handle the non-large single file
    case (for 32 bit anyway) this adds a new config dependency on
    64BIT || LSF. This has always been the case, however we've never
    explicitly said so before.

    Even if we do add support for the non-LSF case, we will still
    not require this field to be duplicated since we will not be
    able to access oversized files anyway.

    So the net result of all this is that we shave 8 bytes from a gfs2_inode
    and get our config deps correct.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This adds a function (currently the only use is during mapping
    of already allocated blocks, but watch this space) which iterates
    over a number of pointers in a block and returns the extent length.

    If the initial pointer is 0 (i.e. unallocated) it will return the
    number of unallocated blocks in the extent. If the initial pointer
    is allocated, then it returns the number of contiguously allocated
    blocks in the extent.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • A dereference was forgotten. This adds it back correctly.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Rather than having to allocate a single block at a time, this patch
    allows the block allocator to allocate an extent. Since there is
    no difference (so far as the block allocator is concerned) between
    data blocks and indirect blocks, it is posible to allocate a single
    extent and for the caller to unrevoke just the blocks required
    for indirect blocks.

    Currently the only bit of GFS2 to make use of this feature is the
    build height function. The intention is that gfs2_block_map will
    be changed to make use of this feature in future patches.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Thanks to the preceeding patches, the only difference between
    these two functions is their name. We can thus merge them
    and call the new function gfs2_alloc_block to reflect the
    fact that it can allocate either kind of block.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • By adding an extra argument to gfs2_trans_add_unrevoke we can now
    specify an extent length of blocks to unrevoke. This means that
    we only need to make one pass through the list for each extent
    rather than each block. Currently the only extent length which
    is used is 1, but that will change in the future.

    Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
    since its the only difference between this and gfs2_alloc_data
    which is left. This will allow a future patch to merge these
    two functions into one (i.e. one call to allocate both data
    and metadata in a single extent in the future).

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • There were three fields being used to keep track of the location
    of the most recently allocated block for each inode. These have
    been merged into a single field in order to better keep the
    data and metadata for an inode close on disk, and also to reduce
    the space required for storage.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The reason for doing this is to allow all the block mapping code
    to share the same array. As a result we can remove two arguments
    from lookup_metapath since they are now returned via the array.

    We also add a function to drop all refs to buffer heads when we
    are done with the metapath. The build_height function shares the
    struct metapath, but currently still frees its own buffers, and
    this will change in a future patch.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This is required to enable future changes to the block
    mapping code.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch contains two small fixups that didn't fit elsewhere.
    They are: (1) get rid of temp variable in find_metapath.
    (2) Remove vestigial "ret" variable from gfs2_writepage_common.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch removed the unnecessary parameter from function
    gfs2_rlist_alloc. The parameter was always passed in as 0.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch improves the calculation of the tree height in order to reduce
    the number of operations which are carried out on each call to gfs2_block_map.
    In the common case, we now make a single comparison, rather than calculating
    the required tree height from scratch each time. Also in the case that the
    tree does need some extra height, we start from the current height rather from
    zero when we work out what the new height ought to be.

    In addition the di_height field is moved into the inode proper and reduced
    in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10).

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch removes the call to gfs2_extent_map from gfs2_write_alloc_required,
    instead we call gfs2_block_map directly. This results in fewer overall calls
    to gfs2_block_map in the multi-block case.

    Also, gfs2_extent_map is marked as deprecated so that people know that its
    going away as soon as all the callers have been converted.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

06 Feb, 2008

1 commit

  • Simplify page cache zeroing of segments of pages through 3 functions

    zero_user_segments(page, start1, end1, start2, end2)

    Zeros two segments of the page. It takes the position where to
    start and end the zeroing which avoids length calculations and
    makes code clearer.

    zero_user_segment(page, start, end)

    Same for a single segment.

    zero_user(page, start, length)

    Length variant for the case where we know the length.

    We remove the zero_user_page macro. Issues:

    1. Its a macro. Inline functions are preferable.

    2. The KM_USER0 macro is only defined for HIGHMEM.

    Having to treat this special case everywhere makes the
    code needlessly complex. The parameter for zeroing is always
    KM_USER0 except in one single case that we open code.

    Avoiding KM_USER0 makes a lot of code not having to be dealing
    with the special casing for HIGHMEM anymore. Dealing with
    kmap is only necessary for HIGHMEM configurations. In those
    configurations we use KM_USER0 like we do for a series of other
    functions defined in highmem.h.

    Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
    function could not be a macro. zero_user_* functions introduced
    here can be be inline because that constant is not used when these
    functions are called.

    Also extract the flushing of the caches to be outside of the kmap.

    [akpm@linux-foundation.org: fix nfs and ntfs build]
    [akpm@linux-foundation.org: fix ntfs build some more]
    Signed-off-by: Christoph Lameter
    Cc: Steven French
    Cc: Michael Halcrow
    Cc:
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Anton Altaparmakov
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: Michael Halcrow
    Cc: Steven French
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

25 Jan, 2008

6 commits

  • The comparison was being made against the wrong quantity.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This is a small I/O performance enhancement to gfs2. (Actually, it is a rework of
    an earlier version I got wrong). The idea here is to check if the write extends
    past the last block in the file. If so, the function can save itself a lot of
    time and trouble because it knows an allocate will be required. Benchmarks like
    iozone should see better performance.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • It is possible to reduce the size of GFS2 inodes by taking the i_alloc
    structure out of the gfs2_inode. This patch allocates the i_alloc
    structure whenever its needed, and frees it afterward. This decreases
    the amount of low memory we use at the expense of requiring a memory
    allocation for each page or partial page that we write. A quick test
    with postmark shows that the overhead is not measurable and I also note
    that OCFS2 use the same approach.

    In the future I'd like to solve the problem by shrinking down the size
    of the members of the i_alloc structure, but for now, this reduces the
    immediate problem of using too much low-memory on x86 and doesn't add
    too much overhead.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Function gfs2_block_map was often looking up the disk inode twice.
    This optimizes it so that only does it once.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch is just a cleanup. Function gfs2_get_block() just calls
    function gfs2_block_map reversing the last two parameters. By
    reversing the parameters, gfs2_block_map() may be called directly
    and function gfs2_get_block may be eliminated altogether.
    Since this function is done for every block operation,
    this streamlines the code and makes it a little bit more efficient.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This adds a function "gfs2_is_writeback()" along the lines of the
    existing "gfs2_is_jdata()" in order to clean up the code and make
    the various tests for the inode mode more obvious. It also fixes
    the PageChecked() logic where we were resetting the flag too early
    in the case of an error path.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse