09 Jul, 2007

1 commit

  • This patch passes all my nasty tests that were causing the code to
    fail under one circumstance or another. Here is a complete summary
    of all changes from today's git tree, in order of appearance:

    1. There are now separate variables for metadata buffer accounting.
    2. Variable sd_log_num_hdrs is no longer needed, since the header
    accounting is taken care of by the reserve/refund sequence.
    3. Fixed a tiny grammatical problem in a comment.
    4. Added a new function "calc_reserved" to calculate the reserved
    log space. This isn't entirely necessary, but it has two benefits:
    First, it simplifies the gfs2_log_refund function greatly.
    Second, it allows for easier debugging because I could sprinkle the
    code with calls to this function to make sure the accounting is
    proper (by adding asserts and printks) at strategic point of the code.
    5. In log_pull_tail there apparently was a kludge to fix up the
    accounting based on a "pull" parameter. The buffer accounting is
    now done properly, so the kludge was removed.
    6. File sync operations were making a call to gfs2_log_flush that
    writes another journal header. Since that header was unplanned
    for (reserved) by the reserve/refund sequence, the free space had
    to be decremented so that when log_pull_tail gets called, the free
    space is be adjusted properly. (Did I hear you call that a kludge?
    well, maybe, but a lot more justifiable than the one I removed).
    7. In the gfs2_log_shutdown code, it optionally syncs the log by
    specifying the PULL parameter to log_write_header. I'm not sure
    this is necessary anymore. It just seems to me there could be
    cases where shutdown is called while there are outstanding log
    buffers.
    8. In the (data)buf_lo_before_commit functions, I changed some offset
    values from being calculated on the fly to being constants. That
    simplified some code and we might as well let the compiler do the
    calculation once rather than redoing those cycles at run time.
    9. This version has my rewritten databuf_lo_add function.
    This version is much more like its predecessor, buf_lo_add, which
    makes it easier to understand. Again, this might not be necessary,
    but it seems as if this one works as well as the previous one,
    maybe even better, so I decided to leave it in.
    10. In databuf_lo_before_commit, a previous data corruption problem
    was caused by going off the end of the buffer. The proper solution
    is to have the proper limit in place, rather than stopping earlier.
    (Thus my previous attempt to fix it is wrong).
    If you don't wrap the buffer, you're stopping too early and that
    causes more log buffer accounting problems.
    11. In lops.h there are two new (previously mentioned) constants for
    figuring out the data offset for the journal buffers.
    12. There are also two new functions, buf_limit and databuf_limit to
    calculate how many entries will fit in the buffer.
    13. In function gfs2_meta_wipe, it needs to distinguish between pinned
    metadata buffers and journaled data buffers for proper journal buffer
    accounting. It can't use the JDATA gfs2_inode flag because it's
    sometimes passed the "real" inode and sometimes the "metadata
    inode" and the inode flags will be random bits in a metadata
    gfs2_inode. It needs to base its decision on which was passed in.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Robert Peterson
     

12 Feb, 2007

1 commit

  • Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
    corresponding "kmem_cache_zalloc()" call.

    Signed-off-by: Robert P. J. Day
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: Roland McGrath
    Cc: James Bottomley
    Cc: Greg KH
    Acked-by: Joel Becker
    Cc: Steven Whitehouse
    Cc: Jan Kara
    Cc: Michael Halcrow
    Cc: "David S. Miller"
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

30 Nov, 2006

3 commits

  • Since the superblock and the address_space are determined by the
    glock, we might as well just pass that as the argument since all
    the callers already have that available.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start()
    can be made static.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This fixes a bug which resulted in poor performance due to flushing
    the journal too often. The code path in question was via the inode_go_sync()
    function in glops.c. The solution is not to flush the journal immediately
    when inodes are ejected from memory, but batch up the work for glockd to
    deal with later on. This means that glocks may now live on beyond the end of
    the lifetime of their inodes (but not very much longer in the normal case).

    Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
    calculation of the number of free journal blocks.

    The gfs2_logd process has been altered to be more responsive to the journal
    filling up. We now wake it up when the number of uncommitted journal blocks
    has reached the threshold level rather than trying to flush directly at the
    end of each transaction. This again means doing fewer, but larger, log
    flushes in general.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

03 Oct, 2006

1 commit


02 Oct, 2006

1 commit


28 Sep, 2006

1 commit

  • The following patches reduce the size of the VFS inode structure by 28 bytes
    on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction
    in the inode size on a UP kernel that is configured in a production mode
    (i.e., with no spinlock or other debugging functions enabled; if you want to
    save memory taken up by in-core inodes, the first thing you should do is
    disable the debugging options; they are responsible for a huge amount of bloat
    in the VFS inode structure).

    This patch:

    The filesystem or device-specific pointer in the inode is inside a union,
    which is pretty pointless given that all 30+ users of this field have been
    using the void pointer. Get rid of the union and rename it to i_private, with
    a comment to explain who is allowed to use the void pointer. This is just a
    cleanup, but it allows us to reuse the union 'u' for something something where
    the union will actually be used.

    Signed-off-by: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton

    Theodore Ts'o
     

22 Sep, 2006

1 commit

  • Fix a bug in the directory reading code, where we might have dereferenced
    a NULL pointer in case of OOM. Updated the directory code to use the new
    & improved version of gfs2_meta_ra() which now returns the first block
    that was being read. Previously it was releasing it requiring following
    code to grab the block again at each point it was called.

    Also turned off readahead on directory lookups since we are reading a
    hash table, and therefore reading the entries in order is very
    unlikely. Readahead is still used for all other calls to the
    directory reading function (e.g. when growing the hash table).

    Removed the DIO_START constant. Everywhere this was used, it was
    used to unconditionally start i/o aside from a couple of places, so
    I've removed it and made the couple of exceptions to this rule into
    separate functions.

    Also hunted through the other DIO flags and removed them as arguments
    from functions which were always called with the same combination of
    arguments.

    Updated gfs2_meta_indirect_buffer to be a bit more efficient and
    hopefully also be a bit easier to read.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

21 Sep, 2006

1 commit


19 Sep, 2006

1 commit

  • lm_interface.h has a few out of the tree clients such as GFS1
    and userland tools.

    Right now, these clients keeps a copy of the file in their build tree
    that can go out of sync.

    Move lm_interface.h to include/linux, export it to userland and
    clean up fs/gfs2 to use the new location.

    Signed-off-by: Fabio M. Di Nitto
    Signed-off-by: Steven Whitehouse

    Fabio Massimo Di Nitto
     

05 Sep, 2006

3 commits


01 Sep, 2006

1 commit

  • As per comments from Jan Engelhardt this
    updates the copyright message to say "version" in full rather than
    "v.2". Also incore.h has been updated to remove forward structure
    declarations which are not required.

    The gfs2_quota_lvb structure has now had endianess annotations added
    to it. Also quota.c has been updated so that we now store the
    lvb data locally in endian independant format to avoid needing
    a structure in host endianess too. As a result the endianess
    conversions are done as required at various points and thus the
    conversion routines in lvb.[ch] are no longer required. I've
    moved the one remaining constant in lvb.h thats used into lm.h
    and removed the unused lvb.[ch].

    I have not changed the HIF_ constants. That is left to a later patch
    which I hope will unify the gh_flags and gh_iflags fields of the
    struct gfs2_holder.

    Cc: Jan Engelhardt
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

19 Aug, 2006

1 commit

  • This fixes a memory leak of struct gfs2_bufdata and also some
    problems in the ordered write handling code. It needs a bit
    more testing, but I believe that the reference counting of
    ordered write buffers should now be correct.

    This is aimed at fixing Red Hat bugzilla: #201028 and #201082

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

11 Jul, 2006

2 commits

  • We must not call GFP_KERNEL memory allocations while we
    are holding the log lock (read or write) since that may
    trigger a log flush resulting in a deadlock.

    Eventually we need to fix the locking in log.c, for now
    this solves the problem at the expense of freeing up memory
    as fast as we would like to. This needs to be revisited
    later on.

    Cc: Kevin Anderson
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This adds a generation number for the eventual use of NFS to the
    ondisk inode. Its backward compatible with the current code since
    it doesn't really matter what the generation number is to start with,
    and indeed since its set to zero, due to it being taken from padding
    in both the inode and rgrp header, it should be fine.

    The eventual plan is to use this rather than no_formal_ino in the
    NFS filehandles. At that point no_formal_ino will be unused.

    At the same time we also add a releasepages call back to the
    "normal" address space for gfs2 inodes. Also I've removed a
    one-linrer function thats not required any more.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

04 Jul, 2006

1 commit


15 Jun, 2006

1 commit

  • This patch fixes the way we have been dealing with unlinked,
    but still open files. It removes all limits (other than memory
    for inodes, as per every other filesystem) on numbers of these
    which we can support on GFS2. It also means that (like other
    fs) its the responsibility of the last process to close the file
    to deallocate the storage, rather than the person who did the
    unlinking. Note that with GFS2, those two events might take place
    on different nodes.

    Also there are a number of other changes:

    o We use the Linux inode subsystem as it was intended to be
    used, wrt allocating GFS2 inodes
    o The Linux inode cache is now the point which we use for
    local enforcement of only holding one copy of the inode in
    core at once (previous to this we used the glock layer).
    o We no longer use the unlinked "special" file. We just ignore it
    completely. This makes unlinking more efficient.
    o We now use the 4th block allocation state. The previously unused
    state is used to track unlinked but still open inodes.
    o gfs2_inoded is no longer needed
    o Several fields are now no longer needed (and removed) from the in
    core struct gfs2_inode
    o Several fields are no longer needed (and removed) from the in core
    superblock

    There are a number of future possible optimisations and clean ups
    which have been made possible by this patch.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 May, 2006

1 commit


19 May, 2006

3 commits


21 Apr, 2006

1 commit


18 Apr, 2006

1 commit

  • When allocating memory to sort directory entries, use vmalloc()
    rather than kmalloc() since for larger directories, the required
    size can easily be graeter than the 128k maximum of kmalloc().

    Also adding the first steps towards getting the AOP_TRUNCATED_PAGE
    return code get in the glock code by flagging all places where we
    request a glock and we are holding a page lock.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

07 Apr, 2006

1 commit

  • This fixes a ref count bug that sometimes showed up a umount time
    (causing it to hang) but it otherwise mostly harmless. At the same
    time there are some clean ups including making the log operations
    structures const, moving a memory allocation so that its not done
    in the fast path of checking to see if there is an outstanding
    transaction related to a particular glock.

    Removes the sd_log_wrap varaible which was updated, but never actually
    used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
    (which it never needed anyway). Removes the "invalidate inodes" loop
    from GFS2's put_super routine. This is done in kill super anyway so
    we don't need to do it here. The loop was also bogus in that if there
    are any inodes "stuck" at this point its a bug and we need to know
    about it rather than hide it by hanging forever.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

28 Feb, 2006

2 commits

  • As suggested by Pekka Enberg .

    The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h
    The other macros are gone from gfs2.h as (although not requested
    by Pekka Enberg) are a number of included header file which are now
    included individually. The inode number comparison function is
    now an inline function.

    The DT2IF and IF2DT may be addressed in a future patch.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Requested by:
    Prarit Bhargava

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

23 Feb, 2006

1 commit


21 Feb, 2006

1 commit


08 Feb, 2006

1 commit

  • This is a very large patch, with a few still to be resolved issues
    so you might want to check out the previous head of the tree since
    this is known to be unstable. Fixes for the various bugs will be
    forthcoming shortly.

    This patch removes the special data format which has been used
    up till now for journaled data files. Directories still retain the
    old format so that they will remain on disk compatible with earlier
    releases. As a result you can now do the following with journaled
    data files:

    1) mmap them
    2) export them over NFS
    3) convert to/from normal files whenever you want to (the zero length
    restriction is gone)

    In addition the level at which GFS' locking is done has changed for all
    files (since they all now use the page cache) such that the locking is
    done at the page cache level rather than the level of the fs operations.
    This should mean that things like loopback mounts and other things which
    touch the page cache directly should now work.

    Current known issues:

    1. There is a lock mode inversion problem related to the resource
    group hold function which needs to be resolved.
    2. Any significant amount of I/O causes an oops with an offset of hex 320
    (NULL pointer dereference) which appears to be related to a journaled data
    buffer appearing on a list where it shouldn't be.
    3. Direct I/O writes are disabled for the time being (will reappear later)
    4. There is probably a deadlock between the page lock and GFS' locks under
    certain combinations of mmap and fs operation I/O.
    5. Issue relating to ref counting on internally used inodes causes a hang
    on umount (discovered before this patch, and not fixed by it)
    6. One part of the directory metadata is different from GFS1 and will need
    to be resolved before next release.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

18 Jan, 2006

3 commits


17 Jan, 2006

1 commit