14 May, 2014

1 commit

  • GFS2 has a transaction glock, which must be grabbed for every
    transaction, whose purpose is to deal with freezing the filesystem.
    Aside from this involving a large amount of locking, it is very easy to
    make the current fsfreeze code hang on unfreezing.

    This patch rewrites how gfs2 handles freezing the filesystem. The
    transaction glock is removed. In it's place is a freeze glock, which is
    cached (but not held) in a shared state by every node in the cluster
    when the filesystem is mounted. This lock only needs to be grabbed on
    freezing, and actions which need to be safe from freezing, like
    recovery.

    When a node wants to freeze the filesystem, it grabs this glock
    exclusively. When the freeze glock state changes on the nodes (either
    from shared to unlocked, or shared to exclusive), the filesystem does a
    special log flush. gfs2_log_flush() does all the work for flushing out
    the and shutting down the incore log, and then it tries to grab the
    freeze glock in a shared state again. Since the filesystem is stuck in
    gfs2_log_flush, no new transaction can start, and nothing can be written
    to disk. Unfreezing the filesytem simply involes dropping the freeze
    glock, allowing gfs2_log_flush() to grab and then release the shared
    lock, so it is cached for next time.

    However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
    shared lock on the filesystem root directory inode to check permissions.
    If that glock has already been grabbed exclusively, fsfreeze will be
    unable to get the shared lock and unfreeze the filesystem.

    In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
    on the filesystem root directory during the freeze, and hold it until it
    unfreezes the filesystem. The functions which need to grab a shared
    lock in order to allow the unfreeze ioctl to be issued now use the lock
    grabbed by the freeze code instead.

    The freeze and unfreeze code take care to make sure that this shared
    lock will not be dropped while another process is using it.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

19 Jun, 2013

1 commit

  • This patch looks at all the outstanding blocks in all the transactions
    on the log, and moves the completed ones to the ail2 list. Then it
    issues revokes for these blocks. This will hopefully speed things up
    in situations where there is a lot of contention for glocks, especially
    if they are acquired serially.

    revoke_lo_before_commit will issue at most one log block's full of these
    preemptive revokes. The amount of reserved log space that
    gfs2_log_reserve() ignores has been incremented to allow for this extra
    block.

    This patch also consolidates the common revoke instructions into one
    function, gfs2_add_revoke().

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

29 Jan, 2013

1 commit

  • Instead of using a list of buffers to write ahead of the journal
    flush, this now uses a list of inodes and calls ->writepages
    via filemap_fdatawrite() in order to achieve the same thing. For
    most use cases this results in a shorter ordered write list,
    as well as much larger i/os being issued.

    The ordered write list is sorted by inode number before writing
    in order to retain the disk block ordering between inodes as
    per the previous code.

    The previous ordered write code used to conflict in its assumptions
    about how to write out the disk blocks with mpage_writepages()
    so that with this updated version we can also use mpage_writepages()
    for GFS2's ordered write, writepages implementation. So we will
    also send larger i/os from writeback too.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Apr, 2012

1 commit

  • Prior to this patch, we have two ways of sending i/o to the log.
    One of those is used when we need to allocate both the data
    to be written itself and also a buffer head to submit it. This
    is done via sb_getblk and friends. This is used mostly for writing
    log headers.

    The other method is used when writing blocks which have some
    in-place counterpart. This is the case for all the metadata
    blocks which are journalled, and when journaled data is in use,
    for unescaped journalled data blocks.

    This patch replaces both of those two methods, and about half
    a dozen separate i/o submission points with a single i/o
    submission function. We also go direct to bio rather than
    using buffer heads, since this allows us to build i/o
    requests of the maximum size for the block device in
    question. It also reduces the memory required for flushing
    the log, which can be very useful in low memory situations.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

29 Feb, 2012

1 commit


20 Apr, 2011

1 commit

  • This patch adds writeback_control to writing back the AIL
    list. This means that we can then take advantage of the
    information we get in ->write_inode() in order to set off
    some pre-emptive writeback.

    In addition, the AIL code is cleaned up a bit to make it
    a bit simpler to understand.

    There is still more which can usefully be done in this area,
    but this is a good start at least.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

21 May, 2010

1 commit

  • The previous patch I wrote for reclaiming unlinked dinodes
    had some shortcomings and did not prevent all hangs.
    This version is much cleaner and more logical, and has
    passed very difficult testing. Sorry for the churn.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

05 May, 2010

1 commit

  • This patch contains various tweaks to how log flushes and active item writeback
    work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
    for gfs2_logd to do the log flushing. Multiple functions were rewritten to
    remove the need to call gfs2_log_lock(). Instead of using one test to see if
    gfs2_logd had work to do, there are now seperate tests to check if there
    are two many buffers in the incore log or if there are two many items on the
    active items list.

    This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
    some minor changes. Since gfs2_ail1_start always submits all the active items,
    it no longer needs to keep track of the first ai submitted, so this has been
    removed. In gfs2_log_reserve(), the order of the calls to
    prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
    been switched. If it called wake_up first there was a small window for a race,
    where logd could run and return before gfs2_log_reserve was ready to get woken
    up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
    would be left waiting for gfs2_logd to eventualy run because it timed out.
    Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
    out, and flushes the log, can now be set on mount with ar_commit.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     

27 Jun, 2008

1 commit


25 Jan, 2008

3 commits

  • This means that we can mark gfs2_ail1_empty static and prepares
    the way for further changes.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The only reason for adding glocks to the journal was to keep track
    of which locks required a log flush prior to release. We add a
    flag to the glock to allow this check to be made in a simpler way.

    This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
    and means that we can avoid extra work during the journal flush.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The i_cache was designed to keep references to the indirect blocks
    used during block mapping so that they didn't have to be looked
    up continually. The idea failed because there are too many places
    where the i_cache needs to be freed, and this has in the past been
    the cause of many bugs.

    In addition there was no performance benefit being gained since the
    disk blocks in question were cached anyway. So this patch removes
    it in order to simplify the code to prepare for other changes which
    would otherwise have had to add further support for this feature.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

10 Oct, 2007

2 commits

  • This patch cleans up the code for writing journaled data into the log.
    It also removes the need to allocate a small "tag" structure for each
    block written into the log. Instead we just keep count of the outstanding
    I/O so that we can be sure that its all been written at the correct time.
    Another result of this patch is that a number of ll_rw_block() calls
    have become submit_bh() calls, closing some races at the same time.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This collects together the operations required to remove a gfs2_bufdata
    from the ail lists. Its only called from two places to start with, but
    expect to see more of this function in future.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

30 Nov, 2006

1 commit


05 Sep, 2006

1 commit


01 Sep, 2006

1 commit

  • As per comments from Jan Engelhardt this
    updates the copyright message to say "version" in full rather than
    "v.2". Also incore.h has been updated to remove forward structure
    declarations which are not required.

    The gfs2_quota_lvb structure has now had endianess annotations added
    to it. Also quota.c has been updated so that we now store the
    lvb data locally in endian independant format to avoid needing
    a structure in host endianess too. As a result the endianess
    conversions are done as required at various points and thus the
    conversion routines in lvb.[ch] are no longer required. I've
    moved the one remaining constant in lvb.h thats used into lm.h
    and removed the unused lvb.[ch].

    I have not changed the HIF_ constants. That is left to a later patch
    which I hope will unify the gh_flags and gh_iflags fields of the
    struct gfs2_holder.

    Cc: Jan Engelhardt
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

19 May, 2006

1 commit


07 Apr, 2006

1 commit

  • This fixes a ref count bug that sometimes showed up a umount time
    (causing it to hang) but it otherwise mostly harmless. At the same
    time there are some clean ups including making the log operations
    structures const, moving a memory allocation so that its not done
    in the fast path of checking to see if there is an outstanding
    transaction related to a particular glock.

    Removes the sd_log_wrap varaible which was updated, but never actually
    used anywhere. Updates the gfs2 ioctl() to run without the kernel lock
    (which it never needed anyway). Removes the "invalidate inodes" loop
    from GFS2's put_super routine. This is done in kill super anyway so
    we don't need to do it here. The loop was also bogus in that if there
    are any inodes "stuck" at this point its a bug and we need to know
    about it rather than hide it by hanging forever.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

21 Feb, 2006

1 commit


17 Jan, 2006

1 commit