10 Jul, 2008

1 commit

  • This patch removes the "recent list" which is used during allocation
    and replaces it with the (already existing) mru list used during
    deletion. The "recent list" was not a true mru list leading to a number
    of inefficiencies including a "next" function which made scanning the
    list an order N^2 operation wrt to the number of list elements.

    This should increase allocation performance with large numbers of rgrps.
    Its also a useful preparation and cleanup before some further changes
    which are planned in this area.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

07 Jul, 2008

2 commits

  • We already allow local SH locks while we hold a cached EX glock, so here
    we allow DF locks as well. This works only because we rely on the VFS's
    invalidation for locally cached data, and because if we hold an EX lock,
    then we know that no other node can be caching data relating to this
    file.

    It dramatically speeds up initial writes to O_DIRECT files since we fall
    back to buffered I/O for this and would otherwise bounce between DF and
    EX modes on each and every write call. The lessons to be learned from
    that are to ensure that (for the time being anyway) O_DIRECT files are
    preallocated and that they are written to using reasonably large I/O
    sizes. Even so this change fixes that corner case nicely

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • There is a race in the delayed demote code where it does the wrong thing
    if a demotion to UN has occurred for other reasons before the delay has
    expired. This patch adds an assert to catch that condition as well as
    fixing the root cause by adding an additional check for the UN state.

    Signed-off-by: Steven Whitehouse
    Cc: Bob Peterson

    Steven Whitehouse
     

03 Jul, 2008

1 commit

  • GFS2 calls permission() to verify permissions after locks on the files
    have been taken.

    For this it's sufficient to call gfs2_permission() instead. This
    results in the following changes:

    - IS_RDONLY() check is not performed
    - IS_IMMUTABLE() check is not performed
    - devcgroup_inode_permission() is not called
    - security_inode_permission() is not called

    IS_RDONLY() should be unnecessary anyway, as the per-mount read-only
    flag should provide protection against read-only remounts during
    operations. do_gfs2_set_flags() has been fixed to perform
    mnt_want_write()/mnt_drop_write() to protect against remounting
    read-only.

    IS_IMMUTABLE has been added to gfs2_permission()

    Repeating the security checks seems to be pointless, as they don't
    normally change, and if they do, it's independent of the filesystem
    state.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Steven Whitehouse

    Miklos Szeredi
     

27 Jun, 2008

12 commits

  • Two lines missed from the previous patch.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch adds a file describing the internals of GFS2's glock
    abstraction.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • I discovered that we had a list onto which every lock_dlm
    lock was being put. Its only function was to discover whether
    we'd got any locks left after umount. Since there was already
    a counter for that purpose as well, I removed the list. The
    saving is sizeof(struct list_head) per glock - well worth
    having.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This is only used by GFS1 so can be removed.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • There are several reasons why this is undesirable:

    1. It never happens during normal operation anyway
    2. If it does happen it causes performance to be very, very poor
    3. It isn't likely to solve the original problem (memory shortage
    on remote DLM node) it was supposed to solve
    4. It uses a bunch of arbitrary constants which are unlikely to be
    correct for any particular situation and for which the tuning seems
    to be a black art.
    5. In an N node cluster, only 1/N of the dropped locked will actually
    contribute to solving the problem on average.

    So all in all we are better off without it. This also makes merging
    the lock_dlm module into GFS2 a bit easier.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch fixes Red Hat bugzilla bug 450156.

    This started with a not-too-improbable mount failure because the
    locking protocol was never set back to its proper "lock_dlm" after the
    system was rebooted in the middle of a gfs2_fsck. That left a
    (purposely) invalid locking protocol in the superblock, which caused an
    error when the file system was mounted the next time.

    When there's an error mounting, vfs calls DQUOT_OFF, which calls
    vfs_quota_off which calls gfs2_sync_fs. Next, gfs2_sync_fs calls
    gfs2_log_flush passing s_fs_info. But due to the error, s_fs_info
    had been previously set to NULL, and so we have the kernel oops.

    My solution in this patch is to test for the NULL value before passing
    it. I tested this patch and it fixes the problem.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • The previous attempt to fix the locking in readpage failed due
    to the use of a "try lock" which resulted in occasional high
    cpu usage during testing (due to repeated tries) and also it
    did not resolve all the ordering problems wrt the transaction
    lock (although it did solve all the inode lock ordering problems).

    This patch avoids the problem by unlocking the page and getting the
    locks in the correct order. This means that we have to retest the
    page to ensure that it hasn't changed when we relock the page.

    This now passes the tests which were previously failing.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The patch to remove lock_nolock managed to get the arguments
    of this list_add backwards. This fixes it.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Annotate the &sdp->sd_log_lock.

    Signed-off-by: Harvey Harrison
    Signed-off-by: Steven Whitehouse

    Harvey Harrison
     
  • This patch merges the lock_nolock module into GFS2 itself. As well as removing
    some of the overhead of the module, it also means that its now impossible to
    build GFS2 without a lock module (which would be a pointless thing to do
    anyway).

    We also plan to merge lock_dlm into GFS2 in the future, but that is a more
    tricky task, and will therefore be a separate patch.

    Signed-off-by: Steven Whitehouse
    Cc: David Teigland

    Steven Whitehouse
     
  • This looks like a lot of change, but in fact its not. Mostly its
    things moving from one file to another. The change is just that
    instead of queuing lock completions and callbacks from the DLM
    we now pass them directly to GFS2.

    This gives us a net loss of two list heads per glock (a fair
    saving in memory) plus a reduction in the latency of delivering
    the messages to GFS2, plus we now have one thread fewer as well.
    There was a bug where callbacks and completions could be delivered
    in the wrong order due to this unnecessary queuing which is fixed
    by this patch.

    Signed-off-by: Steven Whitehouse
    Cc: Bob Peterson

    Steven Whitehouse
     
  • This patch implements a number of cleanups to the core of the
    GFS2 glock code. As a result a lot of code is removed. It looks
    like a really big change, but actually a large part of this patch
    is either removing or moving existing code.

    There are some new bits too though, such as the new run_queue()
    function which is considerably streamlined. Highlights of this
    patch include:

    o Fixes a cluster coherency bug during SH -> EX lock conversions
    o Removes the "glmutex" code in favour of a single bit lock
    o Removes the ->go_xmote_bh() for inodes since it was duplicating
    ->go_lock()
    o We now only use the ->lm_lock() function for both locks and
    unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
    o The fast path is considerably shortly, giving performance gains
    especially with lock_nolock
    o The glock_workqueue is now used for all the callbacks from the DLM
    which allows us to simplify the lock_dlm module (see following patch)
    o The way is now open to make further changes such as eliminating the two
    threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
    scheme.

    This patch has undergone extensive testing with various test suites
    so it should be pretty stable by now.

    Signed-off-by: Steven Whitehouse
    Cc: Bob Peterson

    Steven Whitehouse
     

25 Jun, 2008

17 commits


24 Jun, 2008

7 commits