04 Sep, 2015

2 commits

  • None of these statistics can meaningfully be negative, and the
    numerator for do_div() must have the type u64. The generic
    implementation of do_div() used on some 32-bit architectures asserts
    that, resulting in a compiler error in gfs2_rgrp_congested().

    Fixes: 0166b197c2ed ("GFS2: Average in only non-zero round-trip times ...")

    Signed-off-by: Ben Hutchings
    Signed-off-by: Bob Peterson
    Acked-by: Andreas Gruenbacher

    Ben Hutchings
     
  • What uniquely identifies a glock in the glock hash table is not
    gl_name, but gl_name and its superblock pointer. This patch makes
    the gl_name field correspond to a unique glock identifier. That will
    allow us to simplify hashing with a future patch, since the hash
    algorithm can then take the gl_name and hash its components in one
    operation.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Steven Whitehouse

    Bob Peterson
     

10 Apr, 2013

1 commit

  • This adds the origin indicator to the trace point for glock
    demotion, so that it is possible to see where demote requests
    have come from.

    Note that requests generated from the demote_rq sysfs interface
    will show as remote, since they are intended to replicate
    exactly the effect of a demote reuqest from a remote node. It
    is still possible to tell these apart by looking at the process
    which initiated the demote request.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

16 Nov, 2012

1 commit


24 Sep, 2012

2 commits

  • This patch improves the tracing of block reservations by
    removing some corner cases and also providing more useful
    detail in the traces.

    A new field is added to the reservation structure to contain
    the inode number. This is used since in certain contexts it is
    not possible to access the inode itself to obtain this information.
    As a result we can then display the inode number for all tracepoints
    and also in case we dump the resource group.

    The "del" tracepoint operation has been removed. This could be called
    with the reservation rgrp set to NULL. That resulted in not printing
    the device number, and thus making the information largely useless
    anyway. Also, the conditional on the rgrp being NULL can then be
    removed from the tracepoint. After this change, all the block
    reservation tracepoint calls will be called with the rgrp information.

    The existing ins,clm and tdel calls to the block reservation tracepoint
    are sufficient to track the entire life of the block reservation.

    In gfs2_block_alloc() the error detection is updated to print out
    the inode number of the problematic inode. This can then be compared
    against the information in the glock dump,tracepoints, etc.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch introduces a new structure, gfs2_rbm, which is a
    tuple of a resource group, a bitmap within the resource group
    and an offset within that bitmap. This is designed to make
    manipulating these sets of variables easier. There is also a
    new helper function which converts this representation back
    to a disk block address.

    In addition, the rbtree nodes which are used for the reservations
    were not being correctly initialised, which is now fixed. Also,
    the tracing was not passing through the inode where it should
    have been. That is mostly fixed aside from one corner case. This
    needs to be revisited since there can also be a NULL rgrp in
    some cases which results in the device being incorrect in the
    trace.

    This is intended to be the first step towards cleaning up some
    of the allocation code, and some further bug fixes.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

19 Jul, 2012

1 commit

  • This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
    resulting improved on disk layout greatly speeds up operations in cases
    which would have resulted in interlaced allocation of blocks previously.
    A typical example of this is 10 parallel dd processes, each writing to a
    file in a common dirctory.

    The implementation uses an rbtree of reservations attached to each
    resource group (and each inode).

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

11 May, 2012

1 commit

  • This is a second attempt at a patch that adds rgrp information to the
    block allocation trace point for GFS2. As suggested, the patch was
    modified to list the rgrp information _after_ the fields that exist today.

    Again, the reason for this patch is to allow us to trace and debug
    problems with the block reservations patch, which is still in the works.
    We can debug problems with reservations if we can see what block allocations
    result from the block reservations. It may also be handy in figuring out
    if there are problems in rgrp free space accounting. In other words,
    we can use it to track the rgrp and its free space along side the allocations
    that are taking place.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

29 Feb, 2012

1 commit

  • The stats are divided into two sets: those relating to the
    super block and those relating to an individual glock. The
    super block stats are done on a per cpu basis in order to
    try and reduce the overhead of gathering them. They are also
    further divided by glock type.

    In the case of both the super block and glock statistics,
    the same information is gathered in each case. The super
    block statistics are used to provide default values for
    most of the glock statistics, so that newly created glocks
    should have, as far as possible, a sensible starting point.

    The statistics are divided into three pairs of mean and
    variance, plus two counters. The mean/variance pairs are
    smoothed exponential estimates and the algorithm used is
    one which will be very familiar to those used to calculation
    of round trip times in network code.

    The three pairs of mean/variance measure the following
    things:

    1. DLM lock time (non-blocking requests)
    2. DLM lock time (blocking requests)
    3. Inter-request time (again to the DLM)

    A non-blocking request is one which will complete right
    away, whatever the state of the DLM lock in question. That
    currently means any requests when (a) the current state of
    the lock is exclusive (b) the requested state is either null
    or unlocked or (c) the "try lock" flag is set. A blocking
    request covers all the other lock requests.

    There are two counters. The first is there primarily to show
    how many lock requests have been made, and thus how much data
    has gone into the mean/variance calculations. The other counter
    is counting queueing of holders at the top layer of the glock
    code. Hopefully that number will be a lot larger than the number
    of dlm lock requests issued.

    So why gather these statistics? There are several reasons
    we'd like to get a better idea of these timings:

    1. To be able to better set the glock "min hold time"
    2. To spot performance issues more easily
    3. To improve the algorithm for selecting resource groups for
    allocation (to base it on lock wait time, rather than blindly
    using a "try lock")
    Due to the smoothing action of the updates, a step change in
    some input quantity being sampled will only fully be taken
    into account after 8 samples (or 4 for the variance) and this
    needs to be carefully considered when interpreting the
    results.

    Knowing both the time it takes a lock request to complete and
    the average time between lock requests for a glock means we
    can compute the total percentage of the time for which the
    node is able to use a glock vs. time that the rest of the
    cluster has its share. That will be very useful when setting
    the lock min hold time.

    The other point to remember is that all times are in
    nanoseconds. Great care has been taken to ensure that we
    measure exactly the quantities that we want, as accurately
    as possible. There are always inaccuracies in any
    measuring system, but I hope this is as accurate as we
    can reasonably make it.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

20 Apr, 2011

2 commits

  • Add a tracepoint for monitoring writeback of the AIL.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This adds support for two new flags. One keeps track of whether
    the glock is on the LRU list or not. The other isn't really a
    flag as such, but an indication of whether the glock has an
    attached object or not. This indication is reported without
    any locking, which is ok since we do not dereference the object
    pointer but merely report whether it is NULL or not.

    Also, this fixes one place where a tracepoint was missing, which
    was at the point we remove deallocated blocks from the journal.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

20 Sep, 2010

1 commit

  • Due to the design of the VFS, it is quite usual for operations on GFS2
    to consist of a lookup (requiring a shared lock) followed by an
    operation requiring an exclusive lock. If a remote node has cached an
    exclusive lock, then it will receive two demote events in rapid succession
    firstly for a shared lock and then to unlocked. The existing min hold time
    code was triggering in this case, even if the node was otherwise idle
    since the state change time was being updated by the initial demote.

    This patch introduces logic to skip the min hold timer in the case that
    a "double demote" of this kind has occurred. The min hold timer will
    still be used in all other cases.

    A new glock flag is introduced which is used to keep track of whether
    there have been any newly queued holders since the last glock state
    change. The min hold time is only applied if the flag is set.

    Signed-off-by: Steven Whitehouse
    Tested-by: Abhijith Das

    Steven Whitehouse
     

13 Jul, 2009

1 commit

  • If TRACE_INCLDUE_FILE is defined,
    will be included and compiled, otherwise it will be

    So TRACE_SYSTEM should be defined outside of #if proctection,
    just like TRACE_INCLUDE_FILE.

    Imaging this scenario:

    #include
    -> TRACE_SYSTEM == foo
    ...
    #include
    -> TRACE_SYSTEM == bar
    ...
    #define CREATE_TRACE_POINTS
    #include
    -> TRACE_SYSTEM == bar !!!

    and then bar.h will be included and compiled.

    Signed-off-by: Li Zefan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

12 Jun, 2009

1 commit

  • This patch adds the ability to trace various aspects of the GFS2
    filesystem. The trace points are divided into three groups,
    glocks, logging and bmap. These points have been chosen because
    they allow inspection of the major internal functions of GFS2
    and they are also generic enough that they are unlikely to need
    any major changes as the filesystem evolves.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse