Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

12 Mar, 2014

1 commit

01b172b7b GFS2: Ensure workqueue is scheduled after noexp request ... Browse Code »

This patch closes a small timing window whereby a request to hold the
transaction glock can get stuck. The problem is that after the DLM has
granted the lock, it can get into a state whereby it doesn't transition
the glock to a held state, due to not having requeued the glock state
machine to finish the transition.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2014-03-12 22:45:48 +0800

07 Mar, 2014

2 commits

d77d1b58a GFS2: Use pr_<level> more consistently ... Browse Code »

Add pr_fmt, remove embedded "GFS2: " prefixes.
This now consistently emits lower case "gfs2: " for each message.

Other miscellanea around these changes:

o Add missing newlines
o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches
Signed-off-by: Steven Whitehouse

Joe Perches
2014-03-07 17:30:51 +0800
fc554ed3d GFS2: global conversion to pr_foo() ... Browse Code »

-All printk(KERN_foo converted to pr_foo().
-Messages updated to fit in 80 columns.
-fs_macros converted as well.
-fs_printk removed.

Signed-off-by: Fabian Frederick
Signed-off-by: Steven Whitehouse

Fabian Frederick
2014-03-07 01:34:06 +0800

16 Jan, 2014

1 commit

ac3beb6a5 GFS2: Don't use ENOBUFS when ENOMEM is the correct error code ... Browse Code »

Al Viro has tactfully pointed out that we are using the incorrect
error code in some cases. This patch fixes that, and also removes
the (unused) return value for glock dumping.

> * gfs2_iget() - ENOBUFS instead of ENOMEM. ENOBUFS is
> "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
> we don't support STREAMS it's probably fair game, but... what the hell?

Signed-off-by: Steven Whitehouse
Cc: Al Viro

Steven Whitehouse
2014-01-16 18:31:13 +0800

02 Jan, 2014

1 commit

0b3a2c996 GFS2: Fix unsafe dereference in dump_holder() ... Browse Code »

GLOCK_BUG_ON() might call this function without RCU read lock. Make sure that
RCU read lock is held when using task_struct returned from pid_task().

Signed-off-by: Tetsuo Handa
Signed-off-by: Steven Whitehouse

Tetsuo Handa
2014-01-02 20:18:04 +0800

21 Nov, 2013

1 commit

e3c4269d1 GFS2: fix potential NULL pointer dereference ... Browse Code »

Commit [e66cf1610: GFS2: Use lockref for glocks] replaced call:
atomic_read(&gi->gl->gl_ref) == 0
with:
__lockref_is_dead(&gl->gl_lockref)
therefore changing how gl is accessed, from gi->gl to plan gl.
However, gl can be a NULL pointer, and so gi->gl needs to be
used instead (which is guaranteed not to be NULL because fo
the while loop checking that condition).

Signed-off-by: Michal Nazarewicz
Signed-off-by: Steven Whitehouse

Michal Nazarewicz
2013-11-21 17:55:45 +0800

15 Oct, 2013

1 commit

e66cf1610 GFS2: Use lockref for glocks ... Browse Code »
5

Currently glocks have an atomic reference count and also a spinlock
which covers various internal fields, such as the state. This intent of
this patch is to replace the spinlock and the atomic reference count
with a lockref structure. This contains a spinlock which we can continue
to use as before, and a reference counter which is used in conjuction
with the spinlock to replace the previous atomic counter.

As a result of this there are some new rules for reference counting on
glocks. We need to distinguish between reference count changes under
gl_spin (which are now just increment or decrement of the new counter,
provided the count cannot hit zero) and those which are outside of
gl_spin, but which now take gl_spin internally.

The conversion is relatively straight forward. There is probably some
further clean up which can be done, but the priority at this stage is to
make the change in as simple a manner as possible.

A consequence of this change is that the reference count is being
decoupled from the lru list processing. This should allow future
adoption of the lru_list code with glocks in due course.

The reason for using the "dead" state and not just relying on 0 being
the "invalid state" is so that in due course 0 ref counts can be
allowable. The intent is to eventually be able to remove the ref count
changes which are currently hidden away in state_change().

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-10-15 22:18:08 +0800

11 Sep, 2013

2 commits

1ab6c4997 fs: convert fs shrinkers to new scan/count API ... Browse Code »
38

Convert the filesystem shrinkers to use the new API, and standardise some
of the behaviours of the shrinkers at the same time. For example,
nr_to_scan means the number of objects to scan, not the number of objects
to free.

I refactored the CIFS idmap shrinker a little - it really needs to be
broken up into a shrinker per tree and keep an item count with the tree
root so that we don't need to walk the tree every time the shrinker needs
to count the number of objects in the tree (i.e. all the time under
memory pressure).

[glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
[assorted fixes folded in]
Signed-off-by: Dave Chinner
Signed-off-by: Glauber Costa
Acked-by: Mel Gorman
Acked-by: Artem Bityutskiy
Acked-by: Jan Kara
Acked-by: Steven Whitehouse
Cc: Adrian Hunter
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew Morton

Signed-off-by: Al Viro

Dave Chinner
2013-09-11 06:56:31 +0800
55f841ce9 super: fix calculation of shrinkable objects for small numbers ... Browse Code »

The sysctl knob sysctl_vfs_cache_pressure is used to determine which
percentage of the shrinkable objects in our cache we should actively try
to shrink.

It works great in situations in which we have many objects (at least more
than 100), because the aproximation errors will be negligible. But if
this is not the case, specially when total_objects < 100, we may end up
concluding that we have no objects at all (total / 100 = 0, if total <
100).

This is certainly not the biggest killer in the world, but may matter in
very low kernel memory situations.

Signed-off-by: Glauber Costa
Reviewed-by: Carlos Maiolino
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Mel Gorman
Cc: Dave Chinner
Cc: Al Viro
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Glauber Costa
2013-09-11 06:56:29 +0800

04 Sep, 2013

1 commit

068213f7d GFS2: Remove unnecessary memory barrier ... Browse Code »

Function test_and_clear_bit implies a memory barrier, so subsequent
memory barriers are unnecessary.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2013-09-04 22:58:21 +0800

20 Aug, 2013

1 commit

7286b31ea GFS2: Take glock reference in examine_bucket() ... Browse Code »

We need to check the glock ref counter in a race free way
in order to ensure that the gfs2_glock_hold() call will
succeed. The easiest way to do that is to simply take the
reference count early in the common code of examine_bucket,
skipping any glocks with zero ref count.

That means that the examiner functions all need to put their
reference on the glock once they've performed their function.

Signed-off-by: Steven Whitehouse
Reported-by: David Teigland
Tested-by: David Teigland

Steven Whitehouse
2013-08-20 16:35:09 +0800

19 Aug, 2013

1 commit

dfc4616dd GFS2: alloc_workqueue() doesn't return an ERR_PTR ... Browse Code »

alloc_workqueue() returns a NULL on error, it doesn't return an ERR_PTR.

Signed-off-by: Dan Carpenter
Signed-off-by: Steven Whitehouse

Dan Carpenter
2013-08-19 16:33:43 +0800

01 May, 2013

1 commit

e72859b87 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw ... Browse Code »

Pull GFS2 updates from Steven Whitehouse:
"There is not a whole lot of change this time - there are some further
changes which are in the works, but those will be held over until next
time.

Here there are some clean ups to inode creation, the addition of an
origin (local or remote) indicator to glock demote requests, removal
of one of the remaining GFP_NOFAIL allocations during log flushes, one
minor clean up, and a one liner bug fix."

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Flush work queue before clearing glock hash tables
GFS2: Add origin indicator to glock demote tracing
GFS2: Add origin indicator to glock callbacks
GFS2: replace gfs2_ail structure with gfs2_trans
GFS2: Remove vestigial parameter ip from function rs_deltree
GFS2: Use gfs2_dinode_out() in the inode create path
GFS2: Remove gfs2_refresh_inode from inode creation path
GFS2: Clean up inode creation path

Linus Torvalds
2013-05-01 02:27:14 +0800

29 Apr, 2013

1 commit

7af584d3b gfs2: Convert print_symbol to %pSR ... Browse Code »

Use the new vsprintf extension to avoid any possible
message interleaving.

Signed-off-by: Joe Perches
Acked-by: Steven Whitehouse
Signed-off-by: Jiri Kosina

Joe Perches
2013-04-29 21:23:20 +0800

26 Apr, 2013

1 commit

222cb538f GFS2: Flush work queue before clearing glock hash tables ... Browse Code »

There was a timing window when a GFS2 file system was unmounted
that caused GFS2 to call BUG() and panic the kernel. The call
to BUG() is meant to ensure that the glock reference count,
gl_ref, never gets down to zero and bounce back up again. What was
happening during umount is that function gfs2_put_super was dequeing
its glocks for well-known files. In particular, we saw it on the
journal glock, sd_jinode_gh. The dequeue caused delayed work to be
queued for the glock state machine, to transition the lock to an
"unlocked" state. While the work was still queued, gfs2_put_super
called gfs2_gl_hash_clear to clear out the glock hash tables.
If the timing was just so, the glock work function would drop the
reference count at the time when it was being checked for zero,
and that caused BUG() to be called. This patch calls
flush_workqueue before clearing the glock hash tables, thereby
ensuring that the delayed work is executed before the hash tables
are cleared, and therefore the reference count never goes to zero
until the glock is cleared.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2013-04-26 17:09:04 +0800

10 Apr, 2013

2 commits

7bd8b2eb3 GFS2: Add origin indicator to glock demote tracing ... Browse Code »

This adds the origin indicator to the trace point for glock
demotion, so that it is possible to see where demote requests
have come from.

Note that requests generated from the demote_rq sysfs interface
will show as remote, since they are intended to replicate
exactly the effect of a demote reuqest from a remote node. It
is still possible to tell these apart by looking at the process
which initiated the demote request.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-04-10 17:32:05 +0800
81ffbf654 GFS2: Add origin indicator to glock callbacks ... Browse Code »

This patch adds a bool indicating whether the demote
request was originated locally or remotely. This is then
used by the iopen ->go_callback() to make 100% sure that
it will only respond to remote callbacks.

Since ->evict_inode() uses GL_NOCACHE when it attempts to
get an exclusive lock on the iopen lock, this may result
in extra scheduling of the workqueue in case that the
exclusive promotion request failed. This patch prevents
that from happening.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-04-10 17:26:55 +0800

08 Apr, 2013

1 commit

28fb30275 GFS2: Remove gfs2_refresh_inode from inode creation path ... Browse Code »

The original method for creating inodes used in GFS2 was to fill
out a buffer, with all the information, and then to read that
buffer into the in-core inode, using gfs2_refresh_inode()

The problem with this approach is that all the inode's fields
need to be calculated ahead of time, and were stored in various
variables making the code rather complicated.

The new approach is simply to allocate the in-core inode earlier
and fill in as many fields as possible ahead of time. These can
then be used to initilise the on disk representation. The
code has been working towards the point where it is possible
to remove gfs2_refresh_inode() because all the fields are
correctly initialised ahead of time. We've now reached that
milestone, and have reversed the order of setting up the in
core and on disk inodes.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-04-08 15:40:17 +0800

02 Feb, 2013

1 commit

4506a519f GFS2: Split glock lru processing into two parts ... Browse Code »

The intent here is to split the processing of the glock lru
list into two parts, so that the selection of glocks and the
disposal are separate functions. The plan is then, that further
updates can then be made to these functions in the future
to improve the selection of glocks and also the efficiency of
glock disposal.

The new feature which this patch brings is sorting the
glocks to be disposed of into glock number (and thus also
disk block number) order. Not all glocks will need i/o in
order to dispose of them, but some will, and at least we'll
generate mostly disk block order i/o now.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-02-02 04:36:03 +0800

29 Jan, 2013

1 commit

2a0058559 GFS2: Separate LRU scanning from shrinker ... Browse Code »

This breaks out the LRU scanning function from the shrinker in
preparation for adding other callers to the LRU scanner.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-01-29 18:27:28 +0800

16 Dec, 2012

1 commit

08242bc22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw ... Browse Code »

Pull GFS2 updates from Steven Whitehouse:
"The main feature this time is the new Orlov allocator and the patches
leading up to it which allow us to allocate new inodes from their own
allocation context, rather than borrowing that of their parent
directory. It is this change which then allows us to choose a
different location for subdirectories when required. This works
exactly as per the ext3 implementation from the users point of view.

In addition to that, we've got a speed up in gfs2_rbm_from_block()
from Bob Peterson, three locking related improvements from Dave
Teigland plus a selection of smaller bug fixes and clean ups."

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Set gl_object during inode create
GFS2: add error check while allocating new inodes
GFS2: don't reference inode's glock during block allocation trace
GFS2: remove redundant lvb pointer
GFS2: only use lvb on glocks that need it
GFS2: skip dlm_unlock calls in unmount
GFS2: Fix one RG corner case
GFS2: Eliminate redundant buffer_head manipulation in gfs2_unlink_inode
GFS2: Use dirty_inode in gfs2_dir_add
GFS2: Fix truncation of journaled data files
GFS2: Add Orlov allocator
GFS2: Use proper allocation context for new inodes
GFS2: Add test for resource group congestion status
GFS2: Rename glops go_xmote_th to go_sync
GFS2: Speed up gfs2_rbm_from_block
GFS2: Review bug traps in glops.c

Linus Torvalds
2012-12-16 04:34:21 +0800

12 Dec, 2012

1 commit

252aa6f5b mm: redefine address_space.assoc_mapping ... Browse Code »

Overhaul struct address_space.assoc_mapping renaming it to
address_space.private_data and its type is redefined to void*. By this
approach we consistently name the .private_* elements from struct
address_space as well as allow extended usage for address_space
association with other data structures through ->private_data.

Also, all users of old ->assoc_mapping element are converted to reflect
its new name and type change (->private_data).

Signed-off-by: Rafael Aquini
Cc: Rusty Russell
Cc: "Michael S. Tsirkin"
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Andi Kleen
Cc: Konrad Rzeszutek Wilk
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael Aquini
2012-12-12 09:22:26 +0800

15 Nov, 2012

2 commits

4e2f8849d GFS2: remove redundant lvb pointer ... Browse Code »

The lksb struct already contains a pointer to the lvb,
so another directly from the glock struct is not needed.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-15 18:17:22 +0800
dba2d70c5 GFS2: only use lvb on glocks that need it ... Browse Code »

Save the effort of allocating, reading and writing
the lvb for most glocks that do not use it.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-15 18:16:59 +0800

14 Nov, 2012

1 commit

fb6791d10 GFS2: skip dlm_unlock calls in unmount ... Browse Code »

When unmounting, gfs2 does a full dlm_unlock operation on every
cached lock. This can create a very large amount of work and can
take a long time to complete. However, the vast majority of these
dlm unlock operations are unnecessary because after all the unlocks
are done, gfs2 leaves the dlm lockspace, which automatically clears
the locks of the leaving node, without unlocking each one individually.
So, gfs2 can skip explicit dlm unlocks, and use dlm_release_lockspace to
remove the locks implicitly. The one exception is when the lock's lvb is
being used. In this case, dlm_unlock is called because it may update the
lvb of the resource.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-14 17:37:04 +0800

07 Nov, 2012

2 commits

06dfc3064 GFS2: Rename glops go_xmote_th to go_sync ... Browse Code »

[Editorial: This is a nit, but has been a minor irritation for a long time:]

This patch renames glops structure item for go_xmote_th to go_sync.
The functionality is unchanged; it's just for readability.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-11-07 21:31:57 +0800
8eae1ca00 GFS2: Review bug traps in glops.c ... Browse Code »

Two of the bug traps here could really be warnings. The others are
converted from BUG() to GLOCK_BUG_ON() since we'll most likely
need to know the glock state in order to debug any issues which
arise. As a result of this, __dump_glock has to be renamed and
is no longer static.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-11-07 21:31:07 +0800

24 Sep, 2012

4 commits

e5dc76b9a GFS2: Eliminate redundant calls to may_grant ... Browse Code »

Function add_to_queue was checking may_grant for the passed-in
holder for every iteration of its gh2 loop. Now it only checks it
once at the beginning to see if a try lock is futile.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-09-24 17:47:12 +0800
81e1d4506 GFS2: Combine functions gfs2_glock_dq_wait and wait_on_demote ... Browse Code »

Function gfs2_glock_dq_wait called two-line function wait_on_demote,
so they were combined.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-09-24 17:47:10 +0800
07a790494 GFS2: Combine functions gfs2_glock_wait and wait_on_holder ... Browse Code »

Function gfs2_glock_wait only called function wait_on_holder and
returned its return code, so they were combined for readability.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-09-24 17:47:09 +0800
4abb6ad9e GFS2: inline __gfs2_glock_schedule_for_reclaim ... Browse Code »

Since function gfs2_glock_schedule_for_reclaim is only two
significant lines, we can eliminate it, simplifying the code
and making it more readable.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-09-24 17:47:07 +0800

11 Jun, 2012

2 commits

0fe2f1e92 GFS2: Size seq_file buffer more carefully ... Browse Code »

This places a limit on the buffer size for archs with larger
PAGE_SIZE.

Signed-off-by: Steven Whitehouse
Reported-by: Eric Dumazet

Steven Whitehouse
2012-06-11 20:49:47 +0800
1bb49303b GFS2: Use seq_vprintf for glocks debugfs file ... Browse Code »

Make use of the newly added seq_vprintf() function.

Signed-off-by: Steven Whitehouse
Reported-by: Eric Dumazet
Acked-by: Al Viro

Steven Whitehouse
2012-06-11 20:26:50 +0800

08 Jun, 2012

2 commits

90306c41d GFS2: Use lvbs for storing rgrp information with mount option ... Browse Code »

Instead of reading in the resource groups when gfs2 is checking
for free space to allocate from, gfs2 can store the necessary infromation
in the resource group's lvb. Also, instead of searching for unlinked
inodes in every resource group that's checked for free space, gfs2 can
store the number of unlinked but inodes in the lvb, and only check for
unlinked inodes if it will find some.

The first time a resource group is locked, the lvb must initialized.
Since this involves counting the unlinked inodes in the resource group,
this takes a little extra time. But after that, if the resource group
is locked with GL_SKIP, the buffer head won't be read in unless it's
actually needed.

Enabling the resource groups lvbs is done via the rgrplvb mount option. If
this option isn't set, the lvbs will still be set and updated, but they won't
be verfied or used by the filesystem. To safely turn on this option, all of
the nodes mounting the filesystem must be running code with this patch, and
the filesystem must have been completely unmounted since they were updated.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2012-06-08 18:50:01 +0800
ba1ddcb6c GFS2: Cache last hash bucket for glock seq_files ... Browse Code »

For the glocks and glstats seq_files, which are exposed via debugfs
we should cache the most recent hash bucket, along with the offset
into that bucket. This allows us to restart from that point, rather
than having to begin at the beginning each time.

This is an idea from Eric Dumazet, however I've slightly extended it
so that if the position from which we are due to start is at any
point beyond the last cached point, we start from the last cached
point, plus whatever is the appropriate offset. I don't really expect
people to be lseeking around these files, but if they did so with only
positive offsets, then we'd still get some of the benefit of using a
cached offset.

With my simple test of around 200k entries in the file, I'm seeing
an approx 10x speed up.

Cc: Eric Dumazet
Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-06-08 18:16:22 +0800

07 Jun, 2012

1 commit

df5d2f556 GFS2: Increase buffer size for glocks and glstats debugfs files ... Browse Code »

As per Al Viro's suggestion, this increases the buffer size used
for these two files. This provides a speed up of slightly less than
8x (i.e. proportional to the buffer size) for cases when we have
large numbers of glocks.

Cc: Al Viro
Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-06-07 20:30:16 +0800

29 Feb, 2012

1 commit

a245769f2 GFS2: glock statistics gathering ... Browse Code »

The stats are divided into two sets: those relating to the
super block and those relating to an individual glock. The
super block stats are done on a per cpu basis in order to
try and reduce the overhead of gathering them. They are also
further divided by glock type.

In the case of both the super block and glock statistics,
the same information is gathered in each case. The super
block statistics are used to provide default values for
most of the glock statistics, so that newly created glocks
should have, as far as possible, a sensible starting point.

The statistics are divided into three pairs of mean and
variance, plus two counters. The mean/variance pairs are
smoothed exponential estimates and the algorithm used is
one which will be very familiar to those used to calculation
of round trip times in network code.

The three pairs of mean/variance measure the following
things:

1. DLM lock time (non-blocking requests)
2. DLM lock time (blocking requests)
3. Inter-request time (again to the DLM)

A non-blocking request is one which will complete right
away, whatever the state of the DLM lock in question. That
currently means any requests when (a) the current state of
the lock is exclusive (b) the requested state is either null
or unlocked or (c) the "try lock" flag is set. A blocking
request covers all the other lock requests.

There are two counters. The first is there primarily to show
how many lock requests have been made, and thus how much data
has gone into the mean/variance calculations. The other counter
is counting queueing of holders at the top layer of the glock
code. Hopefully that number will be a lot larger than the number
of dlm lock requests issued.

So why gather these statistics? There are several reasons
we'd like to get a better idea of these timings:

1. To be able to better set the glock "min hold time"
2. To spot performance issues more easily
3. To improve the algorithm for selecting resource groups for
allocation (to base it on lock wait time, rather than blindly
using a "try lock")
Due to the smoothing action of the updates, a step change in
some input quantity being sampled will only fully be taken
into account after 8 samples (or 4 for the variance) and this
needs to be carefully considered when interpreting the
results.

Knowing both the time it takes a lock request to complete and
the average time between lock requests for a glock means we
can compute the total percentage of the time for which the
node is able to use a glock vs. time that the rest of the
cluster has its share. That will be very useful when setting
the lock min hold time.

The other point to remember is that all times are in
nanoseconds. Great care has been taken to ensure that we
measure exactly the quantities that we want, as accurately
as possible. There are always inaccuracies in any
measuring system, but I hope this is as accurate as we
can reasonably make it.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-02-29 01:09:42 +0800

28 Feb, 2012

1 commit

4043b886b GFS2: Fix race between lru_list and glock ref count ... Browse Code »

This patch fixes a narrow race window between the glock ref count
hitting zero and glocks being removed from the lru_list.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-02-28 17:43:07 +0800

11 Jan, 2012

1 commit

e0c2a9aa1 GFS2: dlm based recovery coordination ... Browse Code »

This new method of managing recovery is an alternative to
the previous approach of using the userland gfs_controld.

- use dlm slot numbers to assign journal id's
- use dlm recovery callbacks to initiate journal recovery
- use a dlm lock to determine the first node to mount fs
- use a dlm lock to track journals that need recovery

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-01-11 17:23:05 +0800

15 Jul, 2011

1 commit

7cf8dcd3b GFS2: Automatically adjust glock min hold time ... Browse Code »

This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-07-15 16:32:11 +0800