Eric Lee / smarc-fsl-linux-kernel

25 Dec, 2015

1 commit

f39814f60 gfs2: Invalid security labels of inodes when they go invalid ... Browse Code »

When gfs2 releases the glock of an inode, it must invalidate all
information cached for that inode, including the page cache and acls.
Use the new security_inode_invalidate_secctx hook to also invalidate
security labels in that case. These items will be reread from disk
when needed after reacquiring the glock.

Signed-off-by: Andreas Gruenbacher
Acked-by: Bob Peterson
Acked-by: Steven Whitehouse
Cc: cluster-devel@redhat.com
[PM: fixed spelling errors and description line lengths]
Signed-off-by: Paul Moore

Andreas Gruenbacher
2015-12-25 00:09:40 +0800

30 Oct, 2015

1 commit

f3dd16491 gfs2: Remove gl_spin define ... Browse Code »

Commit e66cf161 replaced the gl_spin spinlock in struct gfs2_glock with a
gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in
gl_lockref). Remove that define to make the references to gl_lockref.lock more
obvious.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Bob Peterson

Andreas Gruenbacher
2015-10-30 01:57:48 +0800

04 Sep, 2015

1 commit

15562c439 GFS2: Move glock superblock pointer to field gl_name ... Browse Code »

What uniquely identifies a glock in the glock hash table is not
gl_name, but gl_name and its superblock pointer. This patch makes
the gl_name field correspond to a unique glock identifier. That will
allow us to simplify hashing with a future patch, since the hash
algorithm can then take the gl_name and hash its components in one
operation.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher
Acked-by: Steven Whitehouse

Bob Peterson
2015-09-04 02:33:09 +0800

19 Jun, 2015

2 commits

39b0f1e92 GFS2: Don't brelse rgrp buffer_heads every allocation ... Browse Code »

This patch allows the block allocation code to retain the buffers
for the resource groups so they don't need to be re-read from buffer
cache with every request. This is a performance improvement that's
especially noticeable when resource groups are very large. For
example, with 2GB resource groups and 4K blocks, there can be 33
blocks for every resource group. This patch allows those 33 buffers
to be kept around and not read in and thrown away with every
operation. The buffers are released when the resource group is
either synced or invalidated.

Signed-off-by: Bob Peterson
Reviewed-by: Steven Whitehouse
Reviewed-by: Benjamin Marzinski

Bob Peterson
2015-06-19 20:40:22 +0800
e7ccaf5fe GFS2: Don't add all glocks to the lru ... Browse Code »

The glocks used for resource groups often come and go hundreds of
thousands of times per second. Adding them to the lru list just
adds unnecessary contention for the lru_lock spin_lock, especially
considering we're almost certainly going to re-use the glock and
take it back off the lru microseconds later. We never want the
glock shrinker to cull them anyway. This patch adds a new bit in
the glops that determines which glock types get put onto the lru
list and which ones don't.

Signed-off-by: Bob Peterson
Acked-by: Steven Whitehouse

Bob Peterson
2015-06-19 01:17:59 +0800

17 Nov, 2014

1 commit

2e60d7683 GFS2: update freeze code to use freeze/thaw_super on all nodes ... Browse Code »

The current gfs2 freezing code is considerably more complicated than it
should be because it doesn't use the vfs freezing code on any node except
the one that begins the freeze. This is because it needs to acquire a
cluster glock before calling the vfs code to prevent a deadlock, and
without the new freeze_super and thaw_super hooks, that was impossible. To
deal with the issue, gfs2 had to do some hacky locking tricks to make sure
that a frozen node couldn't be holding on a lock it needed to do the
unfreeze ioctl.

This patch makes use of the new hooks to simply the gfs2 locking code. Now,
all the nodes in the cluster freeze and thaw in exactly the same way. Every
node in the cluster caches the freeze glock in the shared state. The new
freeze_super hook allows the freezing node to grab this freeze glock in
the exclusive state without first calling the vfs freeze_super function.
All the nodes in the cluster see this lock change, and call the vfs
freeze_super function. The vfs locking code guarantees that the nodes can't
get stuck holding the glocks necessary to unfreeze the system. To
unfreeze, the freezing node uses the new thaw_super hook to drop the freeze
glock. Again, all the nodes notice this, reacquire the glock in shared mode
and call the vfs thaw_super function.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2014-11-17 18:36:39 +0800

08 Oct, 2014

1 commit

d29c0afe4 GFS2: use _RET_IP_ instead of (unsigned long)__builtin_return_address(0) ... Browse Code »

use macro definition

Signed-off-by: Fabian Frederick
Signed-off-by: Steven Whitehouse

Fabian Frederick
2014-10-08 16:57:07 +0800

18 Jul, 2014

1 commit

6b49d1d9c GFS2: memcontrol: Spelling s/invlidate/invalidate/ ... Browse Code »

Signed-off-by: Geert Uytterhoeven
Cc: cluster-devel@redhat.com
Signed-off-by: Steven Whitehouse

Geert Uytterhoeven
2014-07-18 18:14:31 +0800

04 Jun, 2014

1 commit

ba1bdefec Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/s… ... Browse Code »

…teve/gfs2-3.0-nmw into next

Pull gfs2 updates from Steven Whitehouse:
"This must be about the smallest merge window patch set ever for GFS2.
It is probably also the first one without a single patch from me.
That is down to a combination of factors, and I have some things in
the works that are not quite ready yet, that I hope to put in next
time around.

Returning to what is here this time... we have 3 patches which fix
various warnings. Two are bug fixes (for quotas and also a rare
recovery race condition). The final patch, from Ben Marzinski, is an
important change in the freeze code which has been in progress for
some time. This removes the need to take and drop the transaction
lock for every single transaction, when the only time it was used, was
at file system freeze time. Ben's patch integrates the freeze
operation into the journal flush code as an alternative with lower
overheads and also lands up resolving some difficult to fix races at
the same time"

* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Prevent recovery before the local journal is set
GFS2: fs/gfs2/file.c: kernel-doc warning fixes
GFS2: fs/gfs2/bmap.c: kernel-doc warning fixes
GFS2: remove transaction glock
GFS2: lops.c: replace 0 by NULL for pointers
GFS2: quotas not being refreshed in gfs2_adjust_quota

Linus Torvalds
2014-06-04 23:30:10 +0800

14 May, 2014

1 commit

24972557b GFS2: remove transaction glock ... Browse Code »

GFS2 has a transaction glock, which must be grabbed for every
transaction, whose purpose is to deal with freezing the filesystem.
Aside from this involving a large amount of locking, it is very easy to
make the current fsfreeze code hang on unfreezing.

This patch rewrites how gfs2 handles freezing the filesystem. The
transaction glock is removed. In it's place is a freeze glock, which is
cached (but not held) in a shared state by every node in the cluster
when the filesystem is mounted. This lock only needs to be grabbed on
freezing, and actions which need to be safe from freezing, like
recovery.

When a node wants to freeze the filesystem, it grabs this glock
exclusively. When the freeze glock state changes on the nodes (either
from shared to unlocked, or shared to exclusive), the filesystem does a
special log flush. gfs2_log_flush() does all the work for flushing out
the and shutting down the incore log, and then it tries to grab the
freeze glock in a shared state again. Since the filesystem is stuck in
gfs2_log_flush, no new transaction can start, and nothing can be written
to disk. Unfreezing the filesytem simply involes dropping the freeze
glock, allowing gfs2_log_flush() to grab and then release the shared
lock, so it is cached for next time.

However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
shared lock on the filesystem root directory inode to check permissions.
If that glock has already been grabbed exclusively, fsfreeze will be
unable to get the shared lock and unfreeze the filesystem.

In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
on the filesystem root directory during the freeze, and hold it until it
unfreezes the filesystem. The functions which need to grab a shared
lock in order to allow the unfreeze ioctl to be issued now use the lock
grabbed by the freeze code instead.

The freeze and unfreeze code take care to make sure that this shared
lock will not be dropped while another process is using it.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2014-05-14 17:04:34 +0800

18 Apr, 2014

1 commit

4e857c58e arch: Mass conversion of smp_mb__*() ... Browse Code »

Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra
Acked-by: Paul E. McKenney
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-04-18 20:20:48 +0800

25 Feb, 2014

1 commit

d69a3c656 GFS2: Move log buffer lists into transaction ... Browse Code »

Over time, we hope to be able to improve the concurrency available
in the log code. This is one small step towards that, by moving
the buffer lists from the super block, and into the transaction
structure, so that each transaction builds its own buffer lists.

At transaction commit time, the buffer lists are merged into
the currently accumulating transaction. That transaction then
is passed into the before and after commit functions at journal
flush time. Thus there should be no change in overall behaviour
yet.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-02-25 00:54:54 +0800

16 Jan, 2014

1 commit

ac3beb6a5 GFS2: Don't use ENOBUFS when ENOMEM is the correct error code ... Browse Code »

Al Viro has tactfully pointed out that we are using the incorrect
error code in some cases. This patch fixes that, and also removes
the (unused) return value for glock dumping.

> * gfs2_iget() - ENOBUFS instead of ENOMEM. ENOBUFS is
> "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
> we don't support STREAMS it's probably fair game, but... what the hell?

Signed-off-by: Steven Whitehouse
Cc: Al Viro

Steven Whitehouse
2014-01-16 18:31:13 +0800

03 Jan, 2014

2 commits

70d4ee94b GFS2: Use only a single address space for rgrps ... Browse Code »

Prior to this patch, GFS2 had one address space for each rgrp,
stored in the glock. This patch changes them to use a single
address space in the super block. This therefore saves
(sizeof(struct address_space) * nr_of_rgrps) bytes of memory
and for large filesystems, that can be significant.

It would be nice to be able to do something similar and merge
the inode metadata address space into the same global
address space. However, that is rather more complicated as the
on-disk location doesn't have a 1:1 mapping with the inodes in
general. So while it could be done, it will be a more complicated
operation as it requires changing a lot more code paths.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-01-03 18:01:50 +0800
7005c3e4a GFS2: Use range based functions for rgrp sync/invalidation ... Browse Code »

Each rgrp header is represented as a single extent on disk, so we
can calculate the position within the address space, since we are
using address spaces mapped 1:1 to the disk. This means that it
is possible to use the range based versions of filemap_fdatawrite/wait
and for invalidating the page cache.

Our eventual intent is to then be able to merge the address spaces
used for rgrps into a single address space, rather than to have
one for each glock, saving memory and reducing complexity.

Since during umount, the rgrp structures are disposed of before
the glocks, we need to store the extent information in the glock
so that is is available for a final invalidation. This patch uses
a field which is otherwise unused in rgrp glocks to do that, so
that we do not have to expand the size of a glock.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-01-03 18:00:31 +0800

20 Dec, 2013

1 commit

582d2f7ae GFS2: Wait for async DIO in glock state changes ... Browse Code »

We need to wait for any outstanding DIO to complete in a couple
of situations. Firstly, in case we are changing out of deferred
mode (in inode_go_sync) where GLF_DIRTY will not be set. That
call could be prefixed with a test for gl_state == LM_ST_DEFERRED
but it doesn't seem worth it bearing in mind that the test for
outstanding DIO is very quick anyway, in the usual case that there
is none.

The second case is in inode_go_lock which will catch the cases
where we have a cached EX lock, but where we grant deferred locks
against it so that there is no glock state transistion. We only
need to wait if the state is not deferred, since DIO is valid
anyway in that state.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-12-20 18:42:08 +0800

15 Oct, 2013

1 commit

e66cf1610 GFS2: Use lockref for glocks ... Browse Code »

Currently glocks have an atomic reference count and also a spinlock
which covers various internal fields, such as the state. This intent of
this patch is to replace the spinlock and the atomic reference count
with a lockref structure. This contains a spinlock which we can continue
to use as before, and a reference counter which is used in conjuction
with the spinlock to replace the previous atomic counter.

As a result of this there are some new rules for reference counting on
glocks. We need to distinguish between reference count changes under
gl_spin (which are now just increment or decrement of the new counter,
provided the count cannot hit zero) and those which are outside of
gl_spin, but which now take gl_spin internally.

The conversion is relatively straight forward. There is probably some
further clean up which can be done, but the priority at this stage is to
make the change in as simple a manner as possible.

A consequence of this change is that the reference count is being
decoupled from the lru list processing. This should allow future
adoption of the lru_list code with glocks in due course.

The reason for using the "dead" state and not just relying on 0 being
the "invalid state" is so that in due course 0 ref counts can be
allowable. The intent is to eventually be able to remove the ref count
changes which are currently hidden away in state_change().

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-10-15 22:18:08 +0800

19 Aug, 2013

1 commit

1bc333f4c GFS2: don't overrun reserved revokes ... Browse Code »

When run during fsync, a gfs2_log_flush could happen between the
time when gfs2_ail_flush checked the number of blocks to revoke,
and when it actually started the transaction to do those revokes.
This occassionally caused it to need more revokes than it reserved,
causing gfs2 to crash.

Instead of just reserving enough revokes to handle the blocks that
currently need them, this patch makes gfs2_ail_flush reserve the
maximum number of revokes it can, without increasing the total number
of reserved log blocks. This patch also passes the number of reserved
revokes to __gfs2_ail_flush() so that it doesn't go over its limit
and cause a crash like we're seeing. Non-fsync calls to __gfs2_ail_flush
will still cause a BUG() necessary revokes are skipped.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2013-08-19 16:33:16 +0800

19 Jun, 2013

1 commit

5d054964f GFS2: aggressively issue revokes in gfs2_log_flush ... Browse Code »

This patch looks at all the outstanding blocks in all the transactions
on the log, and moves the completed ones to the ail2 list. Then it
issues revokes for these blocks. This will hopefully speed things up
in situations where there is a lot of contention for glocks, especially
if they are acquired serially.

revoke_lo_before_commit will issue at most one log block's full of these
preemptive revokes. The amount of reserved log space that
gfs2_log_reserve() ignores has been incremented to allow for this extra
block.

This patch also consolidates the common revoke instructions into one
function, gfs2_add_revoke().

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2013-06-19 16:41:59 +0800

10 Apr, 2013

1 commit

81ffbf654 GFS2: Add origin indicator to glock callbacks ... Browse Code »

This patch adds a bool indicating whether the demote
request was originated locally or remotely. This is then
used by the iopen ->go_callback() to make 100% sure that
it will only respond to remote callbacks.

Since ->evict_inode() uses GL_NOCACHE when it attempts to
get an exclusive lock on the iopen lock, this may result
in extra scheduling of the workqueue in case that the
exclusive promotion request failed. This patch prevents
that from happening.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-04-10 17:26:55 +0800

13 Feb, 2013

1 commit

d05464264 gfs2: Convert uids and gids between dinodes and vfs inodes. ... Browse Code »

When reading dinodes from the disk convert uids and gids
into kuids and kgids to store in vfs data structures.

When writing to dinodes to the disk convert kuids and kgids
in the in memory structures into plain uids and gids.

For now all on disk data structures are assumed to be
stored in the initial user namespace.

Cc: Steven Whitehouse
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:15:11 +0800

15 Nov, 2012

1 commit

dba2d70c5 GFS2: only use lvb on glocks that need it ... Browse Code »

Save the effort of allocating, reading and writing
the lvb for most glocks that do not use it.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-15 18:16:59 +0800

07 Nov, 2012

2 commits

06dfc3064 GFS2: Rename glops go_xmote_th to go_sync ... Browse Code »

[Editorial: This is a nit, but has been a minor irritation for a long time:]

This patch renames glops structure item for go_xmote_th to go_sync.
The functionality is unchanged; it's just for readability.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-11-07 21:31:57 +0800
8eae1ca00 GFS2: Review bug traps in glops.c ... Browse Code »

Two of the bug traps here could really be warnings. The others are
converted from BUG() to GLOCK_BUG_ON() since we'll most likely
need to know the glock state in order to debug any issues which
arise. As a result of this, __dump_glock has to be renamed and
is no longer static.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-11-07 21:31:07 +0800

24 Sep, 2012

1 commit

a0b4df294 GFS2: fix s_writers.counter imbalance in gfs2_ail_empty_gl ... Browse Code »

gfs2_ail_empty_gl() contains an "inline version" of gfs2_trans_begin(),
so it needs an explicit sb_start_intwrite() as well, to balance the
sb_end_intwrite() which will be called by gfs2_trans_end().

With this, xfstest 068 passes on lock_nolock local gfs2.
Without it, we reach a writer count of -1 and get stuck.

Signed-off-by: Eric Sandeen
Signed-off-by: Steven Whitehouse

Eric Sandeen
2012-09-24 17:47:29 +0800

08 May, 2012

1 commit

6de1e2f34 GFS2: Remove redundant metadata block type check ... Browse Code »

This patch removes a redundant metadata block check. See description below.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-05-08 23:18:55 +0800

24 Apr, 2012

1 commit

c50b91c4b GFS2: Remove bd_list_tr ... Browse Code »

This is another clean up in the logging code. This per-transaction
list was largely unused. Its main function was to ensure that the
number of buffers in a transaction was correct, however that counter
was only used to check the number of buffers in the bd_list_tr, plus
an assert at the end of each transaction. With the assert now changed
to use the calculated buffer counts, we can remove both bd_list_tr and
its associated counter.

This should make the code easier to understand as well as shrinking
a couple of structures.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2012-04-24 23:44:36 +0800

02 Nov, 2011

1 commit

bfe868486 filesystems: add set_nlink() ... Browse Code »

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig

Miklos Szeredi
2011-11-02 19:53:43 +0800

21 Oct, 2011

5 commits

b5b24d7ae GFS2: Fix AIL flush issue during fsync ... Browse Code »

Unfortunately, it is not enough to just ignore locked buffers during
the AIL flush from fsync. We need to be able to ignore all buffers
which are locked, dirty or pinned at this stage as they might have
been added subsequent to the log flush earlier in the fsync function.

In addition, this means that we no longer need to rely on i_mutex to
keep out writes during fsync, so we can, as a side-effect, remove
that protection too.

Signed-off-by: Steven Whitehouse
Tested-By: Abhijith Das

Steven Whitehouse
2011-10-21 19:39:41 +0800
8339ee543 GFS2: Make resource groups "append only" during life of fs ... Browse Code »

Since we have ruled out supporting online filesystem shrink,
it is possible to make the resource group list append only
during the life of a super block. This gives several benefits:

Firstly, we only need to read new rindex elements as they are added
rather than needing to reread the whole rindex file each time one
element is added.

Secondly, the rindex glock can be held for much shorter periods of
time, and is completely removed from the fast path for allocations.
The lock is taken in shared mode only when updating the resource
groups when the first allocation occurs, and after a grow has
taken place.

Thirdly, this results in a reduction in code size, and everything
gets a lot simpler to understand in this area.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:33 +0800
7c9ca6211 GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme ... Browse Code »

Here is an update of Bob's original rbtree patch which, in addition, also
resolves the rather strange ref counting that was being done relating to
the bitmap blocks.

Originally we had a dual system for journaling resource groups. The metadata
blocks were journaled and also the rgrp itself was added to a list. The reason
for adding the rgrp to the list in the journal was so that the "repolish
clones" code could be run to update the free space, and potentially send any
discard requests when the log was flushed. This was done by comparing the
"cloned" bitmap with what had been written back on disk during the transaction
commit.

Due to this, there was a requirement to hang on to the rgrps' bitmap buffers
until the journal had been flushed. For that reason, there was a rather
complicated set up in the ->go_lock ->go_unlock functions for rgrps involving
both a mutex and a spinlock (the ->sd_rindex_spin) to maintain a reference
count on the buffers.

However, the journal maintains a reference count on the buffers anyway, since
they are being journaled as metadata buffers. So by moving the code which deals
with the post-journal accounting for bitmap blocks to the metadata journaling
code, we can entirely dispense with the rather strange buffer ref counting
scheme and also the requirement to journal the rgrps.

The net result of all this is that the ->sd_rindex_spin is left to do exactly
one job, and that is to look after the rbtree or rgrps.

This patch is designed to be a stepping stone towards using RCU for the rbtree
of resource groups, however the reduction in the number of uses of the
->sd_rindex_spin is likely to have benefits for multi-threaded workloads,
anyway.

The patch retains ->go_lock and ->go_unlock for rgrps, however these maybe also
be removed in future in favour of calling the functions directly where required
in the code. That will allow locking of resource groups without needing to
actually read them in - something that could be useful in speeding up statfs.

In the mean time though it is valid to dereference ->bi_bh only when the rgrp
is locked. This is basically the same rule as before, modulo the references not
being valid until the following journal flush.

Signed-off-by: Steven Whitehouse
Signed-off-by: Bob Peterson
Cc: Benjamin Marzinski

Bob Peterson
2011-10-21 19:39:31 +0800
f18185291 GFS2: Fix bug trap and journaled data fsync ... Browse Code »

Journaled data requires that a complete flush of all dirty data for
the file is done, in order that the ail flush which comes after
will succeed.

Also the recently enhanced bug trap can trigger falsely in case
an ail flush from fsync races with a page read. This updates the
bug trap such that it will ignore buffers which are locked and
only trigger on dirty and/or pinned buffers when the ail flush
is run from fsync. The original bug trap is retained when ail
flush is run from ->go_sync()

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:25 +0800
75549186e GFS2: Fix bug-trap in ail flush code ... Browse Code »

The assert was being tested under the wrong lock, a
legacy of the original code. Also, if it does trigger,
the resulting information was not always a lot of help.

This moves the patch under the correct lock and also
prints out more useful information in tacking down the
source of the problem.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:20 +0800

15 Jul, 2011

3 commits

9964afbb7 GFS2: Add S_NOSEC support ... Browse Code »

This adds S_NOSEC support to GFS2. We set/reset the flag either when
a user calls setattr or when we have just regained the glock
from another node. The flag is only set if there are no xattrs
on the inode and there is no suid bit set.

Signed-off-by: Steven Whitehouse
Reviewed-by: Andi Kleen
Cc: Al Viro

Steven Whitehouse
2011-07-15 16:32:35 +0800
7cf8dcd3b GFS2: Automatically adjust glock min hold time ... Browse Code »

This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-07-15 16:32:11 +0800
17d539f04 GFS2: Cache dir hash table in a contiguous buffer ... Browse Code »

This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-07-15 16:31:48 +0800

14 Jul, 2011

1 commit

380f7c65a GFS2: Resolve inode eviction and ail list interaction bug ... Browse Code »

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.

Signed-off-by: Steven Whitehouse
Tested-by: Bob Peterson
Tested-by: Abhijith Das
Reported-by: Barry J. Marson
Reported-by: David Teigland

Steven Whitehouse
2011-07-14 15:59:44 +0800

12 Jul, 2011

1 commit

1ce533686 GFS2: force a log flush when invalidating the rindex glock ... Browse Code »

Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2011-07-12 16:15:24 +0800

09 May, 2011

1 commit

d4b2cf1b0 GFS2: Move gfs2_refresh_inode() and friends into glops.c ... Browse Code »

Eventually there will only be a single caller of this code, so lets
move it where it can be made static at some future date.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-09 23:44:49 +0800

20 Apr, 2011

1 commit

dba898b02 GFS2: Clean up fsync() ... Browse Code »

This patch is designed to clean up GFS2's fsync
implementation and ensure that it really does get everything on
disk. Since ->write_inode() has been updated, we can call that
via the vfs library function sync_inode_metadata() and the only
remaining thing that has to be done is to ensure that we get
any revoke records in the log after the inode has been written back.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-04-20 16:00:41 +0800