Eric Lee / smarc-fsl-linux-kernel

17 Nov, 2014

1 commit

2e60d7683 GFS2: update freeze code to use freeze/thaw_super on all nodes ... Browse Code »

The current gfs2 freezing code is considerably more complicated than it
should be because it doesn't use the vfs freezing code on any node except
the one that begins the freeze. This is because it needs to acquire a
cluster glock before calling the vfs code to prevent a deadlock, and
without the new freeze_super and thaw_super hooks, that was impossible. To
deal with the issue, gfs2 had to do some hacky locking tricks to make sure
that a frozen node couldn't be holding on a lock it needed to do the
unfreeze ioctl.

This patch makes use of the new hooks to simply the gfs2 locking code. Now,
all the nodes in the cluster freeze and thaw in exactly the same way. Every
node in the cluster caches the freeze glock in the shared state. The new
freeze_super hook allows the freezing node to grab this freeze glock in
the exclusive state without first calling the vfs freeze_super function.
All the nodes in the cluster see this lock change, and call the vfs
freeze_super function. The vfs locking code guarantees that the nodes can't
get stuck holding the glocks necessary to unfreeze the system. To
unfreeze, the freezing node uses the new thaw_super hook to drop the freeze
glock. Again, all the nodes notice this, reacquire the glock in shared mode
and call the vfs thaw_super function.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2014-11-17 18:36:39 +0800

04 Nov, 2014

1 commit

0e27c18c3 GFS2: Set of distributed preferences for rgrps ... Browse Code »

This patch tries to use the journal numbers to evenly distribute
which node prefers which resource group for block allocations. This
is to help performance.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2014-11-04 03:24:49 +0800

11 Sep, 2014

1 commit

a937cca27 GFS2: Don't use MAXQUOTAS value ... Browse Code »

MAXQUOTAS value defines maximum number of quota types VFS supports.
This isn't necessarily the number of types gfs2 supports and with
addition of project quotas these two numbers stop matching. So make gfs2
use its private definition.

CC: cluster-devel@redhat.com
Signed-off-by: Jan Kara
Signed-off-by: Steven Whitehouse

Jan Kara
2014-09-11 17:59:56 +0800

03 Jun, 2014

1 commit

0e48e055a GFS2: Prevent recovery before the local journal is set ... Browse Code »

This patch uses a completion to prevent dlm's recovery process from
referencing and trying to recover a journal before a journal has been
opened.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2014-06-03 02:12:06 +0800

14 May, 2014

1 commit

24972557b GFS2: remove transaction glock ... Browse Code »

GFS2 has a transaction glock, which must be grabbed for every
transaction, whose purpose is to deal with freezing the filesystem.
Aside from this involving a large amount of locking, it is very easy to
make the current fsfreeze code hang on unfreezing.

This patch rewrites how gfs2 handles freezing the filesystem. The
transaction glock is removed. In it's place is a freeze glock, which is
cached (but not held) in a shared state by every node in the cluster
when the filesystem is mounted. This lock only needs to be grabbed on
freezing, and actions which need to be safe from freezing, like
recovery.

When a node wants to freeze the filesystem, it grabs this glock
exclusively. When the freeze glock state changes on the nodes (either
from shared to unlocked, or shared to exclusive), the filesystem does a
special log flush. gfs2_log_flush() does all the work for flushing out
the and shutting down the incore log, and then it tries to grab the
freeze glock in a shared state again. Since the filesystem is stuck in
gfs2_log_flush, no new transaction can start, and nothing can be written
to disk. Unfreezing the filesytem simply involes dropping the freeze
glock, allowing gfs2_log_flush() to grab and then release the shared
lock, so it is cached for next time.

However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
shared lock on the filesystem root directory inode to check permissions.
If that glock has already been grabbed exclusively, fsfreeze will be
unable to get the shared lock and unfreeze the filesystem.

In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
on the filesystem root directory during the freeze, and hold it until it
unfreezes the filesystem. The functions which need to grab a shared
lock in order to allow the unfreeze ioctl to be issued now use the lock
grabbed by the freeze code instead.

The freeze and unfreeze code take care to make sure that this shared
lock will not be dropped while another process is using it.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2014-05-14 17:04:34 +0800

31 Mar, 2014

1 commit

059788039 GFS2: Fix uninitialized VFS inode in gfs2_create_inode ... Browse Code »

When gfs2_create_inode() fails due to quota violation, the VFS
inode is not completely uninitialized. This can cause a list
corruption error.

This patch correctly uninitializes the VFS inode when a quota
violation occurs in the gfs2_create_inode codepath.

Resolves: rhbz#1059808
Signed-off-by: Abhi Das
Signed-off-by: Steven Whitehouse

Abhi Das
2014-03-31 23:41:39 +0800

07 Mar, 2014

1 commit

a17d758b6 GFS2: Move recovery variables to journal structure in memory ... Browse Code »

If multiple nodes fail and their recovery work runs simultaneously, they
would use the same unprotected variables in the superblock. For example,
they would stomp on each other's revoked blocks lists, which resulted
in file system metadata corruption. This patch moves the necessary
variables so that each journal has its own separate area for tracking
its journal replay.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2014-03-07 17:14:48 +0800

03 Mar, 2014

1 commit

b50f227bd GFS2: Clean up journal extent mapping ... Browse Code »

This patch fixes a long standing issue in mapping the journal
extents. Most journals will consist of only a single extent,
and although the cache took account of that by merging extents,
it did not actually map large extents, but instead was doing a
block by block mapping. Since the journal was only being mapped
on mount, this was not normally noticeable.

With the updated code, it is now possible to use the same extent
mapping system during journal recovery (which will be added in a
later patch). This will allow checking of the integrity of the
journal before any reply of the journal content is attempted. For
this reason the code is moving to bmap.c, since it will be used
more widely in due course.

An exercise left for the reader is to compare the new function
gfs2_map_journal_extents() with gfs2_write_alloc_required()

Additionally, should there be a failure, the error reporting is
also updated to show more detail about what went wrong.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-03-03 21:50:12 +0800

25 Feb, 2014

2 commits

022ef4fee GFS2: Move log buffer accounting to transaction ... Browse Code »

Now we have a master transaction into which other transactions
are merged, the accounting can be done using this master
transaction. We no longer require the superblock fields which
were being used for this function.

In addition, this allows for a clean up in calc_reserved()
making it rather easier understand. Also, by reducing the
number of variables used to track the buffers being added
and removed from the journal, a number of error checks are
now no longer required.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-02-25 03:49:12 +0800
d69a3c656 GFS2: Move log buffer lists into transaction ... Browse Code »

Over time, we hope to be able to improve the concurrency available
in the log code. This is one small step towards that, by moving
the buffer lists from the super block, and into the transaction
structure, so that each transaction builds its own buffer lists.

At transaction commit time, the buffer lists are merged into
the currently accumulating transaction. That transaction then
is passed into the before and after commit functions at journal
flush time. Thus there should be no change in overall behaviour
yet.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-02-25 00:54:54 +0800

21 Feb, 2014

1 commit

654a6d2f9 GFS2: Reduce struct gfs2_trans in size ... Browse Code »

A couple of "int" fields were being used as boolean values
so we can make them bitfields of one bit, and put them in
what might otherwise be a hole in the structure with 64
bit alignment.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-02-21 19:52:00 +0800

16 Jan, 2014

1 commit

ac3beb6a5 GFS2: Don't use ENOBUFS when ENOMEM is the correct error code ... Browse Code »

Al Viro has tactfully pointed out that we are using the incorrect
error code in some cases. This patch fixes that, and also removes
the (unused) return value for glock dumping.

> * gfs2_iget() - ENOBUFS instead of ENOMEM. ENOBUFS is
> "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
> we don't support STREAMS it's probably fair game, but... what the hell?

Signed-off-by: Steven Whitehouse
Cc: Al Viro

Steven Whitehouse
2014-01-16 18:31:13 +0800

15 Jan, 2014

3 commits

2d9e72303 GFS2: Move quota bitmap operations under their own lock ... Browse Code »

Gradually, the global qd_lock is being used for less and less.
After this patch it will only be used for the per super block
list whose purpose is to allow syncing of changes back to the
master quota file from the local quota changes file. Fixing
up that process to make it more efficient will be the subject
of a later patch, however this patch removes another barrier
to doing that.

Signed-off-by: Steven Whitehouse
Cc: Abhijith Das

Steven Whitehouse
2014-01-15 03:29:06 +0800
ee2411a8d GFS2: Clean up quota slot allocation ... Browse Code »

Quota slot allocation has historically used a vector of pages
and a set of homegrown find/test/set/clear bit functions. Since
the size of the bitmap is likely to be based on the default
qc file size, thats a couple of pages at most. So we ought
to be able to allocate that as a single chunk, with a vmalloc
fallback, just in case of memory fragmentation.

We are then able to use the kernel's own find/test/set/clear
bit functions, rather than rolling our own.

Signed-off-by: Steven Whitehouse
Cc: Abhijith Das

Steven Whitehouse
2014-01-15 03:28:49 +0800
c754fbbb1 GFS2: Use RCU/hlist_bl based hash for quotas ... Browse Code »

Prior to this patch, GFS2 kept all the quotas for each
super block in a single linked list. This is rather slow
when there are large numbers of quotas.

This patch introduces a hlist_bl based hash table, similar
to the one used for glocks. The initial look up of the quota
is now lockless in the case where it is already cached,
although we still have to take the per quota spinlock in
order to bump the ref count. Either way though, this is a
big improvement on what was there before.

The qd_lock and the per super block list is preserved, for
the time being. However it is intended that since this is no
longer used for its original role, it should be possible to
shrink the number of items on that list in due course and
remove the requirement to take qd_lock in qd_get.

Signed-off-by: Steven Whitehouse
Cc: Abhijith Das
Cc: Paul E. McKenney

Steven Whitehouse
2014-01-15 03:27:56 +0800

03 Jan, 2014

3 commits

70d4ee94b GFS2: Use only a single address space for rgrps ... Browse Code »

Prior to this patch, GFS2 had one address space for each rgrp,
stored in the glock. This patch changes them to use a single
address space in the super block. This therefore saves
(sizeof(struct address_space) * nr_of_rgrps) bytes of memory
and for large filesystems, that can be significant.

It would be nice to be able to do something similar and merge
the inode metadata address space into the same global
address space. However, that is rather more complicated as the
on-disk location doesn't have a 1:1 mapping with the inodes in
general. So while it could be done, it will be a more complicated
operation as it requires changing a lot more code paths.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-01-03 18:01:50 +0800
7005c3e4a GFS2: Use range based functions for rgrp sync/invalidation ... Browse Code »

Each rgrp header is represented as a single extent on disk, so we
can calculate the position within the address space, since we are
using address spaces mapped 1:1 to the disk. This means that it
is possible to use the range based versions of filemap_fdatawrite/wait
and for invalidating the page cache.

Our eventual intent is to then be able to merge the address spaces
used for rgrps into a single address space, rather than to have
one for each glock, saving memory and reducing complexity.

Since during umount, the rgrp structures are disposed of before
the glocks, we need to store the extent information in the glock
so that is is available for a final invalidation. This patch uses
a field which is otherwise unused in rgrp glocks to do that, so
that we do not have to expand the size of a glock.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2014-01-03 18:00:31 +0800
5ea5050ce GFS2: Implement a "rgrp has no extents longer than X" scheme ... Browse Code »

With the preceding patch, we started accepting block reservations
smaller than the ideal size, which requires a lot more parsing of the
bitmaps. To reduce the amount of bitmap searching, this patch
implements a scheme whereby each rgrp keeps track of the point
at this multi-block reservations will fail.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2014-01-03 17:58:08 +0800

04 Nov, 2013

2 commits

2147dbfd0 GFS2: Use generic list_lru for quota ... Browse Code »

By using the generic list_lru code, we can now separate the
per sb quota list locking from the lru locking. The lru
lock is made into the inner-most lock.

As a result of this new lock order, we may occasionally see
items on the per-sb quota list which are "dead" so that the
two places where we traverse that list are updated to take
account of that.

As a result of this patch, the gfs2 quota shrinker is now
NUMA zone aware, and we are also laying the foundations for
further improvments in due course.

Signed-off-by: Steven Whitehouse
Signed-off-by: Abhijith Das
Tested-by: Abhijith Das
Cc: Dave Chinner

Steven Whitehouse
2013-11-04 19:17:49 +0800
9b9f039d5 GFS2: Use reflink for quota data cache ... Browse Code »

This patch adds reflink support to the quota data cache. It
looks a bit strange because we still don't have a sensible
split in the lookup by id and the lru list. That is coming in
later patches though.

The intent here is just to swap the current ref count for
reflinks in all cases with as little as possible other change.

Signed-off-by: Steven Whitehouse
Signed-off-by: Abhijith Das
Tested-by: Abhijith Das

Steven Whitehouse
2013-11-04 19:17:07 +0800

15 Oct, 2013

1 commit

e66cf1610 GFS2: Use lockref for glocks ... Browse Code »

Currently glocks have an atomic reference count and also a spinlock
which covers various internal fields, such as the state. This intent of
this patch is to replace the spinlock and the atomic reference count
with a lockref structure. This contains a spinlock which we can continue
to use as before, and a reference counter which is used in conjuction
with the spinlock to replace the previous atomic counter.

As a result of this there are some new rules for reference counting on
glocks. We need to distinguish between reference count changes under
gl_spin (which are now just increment or decrement of the new counter,
provided the count cannot hit zero) and those which are outside of
gl_spin, but which now take gl_spin internally.

The conversion is relatively straight forward. There is probably some
further clean up which can be done, but the priority at this stage is to
make the change in as simple a manner as possible.

A consequence of this change is that the reference count is being
decoupled from the lru list processing. This should allow future
adoption of the lru_list code with glocks in due course.

The reason for using the "dead" state and not just relying on 0 being
the "invalid state" is so that in due course 0 ref counts can be
allowable. The intent is to eventually be able to remove the ref count
changes which are currently hidden away in state_change().

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-10-15 22:18:08 +0800

04 Oct, 2013

2 commits

e46c772db GFS2: Protect quota sync generation ... Browse Code »

Now that gfs2_quota_sync can be potentially called from multiple
threads, we should protect this bit of code, and the sync generation
number in particular in order to ensure that there are no races
when syncing quotas.

Signed-off-by: Steven Whitehouse
Cc: Abhijith Das

Steven Whitehouse
2013-10-04 19:29:34 +0800
bef292a72 GFS2: Remove obsolete quota tunable ... Browse Code »

There is no need for a paramater which relates to the internals
of quota to be exposed to users. The only possible use would be
to turn it up so large that the memory allocation fails. So lets
remove it and set it to a sensible value which ensures that we
don't ask for multipage allocations.

Currently the size of struct gfs2_holder means that the caluclated
value is identical to the previous default value, so there should
be no functional change.

Signed-off-by: Steven Whitehouse
Cc: Abhijith Das

Steven Whitehouse
2013-10-04 16:49:29 +0800

02 Oct, 2013

1 commit

7b9cff467 GFS2: Add allocation parameters structure ... Browse Code »

This patch adds a structure to contain allocation parameters with
the intention of future expansion of this structure. The idea is
that we should be able to add more information about the allocation
in the future in order to allow the allocator to make a better job
of placing the requests on-disk.

There is no functional difference from applying this patch.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-10-02 18:13:25 +0800

18 Sep, 2013

1 commit

e579ed4f4 GFS2: Introduce rbm field bii ... Browse Code »

This is a respin of the original patch. As Steve pointed out, the
introduction of field bii makes it easy to eliminate bi itself.
This revised patch does just that, replacing bi with bii.

This patch adds a new field to the rbm structure, called bii,
which is an index into the array of bitmaps for an rgrp.
This replaces *bi which was a pointer to the bitmap.
This is being done for further optimizations.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2013-09-18 17:39:53 +0800

17 Sep, 2013

1 commit

7e230f577 GFS2: introduce bi_blocks for optimization ... Browse Code »

This patch introduces a new field in the bitmap structure called
bi_blocks. Its purpose is to save us from constantly multiplying
bi_len by the constant GFS2_NBBY. It also paves the way for more
optimization in a future patch.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2013-09-17 17:15:13 +0800

10 Apr, 2013

1 commit

81ffbf654 GFS2: Add origin indicator to glock callbacks ... Browse Code »

This patch adds a bool indicating whether the demote
request was originated locally or remotely. This is then
used by the iopen ->go_callback() to make 100% sure that
it will only respond to remote callbacks.

Since ->evict_inode() uses GL_NOCACHE when it attempts to
get an exclusive lock on the iopen lock, this may result
in extra scheduling of the workqueue in case that the
exclusive promotion request failed. This patch prevents
that from happening.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-04-10 17:26:55 +0800

08 Apr, 2013

1 commit

16ca9412d GFS2: replace gfs2_ail structure with gfs2_trans ... Browse Code »

In order to allow transactions and log flushes to happen at the same
time, gfs2 needs to move the transaction accounting and active items
list code into the gfs2_trans structure. As a first step toward this,
this patch removes the gfs2_ail structure, and handles the active items
list in the gfs_trans structure. This keeps gfs2 from allocating an ail
structure on log flushes, and gives us a struture that can later be used
to store the transaction accounting outside of the gfs2 superblock
structure.

With this patch, at the end of a transaction, gfs2 will add the
gfs2_trans structure to the superblock if there is not one already.
This structure now has the active items fields that were previously in
gfs2_ail. This is not necessary in the case where the transaction was
simply used to add revokes, since these are never written outside of the
journal, and thus, don't need an active items list.

Also, in order to make sure that the transaction structure is not
removed while it's still in use by gfs2_trans_end, unlocking the
sd_log_flush_lock has to happen slightly later in ending the
transaction.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2013-04-08 15:46:22 +0800

04 Apr, 2013

1 commit

57c7310b8 GFS2: use kmalloc for lvb bitmap ... Browse Code »

The temp lvb bitmap was on the stack, which could
be an alignment problem for __set_bit_le. Use
kmalloc for it instead.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2013-04-04 16:52:14 +0800

26 Feb, 2013

1 commit

94f2f1423 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace and namespace infrastructure changes from Eric W Biederman:
"This set of changes starts with a few small enhnacements to the user
namespace. reboot support, allowing more arbitrary mappings, and
support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
user namespace root.

I do my best to document that if you care about limiting your
unprivileged users that when you have the user namespace support
enabled you will need to enable memory control groups.

There is a minor bug fix to prevent overflowing the stack if someone
creates way too many user namespaces.

The bulk of the changes are a continuation of the kuid/kgid push down
work through the filesystems. These changes make using uids and gids
typesafe which ensures that these filesystems are safe to use when
multiple user namespaces are in use. The filesystems converted for
3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The
changes for these filesystems were a little more involved so I split
the changes into smaller hopefully obviously correct changes.

XFS is the only filesystem that remains. I was hoping I could get
that in this release so that user namespace support would be enabled
with an allyesconfig or an allmodconfig but it looks like the xfs
changes need another couple of days before it they are ready."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
cifs: Enable building with user namespaces enabled.
cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
cifs: Convert struct cifs_sb_info to use kuids and kgids
cifs: Modify struct smb_vol to use kuids and kgids
cifs: Convert struct cifsFileInfo to use a kuid
cifs: Convert struct cifs_fattr to use kuid and kgids
cifs: Convert struct tcon_link to use a kuid.
cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
cifs: Convert from a kuid before printing current_fsuid
cifs: Use kuids and kgids SID to uid/gid mapping
cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
cifs: Override unmappable incoming uids and gids
nfsd: Enable building with user namespaces enabled.
nfsd: Properly compare and initialize kuids and kgids
nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
nfsd: Modify nfsd4_cb_sec to use kuids and kgids
nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
nfsd: Convert nfsxdr to use kuids and kgids
nfsd: Convert nfs3xdr to use kuids and kgids
...

Linus Torvalds
2013-02-26 08:00:49 +0800

13 Feb, 2013

2 commits

05e0a60d8 gfs2: Store qd_id in struct gfs2_quota_data as a struct kqid ... Browse Code »

- Change qd_id in struct gfs2_qutoa_data to struct kqid.
- Remove the now unnecessary QDF_USER bit field in qd_flags.
- Propopoage this change through the code generally making
things simpler along the way.

Cc: Steven Whitehouse
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:15:07 +0800
fd95e81cb GFS2: Reinstate withdraw ack system ... Browse Code »

This patch reinstates the ack system which withdraw should be using. It
appears to have been accidentally forgotten when the lock module was
merged into GFS2, due to two different sysfs files having the same name.

Reported-by: David Teigland
Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-02-13 20:21:40 +0800

29 Jan, 2013

3 commits

451389909 GFS2: Use ->writepages for ordered writes ... Browse Code »

Instead of using a list of buffers to write ahead of the journal
flush, this now uses a list of inodes and calls ->writepages
via filemap_fdatawrite() in order to achieve the same thing. For
most use cases this results in a shorter ordered write list,
as well as much larger i/os being issued.

The ordered write list is sorted by inode number before writing
in order to retain the disk block ordering between inodes as
per the previous code.

The previous ordered write code used to conflict in its assumptions
about how to write out the disk blocks with mpage_writepages()
so that with this updated version we can also use mpage_writepages()
for GFS2's ordered write, writepages implementation. So we will
also send larger i/os from writeback too.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-01-29 18:29:17 +0800
d564053f0 GFS2: Clean up freeze code ... Browse Code »

The freeze code has not been looked at a lot recently. Upstream has
moved on, and this is an attempt to catch us back up again. There
is a vfs level interface for the freeze code which can be called
from our (obsolete, but kept for backward compatibility purposes)
sysfs freeze interface. This means freezing this way vs. doing it
from the ioctl should now work in identical fashion.

As a result of this, the freeze function is only called once
and we can drop our own special purpose code for counting the
number of freezes.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-01-29 18:29:05 +0800
767f433f3 GFS2: Copy gfs2_trans_add_bh into new data/meta functions ... Browse Code »

This patch copies the body of gfs2_trans_add_bh into the two newly
added gfs2_trans_add_data and gfs2_trans_add_meta functions. We can
then move the .lo_add functions from lops.c into trans.c and call
them directly.

As a result of this, we no longer need to use the .lo_add functions
at all, so that is removed from the log operations structure.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2013-01-29 18:28:28 +0800

15 Nov, 2012

2 commits

4e2f8849d GFS2: remove redundant lvb pointer ... Browse Code »

The lksb struct already contains a pointer to the lvb,
so another directly from the glock struct is not needed.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-15 18:17:22 +0800
dba2d70c5 GFS2: only use lvb on glocks that need it ... Browse Code »

Save the effort of allocating, reading and writing
the lvb for most glocks that do not use it.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-15 18:16:59 +0800

14 Nov, 2012

1 commit

fb6791d10 GFS2: skip dlm_unlock calls in unmount ... Browse Code »

When unmounting, gfs2 does a full dlm_unlock operation on every
cached lock. This can create a very large amount of work and can
take a long time to complete. However, the vast majority of these
dlm unlock operations are unnecessary because after all the unlocks
are done, gfs2 leaves the dlm lockspace, which automatically clears
the locks of the leaving node, without unlocking each one individually.
So, gfs2 can skip explicit dlm unlocks, and use dlm_release_lockspace to
remove the locks implicitly. The one exception is when the lock's lvb is
being used. In this case, dlm_unlock is called because it may update the
lvb of the resource.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2012-11-14 17:37:04 +0800

07 Nov, 2012

2 commits

06dfc3064 GFS2: Rename glops go_xmote_th to go_sync ... Browse Code »

[Editorial: This is a nit, but has been a minor irritation for a long time:]

This patch renames glops structure item for go_xmote_th to go_sync.
The functionality is unchanged; it's just for readability.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-11-07 21:31:57 +0800
a68a0a352 GFS2: Speed up gfs2_rbm_from_block ... Browse Code »

This patch is a rewrite of function gfs2_rbm_from_block. Rather than
looping to find the right bitmap, the code now does a few simple
math calculations.

I compared the performance of both algorithms side by side and the new
algorithm is noticeably faster. Sample instrumentation output from a
"fast" machine:

5 million calls: millisec spent: Orig: 166 New: 113
5 million calls: millisec spent: Orig: 189 New: 114

In addition, I ran postmark (on a somewhat slowr CPU) before the after
the new algorithm was put in place and postmark showed a decent
improvement:

Before the new algorithm:
-------------------------
Time:
645 seconds total
584 seconds of transactions (171 per second)

Files:
150087 created (232 per second)
Creation alone: 100000 files (2083 per second)
Mixed with transactions: 50087 files (85 per second)
49995 read (85 per second)
49991 appended (85 per second)
150087 deleted (232 per second)
Deletion alone: 100174 files (7705 per second)
Mixed with transactions: 49913 files (85 per second)

Data:
273.42 megabytes read (434.08 kilobytes per second)
852.13 megabytes written (1.32 megabytes per second)

With the new algorithm:
-----------------------
Time:
599 seconds total
530 seconds of transactions (188 per second)

Files:
150087 created (250 per second)
Creation alone: 100000 files (1886 per second)
Mixed with transactions: 50087 files (94 per second)
49995 read (94 per second)
49991 appended (94 per second)
150087 deleted (250 per second)
Deletion alone: 100174 files (6260 per second)
Mixed with transactions: 49913 files (94 per second)

Data:
273.42 megabytes read (467.42 kilobytes per second)
852.13 megabytes written (1.42 megabytes per second)

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2012-11-07 21:31:36 +0800