Eric Lee / smarc-fsl-linux-kernel

22 Nov, 2011

1 commit

564e12b11 GFS2: decouple quota allocations from block allocations ... Browse Code »

This patch separates the code pertaining to allocations into two
parts: quota-related information and block reservations.
This patch also moves all the block reservation structure allocations to
function gfs2_inplace_reserve to simplify the code, and moves
the frees to function gfs2_inplace_release.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-11-22 18:25:21 +0800

21 Nov, 2011

1 commit

6e87ed0fc GFS2: move toward a generic multi-block allocator ... Browse Code »

This patch is a revision of the one I previously posted.
I tried to integrate all the suggestions Steve gave.
The purpose of the patch is to change function gfs2_alloc_block
(allocate either a dinode block or an extent of data blocks)
to a more generic gfs2_alloc_blocks function that can
allocate both a dinode _and_ an extent of data blocks in the
same call. This will ultimately help us create a multi-block
reservation scheme to reduce file fragmentation.

This patch moves more toward a generic multi-block allocator that
takes a pointer to the number of data blocks to allocate, plus whether
or not to allocate a dinode. In theory, it could be called to allocate
(1) a single dinode block, (2) a group of one or more data blocks, or
(3) a dinode plus several data blocks.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-11-21 18:04:09 +0800

15 Nov, 2011

1 commit

3c5d785ac GFS2: combine gfs2_alloc_block and gfs2_alloc_di ... Browse Code »

GFS2 functions gfs2_alloc_block and gfs2_alloc_di do basically
the same things, with a few exceptions. This patch combines
the two functions into a slightly more generic gfs2_alloc_block.
Having one centralized block allocation function will reduce
code redundancy and make it easier to implement multi-block
reservations to reduce file fragmentation in the future.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-11-15 23:25:03 +0800

09 Nov, 2011

1 commit

79c4c379c GFS2: f_ra is always valid in dir readahead function ... Browse Code »

As a result, we don't need to test it each time.

Signed-off-by: Steven Whitehouse
Cc: Bob Peterson

Steven Whitehouse
2011-11-09 21:46:06 +0800

08 Nov, 2011

1 commit

dfe4d34b3 GFS2: Add readahead to sequential directory traversal ... Browse Code »

This patch adds read-ahead capability to GFS2's
directory hash table management. It greatly improves
performance for some directory operations. For example:
In one of my file systems that has 1000 directories, each
of which has 1000 files, time to execute a recursive
ls (time ls -fR /mnt/gfs2 > /dev/null) was reduced
from 2m2.814s on a stock kernel to 0m45.938s.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-11-08 17:52:12 +0800

21 Oct, 2011

4 commits

70b0c3656 GFS2: Use cached rgrp in gfs2_rlist_add() ... Browse Code »

Each block which is deallocated, requires a call to gfs2_rlist_add()
and each of those calls was calling gfs2_blk2rgrpd() in order to
figure out which rgrp the block belonged in. This can be speeded up
by making use of the rgrp cached in the inode. We also reset this
cached rgrp in case the block has changed rgrp. This should provide
a big reduction in gfs2_blk2rgrpd() calls during deallocation.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:39 +0800
8339ee543 GFS2: Make resource groups "append only" during life of fs ... Browse Code »

Since we have ruled out supporting online filesystem shrink,
it is possible to make the resource group list append only
during the life of a super block. This gives several benefits:

Firstly, we only need to read new rindex elements as they are added
rather than needing to reread the whole rindex file each time one
element is added.

Secondly, the rindex glock can be held for much shorter periods of
time, and is completely removed from the fast path for allocations.
The lock is taken in shared mode only when updating the resource
groups when the first allocation occurs, and after a grow has
taken place.

Thirdly, this results in a reduction in code size, and everything
gets a lot simpler to understand in this area.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:33 +0800
ab9bbda02 GFS2: Use ->dirty_inode() ... Browse Code »

The aim of this patch is to use the newly enhanced ->dirty_inode()
super block operation to deal with atime updates, rather than
piggy backing that code into ->write_inode() as is currently
done.

The net result is a simplification of the code in various places
and a reduction of the number of gfs2_dinode_out() calls since
this is now implied by ->dirty_inode().

Some of the mark_inode_dirty() calls have been moved under glocks
in order to take advantage of then being able to avoid locking in
->dirty_inode() when we already have suitable locks.

One consequence is that generic_write_end() now correctly deals
with file size updates, so that we do not need a separate check
for that afterwards. This also, indirectly, means that fdatasync
should work correctly on GFS2 - the current code always syncs the
metadata whether it needs to or not.

Has survived testing with postmark (with and without atime) and
also fsx.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:26 +0800
4c28d3380 GFS2: Clean up dir hash table reading ... Browse Code »

Since there is now only a single caller to gfs2_dir_read_data()
and it has a number of constant arguments, we can factor
those out. Also some tests relating to the inode size were
being done twice.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-10-21 19:39:17 +0800

15 Jul, 2011

1 commit

17d539f04 GFS2: Cache dir hash table in a contiguous buffer ... Browse Code »

This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-07-15 16:31:48 +0800

09 May, 2011

2 commits

3d6ecb7d1 GFS2: When adding a new dir entry, inc link count if it is a subdir ... Browse Code »

This adds an increment of the link count when we add a new directory
entry, if that entry is itself a directory. This means that we no
longer need separate code to perform this operation.

Now that both adding and removing directory entries automatically
update the parent directory's link count if required, that makes
the code shorter and simpler than before.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-09 23:43:53 +0800
855d23ce2 GFS2: Make gfs2_dir_del update link count when required ... Browse Code »

When we remove an entry from a directory, we can save ourselves
some trouble if we know the type of the entry in question, since
if it is itself a directory, we can update the link count of the
parent at the same time as removing the directory entry.

In addition this patch also merges the rmdir and unlink code which
was almost identical anyway. This eliminates the calls to remove
the . and .. directory entries on each rmdir (not needed since the
directory will be deallocated, anyway) which was the only thing preventing
passing the dentry to gfs2_dir_del(). The passing of the dentry
rather than just the name allows us to figure out the type of the entry
which is being removed, and thus adjust the link count when required.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-09 23:42:37 +0800

20 Apr, 2011

4 commits

556bb1799 GFS2: move function foreach_leaf to gfs2_dir_exhash_dealloc ... Browse Code »

The previous patches made function gfs2_dir_exhash_dealloc do nothing
but call function foreach_leaf. This patch simplifies the code by
moving the entire function foreach_leaf into gfs2_dir_exhash_dealloc.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-04-20 15:54:44 +0800
ec038c826 GFS2: pass leaf_bh into leaf_dealloc ... Browse Code »

Function foreach_leaf used to look up the leaf block address and get
a buffer_head. Then it would call leaf_dealloc which did the same
lookup. This patch combines the two operations by making foreach_leaf
pass the leaf bh to leaf_dealloc.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-04-20 15:54:26 +0800
d24a7a439 GFS2: Combine transaction from gfs2_dir_exhash_dealloc ... Browse Code »

At the end of function gfs2_dir_exhash_dealloc, it was setting the dinode
type to "file" to prevent directory corruption in case of a crash.
It was doing so in its own journal transaction. This patch makes the
change occur when the last call is make to leaf_dealloc, since it needs
to rewrite the directory dinode at that time anyway.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-04-20 15:53:56 +0800
0d95326d9 GFS2: remove *leaf_call_t and simplify leaf_dealloc ... Browse Code »

Since foreach_leaf is only called with leaf_dealloc as its only possible
call function, we can simplify the code by making it call leaf_dealloc
directly. This simplifies the code and eliminates the need for
leaf_call_t, the generic call method. This is a first small step in
simplifying the directory leaf deallocation code.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-04-20 15:53:35 +0800

18 Apr, 2011

1 commit

44ad37d69 GFS2: filesystem hang caused by incorrect lock order ... Browse Code »

This patch fixes a deadlock in GFS2 where two processes are trying
to reclaim an unlinked dinode:
One holds the inode glock and calls gfs2_lookup_by_inum trying to look
up the inode, which it can't, due to I_FREEING. The other has set
I_FREEING from vfs and is at the beginning of gfs2_delete_inode
waiting for the glock, which is held by the first. The solution is to
add a new non_block parameter to the gfs2_iget function that causes it
to return -ENOENT if the inode is being freed.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-04-18 22:23:50 +0800

20 Sep, 2010

2 commits

8d1235852 GFS2: Make . and .. qstrs constant ... Browse Code »

Rather than calculating the qstrs for . and .. each time
we need them, its better to keep a constant version of
these and just refer to them when required.

Signed-off-by: Steven Whitehouse
Reviewed-by: Christoph Hellwig

Steven Whitehouse
2010-09-20 18:21:09 +0800
a2e0f7993 GFS2: Remove i_disksize ... Browse Code »

With the update of the truncate code, ip->i_disksize and
inode->i_size are merely copies of each other. This means
we can remove ip->i_disksize and use inode->i_size exclusively
reducing the size of a GFS2 inode by 8 bytes.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-09-20 18:18:29 +0800

29 Jul, 2010

2 commits

4244b52e1 GFS2: remove dependency on __GFP_NOFAIL ... Browse Code »

The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
so remove __GFP_NOFAIL from their masks.

Cc: Bob Peterson
Signed-off-by: David Rientjes
Signed-off-by: Steven Whitehouse

David Rientjes
2010-07-29 16:37:18 +0800
d2a97a4e9 GFS2: Use kmalloc when possible for ->readdir() ... Browse Code »

If we don't need a huge amount of memory in ->readdir() then
we can use kmalloc rather than vmalloc to allocate it. This
should cut down on the greater overheads associated with
vmalloc for smaller directories.

We may be able to eliminate vmalloc entirely at some stage,
but this is easy to do right away.

Also using GFP_NOFS to avoid any issues wrt to deleting inodes
while under a glock, and suggestion from Linus to factor out
the alloc/dealloc.

I've given this a test with a variety of different sized
directories and it seems to work ok.

Cc: Andrew Morton
Cc: Nick Piggin
Cc: Prarit Bhargava
Signed-off-by: Steven Whitehouse
Signed-off-by: Linus Torvalds

Steven Whitehouse
2010-07-29 02:10:03 +0800

15 Jul, 2010

1 commit

728a756b8 GFS2: rename causes kernel Oops ... Browse Code »

This patch fixes a kernel Oops in the GFS2 rename code.

The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.

In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.

First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another. Therefore
it determined correctly that no block allocations were needed.

Next, the rename code deleted the old directory entry prior to
replacing it with the new name. Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.

Lastly, it went to re-add the replacement directory entry in
that leaf block. However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel. That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block. But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode. Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.

In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.

Fixing this calculation enables the reproducer programs to work
properly.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2010-07-15 16:07:56 +0800

14 Apr, 2010

1 commit

1a0eae884 GFS2: glock livelock ... Browse Code »

This patch fixes a couple gfs2 problems with the reclaiming of
unlinked dinodes. First, there were a couple of livelocks where
everything would come to a halt waiting for a glock that was
seemingly held by a process that no longer existed. In fact, the
process did exist, it just had the wrong pid number in the holder
information. Second, there was a lock ordering problem between
inode locking and glock locking. Third, glock/inode contention
could sometimes cause inodes to be improperly marked invalid by
iget_failed.

Signed-off-by: Bob Peterson

Bob Peterson
2010-04-14 23:48:05 +0800

03 Dec, 2009

1 commit

1579343a7 GFS2: Remove dirent_first() function ... Browse Code »

This function only had one caller left, and that caller only
called it for leaf blocks, hence one branch of the "if" was
never taken. In addition the call to get_left had already
verified the metadata type, so the function can be reduced
to a single line of code in its caller.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:57:23 +0800

20 May, 2009

1 commit

090109783 GFS2: Improve resource group error handling ... Browse Code »

This patch improves the error handling in the case where we
discover that the summary information in the resource group
doesn't match the bitmap information while in the process of
allocating blocks. Originally this resulted in a kernel bug,
but this patch changes that so that we return -EIO and print
some messages explaining what went wrong, and how to fix it.

We also remember locally not to try and allocate from the
same rgrp again, so that a subsequent allocation in a
different rgrp should succeed.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-05-20 17:48:47 +0800

24 Mar, 2009

1 commit

f057f6cdf GFS2: Merge lock_dlm module into GFS2 ... Browse Code »

This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
o Reducing overhead by eliminating duplicated fields between structures
o Simplifcation of the code (reduces the code size by a fair bit)
o The locking interface is now the DLM interface itself as proposed
some time ago.
o Fewer lookups of glocks when processing replies from the DLM
o Fewer memory allocations/deallocations for each glock
o Scope to do further optimisations in the future (but this patch is
more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-03-24 19:21:14 +0800

05 Jan, 2009

3 commits

383f01fbf GFS2: Banish struct gfs2_dinode_host ... Browse Code »

The final field in gfs2_dinode_host was the i_flags field. Thats
renamed to i_diskflags in order to avoid confusion with the existing
inode flags, and moved into the inode proper at a suitable location
to avoid creating a "hole".

At that point struct gfs2_dinode_host is no longer needed and as
promised (quite some time ago!) it can now be removed completely.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-01-05 15:38:59 +0800
c9e988867 GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize ... Browse Code »

This patch moved the i_size field from the gfs2_dinode_host and
following the ext3 convention renames it i_disksize.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-01-05 15:38:58 +0800
ad6203f2b GFS2: Move "entries" into "proper" inode ... Browse Code »

This moves the directory entry count into the proper inode.
Potentially we could get this to share the space used by
something else in the future, but this is one more step
on the way to removing the gfs2_dinode_host structure.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-01-05 15:38:56 +0800

10 Apr, 2008

1 commit

16c5f06f1 [GFS2] fix GFP_KERNEL misuses ... Browse Code »

There are several places where GFP_KERNEL allocations happen under a glock,
which will result in hangs if we're under memory pressure and go to re-enter the
fs in order to flush stuff out. This patch changes the culprits to GFS_NOFS to
keep this problem from happening. Thank you,

Signed-off-by: Josef Bacik
Signed-off-by: Steven Whitehouse

Josef Bacik
2008-04-10 16:55:26 +0800

31 Mar, 2008

9 commits

182fe5abd [GFS2] possible null pointer dereference fixup ... Browse Code »

gfs2_alloc_get may fail so we have to check it to prevent
NULL pointer dereference.

Signed-off-by: Cyrill Gorcunov
Signed-off-by: Steven Whitehouse

Cyrill Gorcunov
2008-03-31 17:41:28 +0800
9b8c81d1d [GFS2] Allow bmap to allocate extents ... Browse Code »

We've supported mapping of extents when no block allocation is required
for some time. This patch extends that to mapping of extents when an
allocation has been requested. In that case we try to allocate as many
blocks as are requested, but we might return fewer in case there is
something preventing us from returning the complete amount (e.g. an
already allocated block is in the way).

Currently the only code path which can actually request multiple data
blocks in a single bmap call is the page_mkwrite path and even then it
only happens if there are multiple blocks per page. What this patch does
do however, is merge the allocation requests for metadata (growing the
metadata tree in either height or depth) with the allocation of the data
blocks in the case that both are needed. This results in lower overheads
even in the single block allocation case.

The one thing which we can't handle here at the moment is unstuffing. I
would like to be able to do that, but the problem which arises is that
in order to unstuff one has to get a locked page from the page cache
which results in locking problems in the (usual) case that the caller is
holding the page lock on the page it wishes to map. So that case will
have to be addressed in future patches.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:41:14 +0800
bb16b342b [GFS2] be*_add_cpu conversion ... Browse Code »

replace all:
big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
expression_in_cpu_byteorder);
with:
beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz
Signed-off-by: Steven Whitehouse

Marcin Slusarz
2008-03-31 17:41:03 +0800
77658aad2 [GFS2] Eliminate (almost) duplicate field from gfs2_inode ... Browse Code »

The blocks counter is almost a duplicate of the i_blocks
field in the VFS inode. The only difference is that i_blocks
can be only 32bits long for 32bit arch without large single file
support. Since GFS2 doesn't handle the non-large single file
case (for 32 bit anyway) this adds a new config dependency on
64BIT || LSF. This has always been the case, however we've never
explicitly said so before.

Even if we do add support for the non-LSF case, we will still
not require this field to be duplicated since we will not be
able to access oversized files anyway.

So the net result of all this is that we shave 8 bytes from a gfs2_inode
and get our config deps correct.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:40:55 +0800
b45e41d7d [GFS2] Add extent allocation to block allocator ... Browse Code »

Rather than having to allocate a single block at a time, this patch
allows the block allocator to allocate an extent. Since there is
no difference (so far as the block allocator is concerned) between
data blocks and indirect blocks, it is posible to allocate a single
extent and for the caller to unrevoke just the blocks required
for indirect blocks.

Currently the only bit of GFS2 to make use of this feature is the
build height function. The intention is that gfs2_block_map will
be changed to make use of this feature in future patches.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:40:47 +0800
1639431a3 [GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data ... Browse Code »

Thanks to the preceeding patches, the only difference between
these two functions is their name. We can thus merge them
and call the new function gfs2_alloc_block to reflect the
fact that it can allocate either kind of block.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:40:45 +0800
5731be53e [GFS2] Update gfs2_trans_add_unrevoke to accept extents ... Browse Code »

By adding an extra argument to gfs2_trans_add_unrevoke we can now
specify an extent length of blocks to unrevoke. This means that
we only need to make one pass through the list for each extent
rather than each block. Currently the only extent length which
is used is 1, but that will change in the future.

Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
since its the only difference between this and gfs2_alloc_data
which is left. This will allow a future patch to merge these
two functions into one (i.e. one call to allocate both data
and metadata in a single extent in the future).

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:40:42 +0800
9a0045088 [GFS2] Shrink & rename di_depth ... Browse Code »

This patch forms a pair with the previous patch which shrunk
di_height. Like that patch di_depth is renamed i_depth and moved
into struct gfs2_inode directly. Also the field goes from 16 bits
to 8 bits since it is also limited to a max value which is rather
small (17 in this case). In addition we also now validate the field
against this maximum value when its read in.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2008-03-31 17:40:31 +0800
fe6c991c5 [GFS2] Get rid of unneeded parameter in gfs2_rlist_alloc ... Browse Code »

This patch removed the unnecessary parameter from function
gfs2_rlist_alloc. The parameter was always passed in as 0.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2008-03-31 17:39:49 +0800

08 Feb, 2008

1 commit

e231c2ee6 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) ... Browse Code »

Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]*PTR_ERR' fs crypto net security`

Signed-off-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-02-08 00:42:26 +0800