25 Jan, 2008
6 commits
-
Here is a patch for the latest upstream GFS2 code:
The journal extent map needs to be initialized sooner than it
currently is. Otherwise failed mount attempts (e.g. not enough
journals, etc.) may panic trying to access the uninitialized list.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
This is a small correction to my previously posted patch1.
It just changes a divide to a shift. It's faster and doesn't
introduce odd dependencies on 32-bit compiles.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
This patch eliminates the unneeded sd_statfs_mutex mutex but preserves
the ordering as discussed.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
This patch saves a little time when gfs2 writes to the journals by
keeping a mapping between logical and physical blocks on disk.
That's better than constantly looking up indirect pointers in
buffers, when the journals are several levels of indirection
(which they typically are).Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
This patch changes the counter which keeps track of the free
blocks in the journal to an atomic_t in preparation for the
following patch which will update the log reservation code.Signed-off-by: Steven Whitehouse
-
The only reason for adding glocks to the journal was to keep track
of which locks required a log flush prior to release. We add a
flag to the glock to allow this check to be made in a simpler way.This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
and means that we can avoid extra work during the journal flush.Signed-off-by: Steven Whitehouse
10 Oct, 2007
12 commits
-
There is a possible deadlock between two processes on the same node, where one
process is deleting an inode, and another process is looking for allocated but
unused inodes to delete in order to create more space.process A does an iput() on inode X, and it's i_count drops to 0. This causes
iput_final() to be called, which puts an inode into state I_FREEING at
generic_delete_inode(). There no point between when iput_final() is called, and
when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
set, no other process on that node can successfully look up that inode until
the delete finishes.process B locks the the resource group for the same inode in get_local_rgrp(),
which is called by gfs2_inplace_reserve_i()process A tries to lock the resource group for the inode in
gfs2_dinode_dealloc(), but it's already locked by process Bprocess B waits in find_inode for the inode to have the I_FREEING state cleared.
Deadlock.
This patch solves the problem by adding an alternative to gfs2_iget(),
gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
state.o The alternate test function is just like the original one, except that
it fails if the inode is being freed, and sets a skipped flag. The alternate
set function is just like the original, except that it fails if the skipped
flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
gfs2_iget().Signed-off-by: Benjamin E. Marzinski
Signed-off-by: Steven Whitehouse -
This patch cleans up the code for writing journaled data into the log.
It also removes the need to allocate a small "tag" structure for each
block written into the log. Instead we just keep count of the outstanding
I/O so that we can be sure that its all been written at the correct time.
Another result of this patch is that a number of ll_rw_block() calls
have become submit_bh() calls, closing some races at the same time.Signed-off-by: Steven Whitehouse
-
The following patch removes the ordered write processing from
databuf_lo_before_commit() and moves it to log.c. This has the effect of
greatly simplyfying databuf_lo_before_commit() and well as potentially
making the ordered write code more efficient.As a side effect of this, its now possible to remove ordered buffers
from the ordered buffer list at any time, so we now make use of this in
invalidatepage and releasepage to ensure timely release of these
buffers.Signed-off-by: Steven Whitehouse
-
When you try to mount gfs2 with -o garbage, the mount fails and the gfs2
superblock is deallocated and becomes NULL. The vfs comes around later
on and calls gfs2_kill_sb. At this point the hidden gfs2 superblock
pointer (sb->s_fs_info) is NULL and dereferencing it through
gfs2_meta_syncfs causes the panic. (the other function call to
gfs2_delete_debugfs_file() succeeds because this function already checks
for a NULL pointer)Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse -
This patch fixes some bugs relating to journaled data files by cleaning
up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now
never block during gfs2_releasepage(), instead we always either release
or refuse to release depending on the status of the buffers.This fixes Red Hat bugzillas #248969 and #252392.
Signed-off-by: Steven Whitehouse
Cc: Bob Peterson -
Signed-off-by: Denis Cheng
Signed-off-by: Steven Whitehouse -
the original code could work, but I think this code could work better.
Signed-off-by: Denis Cheng
Signed-off-by: Steven Whitehouse -
sb->s_fs_info is a void pointer, thus the type cast is not needed.
Signed-off-by: Denis Cheng
Signed-off-by: Steven Whitehouse -
Signed-off-by: Denis Cheng
Signed-off-by: Steven Whitehouse -
This is for bugzilla bug #248176: GFS2: invalid metadata block
Patches 1 thru 3 were accepted upstream, but there were problems
with 4 and 5. Those issues have been resolved and now the recovery
tests are passing without errors. This code has gone through
41 * 3 successful gfs2 recovery tests before it hit an
unrelated (openais) problem. I'm continuing to test it.This is a complete rewrite of patch 5 for bug #248176, written by
Steve Whitehouse. This is referred to in the bugzilla record as
"new 6" and "a different solution".The problem was that the journal inodes, although protected by
a glock, were not synched with the other nodes because they don't
use the inode glock synch operations (i.e. no "glops" were defined).
Therefore, journal recovery on a journal-recovering node were causing
the blocks to get out of sync with the node that was actually trying
to use that journal as it comes back up from a reboot.There are two possible solutions: (1) To make the journals use the
normal inode glock sync operations, or (2) To make the journal
operations take effect immediately (i.e. no caching). Although
option 1 works, it turns out to be a lot more code. Steve opted
for option 2, which is much simpler and therefore less prone to
regression errors.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse--
-
We only need a single gfs2_scand process rather than the one
per filesystem which we had previously. As a result the parameter
determining the frequency of gfs2_scand runs becomes a module
parameter rather than a mount parameter as it was before.Signed-off-by: Steven Whitehouse
-
Signed-off-by: Denis Cheng
Signed-off-by: Steven Whitehouse
09 Jul, 2007
5 commits
-
GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.Signed-off-by: S. Wendy Cheng
Signed-off-by: Steven Whitehouse -
This patch fixes bug 243131: Can't mount GFS2 file system on AoE device.
When using AoE devices with lock_nolock, there is no locking table, so
gfs2 (and gfs1) uses the superblock s_id. This turns out to be the device
name in some cases. In the case of AoE, the device contains a slash,
(e.g. "etherd/e1.1p2") which is an invalid character when we try to
register the table in sysfs. This patch replaces the "/" with underscore.
Rather than add a new variable to the stack, I'm just reusing a (char *)
variable that's no longer used: table.This code has been tested on the failing system using a RHEL5 patch.
The upstream code was tested by using gfs2_tool sb to interject a "/"
into the table name of a clustered gfs2 file system.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
to the way that the on-disk format works, older filesystems will just
appear to have this field set to zero. When mounted by an older version
of GFS2, the filesystem will simply ignore the extra fields so that
it will again appear to have whole second resolution, so that its
trivially backward compatible.Signed-off-by: Steven Whitehouse
-
This patch fixes some sign issues which were accidentally introduced
into the quota & statfs code during the endianess annotation process.
Also included is a general clean up which moves all of the _host
structures out of gfs2_ondisk.h (where they should not have been to
start with) and into the places where they are actually used (often only
one place). Also those _host structures which are not required any more
are removed entirely (which is the eventual plan for all of them).The conversion routines from ondisk.c are also moved into the places
where they are actually used, which for almost every one, was just one
single place, so all those are now static functions. This also cleans up
the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.The net result is a reduction of about 100 lines of code, many functions
now marked static plus the bug fixes as mentioned above. For good
measure I ran the code through sparse after making these changes to
check that there are no warnings generated.This fixes Red Hat bz #239686
Signed-off-by: Steven Whitehouse
-
This patch cleans up the inode number handling code. The main difference
is that instead of looking up the inodes using a struct gfs2_inum_host
we now use just the no_addr member of this structure. The tests relating
to no_formal_ino can then be done by the calling code. This has
advantages in that we want to do different things in different code
paths if the no_formal_ino doesn't match. In the NFS patch we want to
return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
no_formal_ino doesn't match and thus we can withdraw in this case.In order to later fix bz #201012, we need to be able to look up an inode
without knowing no_formal_ino, as the only information that is known to
us is the on-disk location of the inode in question.This patch will also help us to fix bz #236099 at a later date by
cleaning up a lot of the code in that area.There are no user visible changes as a result of this patch and there
are no changes to the on-disk format either.Signed-off-by: Steven Whitehouse
01 May, 2007
2 commits
-
The patch below consists of the following changes (in code order):
1. I fixed a minor compiler warning regarding the printing of
a kernel symbol address.
2. I implemented a suggestion from Dave Teigland that moves
the debugfs information for gfs2 into a subdirectory so
we can easily expand our use of debugfs in the future.
The current code keeps the glock information in:
/debug/gfs2/
With the patch, the new code keeps the glock information in:
/debug/gfs2//glock
That will allow us to create more debugfs files in the future.
3. This fixes a bug whereby a failed mount attempt causes the
debugfs file to not be deleted. Failed mount attempts should
always clean up after themselves, including deleting the
debugfs file and/or directory.Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse -
The attached patch resolves bz 228540. This adds the capability
for gfs2 to dump gfs2 locks through the debugfs file system.
This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
gfs2 because all the ioctls were stripped out. Please see the bugzilla
for more history about the fix. This patch is also attached to the bugzilla
record.The patch is against Steve Whitehouse's latest nmw git tree kernel
(2.6.21-rc1) and has been tested on system trin-10.Signed-off-by: Robert Peterson
Signed-off-by: Steven Whitehouse
08 Mar, 2007
1 commit
-
Patch for the 2.6.20 stable tree that adds a missing newline to one of
the printk messages in fs/gfs2/ops_fstype.c.Signed-off-by: Richard Fearn
Signed-off-by: Steven Whitehouse
12 Jan, 2007
1 commit
-
Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.(XFS unlocks the semaphore from a different task, by design. The mutex
code warns about this)Signed-off-by: Dave Chinner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Nov, 2006
1 commit
-
Signed-off-by: Al Viro
Signed-off-by: Steven Whitehouse
20 Oct, 2006
2 commits
-
Don't dereference new->s_root when we do know it's NULL.
Spotted by the Coverity checker.
Signed-off-by: Adrian Bunk
Signed-off-by: Steven Whitehouse -
The Coverity checker spotted this unused variable.
Signed-off-by: Adrian Bunk
Signed-off-by: Steven Whitehouse
02 Oct, 2006
1 commit
-
For some reason we had two different sets of code for reading in the
superblock. This removes one of them in favour of the other. Also we
don't need the temporary buffer for the sb since we already have one
in the gfs2 sb itself.Signed-off-by: Steven Whitehouse
25 Sep, 2006
1 commit
-
As per Andrew Morton's request, removed trailing whitespace.
Cc: Andrew Morton
Signed-off-by: Steven Whitehouse
19 Sep, 2006
1 commit
-
lm_interface.h has a few out of the tree clients such as GFS1
and userland tools.Right now, these clients keeps a copy of the file in their build tree
that can go out of sync.Move lm_interface.h to include/linux, export it to userland and
clean up fs/gfs2 to use the new location.Signed-off-by: Fabio M. Di Nitto
Signed-off-by: Steven Whitehouse
08 Sep, 2006
3 commits
-
This was missed in an earlier patch when changing over from vmalloc
to kmalloc for the superblock.Signed-off-by: Steven Whitehouse
-
Excatly as the subject line says.
Signed-off-by: Steven Whitehouse
-
There are several reasons why we want to do this:
- Firstly its large and thus we'll scale better with multiple
GFS2 fs mounted at the same time
- Secondly its easier to scale its size as required (thats a plan
for later patches)
- Thirdly, we can use kzalloc rather than vmalloc when allocating
the superblock (its now only 4888 bytes)
- Fourth its all part of my plan to eventually be able to use RCU
with the glock hash.Signed-off-by: Steven Whitehouse
05 Sep, 2006
2 commits
-
As per Jan Engelhardt's fifth email. This has most of the changes
recommended, which is the removal of casts which are not required,
some indenting fixes and similar.Cc: Jan Engelhardt
Signed-off-by: Steven Whitehouse -
This makes everything consistent.
Signed-off-by: Steven Whitehouse
01 Sep, 2006
1 commit
-
As per comments from Jan Engelhardt this
updates the copyright message to say "version" in full rather than
"v.2". Also incore.h has been updated to remove forward structure
declarations which are not required.The gfs2_quota_lvb structure has now had endianess annotations added
to it. Also quota.c has been updated so that we now store the
lvb data locally in endian independant format to avoid needing
a structure in host endianess too. As a result the endianess
conversions are done as required at various points and thus the
conversion routines in lvb.[ch] are no longer required. I've
moved the one remaining constant in lvb.h thats used into lm.h
and removed the unused lvb.[ch].I have not changed the HIF_ constants. That is left to a later patch
which I hope will unify the gh_flags and gh_iflags fields of the
struct gfs2_holder.Cc: Jan Engelhardt
Signed-off-by: Steven Whitehouse
26 Aug, 2006
1 commit
-
This patch allows the simultaneous mounting of gfs2meta and gfs2
filesystems. A restriction however is that a gfs2meta fs may only be
mounted if its corresponding gfs2 filesystem is also mounted. Also, a
gfs2 filesystem cannot be unmounted before its gfs2meta filesystem.Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse