25 Jan, 2008

6 commits

  • Here is a patch for the latest upstream GFS2 code:
    The journal extent map needs to be initialized sooner than it
    currently is. Otherwise failed mount attempts (e.g. not enough
    journals, etc.) may panic trying to access the uninitialized list.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This is a small correction to my previously posted patch1.
    It just changes a divide to a shift. It's faster and doesn't
    introduce odd dependencies on 32-bit compiles.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch eliminates the unneeded sd_statfs_mutex mutex but preserves
    the ordering as discussed.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch saves a little time when gfs2 writes to the journals by
    keeping a mapping between logical and physical blocks on disk.
    That's better than constantly looking up indirect pointers in
    buffers, when the journals are several levels of indirection
    (which they typically are).

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch changes the counter which keeps track of the free
    blocks in the journal to an atomic_t in preparation for the
    following patch which will update the log reservation code.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The only reason for adding glocks to the journal was to keep track
    of which locks required a log flush prior to release. We add a
    flag to the glock to allow this check to be made in a simpler way.

    This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
    and means that we can avoid extra work during the journal flush.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

10 Oct, 2007

12 commits

  • There is a possible deadlock between two processes on the same node, where one
    process is deleting an inode, and another process is looking for allocated but
    unused inodes to delete in order to create more space.

    process A does an iput() on inode X, and it's i_count drops to 0. This causes
    iput_final() to be called, which puts an inode into state I_FREEING at
    generic_delete_inode(). There no point between when iput_final() is called, and
    when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
    set, no other process on that node can successfully look up that inode until
    the delete finishes.

    process B locks the the resource group for the same inode in get_local_rgrp(),
    which is called by gfs2_inplace_reserve_i()

    process A tries to lock the resource group for the inode in
    gfs2_dinode_dealloc(), but it's already locked by process B

    process B waits in find_inode for the inode to have the I_FREEING state cleared.

    Deadlock.

    This patch solves the problem by adding an alternative to gfs2_iget(),
    gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
    state.o The alternate test function is just like the original one, except that
    it fails if the inode is being freed, and sets a skipped flag. The alternate
    set function is just like the original, except that it fails if the skipped
    flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
    gfs2_iget().

    Signed-off-by: Benjamin E. Marzinski
    Signed-off-by: Steven Whitehouse

    Benjamin Marzinski
     
  • This patch cleans up the code for writing journaled data into the log.
    It also removes the need to allocate a small "tag" structure for each
    block written into the log. Instead we just keep count of the outstanding
    I/O so that we can be sure that its all been written at the correct time.
    Another result of this patch is that a number of ll_rw_block() calls
    have become submit_bh() calls, closing some races at the same time.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • The following patch removes the ordered write processing from
    databuf_lo_before_commit() and moves it to log.c. This has the effect of
    greatly simplyfying databuf_lo_before_commit() and well as potentially
    making the ordered write code more efficient.

    As a side effect of this, its now possible to remove ordered buffers
    from the ordered buffer list at any time, so we now make use of this in
    invalidatepage and releasepage to ensure timely release of these
    buffers.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • When you try to mount gfs2 with -o garbage, the mount fails and the gfs2
    superblock is deallocated and becomes NULL. The vfs comes around later
    on and calls gfs2_kill_sb. At this point the hidden gfs2 superblock
    pointer (sb->s_fs_info) is NULL and dereferencing it through
    gfs2_meta_syncfs causes the panic. (the other function call to
    gfs2_delete_debugfs_file() succeeds because this function already checks
    for a NULL pointer)

    Signed-off-by: Abhijith Das
    Signed-off-by: Steven Whitehouse

    Abhijith Das
     
  • This patch fixes some bugs relating to journaled data files by cleaning
    up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now
    never block during gfs2_releasepage(), instead we always either release
    or refuse to release depending on the status of the buffers.

    This fixes Red Hat bugzillas #248969 and #252392.

    Signed-off-by: Steven Whitehouse
    Cc: Bob Peterson

    Steven Whitehouse
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     
  • the original code could work, but I think this code could work better.

    Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     
  • sb->s_fs_info is a void pointer, thus the type cast is not needed.

    Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     
  • This is for bugzilla bug #248176: GFS2: invalid metadata block

    Patches 1 thru 3 were accepted upstream, but there were problems
    with 4 and 5. Those issues have been resolved and now the recovery
    tests are passing without errors. This code has gone through
    41 * 3 successful gfs2 recovery tests before it hit an
    unrelated (openais) problem. I'm continuing to test it.

    This is a complete rewrite of patch 5 for bug #248176, written by
    Steve Whitehouse. This is referred to in the bugzilla record as
    "new 6" and "a different solution".

    The problem was that the journal inodes, although protected by
    a glock, were not synched with the other nodes because they don't
    use the inode glock synch operations (i.e. no "glops" were defined).
    Therefore, journal recovery on a journal-recovering node were causing
    the blocks to get out of sync with the node that was actually trying
    to use that journal as it comes back up from a reboot.

    There are two possible solutions: (1) To make the journals use the
    normal inode glock sync operations, or (2) To make the journal
    operations take effect immediately (i.e. no caching). Although
    option 1 works, it turns out to be a lot more code. Steve opted
    for option 2, which is much simpler and therefore less prone to
    regression errors.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    --

    Bob Peterson
     
  • We only need a single gfs2_scand process rather than the one
    per filesystem which we had previously. As a result the parameter
    determining the frequency of gfs2_scand runs becomes a module
    parameter rather than a mount parameter as it was before.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: Steven Whitehouse

    Denis Cheng
     

09 Jul, 2007

5 commits

  • GFS2 lookup code doesn't ask for inode shared glock. This implies during
    in-memory inode creation for existing file, GFS2 will not disk-read in
    the inode contents. This leaves no_formal_ino un-initialized during
    lookup time. The un-initialized no_formal_ino is subsequently encoded
    into file handle. Clients will get ESTALE error whenever it tries to
    access these files.

    Signed-off-by: S. Wendy Cheng
    Signed-off-by: Steven Whitehouse

    Wendy Cheng
     
  • This patch fixes bug 243131: Can't mount GFS2 file system on AoE device.
    When using AoE devices with lock_nolock, there is no locking table, so
    gfs2 (and gfs1) uses the superblock s_id. This turns out to be the device
    name in some cases. In the case of AoE, the device contains a slash,
    (e.g. "etherd/e1.1p2") which is an invalid character when we try to
    register the table in sysfs. This patch replaces the "/" with underscore.
    Rather than add a new variable to the stack, I'm just reusing a (char *)
    variable that's no longer used: table.

    This code has been tested on the failing system using a RHEL5 patch.
    The upstream code was tested by using gfs2_tool sb to interject a "/"
    into the table name of a clustered gfs2 file system.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Robert Peterson
     
  • This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
    to the way that the on-disk format works, older filesystems will just
    appear to have this field set to zero. When mounted by an older version
    of GFS2, the filesystem will simply ignore the extra fields so that
    it will again appear to have whole second resolution, so that its
    trivially backward compatible.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch fixes some sign issues which were accidentally introduced
    into the quota & statfs code during the endianess annotation process.
    Also included is a general clean up which moves all of the _host
    structures out of gfs2_ondisk.h (where they should not have been to
    start with) and into the places where they are actually used (often only
    one place). Also those _host structures which are not required any more
    are removed entirely (which is the eventual plan for all of them).

    The conversion routines from ondisk.c are also moved into the places
    where they are actually used, which for almost every one, was just one
    single place, so all those are now static functions. This also cleans up
    the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.

    The net result is a reduction of about 100 lines of code, many functions
    now marked static plus the bug fixes as mentioned above. For good
    measure I ran the code through sparse after making these changes to
    check that there are no warnings generated.

    This fixes Red Hat bz #239686

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch cleans up the inode number handling code. The main difference
    is that instead of looking up the inodes using a struct gfs2_inum_host
    we now use just the no_addr member of this structure. The tests relating
    to no_formal_ino can then be done by the calling code. This has
    advantages in that we want to do different things in different code
    paths if the no_formal_ino doesn't match. In the NFS patch we want to
    return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
    no_formal_ino doesn't match and thus we can withdraw in this case.

    In order to later fix bz #201012, we need to be able to look up an inode
    without knowing no_formal_ino, as the only information that is known to
    us is the on-disk location of the inode in question.

    This patch will also help us to fix bz #236099 at a later date by
    cleaning up a lot of the code in that area.

    There are no user visible changes as a result of this patch and there
    are no changes to the on-disk format either.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

01 May, 2007

2 commits

  • The patch below consists of the following changes (in code order):

    1. I fixed a minor compiler warning regarding the printing of
    a kernel symbol address.
    2. I implemented a suggestion from Dave Teigland that moves
    the debugfs information for gfs2 into a subdirectory so
    we can easily expand our use of debugfs in the future.
    The current code keeps the glock information in:
    /debug/gfs2/
    With the patch, the new code keeps the glock information in:
    /debug/gfs2//glock
    That will allow us to create more debugfs files in the future.
    3. This fixes a bug whereby a failed mount attempt causes the
    debugfs file to not be deleted. Failed mount attempts should
    always clean up after themselves, including deleting the
    debugfs file and/or directory.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Robert Peterson
     
  • The attached patch resolves bz 228540. This adds the capability
    for gfs2 to dump gfs2 locks through the debugfs file system.
    This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
    gfs2 because all the ioctls were stripped out. Please see the bugzilla
    for more history about the fix. This patch is also attached to the bugzilla
    record.

    The patch is against Steve Whitehouse's latest nmw git tree kernel
    (2.6.21-rc1) and has been tested on system trin-10.

    Signed-off-by: Robert Peterson
    Signed-off-by: Steven Whitehouse

    Robert Peterson
     

08 Mar, 2007

1 commit


12 Jan, 2007

1 commit

  • Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest;
    xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings.

    (XFS unlocks the semaphore from a different task, by design. The mutex
    code warns about this)

    Signed-off-by: Dave Chinner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Chinner
     

30 Nov, 2006

1 commit


20 Oct, 2006

2 commits


02 Oct, 2006

1 commit

  • For some reason we had two different sets of code for reading in the
    superblock. This removes one of them in favour of the other. Also we
    don't need the temporary buffer for the sb since we already have one
    in the gfs2 sb itself.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

25 Sep, 2006

1 commit


19 Sep, 2006

1 commit

  • lm_interface.h has a few out of the tree clients such as GFS1
    and userland tools.

    Right now, these clients keeps a copy of the file in their build tree
    that can go out of sync.

    Move lm_interface.h to include/linux, export it to userland and
    clean up fs/gfs2 to use the new location.

    Signed-off-by: Fabio M. Di Nitto
    Signed-off-by: Steven Whitehouse

    Fabio Massimo Di Nitto
     

08 Sep, 2006

3 commits


05 Sep, 2006

2 commits


01 Sep, 2006

1 commit

  • As per comments from Jan Engelhardt this
    updates the copyright message to say "version" in full rather than
    "v.2". Also incore.h has been updated to remove forward structure
    declarations which are not required.

    The gfs2_quota_lvb structure has now had endianess annotations added
    to it. Also quota.c has been updated so that we now store the
    lvb data locally in endian independant format to avoid needing
    a structure in host endianess too. As a result the endianess
    conversions are done as required at various points and thus the
    conversion routines in lvb.[ch] are no longer required. I've
    moved the one remaining constant in lvb.h thats used into lm.h
    and removed the unused lvb.[ch].

    I have not changed the HIF_ constants. That is left to a later patch
    which I hope will unify the gh_flags and gh_iflags fields of the
    struct gfs2_holder.

    Cc: Jan Engelhardt
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

26 Aug, 2006

1 commit

  • This patch allows the simultaneous mounting of gfs2meta and gfs2
    filesystems. A restriction however is that a gfs2meta fs may only be
    mounted if its corresponding gfs2 filesystem is also mounted. Also, a
    gfs2 filesystem cannot be unmounted before its gfs2meta filesystem.

    Signed-off-by: Abhijith Das
    Signed-off-by: Steven Whitehouse

    Abhijith Das