15 Oct, 2020

1 commit

  • Commit ca399c96e96e changes gfs2_log_flush to not withdraw the
    filesystem while holding the log flush lock, but it fails to check if
    the filesystem needs to be withdrawn once the log flush lock has been
    released. Likewise, commit f05b86db314d depends on gfs2_log_flush to
    trigger for delayed withdraws. Add that and clean up the code flow
    somewhat.

    In gfs2_put_super, add a check for delayed withdraws that have been
    missed to prevent these kinds of bugs in the future.

    Fixes: ca399c96e96e ("gfs2: flesh out delayed withdraw for gfs2_log_flush")
    Fixes: f05b86db314d ("gfs2: Prepare to withdraw as soon as an IO error occurs in log write")
    Cc: stable@vger.kernel.org # v5.7+: 462582b99b607: gfs2: add some much needed cleanup for log flushes that fail
    Signed-off-by: Andreas Gruenbacher

    Andreas Gruenbacher
     

06 Jun, 2020

1 commit

  • This patch adds a new slab for gfs2 transactions. That allows us to
    reduce kernel memory fragmentation, have better organization of data
    for analysis of vmcore dumps. A new centralized function is added to
    free the slab objects, and it exposes use-after-free by giving
    warnings if a transaction is freed while it still has bd elements
    attached to its buffers or ail lists. We make sure to initialize
    those transaction ail lists so we can check their integrity when freeing.

    At a later time, we should add a slab initialization function to
    make it more efficient, but for this initial patch I wanted to
    minimize the impact.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     

27 Feb, 2020

2 commits

  • Function gfs2_log_flush() had a few places where it tried to withdraw
    from the file system when errors were encountered. The problem is,
    it should delay those withdraws until the log flush lock is no longer
    held.

    This patch creates a new function just for delayed withdraws for
    situations like this. If errors=panic was specified on mount, we
    still want to do it the old fashioned way because the panic it does
    not help to delay in that situation.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     
  • Before this patch, function check_journal_clean would give messages
    related to journal recovery. That's fine for mount time, but when a
    node withdraws and forces replay that way, we don't want all those
    distracting and misleading messages. This patch adds a new parameter
    to make those messages optional.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     

10 Feb, 2020

6 commits

  • Before this patch function check_journal_clean was in ops_fstype.c.
    This patch moves it to util.c so we can make use of it elsewhere
    in a future patch.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     
  • File system withdraws can be delayed when inconsistencies are
    discovered when we cannot withdraw immediately, for example, when
    critical spin_locks are held. But delaying the withdraw can cause
    gfs2 to ignore the error and keep running for a short period of time.
    For example, an rgrp glock may be dequeued and demoted while there
    are still buffers that haven't been properly revoked, due to io
    errors writing to the journal.

    This patch introduces a new concept of a pending withdraw, which
    means an inconsistency has been discovered and we need to withdraw
    at the earliest possible opportunity. In these cases, we aren't
    quite withdrawn yet, but we still need to not dequeue glocks and
    other critical things. If we dequeue the glocks and the withdraw
    results in our journal being replayed, the replay could overwrite
    data that's been modified by a different node that acquired the
    glock in the meantime.

    Signed-off-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher

    Bob Peterson
     
  • The gfs2_assert functions only print messages when the filesystem hasn't been
    withdrawn yet, and they indicate whether or not they've printed something in
    their return value. However, none of the callers use that information, so
    simply return whether or not the assert has failed.

    (The gfs2_assert functions are still backwards; they return false when an
    assertion is true.)

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     
  • Change the various gfs2_consist functions to return void.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     
  • These arguments are always passed as 0, and they are never evaluated.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     
  • Split gfs2_lm_withdraw into a function that prints an error message and a
    function that withdraws the filesystem.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

15 Nov, 2019

1 commit

  • Add function gfs2_withdrawn and replace all checks for the SDF_WITHDRAWN
    bit to call it. This does not change the logic or function of gfs2, and
    it facilitates later improvements to the withdraw sequence.

    Signed-off-by: Bob Peterson
    Signed-off-by: Andreas Gruenbacher

    Bob Peterson
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this copyrighted material is made available to anyone wishing to use
    modify copy or redistribute it subject to the terms and conditions
    of the gnu general public license version 2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 44 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

06 Oct, 2018

1 commit

  • Before this patch, various errors and messages were reported using
    the pr_* functions: pr_err, pr_warn, pr_info, etc., but that does
    not tell you which gfs2 mount had the problem, which is often vital
    to debugging. This patch changes the calls from pr_* to fs_* in
    most of the messages so that the file system id is printed along
    with the message.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

21 Jun, 2018

1 commit

  • In two places, the gfs2_io_error_bh macro is called while holding the
    sd_ail_lock spin lock. This isn't allowed because gfs2_io_error_bh
    withdraws the filesystem, which can sleep because it issues a uevent.
    To fix that, add a gfs2_io_error_bh_wd macro that does withdraw the
    filesystem and change gfs2_io_error_bh to not withdraw the filesystem.
    In those places where the new gfs2_io_error_bh is used, withdraw the
    filesystem after releasing sd_ail_lock.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson
    Reviewed-by: Andrew Price

    Andreas Gruenbacher
     

26 Aug, 2017

1 commit

  • This patch cleans up various pieces of GFS2 to avoid sparse errors.
    This doesn't fix them all, but it fixes several. The first error,
    in function glock_hash_walk was a genuine bug where the rhashtable
    could be started and not stopped.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

15 Dec, 2015

1 commit

  • Before this patch, multi-block reservation structures were allocated
    from a special slab. This patch folds the structure into the gfs2_inode
    structure. The disadvantage is that the gfs2_inode needs more memory,
    even when a file is opened read-only. The advantages are: (a) we don't
    need the special slab and the extra time it takes to allocate and
    deallocate from it. (b) we no longer need to worry that the structure
    exists for things like quota management. (c) This also allows us to
    remove the calls to get_write_access and put_write_access since we
    know the structure will exist.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

24 Nov, 2015

1 commit

  • This patch basically reverts the majority of patch 5407e24.
    That patch eliminated the gfs2_qadata structure in favor of just
    using the reservations structure. The problem with doing that is that
    it increases the size of the reservations structure. That is not an
    issue until it comes time to fold the reservations structure into the
    inode in memory so we know it's always there. By separating out the
    quota structure again, we aren't punishing the non-quota users by
    making all the inodes bigger, requiring more slab space. This patch
    creates a new slab area to allocate the quota stuff so it's managed
    a little more sanely.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

07 Mar, 2014

3 commits


02 Oct, 2013

1 commit


06 Jun, 2012

1 commit

  • When we read an invalid block from the journal, we should not call
    withdraw, but simply print a message and return an error. It is
    up to the caller to then handle that error. In the case of mount
    that means a failed mount, rather than a withdraw (requiring a
    reboot). In the case of recovering another nodes journal then
    we return an error via the uevent.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

24 Apr, 2012

2 commits

  • Prior to this patch, we have two ways of sending i/o to the log.
    One of those is used when we need to allocate both the data
    to be written itself and also a buffer head to submit it. This
    is done via sb_getblk and friends. This is used mostly for writing
    log headers.

    The other method is used when writing blocks which have some
    in-place counterpart. This is the case for all the metadata
    blocks which are journalled, and when journaled data is in use,
    for unescaped journalled data blocks.

    This patch replaces both of those two methods, and about half
    a dozen separate i/o submission points with a single i/o
    submission function. We also go direct to bio rather than
    using buffer heads, since this allows us to build i/o
    requests of the maximum size for the block device in
    question. It also reduces the memory required for flushing
    the log, which can be very useful in low memory situations.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch changes block reservations so it uses slab storage.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

08 Mar, 2012

1 commit

  • In order to ensure that we've got enough buffer heads for flushing
    the journal, the orignal code used __GFP_NOFAIL when performing
    this allocation. Here we dispense with that in favour of using a
    mempool. This should improve efficiency in low memory conditions
    since flushing the journal is a good way to get memory back, we
    don't want to be spinning, waiting on memory allocations. The
    buffers which are allocated via this mempool are fairly short lived,
    so that we'll recycle them pretty quickly.

    Although there are other memory allocations which occur during the
    journal flush process, this is the one which can potentially require
    the most memory, so the most important one to fix.

    The amount of memory reserved is a fixed amount, and we should not need
    to scale it when there are a greater number of filesystems in use.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

01 Mar, 2010

1 commit

  • Since the start of GFS2, an "extra" inode has been used to store
    the metadata belonging to each inode. The only reason for using
    this inode was to have an extra address space, the other fields
    were unused. This means that the memory usage was rather inefficient.

    The reason for keeping each inode's metadata in a separate address
    space is that when glocks are requested on remote nodes, we need to
    be able to efficiently locate the data and metadata which relating
    to that glock (inode) in order to sync or sync and invalidate it
    (depending on the remotely requested lock mode).

    This patch adds a new type of glock, which has in addition to
    its normal fields, has an address space. This applies to all
    inode and rgrp glocks (but to no other glock types which remain
    as before). As a result, we no longer need to have the second
    inode.

    This results in three major improvements:
    1. A saving of approx 25% of memory used in caching inodes
    2. A removal of the circular dependency between inodes and glocks
    3. No confusion between "normal" and "metadata" inodes in super.c

    Although the first of these is the more immediately apparent, the
    second is just as important as it now enables a number of clean
    ups at umount time. Those will be the subject of future patches.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

05 Jan, 2009

1 commit

  • This patch is a clean up of gfs2_quotad prior to giving it an
    extra job to do in addition to the current portfolio of updating
    the quota and statfs information from time to time.

    As a result it has been moved into quota.c allowing one of the
    functions it calls to be made static. Also the clean up allows
    the two existing functions to have separate timeouts and also
    to coexist with its future role of dealing with the "truncate in
    progress" inode flag.

    The (pointless) setting of gfs2_quotad_secs is removed since we
    arrange to only wake up quotad when one of the two timers expires.

    In addition the struct gfs2_quota_data is moved into a slab cache,
    mainly for easier debugging. It should also be possible to use
    a shrinker in the future, rather than the current scheme of scanning
    the quota data entries from time to time.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

30 Apr, 2008

1 commit


31 Mar, 2008

2 commits

  • The functions in lm.c were just wrappers which were mostly
    only used in one other file. By moving the functions to
    the files where they are being used, they can be marked
    static and also this will usually result in them being inlined
    since they are often only used from one point in the code.

    A couple of really trivial functions have been inlined by hand
    into the function which called them as it makes the code clearer
    to do that.

    We also gain from one fewer function call in the glock lock and
    unlock paths.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • This patch moves the gfs2_rgrpd structure to its own slab
    memory. This makes it easier to control and monitor, and
    yields less memory fragmentation.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

08 Dec, 2006

2 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits)
    [DLM] Clean up lowcomms
    [GFS2] Change gfs2_fsync() to use write_inode_now()
    [GFS2] Fix indent in recovery.c
    [GFS2] Don't flush everything on fdatasync
    [GFS2] Add a comment about reading the super block
    [GFS2] Mount problem with the GFS2 code
    [GFS2] Remove gfs2_check_acl()
    [DLM] fix format warnings in rcom.c and recoverd.c
    [GFS2] lock function parameter
    [DLM] don't accept replies to old recovery messages
    [DLM] fix size of STATUS_REPLY message
    [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning
    [DLM] fix add_requestqueue checking nodes list
    [GFS2] Fix recursive locking in gfs2_getattr
    [GFS2] Fix recursive locking in gfs2_permission
    [GFS2] Reduce number of arguments to meta_io.c:getbuf()
    [GFS2] Move gfs2_meta_syncfs() into log.c
    [GFS2] Fix journal flush problem
    [GFS2] mark_inode_dirty after write to stuffed file
    [GFS2] Fix glock ordering on inode creation
    ...

    Linus Torvalds
     
  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

30 Nov, 2006

1 commit


05 Sep, 2006

2 commits


01 Sep, 2006

1 commit

  • As per comments from Jan Engelhardt this
    updates the copyright message to say "version" in full rather than
    "v.2". Also incore.h has been updated to remove forward structure
    declarations which are not required.

    The gfs2_quota_lvb structure has now had endianess annotations added
    to it. Also quota.c has been updated so that we now store the
    lvb data locally in endian independant format to avoid needing
    a structure in host endianess too. As a result the endianess
    conversions are done as required at various points and thus the
    conversion routines in lvb.[ch] are no longer required. I've
    moved the one remaining constant in lvb.h thats used into lm.h
    and removed the unused lvb.[ch].

    I have not changed the HIF_ constants. That is left to a later patch
    which I hope will unify the gh_flags and gh_iflags fields of the
    struct gfs2_holder.

    Cc: Jan Engelhardt
    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

19 May, 2006

1 commit


22 Apr, 2006

2 commits


31 Mar, 2006

1 commit