22 Sep, 2016

1 commit

  • register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node
    deferred work"), we should detect the failure of it, otherwise we may
    fail to register shrinker after gfs2 module was been inited successfully.

    Signed-off-by: Chao Yu
    Signed-off-by: Bob Peterson

    Chao Yu
     

03 Aug, 2016

1 commit

  • Replace 1 << value shift by more explicit BIT() macro

    Also fixes two bare unsigned definitions:

    WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
    + unsigned hsize = BIT(ip->i_depth);

    Signed-off-by: Fabian Frederick
    Signed-off-by: Bob Peterson

    Fabian Frederick
     

27 Jun, 2016

3 commits

  • Make the code more readable by cleaning up the different ways of
    initializing lock holders and checking for initialized lock holders:
    mark lock holders as uninitialized by setting the holder's glock to NULL
    (gfs2_holder_mark_uninitialized) instead of zeroing out the entire
    object or using a separate flag. Recognize initialized holders by their
    non-NULL glock (gfs2_holder_initialized). Don't zero out holder objects
    which are immeditiately initialized via gfs2_holder_init or
    gfs2_glock_nq_init.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     
  • Now that gfs2_lookup_by_inum only takes the inode glock for new inodes
    (and not for cached inodes anymore), there no longer is a need to
    optimize the cached-inode case in gfs2_get_dentry or delete_work_func,
    and gfs2_ilookup can be removed.

    In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in
    i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this
    inconsistency goes away as well.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     
  • The current gfs2_lookup_by_inum takes the glock of a presumed inode
    identified by block number, verifies that the block is indeed an inode,
    and then instantiates and reads the new inode via gfs2_inode_lookup.

    However, instantiating a new inode may block on freeing a previous
    instance of that inode (__wait_on_freeing_inode), and freeing an inode
    requires to take the glock already held, leading to lock inversion and
    deadlock.

    Fix this by first instantiating the new inode, then verifying that the
    block is an inode (if required), and then reading in the new inode, all
    in gfs2_inode_lookup.

    If the block we are looking for is not an inode, we discard the new
    inode via iget_failed, which marks inodes as bad and unhashes them.
    Other tasks waiting on that inode will get back a bad inode back from
    ilookup or iget_locked; in that case, retry the lookup.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

21 May, 2016

1 commit

  • Pull GFS2 updates from Bob Peterson:
    "We've got nine patches this time:

    - Abhi Das has two patches that fix a GFS2 splice issue (and an
    adjustment).

    - Ben Marzinski has a patch which allows the proper unmount of a GFS2
    file system after hitting a withdraw error.

    - I have a patch to fix a problem where GFS2 would dereference an
    error value, plus three cosmetic / refactoring patches.

    - Daniel DeFreez has a patch to fix two glock reference count
    problems, where GFS2 was not properly "uninitializing" its glock
    holder on error paths.

    - Denys Vlasenko has a patch to change a function to not be inlined,
    thus reducing the memory footprint of the GFS2 module"

    * tag 'gfs2-4.7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
    GFS2: Refactor gfs2_remove_from_journal
    GFS2: Remove allocation parms from gfs2_rbm_find
    gfs2: use inode_lock/unlock instead of accessing i_mutex directly
    GFS2: Add calls to gfs2_holder_uninit in two error handlers
    GFS2: Don't dereference inode in gfs2_inode_lookup until it's valid
    GFS2: fs/gfs2/glock.c: Deinline do_error, save 1856 bytes
    gfs2: Use gfs2 wrapper to sync inode before calling generic_file_splice_read()
    GFS2: Get rid of dead code in inode_go_demote_ok
    GFS2: ignore unlock failures after withdraw

    Linus Torvalds
     

13 Apr, 2016

1 commit


05 Apr, 2016

1 commit

  • In certain cases, the 802.11 mesh pathtable code wants to
    iterate over all of the entries in the forwarding table from
    the receive path, which is inside an RCU read-side critical
    section. Enable walks inside atomic sections by allowing
    GFP_ATOMIC allocations for the walker state.

    Change all existing callsites to pass in GFP_KERNEL.

    Acked-by: Thomas Graf
    Signed-off-by: Bob Copeland
    [also adjust gfs2/glock.c and rhashtable tests]
    Signed-off-by: Johannes Berg

    Bob Copeland
     

24 Mar, 2016

1 commit

  • After gfs2 has withdrawn the filesystem, it may still have many locks not
    in the unlocked state. If it is using lock_dlm, it will failed trying
    the unlocks since it has already unmounted the lock manager. Instead, it
    should set the SDF_SKIP_DLM_UNLOCK flag on withdraw, to signal that
    it can skip the lock_manager on unlocks, and failback to lock_nolock
    style unlocking.

    Signed-off-by: Benjamin Marzinski
    Signed-off-by: Bob Peterson

    Benjamin Marzinski
     

15 Mar, 2016

2 commits

  • This patch basically reverts a very old patch from 2008,
    7a9f53b3c1875bef22ad4588e818bc046ef183da, with the title
    "Alternate gfs2_iget to avoid looking up inodes being freed".
    The original patch was designed to avoid a deadlock caused by lock
    ordering with try_rgrp_unlink. The patch forced the function to not
    find inodes that were being removed by VFS. The problem is, that
    made it impossible for nodes to delete their own unlinked dinodes
    after a certain point in time, because the inode needed was not found
    by this filtering process. There is no longer a need for the patch,
    since function try_rgrp_unlink no longer locks the inode: All it does
    is queue the glock onto the delete work_queue, so there should be no
    more deadlock.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • This patch tries to prevent delete work (queued via iopen callback)
    from executing if the glock is currently being used to create
    a new inode.

    Signed-off-by: Bob Peterson
    Acked-by: Steven Whitehouse

    Bob Peterson
     

14 Jan, 2016

1 commit

  • This patch fixes an error condition in which an inode is partially
    created in gfs2_create_inode() but then some error is discovered,
    which causes it to fail and call iput() before the iopen glock is
    created or held. In that case, gfs2_delete_inode would try to
    unlock an iopen glock that doesn't yet exist. Therefore, we test
    its holder (which must exist) for the HIF_HOLDER bit before trying
    to dq it.

    Signed-off-by: Bob Peterson
    Acked-by: Steven Whitehouse

    Bob Peterson
     

15 Dec, 2015

2 commits

  • At some point in the past, we used to have a timeout when GFS2 was
    unmounting, trying to clear out its glocks. If the timeout expires,
    it would dump the remaining glocks to the kernel messages so that
    developers can debug the problem. That timeout was eliminated,
    probably by accident. This patch reintroduces it.

    Signed-off-by: Bob Peterson

    Bob Peterson
     
  • This patch makes no functional changes. Its goal is to reduce the
    size of the gfs2 inode in memory by rearranging structures and
    changing the size of some variables within the structure.

    Signed-off-by: Bob Peterson

    Bob Peterson
     

17 Nov, 2015

1 commit

  • This lockdep splat was being triggered on umount:

    [55715.973122] ===============================
    [55715.980169] [ INFO: suspicious RCU usage. ]
    [55715.981021] 4.3.0-11553-g8d3de01-dirty #15 Tainted: G W
    [55715.982353] -------------------------------
    [55715.983301] fs/gfs2/glock.c:1427 suspicious rcu_dereference_protected() usage!

    The code it refers to is the rht_for_each_entry_safe usage in
    glock_hash_walk. The condition that triggers the warning is
    lockdep_rht_bucket_is_held(tbl, hash) which is checked in the
    __rcu_dereference_protected macro.

    The rhashtable buckets are not changed in glock_hash_walk so it's safe
    to rely on the rcu protection. Replace the rht_for_each_entry_safe()
    usage with rht_for_each_entry_rcu(), which doesn't care whether the
    bucket lock is held if the rcu read lock is held.

    Signed-off-by: Andrew Price
    Signed-off-by: Bob Peterson
    Acked-by: Steven Whitehouse

    Andrew Price
     

30 Oct, 2015

1 commit

  • Commit e66cf161 replaced the gl_spin spinlock in struct gfs2_glock with a
    gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in
    gl_lockref). Remove that define to make the references to gl_lockref.lock more
    obvious.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Bob Peterson

    Andreas Gruenbacher
     

04 Sep, 2015

5 commits


19 Jun, 2015

1 commit

  • The glocks used for resource groups often come and go hundreds of
    thousands of times per second. Adding them to the lru list just
    adds unnecessary contention for the lru_lock spin_lock, especially
    considering we're almost certainly going to re-use the glock and
    take it back off the lru microseconds later. We never want the
    glock shrinker to cull them anyway. This patch adds a new bit in
    the glops that determines which glock types get put onto the lru
    list and which ones don't.

    Signed-off-by: Bob Peterson
    Acked-by: Steven Whitehouse

    Bob Peterson
     

30 Mar, 2015

1 commit

  • debugfs_create_dir and debugfs_create_file may return -ENODEV when debugfs
    is not configured, so the return value should be checked against ERROR_VALUE
    as well, otherwise the later dereference of the dentry pointer would crash
    the kernel.

    Signed-off-by: Chengyu Song
    Signed-off-by: Bob Peterson
    Acked-by: Steven Whitehouse

    Chengyu Song
     

13 Feb, 2015

1 commit

  • Pull backing device changes from Jens Axboe:
    "This contains a cleanup of how the backing device is handled, in
    preparation for a rework of the life time rules. In this part, the
    most important change is to split the unrelated nommu mmap flags from
    it, but also removing a backing_dev_info pointer from the
    address_space (and inode), and a cleanup of other various minor bits.

    Christoph did all the work here, I just fixed an oops with pages that
    have a swap backing. Arnd fixed a missing export, and Oleg killed the
    lustre backing_dev_info from staging. Last patch was from Al,
    unexporting parts that are now no longer needed outside"

    * 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
    Make super_blocks and sb_lock static
    mtd: export new mtd_mmap_capabilities
    fs: make inode_to_bdi() handle NULL inode
    staging/lustre/llite: get rid of backing_dev_info
    fs: remove default_backing_dev_info
    fs: don't reassign dirty inodes to default_backing_dev_info
    nfs: don't call bdi_unregister
    ceph: remove call to bdi_unregister
    fs: remove mapping->backing_dev_info
    fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
    nilfs2: set up s_bdi like the generic mount_bdev code
    block_dev: get bdev inode bdi directly from the block device
    block_dev: only write bdev inode on close
    fs: introduce f_op->mmap_capabilities for nommu mmap support
    fs: kill BDI_CAP_SWAP_BACKED
    fs: deduplicate noop_backing_dev_info

    Linus Torvalds
     

21 Jan, 2015

1 commit

  • Now that we never use the backing_dev_info pointer in struct address_space
    we can simply remove it and save 4 to 8 bytes in every inode.

    Signed-off-by: Christoph Hellwig
    Acked-by: Ryusuke Konishi
    Reviewed-by: Tejun Heo
    Reviewed-by: Jan Kara
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

09 Jan, 2015

1 commit


18 Nov, 2014

1 commit


08 Oct, 2014

1 commit


28 Jul, 2014

1 commit


18 Jul, 2014

4 commits

  • This patch allows flock glocks to use a non-blocking dequeue rather
    than dq_wait. It also reverts the previous patch I had posted regarding
    dq_wait. The reverted patch isn't necessarily a bad idea, but I decided
    this might avoid unforeseen side effects, and was therefore safer.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     
  • Normally GFP_KERNEL is ok here, but there is now a rarely used code path
    relating to deallocation of unlinked inodes (in certain corner cases)
    which if hit at times of memory shortage can cause recursion while
    trying to free memory.

    One solution would be to try and move the gfs2_glock_get() call so
    that it is no longer called while another glock is held, but that
    doesn't look at all easy, so GFP_NOFS is the best solution for the
    time being.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • We must not leave items on the LRU list with GLF_LOCK set, since
    they can be removed if the glock is brought back into use, which
    may then potentially result in a hang, waiting for GLF_LOCK to
    clear.

    It doesn't happen very often, since it requires a glock that has
    not been used for a long time to be brought back into use at the
    same moment that the shrinker is part way through disposing of
    glocks.

    The fix is to set GLF_LOCK at a later time, when we already know
    that the other locks can be obtained. Also, we now only release
    the lru_lock in case a resched is needed, rather than on every
    iteration.

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     
  • Function gfs2_glock_dq_wait is supposed to dequeue a glock and then
    wait for the lock to be demoted. The problem is, if this is a shared
    lock, its demote will depend on the other holders, which means you
    might end up waiting forever because the other process is blocked.
    This problem is especially apparent when dealing with nested flocks.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

16 Jul, 2014

1 commit

  • The current "wait_on_bit" interface requires an 'action'
    function to be provided which does the actual waiting.
    There are over 20 such functions, many of them identical.
    Most cases can be satisfied by one of just two functions, one
    which uses io_schedule() and one which just uses schedule().

    So:
    Rename wait_on_bit and wait_on_bit_lock to
    wait_on_bit_action and wait_on_bit_lock_action
    to make it explicit that they need an action function.

    Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
    which are *not* given an action function but implicitly use
    a standard one.
    The decision to error-out if a signal is pending is now made
    based on the 'mode' argument rather than being encoded in the action
    function.

    All instances of the old wait_on_bit and wait_on_bit_lock which
    can use the new version have been changed accordingly and their
    action functions have been discarded.
    wait_on_bit{_lock} does not return any specific error code in the
    event of a signal so the caller must check for non-zero and
    interpolate their own error code as appropriate.

    The wait_on_bit() call in __fscache_wait_on_invalidate() was
    ambiguous as it specified TASK_UNINTERRUPTIBLE but used
    fscache_wait_bit_interruptible as an action function.
    David Howells confirms this should be uniformly
    "uninterruptible"

    The main remaining user of wait_on_bit{,_lock}_action is NFS
    which needs to use a freezer-aware schedule() call.

    A comment in fs/gfs2/glock.c notes that having multiple 'action'
    functions is useful as they display differently in the 'wchan'
    field of 'ps'. (and /proc/$PID/wchan).
    As the new bit_wait{,_io} functions are tagged "__sched", they
    will not show up at all, but something higher in the stack. So
    the distinction will still be visible, only with different
    function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
    gfs2/glock.c case).

    Since first version of this patch (against 3.15) two new action
    functions appeared, on in NFS and one in CIFS. CIFS also now
    uses an action function that makes the same freezer aware
    schedule call as NFS.

    Signed-off-by: NeilBrown
    Acked-by: David Howells (fscache, keys)
    Acked-by: Steven Whitehouse (gfs2)
    Acked-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Steve French
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
    Signed-off-by: Ingo Molnar

    NeilBrown
     

18 Apr, 2014

1 commit

  • Mostly scripted conversion of the smp_mb__* barriers.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
    Cc: Linus Torvalds
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

12 Mar, 2014

1 commit

  • This patch closes a small timing window whereby a request to hold the
    transaction glock can get stuck. The problem is that after the DLM has
    granted the lock, it can get into a state whereby it doesn't transition
    the glock to a held state, due to not having requeued the glock state
    machine to finish the transition.

    Signed-off-by: Bob Peterson
    Signed-off-by: Steven Whitehouse

    Bob Peterson
     

07 Mar, 2014

2 commits

  • Add pr_fmt, remove embedded "GFS2: " prefixes.
    This now consistently emits lower case "gfs2: " for each message.

    Other miscellanea around these changes:

    o Add missing newlines
    o Coalesce formats
    o Realign arguments

    Signed-off-by: Joe Perches
    Signed-off-by: Steven Whitehouse

    Joe Perches
     
  • -All printk(KERN_foo converted to pr_foo().
    -Messages updated to fit in 80 columns.
    -fs_macros converted as well.
    -fs_printk removed.

    Signed-off-by: Fabian Frederick
    Signed-off-by: Steven Whitehouse

    Fabian Frederick
     

16 Jan, 2014

1 commit

  • Al Viro has tactfully pointed out that we are using the incorrect
    error code in some cases. This patch fixes that, and also removes
    the (unused) return value for glock dumping.

    > * gfs2_iget() - ENOBUFS instead of ENOMEM. ENOBUFS is
    > "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
    > we don't support STREAMS it's probably fair game, but... what the hell?

    Signed-off-by: Steven Whitehouse
    Cc: Al Viro

    Steven Whitehouse
     

02 Jan, 2014

1 commit