20 Nov, 2014

1 commit

  • The fair reader/writer locks mean that btrfs_clear_path_blocking needs
    to strictly follow lock ordering rules even when we already have
    blocking locks on a given path.

    Before we can clear a blocking lock on the path, we need to make sure
    all of the locks have been converted to blocking. This will remove lock
    inversions against anyone spinning in write_lock() against the buffers
    we're trying to get read locks on. These inversions didn't exist before
    the fair read/writer locks, but now we need to be more careful.

    We papered over this deadlock in the past by changing
    btrfs_try_read_lock() to be a true trylock against both the spinlock and
    the blocking lock. This was slower, and not sufficient to fix all the
    deadlocks. This patch adds a btrfs_tree_read_lock_atomic(), which
    basically means get the spinlock but trylock on the blocking lock.

    Signed-off-by: Chris Mason
    Signed-off-by: Josef Bacik
    Reported-by: Patrick Schmid
    cc: stable@vger.kernel.org #v3.15+

    Chris Mason
     

15 Mar, 2013

1 commit


22 Mar, 2012

1 commit


28 Jul, 2011

1 commit

  • The btrfs metadata btree is the source of significant
    lock contention, especially in the root node. This
    commit changes our locking to use a reader/writer
    lock.

    The lock is built on top of rw spinlocks, and it
    extends the lock tracking to remember if we have a
    read lock or a write lock when we go to blocking. Atomics
    count the number of blocking readers or writers at any
    given time.

    It removes all of the adaptive spinning from the old code
    and uses only the spinning/blocking hints inside of btrfs
    to decide when it should continue spinning.

    In read heavy workloads this is dramatically faster. In write
    heavy workloads we're still faster because of less contention
    on the root node lock.

    We suffer slightly in dbench because we schedule more often
    during write locks, but all other benchmarks so far are improved.

    Signed-off-by: Chris Mason

    Chris Mason
     

06 May, 2011

1 commit

  • Remove static and global declarations and/or definitions. Reduces size
    of btrfs.ko by ~3.4kB.

    text data bss dec hex filename
    402081 7464 200 409745 64091 btrfs.ko.base
    398620 7144 200 405964 631cc btrfs.ko.remove-all

    Signed-off-by: David Sterba

    David Sterba
     

09 Mar, 2009

1 commit

  • btrfs_tree_locked was being used to make sure a given extent_buffer was
    properly locked in a few places. But, it wasn't correct for UP compiled
    kernels.

    This switches it to using assert_spin_locked instead, and renames it to
    btrfs_assert_tree_locked to better reflect how it was really being used.

    Signed-off-by: Chris Mason

    Chris Mason
     

10 Feb, 2009

1 commit

  • Btrfs was using spin_is_contended to see if it should drop locks before
    doing extent allocations during btrfs_search_slot. The idea was to avoid
    expensive searches in the tree unless the lock was actually contended.

    But, spin_is_contended is specific to the ticket spinlocks on x86, so this
    is causing compile errors everywhere else.

    In practice, the contention could easily appear some time after we started
    doing the extent allocation, and it makes more sense to always drop the lock
    instead.

    Signed-off-by: Chris Mason

    Chris Mason
     

04 Feb, 2009

1 commit

  • Most of the btrfs metadata operations can be protected by a spinlock,
    but some operations still need to schedule.

    So far, btrfs has been using a mutex along with a trylock loop,
    most of the time it is able to avoid going for the full mutex, so
    the trylock loop is a big performance gain.

    This commit is step one for getting rid of the blocking locks entirely.
    btrfs_tree_lock takes a spinlock, and the code explicitly switches
    to a blocking lock when it starts an operation that can schedule.

    We'll be able get rid of the blocking locks in smaller pieces over time.
    Tracing allows us to find the most common cause of blocking, so we
    can start with the hot spots first.

    The basic idea is:

    btrfs_tree_lock() returns with the spin lock held

    btrfs_set_lock_blocking() sets the EXTENT_BUFFER_BLOCKING bit in
    the extent buffer flags, and then drops the spin lock. The buffer is
    still considered locked by all of the btrfs code.

    If btrfs_tree_lock gets the spinlock but finds the blocking bit set, it drops
    the spin lock and waits on a wait queue for the blocking bit to go away.

    Much of the code that needs to set the blocking bit finishes without actually
    blocking a good percentage of the time. So, an adaptive spin is still
    used against the blocking bit to avoid very high context switch rates.

    btrfs_clear_lock_blocking() clears the blocking bit and returns
    with the spinlock held again.

    btrfs_tree_unlock() can be called on either blocking or spinning locks,
    it does the right thing based on the blocking bit.

    ctree.c has a helper function to set/clear all the locked buffers in a
    path as blocking.

    Signed-off-by: Chris Mason

    Chris Mason
     

25 Sep, 2008

2 commits

  • A btree block cow has two parts, the first is to allocate a destination
    block and the second is to copy the old bock over.

    The first part needs locks in the extent allocation tree, and may need to
    do IO. This changeset splits that into a separate function that can be
    called without any tree locks held.

    btrfs_search_slot is changed to drop its path and start over if it has
    to COW a contended block. This often means that many writers will
    pre-alloc a new destination for a the same contended block, but they
    cache their prealloc for later use on lower levels in the tree.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • The allocation trees and the chunk trees are serialized via their own
    dedicated mutexes. This means allocation location is still not very
    fine grained.

    The main FS btree is protected by locks on each block in the btree. Locks
    are taken top / down, and as processing finishes on a given level of the
    tree, the lock is released after locking the lower level.

    The end result of a search is now a path where only the lowest level
    is locked. Releasing or freeing the path drops any locks held.

    Signed-off-by: Chris Mason

    Chris Mason