08 May, 2014

1 commit


07 May, 2014

5 commits


09 Aug, 2013

1 commit

  • The reiserfs write lock replaced the BKL and uses similar semantics.

    Frederic's locking code makes a distinction between when the lock is nested
    and when it's being acquired/released, but I don't think that's the right
    distinction to make.

    The right distinction is between the lock being released at end-of-use and
    the lock being released for a schedule. The unlock should return the depth
    and the lock should restore it, rather than the other way around as it is now.

    This patch implements that and adds a number of places where the lock
    should be dropped.

    Signed-off-by: Jeff Mahoney

    Jeff Mahoney
     

21 Mar, 2012

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

10 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
    tree-wide: fix misspelling of "definition" in comments
    reiserfs: fix misspelling of "journaled"
    doc: Fix a typo in slub.txt.
    inotify: remove superfluous return code check
    hdlc: spelling fix in find_pvc() comment
    doc: fix regulator docs cut-and-pasteism
    mtd: Fix comment in Kconfig
    doc: Fix IRQ chip docs
    tree-wide: fix assorted typos all over the place
    drivers/ata/libata-sff.c: comment spelling fixes
    fix typos/grammos in Documentation/edac.txt
    sysctl: add missing comments
    fs/debugfs/inode.c: fix comment typos
    sgivwfb: Make use of ARRAY_SIZE.
    sky2: fix sky2_link_down copy/paste comment error
    tree-wide: fix typos "couter" -> "counter"
    tree-wide: fix typos "offest" -> "offset"
    fix kerneldoc for set_irq_msi()
    spidev: fix double "of of" in comment
    comment typo fix: sybsystem -> subsystem
    ...

    Linus Torvalds
     

05 Dec, 2009

1 commit


14 Sep, 2009

3 commits

  • When do_balance() balances the tree, a trick is performed to
    provide the ability for other tree writers/readers to check whether
    do_balance() is executing concurrently (requires CONFIG_REISERFS_CHECK).

    This is done to protect concurrent accesses to the tree. The trick
    is the following:

    When do_balance is called, a unique global variable called cur_tb
    takes a pointer to the current tree to be rebalanced.
    Once do_balance finishes its work, cur_tb takes the NULL value.

    Then, concurrent tree readers/writers just have to check the value
    of cur_tb to ensure do_balance isn't executing concurrently.
    If it is, then it proves that schedule() occured on do_balance(),
    which then relaxed the bkl that protected the tree.

    Now that the bkl has be turned into a mutex, this check is still
    fine even though do_balance() becomes preemptible: the write lock
    will not be automatically released on schedule(), so the tree is
    still protected.

    But this is only fine if we have a single reiserfs mountpoint.
    Indeed, because the bkl is a global lock, it didn't allowed
    concurrent executions between a tree reader/writer in a mount point
    and a do_balance() on another tree from another mountpoint.

    So assuming all these readers/writers weren't supposed to be
    reentrant, the current check now sometimes detect false positives with
    the current per-superblock mutex which allows this reentrancy.

    This patch keeps the concurrent tree accesses check but moves it
    per superblock, so that only trees from a same mount point are
    checked to be not accessed concurrently.

    [ Impact: fix spurious panic while running several reiserfs mount-points ]

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Ingo Molnar
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • get_neighbors() is used to get the left and/or right blocks
    against a given one in order to balance a tree.

    sb_bread() is used to read the buffer of these neighors blocks and
    while it waits for this operation, it might sleep.

    The bkl was released at this point, and then we can also release
    the write lock before calling sb_bread().

    This is safe because if the filesystem is changed after this
    lock release, the function returns REPEAT_SEARCH (aka SCHEDULE_OCCURRED
    in the function header comments) in order to repeat the neighbhor
    research.

    [ Impact: release the reiserfs write lock when it is not needed ]

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • This patch is an attempt to remove the Bkl based locking scheme from
    reiserfs and is intended.

    It is a bit inspired from an old attempt by Peter Zijlstra:

    http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

    The bkl is heavily used in this filesystem to prevent from
    concurrent write accesses on the filesystem.

    Reiserfs makes a deep use of the specific properties of the Bkl:

    - It can be acqquired recursively by a same task
    - It is released on the schedule() calls and reacquired when schedule() returns

    The two properties above are a roadmap for the reiserfs write locking so it's
    very hard to simply replace it with a common mutex.

    - We need a recursive-able locking unless we want to restructure several blocks
    of the code.
    - We need to identify the sites where the bkl was implictly relaxed
    (schedule, wait, sync, etc...) so that we can in turn release and
    reacquire our new lock explicitly.
    Such implicit releases of the lock are often required to let other
    resources producer/consumer do their job or we can suffer unexpected
    starvations or deadlocks.

    So the new lock that replaces the bkl here is a per superblock mutex with a
    specific property: it can be acquired recursively by a same task, like the
    bkl.

    For such purpose, we integrate a lock owner and a lock depth field on the
    superblock information structure.

    The first axis on this patch is to turn reiserfs_write_(un)lock() function
    into a wrapper to manage this mutex. Also some explicit calls to
    lock_kernel() have been converted to reiserfs_write_lock() helpers.

    The second axis is to find the important blocking sites (schedule...(),
    wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
    release of the write lock on these locations before blocking. Then we can
    safely wait for those who can give us resources or those who need some.
    Typically this is a fight between the current writer, the reiserfs workqueue
    (aka the async commiter) and the pdflush threads.

    The third axis is a consequence of the second. The write lock is usually
    on top of a lock dependency chain which can include the journal lock, the
    flush lock or the commit lock. So it's dangerous to release and trying to
    reacquire the write lock while we still hold other locks.

    This is fine with the bkl:

    T1 T2

    lock_kernel()
    mutex_lock(A)
    unlock_kernel()
    // do something
    lock_kernel()
    mutex_lock(A) -> already locked by T1
    schedule() (and then unlock_kernel())
    lock_kernel()
    mutex_unlock(A)
    ....

    This is not fine with a mutex:

    T1 T2

    mutex_lock(write)
    mutex_lock(A)
    mutex_unlock(write)
    // do something
    mutex_lock(write)
    mutex_lock(A) -> already locked by T1
    schedule()

    mutex_lock(write) -> already locked by T2
    deadlock

    The solution in this patch is to provide a helper which releases the write
    lock and sleep a bit if we can't lock a mutex that depend on it. It's another
    simulation of the bkl behaviour.

    The last axis is to locate the fs callbacks that are called with the bkl held,
    according to Documentation/filesystem/Locking.

    Those are:

    - reiserfs_remount
    - reiserfs_fill_super
    - reiserfs_put_super

    Reiserfs didn't need to explicitly lock because of the context of these callbacks.
    But now we must take care of that with the new locking.

    After this patch, reiserfs suffers from a slight performance regression (for now).
    On UP, a high volume write with dd reports an average of 27 MB/s instead
    of 30 MB/s without the patch applied.

    Signed-off-by: Frederic Weisbecker
    Reviewed-by: Ingo Molnar
    Cc: Jeff Mahoney
    Cc: Peter Zijlstra
    Cc: Bron Gondwana
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Alexander Viro
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

31 Mar, 2009

9 commits

  • This patch renames n_, c_, etc variables to something more sane. This
    is the sixth in a series of patches to rip out some of the awful
    variable naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_._//g to the reiserfs code. This is the
    fifth in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_tb/tb/g to the reiserfs code. This is the
    fourth in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_bh/bh/g to the reiserfs code. This is the
    second in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_sb/sb/g to the reiserfs code. This is the
    first in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch strips trailing whitespace from the reiserfs code.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch cleans up some redundancies in the reiserfs tree path code.

    decrement_bcount() is essentially the same function as brelse(), so we use
    that instead.

    decrement_counters_in_path() is exactly the same function as pathrelse(), so
    we kill that and use pathrelse() instead.

    There's also a bit of cleanup that makes the code a bit more readable.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • ReiserFS panics can be somewhat inconsistent.
    In some cases:
    * a unique identifier may be associated with it
    * the function name may be included
    * the device may be printed separately

    This patch aims to make warnings more consistent. reiserfs_warning() prints
    the device name, so printing it a second time is not required. The function
    name for a warning is always helpful in debugging, so it is now automatically
    inserted into the output. Hans has stated that every warning should have
    a unique identifier. Some cases lack them, others really shouldn't have them.
    reiserfs_warning() now expects an id associated with each message. In the
    rare case where one isn't needed, "" will suffice.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • ReiserFS warnings can be somewhat inconsistent.
    In some cases:
    * a unique identifier may be associated with it
    * the function name may be included
    * the device may be printed separately

    This patch aims to make warnings more consistent. reiserfs_warning() prints
    the device name, so printing it a second time is not required. The function
    name for a warning is always helpful in debugging, so it is now automatically
    inserted into the output. Hans has stated that every warning should have
    a unique identifier. Some cases lack them, others really shouldn't have them.
    reiserfs_warning() now expects an id associated with each message. In the
    rare case where one isn't needed, "" will suffice.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     

31 Mar, 2008

1 commit


09 Dec, 2006

1 commit


01 Jul, 2006

1 commit


26 Mar, 2006

1 commit

  • Clean up several places where gcc issues warnings when -W is specified.
    Thanks to Neil for finding that.

    Signed-off-by: Vladimir V. Saveliev
    Cc: Neil Brown
    Signed-off-by: Hans Reiser
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir V. Saveliev
     

02 Feb, 2006

1 commit


28 Oct, 2005

1 commit

  • - ->releasepage() annotated (s/int/gfp_t), instances updated
    - missing gfp_t in fs/* added
    - fixed misannotation from the original sweep caught by bitwise checks:
    XFS used __nocast both for gfp_t and for flags used by XFS allocator.
    The latter left with unsigned int __nocast; we might want to add a
    different type for those but for now let's leave them alone. That,
    BTW, is a case when __nocast use had been actively confusing - it had
    been used in the same code for two different and similar types, with
    no way to catch misuses. Switch of gfp_t to bitwise had caught that
    immediately...

    One tricky bit is left alone to be dealt with later - mapping->flags is
    a mix of gfp_t and error indications. Left alone for now.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

13 Jul, 2005

1 commit

  • This was a pure indentation change, using:

    scripts/Lindent fs/reiserfs/*.c include/linux/reiserfs_*.h

    to make reiserfs match the regular Linux indentation style. As Jeff
    Mahoney writes:

    The ReiserFS code is a mix of a number of different coding styles, sometimes
    different even from line-to-line. Since the code has been relatively stable
    for quite some time and there are few outstanding patches to be applied, it
    is time to reformat the code to conform to the Linux style standard outlined
    in Documentation/CodingStyle.

    This patch contains the result of running scripts/Lindent against
    fs/reiserfs/*.c and include/linux/reiserfs_*.h. There are places where the
    code can be made to look better, but I'd rather keep those patches separate
    so that there isn't a subtle by-hand hand accident in the middle of a huge
    patch. To be clear: This patch is reformatting *only*.

    A number of patches may follow that continue to make the code more consistent
    with the Linux coding style.

    Hans wasn't particularly enthusiastic about these patches, but said he
    wouldn't really oppose them either.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds