01 Nov, 2016

1 commit


28 Sep, 2016

1 commit

  • CURRENT_TIME_SEC is not y2038 safe. current_time() will
    be transitioned to use 64 bit time along with vfs in a
    separate patch.
    There is no plan to transistion CURRENT_TIME_SEC to use
    y2038 safe time interfaces.

    current_time() will also be extended to use superblock
    range checking parameters when range checking is introduced.

    This works because alloc_super() fills in the the s_time_gran
    in super block to NSEC_PER_SEC.

    Signed-off-by: Deepa Dinamani
    Acked-by: Jan Kara
    Signed-off-by: Al Viro

    Deepa Dinamani
     

21 Jul, 2016

1 commit

  • These two are confusing leftover of the old world order, combining
    values of the REQ_OP_ and REQ_ namespaces. For callers that don't
    special case we mostly just replace bi_rw with bio_data_dir or
    op_is_write, except for the few cases where a switch over the REQ_OP_
    values makes more sense. Any check for READA is replaced with an
    explicit check for REQ_RAHEAD. Also remove the READA alias for
    REQ_RAHEAD.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Mike Christie
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Jun, 2016

1 commit


05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

09 Aug, 2014

1 commit


15 May, 2014

1 commit


07 May, 2014

7 commits


09 Aug, 2013

2 commits

  • Previous commits released the write lock across quota operations but
    missed several places. In particular, the free operations can also
    call into the file system code and take the write lock, causing
    deadlocks.

    This patch introduces some more helpers and uses them for quota call
    sites. Without this patch applied, reiserfs + quotas runs into deadlocks
    under anything more than trivial load.

    Signed-off-by: Jeff Mahoney

    Jeff Mahoney
     
  • The reiserfs write lock replaced the BKL and uses similar semantics.

    Frederic's locking code makes a distinction between when the lock is nested
    and when it's being acquired/released, but I don't think that's the right
    distinction to make.

    The right distinction is between the lock being released at end-of-use and
    the lock being released for a schedule. The unlock should return the depth
    and the lock should restore it, rather than the other way around as it is now.

    This patch implements that and adds a number of places where the lock
    should be dropped.

    Signed-off-by: Jeff Mahoney

    Jeff Mahoney
     

20 Nov, 2012

1 commit


22 Mar, 2012

1 commit

  • Pull vfs pile 1 from Al Viro:
    "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
    yet."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
    ext4: initialization of ext4_li_mtx needs to be done earlier
    debugfs-related mode_t whack-a-mole
    hfsplus: add an ioctl to bless files
    hfsplus: change finder_info to u32
    hfsplus: initialise userflags
    qnx4: new helper - try_extent()
    qnx4: get rid of qnx4_bread/qnx4_getblk
    take removal of PF_FORKNOEXEC to flush_old_exec()
    trim includes in inode.c
    um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
    um: embed ->stub_pages[] into mmu_context
    gadgetfs: list_for_each_safe() misuse
    ocfs2: fix leaks on failure exits in module_init
    ecryptfs: make register_filesystem() the last potential failure exit
    ntfs: forgets to unregister sysctls on register_filesystem() failure
    logfs: missing cleanup on register_filesystem() failure
    jfs: mising cleanup on register_filesystem() failure
    make configfs_pin_fs() return root dentry on success
    configfs: configfs_create_dir() has parent dentry in dentry->d_parent
    configfs: sanitize configfs_create()
    ...

    Linus Torvalds
     

21 Mar, 2012

1 commit


20 Mar, 2012

1 commit


05 Mar, 2010

1 commit

  • Get rid of the alloc_space, free_space, reserve_space, claim_space and
    release_rsv dquot operations - they are always called from the filesystem
    and if a filesystem really needs their own (which none currently does)
    it can just call into it's own routine directly.

    Move shared logic into the common __dquot_alloc_space,
    dquot_claim_space_nodirty and __dquot_free_space low-level methods,
    and rationalize the wrappers around it to move as much as possible
    code into the common block for CONFIG_QUOTA vs not. Also rename
    all these helpers to be named dquot_* instead of vfs_dq_*.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

14 Sep, 2009

5 commits

  • When do_balance() balances the tree, a trick is performed to
    provide the ability for other tree writers/readers to check whether
    do_balance() is executing concurrently (requires CONFIG_REISERFS_CHECK).

    This is done to protect concurrent accesses to the tree. The trick
    is the following:

    When do_balance is called, a unique global variable called cur_tb
    takes a pointer to the current tree to be rebalanced.
    Once do_balance finishes its work, cur_tb takes the NULL value.

    Then, concurrent tree readers/writers just have to check the value
    of cur_tb to ensure do_balance isn't executing concurrently.
    If it is, then it proves that schedule() occured on do_balance(),
    which then relaxed the bkl that protected the tree.

    Now that the bkl has be turned into a mutex, this check is still
    fine even though do_balance() becomes preemptible: the write lock
    will not be automatically released on schedule(), so the tree is
    still protected.

    But this is only fine if we have a single reiserfs mountpoint.
    Indeed, because the bkl is a global lock, it didn't allowed
    concurrent executions between a tree reader/writer in a mount point
    and a do_balance() on another tree from another mountpoint.

    So assuming all these readers/writers weren't supposed to be
    reentrant, the current check now sometimes detect false positives with
    the current per-superblock mutex which allows this reentrancy.

    This patch keeps the concurrent tree accesses check but moves it
    per superblock, so that only trees from a same mount point are
    checked to be not accessed concurrently.

    [ Impact: fix spurious panic while running several reiserfs mount-points ]

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Ingo Molnar
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • search_by_key() is the site which most requires the lock.
    This is mostly because it is a very central function and also
    because it releases/reaqcuires the write lock at least once each
    time it is called.

    Such release/reacquire creates a lot of contention in this place and
    also opens more the window which let another thread changing the tree.
    When it happens, the current path searching over the tree must be
    retried from the beggining (the root) which is a wasteful and
    time consuming recovery.

    This patch factorizes two release/reacquire sequences:

    - reading leaf nodes blocks
    - reading current block

    The latter immediately follows the former.

    The whole sequence is safe as a single unlocked section because
    we check just after if the tree has changed during these operations.

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Ingo Molnar
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • search_by_key() is a central function in reiserfs which searches
    the patch in the fs tree from the root to a node given its key.

    It is the function that is most requesting the write lock
    because it's a path very often used.

    Also we forget to release the lock while reading the next tree node,
    making us holding the lock in a wasteful way.

    Then we release the lock while reading the current node and its childs,
    all-in-one. It should be safe because we have a reference to these
    blocks and even if we read a block that will be concurrently changed,
    we have an fs_changed check later that will make us retry the path from
    the root.

    [ Impact: release the write lock while unused in a hot path ]

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Ingo Molnar
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • prepare_for_delete_or_cut() can process several types of items, including
    indirect items, ie: items which contain no file data but pointers to
    unformatted nodes scattering the datas of a file.

    In this case it has to zero out these pointers to block numbers of
    unformatted nodes and release the bitmap from these block numbers.

    It can take some time, so a rescheduling() is performed between each
    block processed. We can safely release the write lock while
    rescheduling(), like the bkl did, because the code checks just after
    if the item has moved after sleeping.

    [ Impact: release the reiserfs write lock when it is not needed ]

    Cc: Jeff Mahoney
    Cc: Chris Mason
    Cc: Alexander Beregalov
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • This patch is an attempt to remove the Bkl based locking scheme from
    reiserfs and is intended.

    It is a bit inspired from an old attempt by Peter Zijlstra:

    http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

    The bkl is heavily used in this filesystem to prevent from
    concurrent write accesses on the filesystem.

    Reiserfs makes a deep use of the specific properties of the Bkl:

    - It can be acqquired recursively by a same task
    - It is released on the schedule() calls and reacquired when schedule() returns

    The two properties above are a roadmap for the reiserfs write locking so it's
    very hard to simply replace it with a common mutex.

    - We need a recursive-able locking unless we want to restructure several blocks
    of the code.
    - We need to identify the sites where the bkl was implictly relaxed
    (schedule, wait, sync, etc...) so that we can in turn release and
    reacquire our new lock explicitly.
    Such implicit releases of the lock are often required to let other
    resources producer/consumer do their job or we can suffer unexpected
    starvations or deadlocks.

    So the new lock that replaces the bkl here is a per superblock mutex with a
    specific property: it can be acquired recursively by a same task, like the
    bkl.

    For such purpose, we integrate a lock owner and a lock depth field on the
    superblock information structure.

    The first axis on this patch is to turn reiserfs_write_(un)lock() function
    into a wrapper to manage this mutex. Also some explicit calls to
    lock_kernel() have been converted to reiserfs_write_lock() helpers.

    The second axis is to find the important blocking sites (schedule...(),
    wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
    release of the write lock on these locations before blocking. Then we can
    safely wait for those who can give us resources or those who need some.
    Typically this is a fight between the current writer, the reiserfs workqueue
    (aka the async commiter) and the pdflush threads.

    The third axis is a consequence of the second. The write lock is usually
    on top of a lock dependency chain which can include the journal lock, the
    flush lock or the commit lock. So it's dangerous to release and trying to
    reacquire the write lock while we still hold other locks.

    This is fine with the bkl:

    T1 T2

    lock_kernel()
    mutex_lock(A)
    unlock_kernel()
    // do something
    lock_kernel()
    mutex_lock(A) -> already locked by T1
    schedule() (and then unlock_kernel())
    lock_kernel()
    mutex_unlock(A)
    ....

    This is not fine with a mutex:

    T1 T2

    mutex_lock(write)
    mutex_lock(A)
    mutex_unlock(write)
    // do something
    mutex_lock(write)
    mutex_lock(A) -> already locked by T1
    schedule()

    mutex_lock(write) -> already locked by T2
    deadlock

    The solution in this patch is to provide a helper which releases the write
    lock and sleep a bit if we can't lock a mutex that depend on it. It's another
    simulation of the bkl behaviour.

    The last axis is to locate the fs callbacks that are called with the bkl held,
    according to Documentation/filesystem/Locking.

    Those are:

    - reiserfs_remount
    - reiserfs_fill_super
    - reiserfs_put_super

    Reiserfs didn't need to explicitly lock because of the context of these callbacks.
    But now we must take care of that with the new locking.

    After this patch, reiserfs suffers from a slight performance regression (for now).
    On UP, a high volume write with dd reports an average of 27 MB/s instead
    of 30 MB/s without the patch applied.

    Signed-off-by: Frederic Weisbecker
    Reviewed-by: Ingo Molnar
    Cc: Jeff Mahoney
    Cc: Peter Zijlstra
    Cc: Bron Gondwana
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Alexander Viro
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

31 Mar, 2009

12 commits

  • * reiserfs-updates: (35 commits)
    reiserfs: rename [cn]_* variables
    reiserfs: rename p_._ variables
    reiserfs: rename p_s_tb to tb
    reiserfs: rename p_s_inode to inode
    reiserfs: rename p_s_bh to bh
    reiserfs: rename p_s_sb to sb
    reiserfs: strip trailing whitespace
    reiserfs: cleanup path functions
    reiserfs: factor out buffer_info initialization
    reiserfs: add atomic addition of selinux attributes during inode creation
    reiserfs: use generic readdir for operations across all xattrs
    reiserfs: journaled xattrs
    reiserfs: use generic xattr handlers
    reiserfs: remove i_has_xattr_dir
    reiserfs: make per-inode xattr locking more fine grained
    reiserfs: eliminate per-super xattr lock
    reiserfs: simplify xattr internal file lookups/opens
    reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset
    reiserfs: remove IS_PRIVATE helpers
    reiserfs: remove link detection code
    ...

    Fixed up conflicts manually due to:
    - quota name cleanups vs variable naming changes:
    fs/reiserfs/inode.c
    fs/reiserfs/namei.c
    fs/reiserfs/stree.c
    fs/reiserfs/xattr.c
    - exported include header cleanups
    include/linux/reiserfs_fs.h

    Linus Torvalds
     
  • This patch renames n_, c_, etc variables to something more sane. This
    is the sixth in a series of patches to rip out some of the awful
    variable naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_._//g to the reiserfs code. This is the
    fifth in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_tb/tb/g to the reiserfs code. This is the
    fourth in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_inode/inode/g to the reiserfs code. This
    is the third in a series of patches to rip out some of the awful
    variable naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_bh/bh/g to the reiserfs code. This is the
    second in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch is a simple s/p_s_sb/sb/g to the reiserfs code. This is the
    first in a series of patches to rip out some of the awful variable
    naming in reiserfs.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch strips trailing whitespace from the reiserfs code.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch cleans up some redundancies in the reiserfs tree path code.

    decrement_bcount() is essentially the same function as brelse(), so we use
    that instead.

    decrement_counters_in_path() is exactly the same function as pathrelse(), so
    we kill that and use pathrelse() instead.

    There's also a bit of cleanup that makes the code a bit more readable.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch makes many paths that are currently using warnings to handle
    the error.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • ReiserFS panics can be somewhat inconsistent.
    In some cases:
    * a unique identifier may be associated with it
    * the function name may be included
    * the device may be printed separately

    This patch aims to make warnings more consistent. reiserfs_warning() prints
    the device name, so printing it a second time is not required. The function
    name for a warning is always helpful in debugging, so it is now automatically
    inserted into the output. Hans has stated that every warning should have
    a unique identifier. Some cases lack them, others really shouldn't have them.
    reiserfs_warning() now expects an id associated with each message. In the
    rare case where one isn't needed, "" will suffice.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • ReiserFS warnings can be somewhat inconsistent.
    In some cases:
    * a unique identifier may be associated with it
    * the function name may be included
    * the device may be printed separately

    This patch aims to make warnings more consistent. reiserfs_warning() prints
    the device name, so printing it a second time is not required. The function
    name for a warning is always helpful in debugging, so it is now automatically
    inserted into the output. Hans has stated that every warning should have
    a unique identifier. Some cases lack them, others really shouldn't have them.
    reiserfs_warning() now expects an id associated with each message. In the
    rare case where one isn't needed, "" will suffice.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     

26 Mar, 2009

1 commit


28 Apr, 2008

1 commit

  • replace all:
    little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
    expression_in_cpu_byteorder);
    with:
    leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
    generated with semantic patch

    Signed-off-by: Marcin Slusarz
    Cc: Jeff Mahoney
    Cc: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Slusarz