10 Aug, 2016

1 commit

  • Currently the percpu-rwsem switches to (global) atomic ops while a
    writer is waiting; which could be quite a while and slows down
    releasing the readers.

    This patch cures this problem by ordering the reader-state vs
    reader-count (see the comments in __percpu_down_read() and
    percpu_down_write()). This changes a global atomic op into a full
    memory barrier, which doesn't have the global cacheline contention.

    This also enables using the percpu-rwsem with rcu_sync disabled in order
    to bias the implementation differently, reducing the writer latency by
    adding some cost to readers.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Oleg Nesterov
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Paul McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    [ Fixed modular build. ]
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

26 Apr, 2016

1 commit

  • In ext4, there is a race condition between changing inode journal mode
    and ext4_writepages(). While ext4_writepages() is executed on a
    non-journalled mode inode, the inode's journal mode could be enabled
    by ioctl() and then, some pages dirtied after switching the journal
    mode will be still exposed to ext4_writepages() in non-journaled mode.
    To resolve this problem, we use fs-wide per-cpu rw semaphore by Jan
    Kara's suggestion because we don't want to waste ext4_inode_info's
    space for this extra rare case.

    Signed-off-by: Daeho Jeong
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Daeho Jeong
     

07 Oct, 2015

5 commits

  • Based on Peter Zijlstra's earlier patch.

    Change percpu_down_read() to use __down_read(), this way we can
    do rwsem_acquire_read() unconditionally at the start to make this
    code more symmetric and clean.

    Originally-From: Peter Zijlstra
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Oleg Nesterov
     
  • Update the comments broken by the previous change.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Oleg Nesterov
     
  • Currently down_write/up_write calls synchronize_sched_expedited()
    twice, which is evil. Change this code to rely on rcu-sync primitives.
    This avoids the _expedited "big hammer", and this can be faster in
    the contended case or even in the case when a single thread does
    down_write/up_write in a loop.

    Of course, a single down_write() will take more time, but otoh it
    will be much more friendly to the whole system.

    To simplify the review this patch doesn't update the comments, fixed
    by the next change.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Oleg Nesterov
     
  • This is the temporary ugly hack which will be reverted later. We only
    need it to ensure that the next patch will not break "change sb_writers
    to use percpu_rw_semaphore" patches routed via the VFS tree.

    The alloc_super()->destroy_super() error path assumes that it is safe
    to call percpu_free_rwsem() after kzalloc() without percpu_init_rwsem(),
    so let's not disappoint it.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Oleg Nesterov
     
  • This commit exports percpu_down_read(), percpu_down_write(),
    __percpu_init_rwsem(), percpu_up_read(), and percpu_up_write() to allow
    locktorture to test them when built as a module.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

15 Aug, 2015

1 commit


06 Nov, 2013

1 commit