10 Oct, 2020

1 commit

  • We can't check ref->data->confirm_switch directly in __percpu_ref_exit(), since
    ref->data may not be allocated in one not-initialized refcount.

    Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
    Reported-by: syzbot+fd15ff734dace9e16437@syzkaller.appspotmail.com
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

06 Oct, 2020

1 commit

  • 'struct percpu_ref' is often embedded into one user structure, and the
    instance is usually referenced in fast path, however actually only
    'percpu_count_ptr' is needed in fast path.

    So move other fields into one new structure of 'percpu_ref_data', and
    allocate it dynamically via kzalloc(), then memory footprint of
    'percpu_ref' in fast path is reduced a lot and becomes suitable to put
    into hot cacheline of user structure.

    Signed-off-by: Ming Lei
    Tested-by: Veronika Kabatova
    Reviewed-by: Christoph Hellwig
    Acked-by: Tejun Heo
    Cc: Sagi Grimberg
    Cc: Tejun Heo
    Cc: Christoph Hellwig
    Cc: Jens Axboe
    Cc: Bart Van Assche
    Signed-off-by: Jens Axboe

    Ming Lei
     

05 Jun, 2020

1 commit

  • Remove the trailing newline from the used-once pr_fmt and add it to the
    single use of pr_ in this code to use a more common logging style.

    Miscellanea:

    o Use %lu in the pr_debug format and remove the unnecessary cast

    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Cc: Christophe JAILLET
    Link: http://lkml.kernel.org/r/47372467902a047c03b0fd29aab56e0c38d3f848.camel@perches.com
    Signed-off-by: Linus Torvalds

    Joe Perches
     

06 Mar, 2020

1 commit

  • The comment for percpu_ref_init() implies that using
    PERCPU_REF_ALLOW_REINIT will cause the refcount to start at 0. But
    this is not true. PERCPU_REF_ALLOW_REINIT starts the count at 1 as
    if the flags were zero. Add this fact to the kernel doc comment.

    Signed-off-by: Ira Weiny
    [Dennis: reworded]
    Signed-off-by: Dennis Zhou

    Ira Weiny
     

15 Jul, 2019

1 commit

  • Pull percpu updates from Dennis Zhou:
    "This includes changes to let percpu_ref release the backing percpu
    memory earlier after it has been switched to atomic in cases where the
    percpu ref is not revived.

    This will help recycle percpu memory earlier in cases where the
    refcounts are pinned for prolonged periods of time"

    * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
    percpu_ref: release percpu memory early without PERCPU_REF_ALLOW_REINIT
    md: initialize percpu refcounters using PERCU_REF_ALLOW_REINIT
    io_uring: initialize percpu refcounters using PERCU_REF_ALLOW_REINIT
    percpu_ref: introduce PERCPU_REF_ALLOW_REINIT flag

    Linus Torvalds
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

10 May, 2019

1 commit


09 Apr, 2019

1 commit

  • %pF and %pf are functionally equivalent to %pS and %ps conversion
    specifiers. The former are deprecated, therefore switch the current users
    to use the preferred variant.

    The changes have been produced by the following command:

    git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
    while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done

    And verifying the result.

    Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
    Cc: Andy Shevchenko
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: xen-devel@lists.xenproject.org
    Cc: linux-acpi@vger.kernel.org
    Cc: linux-pm@vger.kernel.org
    Cc: drbd-dev@lists.linbit.com
    Cc: linux-block@vger.kernel.org
    Cc: linux-mmc@vger.kernel.org
    Cc: linux-nvdimm@lists.01.org
    Cc: linux-pci@vger.kernel.org
    Cc: linux-scsi@vger.kernel.org
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: linux-mm@kvack.org
    Cc: ceph-devel@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Sakari Ailus
    Acked-by: David Sterba (for btrfs)
    Acked-by: Mike Rapoport (for mm/memblock.c)
    Acked-by: Bjorn Helgaas (for drivers/pci)
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Petr Mladek

    Sakari Ailus
     

28 Nov, 2018

1 commit

  • Now that call_rcu()'s callback is not invoked until after all
    preempt-disable regions of code have completed (in addition to explicitly
    marked RCU read-side critical sections), call_rcu() can be used in place
    of call_rcu_sched(). This commit therefore makes that change.

    Signed-off-by: Paul E. McKenney
    Cc: Ming Lei
    Cc: Bart Van Assche
    Cc: Jens Axboe
    Acked-by: Tejun Heo

    Paul E. McKenney
     

27 Sep, 2018

1 commit

  • This function will be used in a later patch to switch the struct
    request_queue q_usage_counter from killed back to live. In contrast
    to percpu_ref_reinit(), this new function does not require that the
    refcount is zero.

    Signed-off-by: Bart Van Assche
    Acked-by: Tejun Heo
    Reviewed-by: Ming Lei
    Cc: Christoph Hellwig
    Cc: Jianchao Wang
    Cc: Hannes Reinecke
    Cc: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

20 Mar, 2018

1 commit

  • percpu_ref internally uses sched-RCU to implement the percpu -> atomic
    mode switching and the documentation suggested that this could be
    depended upon. This doesn't seem like a good idea.

    * percpu_ref uses sched-RCU which has different grace periods regular
    RCU. Users may combine percpu_ref with regular RCU usage and
    incorrectly believe that regular RCU grace periods are performed by
    percpu_ref. This can lead to, for example, use-after-free due to
    premature freeing.

    * percpu_ref has a grace period when switching from percpu to atomic
    mode. It doesn't have one between the last put and release. This
    distinction is subtle and can lead to surprising bugs.

    * percpu_ref allows starting in and switching to atomic mode manually
    for debugging and other purposes. This means that there may not be
    any grace periods from kill to release.

    This patch makes it clear that the grace periods are percpu_ref's
    internal implementation detail and can't be depended upon by the
    users.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Linus Torvalds
    Signed-off-by: Tejun Heo

    Tejun Heo
     

05 Dec, 2017

1 commit


23 Mar, 2017

1 commit

  • percpu_ref_switch_to_atomic_sync() schedules the switch to atomic mode, then
    waits for it to complete.

    Also export percpu_ref_switch_to_* so they can be used from modules.

    This will be used in md/raid to count the number of pending write
    requests to an array.
    We occasionally need to check if the count is zero, but most often
    we don't care.
    We always want updates to the counter to be fast, as in some cases
    we count every 4K page.

    Signed-off-by: NeilBrown
    Acked-by: Tejun Heo
    Signed-off-by: Shaohua Li

    NeilBrown
     

12 Aug, 2016

1 commit

  • This patch targets two things which are related to ->confirm_switch:

    1. Init ->confirm_switch pointer with NULL on percpu_ref_init() or
    kernel frightfully complains with WARN_ON_ONCE(ref->confirm_switch)
    at __percpu_ref_switch_to_atomic if memory chunk was not properly
    zeroed.

    2. Warn if RCU callback is still in progress on percpu_ref_exit().
    The race still exists, because percpu_ref_call_confirm_rcu()
    drops ->confirm_switch to NULL early, but that is only a warning
    and still the caller is responsible that ref is no longer in
    active use. Hopefully that can help to catch incorrect usage
    of percpu-refcount.

    Signed-off-by: Roman Pen
    Cc: Tejun Heo
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Tejun Heo

    Roman Pen
     

11 Aug, 2016

5 commits

  • percpu_ref initially didn't have explicit mode switching operations.
    It started out in percpu mode and switched to atomic mode on kill and
    then released. Ensuring that kill operation is initiated only after
    init completes was naturally the caller's responsibility.

    percpu_ref_reinit() was introduced later but it didn't shift the
    synchronization responsibility. Reinit can't be performed until kill
    is confirmed, so there was nothing to worry about
    synchronization-wise. Also, as both reinit and kill manipulate the
    base reference, invocations of the same function couldn't be allowed
    to race each other.

    The latest additions of percpu_ref_switch_to_atomic/percpu() changed
    the situation. These two functions can be called any time as long as
    the percpu_ref is between init and exit and thus there are valid valid
    usage scenarios where these new functions race with each other or
    against reinit/kill. Mostly from inertia, f47ad4578461 ("percpu_ref:
    decouple switching to percpu mode and reinit") still left
    synchronization among percpu mode switching operations to its users.

    That the new switch functions can be freely mixed with kill/reinit but
    the operations themselves should be synchronized is too subtle a
    requirement and led to a very subtle race condition in blk-mq freezing
    path.

    This patch fixes the situation by introducing percpu_ref_switch_lock
    to protect mode switching operations. This ensures that percpu-ref
    users don't have to worry about mode changing operations racing
    against each other, e.g. switch_to_percpu against kill, as long as the
    sequence of operations is valid.

    Signed-off-by: Tejun Heo
    Reported-by: Akinobu Mita
    Link: http://lkml.kernel.org/g/1443287365-4244-7-git-send-email-akinobu.mita@gmail.com
    Fixes: f47ad4578461 ("percpu_ref: decouple switching to percpu mode and reinit")

    Tejun Heo
     
  • Restructure atomic/percpu mode switching.

    * The users of __percpu_ref_switch_to_atomic/percpu() now call a new
    function __percpu_ref_switch_mode() which calls either of the
    original switching functions depending on the current state of
    ref->force_atomic and the __PERCPU_REF_DEAD flag. The callers no
    longer check whether switching is necessary but always invoke
    __percpu_ref_switch_mode().

    * !ref->confirm_switch waiting is collected into
    __percpu_ref_switch_mode().

    This patch doesn't cause any behavior differences.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • When an atomic or percpu switching starts before the previous atomic
    switching finishes, the taken behaviors are

    * If the new atomic switching has confirmation callback, it waits
    for the previous atomic switching to complete.

    * If the new percpu switching is the first percpu switching following
    the previous atomic switching, it waits the previous atomic
    switching to complete.

    No percpu_ref user depends on these subtleties. The only meaningful
    part is that, if the caller ensures that atomic switching isn't in
    progress, mode switching operations can be issued from any context.

    This patch pulls the wait logic to the top of both switching functions
    so that they always wait for the previous atomic switching to
    complete. This makes the behavior simpler and consistent for both
    directions and will help allowing concurrent invocations of mode
    switching functions.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • Reorganize __percpu_ref_switch_to_atomic() so that it looks
    structurally similar to __percpu_ref_switch_to_percpu() and relocate
    percpu_ref_switch_to_atomic so that the two internal functions are
    co-located.

    This patch doesn't introduce any functional differences.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • At the beginning, percpu_ref guaranteed a RCU grace period between a
    call to percpu_ref_kill_and_confirm() and the invocation of the
    confirmation callback. This guarantee exposed internal implementation
    details and got rescinded while switching over to sched RCU; however,
    __percpu_ref_switch_to_atomic() still inserts a full sched RCU grace
    period even when it can simply wait for the previous attempt.

    Remove the unnecessary grace period and perform the confirmation
    synchronously for staggered atomic switching attempts. Update
    comments accordingly.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

15 Feb, 2016

1 commit


25 Sep, 2014

11 commits

  • Currently, a percpu_ref which is initialized with
    PERPCU_REF_INIT_ATOMIC or switched to atomic mode via
    switch_to_atomic() automatically reverts to percpu mode on the first
    percpu_ref_reinit(). This makes the atomic mode difficult to use for
    cases where a percpu_ref is used as a persistent on/off switch which
    may be cycled multiple times.

    This patch makes such atomic state sticky so that it survives through
    kill/reinit cycles. After this patch, atomic state is cleared only by
    an explicit percpu_ref_switch_to_percpu() call.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Johannes Weiner

    Tejun Heo
     
  • With the recent addition of percpu_ref_reinit(), percpu_ref now can be
    used as a persistent switch which can be turned on and off repeatedly
    where turning off maps to killing the ref and waiting for it to drain;
    however, there currently isn't a way to initialize a percpu_ref in its
    off (killed and drained) state, which can be inconvenient for certain
    persistent switch use cases.

    Similarly, percpu_ref_switch_to_atomic/percpu() allow dynamic
    selection of operation mode; however, currently a newly initialized
    percpu_ref is always in percpu mode making it impossible to avoid the
    latency overhead of switching to atomic mode.

    This patch adds @flags to percpu_ref_init() and implements the
    following flags.

    * PERCPU_REF_INIT_ATOMIC : start ref in atomic mode
    * PERCPU_REF_INIT_DEAD : start ref killed and drained

    These flags should be able to serve the above two use cases.

    v2: target_core_tpg.c conversion was missing. Fixed.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Johannes Weiner

    Tejun Heo
     
  • percpu_ref has treated the dropping of the base reference and
    switching to atomic mode as an integral operation; however, there's
    nothing inherent tying the two together.

    The use cases for percpu_ref have been expanding continuously. While
    the current init/kill/reinit/exit model can cover a lot, the coupling
    of kill/reinit with atomic/percpu mode switching is turning out to be
    too restrictive for use cases where many percpu_refs are created and
    destroyed back-to-back with only some of them reaching extended
    operation. The coupling also makes implementing always-atomic debug
    mode difficult.

    This patch separates out percpu mode switching into
    percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
    top of it.

    * DEAD still requires ATOMIC. A dead ref can't be switched to percpu
    mode w/o going through reinit.

    v2: __percpu_ref_switch_to_percpu() was missing static. Fixed.
    Reported by Fengguang aka kbuild test robot.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Johannes Weiner
    Cc: kbuild test robot

    Tejun Heo
     
  • percpu_ref has treated the dropping of the base reference and
    switching to atomic mode as an integral operation; however, there's
    nothing inherent tying the two together.

    The use cases for percpu_ref have been expanding continuously. While
    the current init/kill/reinit/exit model can cover a lot, the coupling
    of kill/reinit with atomic/percpu mode switching is turning out to be
    too restrictive for use cases where many percpu_refs are created and
    destroyed back-to-back with only some of them reaching extended
    operation. The coupling also makes implementing always-atomic debug
    mode difficult.

    This patch separates out atomic mode switching into
    percpu_ref_switch_to_atomic() and reimplements
    percpu_ref_kill_and_confirm() on top of it.

    * The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
    differentiated. Among get/put operations, percpu_ref_tryget_live()
    is the only one which cares about DEAD.

    * percpu_ref_switch_to_atomic() can be called multiple times on the
    same ref. This means that multiple @confirm_switch may get queued
    up which we can't do reliably without extra memory area. This is
    handled by making the later invocation synchronously wait for the
    completion of the previous one. This isn't particularly desirable
    but such synchronous waits shouldn't happen in most cases.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Johannes Weiner

    Tejun Heo
     
  • percpu_ref will be restructured so that percpu/atomic mode switching
    and reference killing are dedoupled. In preparation, add
    PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
    For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
    uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
    behavior changes.

    percpu_ref_init() now specifies an explicit alignment when allocating
    the percpu counters so that the pointer has enough unused low bits to
    accomodate the flags. Note that one flag was fine as min alignment
    for percpu memory is 2 bytes but two flags are already too many for
    the natural alignment of unsigned longs on archs like cris and m68k.

    v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
    long's alignment isn't enough to accomodate the flags, which
    triggered on cris and m64k. percpu_ref_init() updated to specify
    the required alignment explicitly. Reported by Fengguang.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet
    Cc: kbuild test robot

    Tejun Heo
     
  • percpu_ref will be restructured so that percpu/atomic mode switching
    and reference killing are dedoupled. In preparation, do the following
    renames.

    * percpu_ref->confirm_kill -> percpu_ref->confirm_switch
    * __PERCPU_REF_DEAD -> __PERCPU_REF_ATOMIC
    * __percpu_ref_alive() -> __ref_is_percpu()

    This patch is pure rename and doesn't introduce any functional
    changes.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet

    Tejun Heo
     
  • percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
    externally visible ones. This is the same convention used in the
    percpu allocator implementation. It works fine there but percpu_ref
    doesn't have too much internal-only stuff and scattered usages of
    pcpu_ prefix are confusing than helpful.

    This patch replaces all pcpu_ prefixes with percpu_. This is pure
    rename and there's no functional change. Note that PCPU_REF_DEAD is
    renamed to __PERCPU_REF_DEAD to signify that the flag is internal.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet

    Tejun Heo
     
  • * Some comments became stale. Updated.
    * percpu_ref_tryget() unnecessarily initializes @ret. Removed.
    * A blank line removed from percpu_ref_kill_rcu().
    * Explicit function name in a WARN format string replaced with __func__.
    * WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet

    Tejun Heo
     
  • percpu_ref is gonna go through restructuring. Move
    percpu_ref_reinit() after percpu_ref_kill_and_confirm(). This will
    make later changes easier to follow and result in cleaner
    organization.

    Signed-off-by: Tejun Heo
    Reviewed-by: Kent Overstreet

    Tejun Heo
     
  • This reverts commit 0a30288da1aec914e158c2d7a3482a85f632750f, which
    was a temporary fix for SCSI blk-mq stall issue. The following
    patches will fix the issue properly by introducing atomic mode to
    percpu_ref.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Jens Axboe
    Cc: Christoph Hellwig

    Tejun Heo
     
  • …linux-block into for-3.18

    This is to receive 0a30288da1ae ("blk-mq, percpu_ref: implement a
    kludge for SCSI blk-mq stall during probe") which implements
    __percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The
    commit reverted and patches to implement proper fix will be added.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Cc: Kent Overstreet <kmo@daterainc.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Christoph Hellwig <hch@lst.de>

    Tejun Heo
     

24 Sep, 2014

1 commit

  • blk-mq uses percpu_ref for its usage counter which tracks the number
    of in-flight commands and used to synchronously drain the queue on
    freeze. percpu_ref shutdown takes measureable wallclock time as it
    involves a sched RCU grace period. This means that draining a blk-mq
    takes measureable wallclock time. One would think that this shouldn't
    matter as queue shutdown should be a rare event which takes place
    asynchronously w.r.t. userland.

    Unfortunately, SCSI probing involves synchronously setting up and then
    tearing down a lot of request_queues back-to-back for non-existent
    LUNs. This means that SCSI probing may take more than ten seconds
    when scsi-mq is used.

    This will be properly fixed by implementing a mechanism to keep
    q->mq_usage_counter in atomic mode till genhd registration; however,
    that involves rather big updates to percpu_ref which is difficult to
    apply late in the devel cycle (v3.17-rc6 at the moment). As a
    stop-gap measure till the proper fix can be implemented in the next
    cycle, this patch introduces __percpu_ref_kill_expedited() and makes
    blk_mq_freeze_queue() use it. This is heavy-handed but should work
    for testing the experimental SCSI blk-mq implementation.

    Signed-off-by: Tejun Heo
    Reported-by: Christoph Hellwig
    Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
    Fixes: add703fda981 ("blk-mq: use percpu_ref for mq usage count")
    Cc: Kent Overstreet
    Cc: Jens Axboe
    Tested-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Tejun Heo
     

20 Sep, 2014

2 commits

  • percpu_ref is currently based on ints and the number of refs it can
    cover is (1 << 31). This makes it impossible to use a percpu_ref to
    count memory objects or pages on 64bit machines as it may overflow.
    This forces those users to somehow aggregate the references before
    contributing to the percpu_ref which is often cumbersome and sometimes
    challenging to get the same level of performance as using the
    percpu_ref directly.

    While using ints for the percpu counters makes them pack tighter on
    64bit machines, the possible gain from using ints instead of longs is
    extremely small compared to the overall gain from per-cpu operation.
    This patch makes percpu_ref based on longs so that it can be used to
    directly count memory objects or pages.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Johannes Weiner

    Tejun Heo
     
  • percpu_ref's WARN messages can be a lot more helpful by indicating
    who's the culprit. Make them report the release function that the
    offending percpu-refcount is associated with. This should make it a
    lot easier to track down the reported invalid refcnting operations.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet

    Tejun Heo
     

08 Sep, 2014

1 commit

  • Percpu allocator now supports allocation mask. Add @gfp to
    percpu_ref_init() so that !GFP_KERNEL allocation masks can be used
    with percpu_refs too.

    This patch doesn't make any functional difference.

    v2: blk-mq conversion was missing. Updated.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Benjamin LaHaise
    Cc: Li Zefan
    Cc: Nicholas A. Bellinger
    Cc: Jens Axboe

    Tejun Heo
     

28 Jun, 2014

5 commits

  • Now that explicit invocation of percpu_ref_exit() is necessary to free
    the percpu counter, we can implement percpu_ref_reinit() which
    reinitializes a released percpu_ref. This can be used implement
    scalable gating switch which can be drained and then re-opened without
    worrying about memory allocation failures.

    percpu_ref_is_zero() is added to be used in a sanity check in
    percpu_ref_exit(). As this function will be useful for other purposes
    too, make it a public interface.

    v2: Use smp_read_barrier_depends() instead of smp_load_acquire(). We
    only need data dep barrier and smp_load_acquire() is stronger and
    heavier on some archs. Spotted by Lai Jiangshan.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Christoph Lameter
    Cc: Lai Jiangshan

    Tejun Heo
     
  • Currently, a percpu_ref undoes percpu_ref_init() automatically by
    freeing the allocated percpu area when the percpu_ref is killed.
    While seemingly convenient, this has the following niggles.

    * It's impossible to re-init a released reference counter without
    going through re-allocation.

    * In the similar vein, it's impossible to initialize a percpu_ref
    count with static percpu variables.

    * We need and have an explicit destructor anyway for failure paths -
    percpu_ref_cancel_init().

    This patch removes the automatic percpu counter freeing in
    percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a
    generic destructor now named percpu_ref_exit(). percpu_ref_destroy()
    is considered but it gets confusing with percpu_ref_kill() while
    "exit" clearly indicates that it's the counterpart of
    percpu_ref_init().

    All percpu_ref_cancel_init() users are updated to invoke
    percpu_ref_exit() instead and explicit percpu_ref_exit() calls are
    added to the destruction path of all percpu_ref users.

    Signed-off-by: Tejun Heo
    Acked-by: Benjamin LaHaise
    Cc: Kent Overstreet
    Cc: Christoph Lameter
    Cc: Benjamin LaHaise
    Cc: Nicholas A. Bellinger
    Cc: Li Zefan

    Tejun Heo
     
  • percpu_ref->pcpu_count is a percpu pointer with a status flag in its
    lowest bit. As such, it always goes through arithmetic operations
    which is very cumbersome to do on a pointer. It has to be first
    casted to unsigned long and then back.

    Let's just make the field unsigned long so that we can skip the first
    casts. While at it, rename it to pcpu_counter_ptr to clarify that
    it's a pointer value.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Christoph Lameter

    Tejun Heo
     
  • * All four percpu_ref_*() operations implemented in the header file
    perform the same operation to determine whether the percpu_ref is
    alive and extract the percpu pointer. Factor out the common logic
    into __pcpu_ref_alive(). This doesn't change the generated code.

    * There are a couple places in percpu-refcount.c which masks out
    PCPU_REF_DEAD to obtain the percpu pointer. Factor it out into
    pcpu_count_ptr().

    * The above changes make the WARN_ON_ONCE() conditional at the top of
    percpu_ref_kill_and_confirm() the only user of REF_STATUS(). Test
    PCPU_REF_DEAD directly and remove REF_STATUS().

    This patch doesn't introduce any functional change.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Christoph Lameter

    Tejun Heo
     
  • percpu-refcount currently reserves two lowest bits of its percpu
    pointer to indicate its state; however, only one bit is used for
    PCPU_REF_DEAD.

    Simplify it by removing PCPU_STATUS_BITS/MASK and testing
    PCPU_REF_DEAD directly. This also allows the compiler to choose a
    more efficient instruction depending on the architecture.

    Signed-off-by: Tejun Heo
    Cc: Kent Overstreet
    Cc: Christoph Lameter

    Tejun Heo