08 Jun, 2019

1 commit

  • The lockref cmpxchg loop is unbound as long as the spinlock is not
    taken. Depending on the hardware implementation of compare-and-swap
    a high number of loop retries might happen.

    Add an upper bound to the loop to force the fallback to spinlocks
    after some time. A retry value of 100 should not impact any hardware
    that does not have this issue.

    With the retry limit the performance of an open-close testcase
    improved between 60-70% on ThunderX2.

    Suggested-by: Linus Torvalds
    Signed-off-by: Jan Glauber
    Signed-off-by: Linus Torvalds

    Jan Glauber
     

13 Apr, 2018

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

16 Nov, 2016

1 commit

  • With the s390 special case of a yielding cpu_relax() implementation gone,
    we can now remove all users of cpu_relax_lowlatency() and replace them
    with cpu_relax().

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Nicholas Piggin
    Cc: Noam Camus
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1477386195-32736-5-git-send-email-borntraeger@de.ibm.com
    Signed-off-by: Ingo Molnar

    Christian Borntraeger
     

12 Aug, 2015

1 commit

  • cmpxchg64_relaxed() is now defined by linux/atomic.h, so we can
    remove our local definition from the lockref code.

    Signed-off-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hp.com
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1438880084-18856-5-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     

24 Feb, 2015

1 commit

  • With the new standardized functions, we can replace all
    ACCESS_ONCE() calls across relevant locking - this includes
    lockref and seqlock while at it.

    ACCESS_ONCE() does not work reliably on non-scalar types.
    For example gcc 4.6 and 4.7 might remove the volatile tag
    for such accesses during the SRA (scalar replacement of
    aggregates) step:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

    Update the new calls regardless of if it is a scalar type,
    this is cleaner than having three alternatives.

    Signed-off-by: Davidlohr Bueso
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Thomas Gleixner
    Cc: Paul E. McKenney
    Link: http://lkml.kernel.org/r/1424662301.6539.18.camel@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

26 Jan, 2015

1 commit


17 Jul, 2014

1 commit

  • The arch_mutex_cpu_relax() function, introduced by 34b133f, is
    hacky and ugly. It was added a few years ago to address the fact
    that common cpu_relax() calls include yielding on s390, and thus
    impact the optimistic spinning functionality of mutexes. Nowadays
    we use this function well beyond mutexes: rwsem, qrwlock, mcs and
    lockref. Since the macro that defines the call is in the mutex header,
    any users must include mutex.h and the naming is misleading as well.

    This patch (i) renames the call to cpu_relax_lowlatency ("relax, but
    only if you can do it with very low latency") and (ii) defines it in
    each arch's asm/processor.h local header, just like for regular cpu_relax
    functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
    and thus we can take it out of mutex.h. While this can seem redundant,
    I believe it is a good choice as it allows us to move out arch specific
    logic from generic locking primitives and enables future(?) archs to
    transparently define it, similarly to System Z.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Anton Blanchard
    Cc: Aurelien Jacquiot
    Cc: Benjamin Herrenschmidt
    Cc: Bharat Bhushan
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf
    Cc: Christian Borntraeger
    Cc: Chris Zankel
    Cc: David Howells
    Cc: David S. Miller
    Cc: Deepthi Dharwar
    Cc: Dominik Dingel
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Hirokazu Takata
    Cc: Ivan Kokshaysky
    Cc: James E.J. Bottomley
    Cc: James Hogan
    Cc: Jason Wang
    Cc: Jesper Nilsson
    Cc: Joe Perches
    Cc: Jonas Bonn
    Cc: Joseph Myers
    Cc: Kees Cook
    Cc: Koichi Yasutake
    Cc: Lennox Wu
    Cc: Linus Torvalds
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Neuling
    Cc: Michal Simek
    Cc: Mikael Starvik
    Cc: Nicolas Pitre
    Cc: Paolo Bonzini
    Cc: Paul Burton
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Paul Mackerras
    Cc: Qais Yousef
    Cc: Qiaowei Ren
    Cc: Rafael Wysocki
    Cc: Ralf Baechle
    Cc: Richard Henderson
    Cc: Richard Kuo
    Cc: Russell King
    Cc: Steven Miao
    Cc: Steven Rostedt
    Cc: Stratos Karafotis
    Cc: Tim Chen
    Cc: Tony Luck
    Cc: Vasily Kulikov
    Cc: Vineet Gupta
    Cc: Vineet Gupta
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: Wolfram Sang
    Cc: adi-buildroot-devel@lists.sourceforge.net
    Cc: linux390@de.ibm.com
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-am33-list@redhat.com
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-cris-kernel@axis.com
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux@lists.openrisc.net
    Cc: linux-m32r-ja@ml.linux-m32r.org
    Cc: linux-m32r@ml.linux-m32r.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-metag@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: sparclinux@vger.kernel.org
    Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

28 Nov, 2013

1 commit


15 Nov, 2013

1 commit


11 Nov, 2013

1 commit

  • Pull gfs2 updates from Steven Whitehouse:
    "The main feature of interest this time is quota updates. There are
    some clean ups and some patches to use the new generic lru list code.

    There is still plenty of scope for some further changes in due course -
    faster lookups of quota structures is very much on the todo list.
    Also, a start has been made towards the more tricky issue of using the
    generic lru code with glocks, but that will have to be completed in a
    subsequent merge window.

    The other, more minor feature, is that there have been a number of
    performance patches which relate to block allocation. In particular
    they will improve performance when the disk is nearly full"

    * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
    GFS2: Use generic list_lru for quota
    GFS2: Rename quota qd_lru_lock qd_lock
    GFS2: Use reflink for quota data cache
    GFS2: Use lockref for glocks
    GFS2: Protect quota sync generation
    GFS2: Inline qd_trylock into gfs2_quota_unlock
    GFS2: Make two similar quota code fragments into a function
    GFS2: Remove obsolete quota tunable
    GFS2: Move gfs2_icbit_munge into quota.c
    GFS2: Speed up starting point selection for block allocation
    GFS2: Add allocation parameters structure
    GFS2: Clean up reservation removal
    GFS2: fix dentry leaks
    GFS2: new function gfs2_rbm_incr
    GFS2: Introduce rbm field bii
    GFS2: Do not reset flags on active reservations
    GFS2: introduce bi_blocks for optimization
    GFS2: optimize rbm_from_block wrt bi_start
    GFS2: d_splice_alias() can't return error

    Linus Torvalds
     

15 Oct, 2013

1 commit

  • Currently glocks have an atomic reference count and also a spinlock
    which covers various internal fields, such as the state. This intent of
    this patch is to replace the spinlock and the atomic reference count
    with a lockref structure. This contains a spinlock which we can continue
    to use as before, and a reference counter which is used in conjuction
    with the spinlock to replace the previous atomic counter.

    As a result of this there are some new rules for reference counting on
    glocks. We need to distinguish between reference count changes under
    gl_spin (which are now just increment or decrement of the new counter,
    provided the count cannot hit zero) and those which are outside of
    gl_spin, but which now take gl_spin internally.

    The conversion is relatively straight forward. There is probably some
    further clean up which can be done, but the priority at this stage is to
    make the change in as simple a manner as possible.

    A consequence of this change is that the reference count is being
    decoupled from the lru list processing. This should allow future
    adoption of the lru_list code with glocks in due course.

    The reason for using the "dead" state and not just relying on 0 being
    the "invalid state" is so that in due course 0 ref counts can be
    allowable. The intent is to eventually be able to remove the ref count
    changes which are currently hidden away in state_change().

    Signed-off-by: Steven Whitehouse

    Steven Whitehouse
     

28 Sep, 2013

2 commits

  • Make use of arch_mutex_cpu_relax() so architectures can override the
    default cpu_relax() semantics.
    This is especially useful for s390, where cpu_relax() means that we
    yield() the current (virtual) cpu and therefore is very expensive,
    and would contradict the whole purpose of the lockless cmpxchg loop.

    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • The 64-bit cmpxchg operation on the lockref is ordered by virtue of
    hazarding between the cmpxchg operation and the reference count
    manipulation. On weakly ordered memory architectures (such as ARM), it
    can be of great benefit to omit the barrier instructions where they are
    not needed.

    This patch moves the lockless lockref code over to a cmpxchg64_relaxed
    operation, which doesn't provide barrier semantics. If the operation
    isn't defined, we simply #define it as the usual 64-bit cmpxchg macro.

    Cc: Waiman Long
    Signed-off-by: Will Deacon
    Signed-off-by: Linus Torvalds

    Will Deacon
     

21 Sep, 2013

1 commit

  • The cmpxchg() function tends not to support 64-bit arguments on 32-bit
    architectures. This could be either due to use of unsigned long
    arguments (like on ARM) or lack of instruction support (cmpxchgq on
    x86). However, these architectures may implement a specific cmpxchg64()
    function to provide 64-bit cmpxchg support instead.

    Since the lockref code requires a 64-bit cmpxchg and relies on the
    architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64
    instead of cmpxchg and allow 32-bit architectures to make use of the
    lockless lockref implementation.

    Cc: Waiman Long
    Signed-off-by: Will Deacon
    Signed-off-by: Linus Torvalds

    Will Deacon
     

08 Sep, 2013

2 commits

  • The only actual current lockref user (dcache) uses zero reference counts
    even for perfectly live dentries, because it's a cache: there may not be
    any users, but that doesn't mean that we want to throw away the dentry.

    At the same time, the dentry cache does have a notion of a truly "dead"
    dentry that we must not even increment the reference count of, because
    we have pruned it and it is not valid.

    Currently that distinction is not visible in the lockref itself, and the
    dentry cache validation uses "lockref_get_or_lock()" to either get a new
    reference to a dentry that already had existing references (and thus
    cannot be dead), or get the dentry lock so that we can then verify the
    dentry and increment the reference count under the lock if that
    verification was successful.

    That's all somewhat complicated.

    This adds the concept of being "dead" to the lockref itself, by simply
    using a count that is negative. This allows a usage scenario where we
    can increment the refcount of a dentry without having to validate it,
    and pushing the special "we killed it" case into the lockref code.

    The dentry code itself doesn't actually use this yet, and it's probably
    too late in the merge window to do that code (the dentry_kill() code
    with its "should I decrement the count" logic really is pretty complex
    code), but let's introduce the concept at the lockref level now.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The code got rewritten, but the comments got copied as-is from older
    versions, and as a result the argument name in the comment didn't
    actually match the code any more.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

04 Sep, 2013

1 commit

  • While we are likley to succeed and break out of this loop, it isn't
    guaranteed. We should be power and thread friendly if we do have to
    go around for a second (or third, or more) attempt.

    Signed-off-by: Tony Luck
    Signed-off-by: Linus Torvalds

    Luck, Tony
     

03 Sep, 2013

2 commits

  • Instead of taking the spinlock, the lockless versions atomically check
    that the lock is not taken, and do the reference count update using a
    cmpxchg() loop. This is semantically identical to doing the reference
    count update protected by the lock, but avoids the "wait for lock"
    contention that you get when accesses to the reference count are
    contended.

    Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
    Even when the lockref reference counts are updated atomically with
    cmpxchg, the fact that they also verify the state of the spinlock means
    that the lockless updates can never happen while somebody else holds the
    spinlock.

    So while "lockref_put_or_lock()" looks a lot like just another name for
    "atomic_dec_and_lock()", and both optimize to lockless updates, they are
    fundamentally different: the decrement done by atomic_dec_and_lock() is
    truly independent of any lock (as long as it doesn't decrement to zero),
    so a locked region can still see the count change.

    The lockref structure, in contrast, really is a *locked* reference
    count. If you hold the spinlock, the reference count will be stable and
    you can modify the reference count without using atomics, because even
    the lockless updates will see and respect the state of the lock.

    In order to enable the cmpxchg lockless code, the architecture needs to
    do three things:

    (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
    in an aligned u64, and have a "cmpxchg()" implementation that works
    on such a u64 data type.

    (2) define a helper function to test for a spinlock being unlocked
    ("arch_spin_value_unlocked()")

    (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
    Kconfig file.

    This enables it for x86-64 (but not 32-bit, we'd need to make sure
    cmpxchg() turns into the proper cmpxchg8b in order to enable it for
    32-bit mode).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • They aren't very good to inline, since they already call external
    functions (the spinlock code), and we're going to create rather more
    complicated versions of them that can do the reference count updates
    locklessly.

    Signed-off-by: Linus Torvalds

    Linus Torvalds