08 Dec, 2019

1 commit

  • CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
    Both PREEMPT and PREEMPT_RT require the same functionality which today
    depends on CONFIG_PREEMPT.

    Switch the Kconfig dependency to use CONFIG_PREEMPTION.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20191015191821.11479-32-bigeasy@linutronix.de
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior
     

21 May, 2019

1 commit


07 May, 2019

1 commit

  • Pull mmiowb removal from Will Deacon:
    "Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb())

    Remove mmiowb() from the kernel memory barrier API and instead, for
    architectures that need it, hide the barrier inside spin_unlock() when
    MMIO has been performed inside the critical section.

    The only relatively recent changes have been addressing review
    comments on the documentation, which is in a much better shape thanks
    to the efforts of Ben and Ingo.

    I was initially planning to split this into two pull requests so that
    you could run the coccinelle script yourself, however it's been plain
    sailing in linux-next so I've just included the whole lot here to keep
    things simple"

    * tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits)
    docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread
    docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section
    arch: Remove dummy mmiowb() definitions from arch code
    net/ethernet/silan/sc92031: Remove stale comment about mmiowb()
    i40iw: Redefine i40iw_mmiowb() to do nothing
    scsi/qla1280: Remove stale comment about mmiowb()
    drivers: Remove explicit invocations of mmiowb()
    drivers: Remove useless trailing comments from mmiowb() invocations
    Documentation: Kill all references to mmiowb()
    riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code
    powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code
    ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    m68k/io: Remove useless definition of mmiowb()
    nds32/io: Remove useless definition of mmiowb()
    x86/io: Remove useless definition of mmiowb()
    arm64/io: Remove useless definition of mmiowb()
    ARM/io: Remove useless definition of mmiowb()
    mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors
    ...

    Linus Torvalds
     

08 Apr, 2019

1 commit

  • In preparation for removing all explicit mmiowb() calls from driver
    code, implement a tracking system in asm-generic based loosely on the
    PowerPC implementation. This allows architectures with a non-empty
    mmiowb() definition to have the barrier automatically inserted in
    spin_unlock() following a critical section containing an I/O write.

    Acked-by: Linus Torvalds
    Signed-off-by: Will Deacon

    Will Deacon
     

03 Apr, 2019

1 commit

  • Currently, we have two different implementation of rwsem:

    1) CONFIG_RWSEM_GENERIC_SPINLOCK (rwsem-spinlock.c)
    2) CONFIG_RWSEM_XCHGADD_ALGORITHM (rwsem-xadd.c)

    As we are going to use a single generic implementation for rwsem-xadd.c
    and no architecture-specific code will be needed, there is no point
    in keeping two different implementations of rwsem. In most cases, the
    performance of rwsem-spinlock.c will be worse. It also doesn't get all
    the performance tuning and optimizations that had been implemented in
    rwsem-xadd.c over the years.

    For simplication, we are going to remove rwsem-spinlock.c and make all
    architectures use a single implementation of rwsem - rwsem-xadd.c.

    All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
    in the code are removed.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Link: https://lkml.kernel.org/r/20190322143008.21313-3-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

02 Feb, 2019

1 commit

  • Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
    bpf program serialize access to other variables.

    Example:
    struct hash_elem {
    int cnt;
    struct bpf_spin_lock lock;
    };
    struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
    if (val) {
    bpf_spin_lock(&val->lock);
    val->cnt++;
    bpf_spin_unlock(&val->lock);
    }

    Restrictions and safety checks:
    - bpf_spin_lock is only allowed inside HASH and ARRAY maps.
    - BTF description of the map is mandatory for safety analysis.
    - bpf program can take one bpf_spin_lock at a time, since two or more can
    cause dead locks.
    - only one 'struct bpf_spin_lock' is allowed per map element.
    It drastically simplifies implementation yet allows bpf program to use
    any number of bpf_spin_locks.
    - when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
    - bpf program must bpf_spin_unlock() before return.
    - bpf program can access 'struct bpf_spin_lock' only via
    bpf_spin_lock()/bpf_spin_unlock() helpers.
    - load/store into 'struct bpf_spin_lock lock;' field is not allowed.
    - to use bpf_spin_lock() helper the BTF description of map value must be
    a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
    Nested lock inside another struct is not allowed.
    - syscall map_lookup doesn't copy bpf_spin_lock field to user space.
    - syscall map_update and program map_update do not update bpf_spin_lock field.
    - bpf_spin_lock cannot be on the stack or inside networking packet.
    bpf_spin_lock can only be inside HASH or ARRAY map value.
    - bpf_spin_lock is available to root only and to all program types.
    - bpf_spin_lock is not allowed in inner maps of map-in-map.
    - ld_abs is not allowed inside spin_lock-ed region.
    - tracing progs and socket filter progs cannot use bpf_spin_lock due to
    insufficient preemption checks

    Implementation details:
    - cgroup-bpf class of programs can nest with xdp/tc programs.
    Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
    Other solutions to avoid nested bpf_spin_lock are possible.
    Like making sure that all networking progs run with softirq disabled.
    spin_lock_irqsave is the simplest and doesn't add overhead to the
    programs that don't use it.
    - arch_spinlock_t is used when its implemented as queued_spin_lock
    - archs can force their own arch_spinlock_t
    - on architectures where queued_spin_lock is not available and
    sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
    - presence of bpf_spin_lock inside map value could have been indicated via
    extra flag during map_create, but specifying it via BTF is cleaner.
    It provides introspection for map key/value and reduces user mistakes.

    Next steps:
    - allow bpf_spin_lock in other map types (like cgroup local storage)
    - introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
    to request kernel to grab bpf_spin_lock before rewriting the value.
    That will serialize access to map elements.

    Acked-by: Peter Zijlstra (Intel)
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Alexei Starovoitov
     

25 Oct, 2016

1 commit


12 May, 2015

1 commit

  • To be consistent with the queued spinlocks which use
    CONFIG_QUEUED_SPINLOCKS config parameter, the one for the queued
    rwlocks is now renamed to CONFIG_QUEUED_RWLOCKS.

    Signed-off-by: Waiman Long
    Cc: Borislav Petkov
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431367031-36697-1-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

11 May, 2015

1 commit

  • Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
    in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
    called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

    But the typo was natural: the proper English term for such
    a generic object would be 'queued spinlocks' - so rename
    this and related symbols accordingly to the plural form.

    Reported-by: Valentin Rothberg
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

08 May, 2015

3 commits

  • This patch adds the necessary Xen specific code to allow Xen to
    support the CPU halting and kicking operations needed by the queue
    spinlock PV code.

    Signed-off-by: David Vrabel
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-12-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    David Vrabel
     
  • This patch adds the necessary KVM specific code to allow KVM to
    support the CPU halting and kicking operations needed by the queue
    spinlock PV code.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-11-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch introduces a new generic queued spinlock implementation that
    can serve as an alternative to the default ticket spinlock. Compared
    with the ticket spinlock, this queued spinlock should be almost as fair
    as the ticket spinlock. It has about the same speed in single-thread
    and it can be much faster in high contention situations especially when
    the spinlock is embedded within the data structure to be protected.

    Only in light to moderate contention where the average queue depth
    is around 1-3 will this queued spinlock be potentially a bit slower
    due to the higher slowpath overhead.

    This queued spinlock is especially suit to NUMA machines with a large
    number of cores as the chance of spinlock contention is much higher
    in those machines. The cost of contention is also higher because of
    slower inter-node memory traffic.

    Due to the fact that spinlocks are acquired with preemption disabled,
    the process will not be migrated to another CPU while it is trying
    to get a spinlock. Ignoring interrupt handling, a CPU can only be
    contending in one spinlock at any one time. Counting soft IRQ, hard
    IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
    activities. By allocating a set of per-cpu queue nodes and used them
    to form a waiting queue, we can encode the queue node address into a
    much smaller 24-bit size (including CPU number and queue node index)
    leaving one byte for the lock.

    Please note that the queue node is only needed when waiting for the
    lock. Once the lock is acquired, the queue node can be released to
    be used later.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

14 Jan, 2015

1 commit

  • We have two flavors of the MCS spinlock: standard and cancelable (OSQ).
    While each one is independent of the other, we currently mix and match
    them. This patch:

    - Moves the OSQ code out of mcs_spinlock.h (which only deals with the traditional
    version) into include/linux/osq_lock.h. No unnecessary code is added to the
    more global header file, anything locks that make use of OSQ must include
    it anyway.

    - Renames mcs_spinlock.c to osq_lock.c. This file only contains osq code.

    - Introduces a CONFIG_LOCK_SPIN_ON_OWNER in order to only build osq_lock
    if there is support for it.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Thomas Gleixner
    Cc: "Paul E. McKenney"
    Cc: Jason Low
    Cc: Linus Torvalds
    Cc: Mikulas Patocka
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1420573509-24774-5-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

16 Jul, 2014

2 commits

  • Just like with mutexes (CONFIG_MUTEX_SPIN_ON_OWNER),
    encapsulate the dependencies for rwsem optimistic spinning.
    No logical changes here as it continues to depend on both
    SMP and the XADD algorithm variant.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Jason Low
    [ Also make it depend on ARCH_SUPPORTS_ATOMIC_RMW. ]
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1405112406-13052-2-git-send-email-davidlohr@hp.com
    Cc: aswin@hp.com
    Cc: Chris Mason
    Cc: Davidlohr Bueso
    Cc: Josef Bacik
    Cc: Linus Torvalds
    Cc: Waiman Long
    Signed-off-by: Ingo Molnar

    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • The optimistic spin code assumes regular stores and cmpxchg() play nice;
    this is found to not be true for at least: parisc, sparc32, tile32,
    metag-lock1, arc-!llsc and hexagon.

    There is further wreckage, but this in particular seemed easy to
    trigger, so blacklist this.

    Opt in for known good archs.

    Signed-off-by: Peter Zijlstra
    Reported-by: Mikulas Patocka
    Cc: David Miller
    Cc: Chris Metcalf
    Cc: James Bottomley
    Cc: Vineet Gupta
    Cc: Jason Low
    Cc: Waiman Long
    Cc: "James E.J. Bottomley"
    Cc: Paul McKenney
    Cc: John David Anglin
    Cc: James Hogan
    Cc: Linus Torvalds
    Cc: Davidlohr Bueso
    Cc: stable@vger.kernel.org
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Russell King
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: sparclinux@vger.kernel.org
    Link: http://lkml.kernel.org/r/20140606175316.GV13930@laptop.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Jun, 2014

1 commit

  • This rwlock uses the arch_spin_lock_t as a waitqueue, and assuming the
    arch_spin_lock_t is a fair lock (ticket,mcs etc..) the resulting
    rwlock is a fair lock.

    It fits in the same 8 bytes as the regular rwlock_t by folding the
    reader and writer count into a single integer, using the remaining 4
    bytes for the arch_spinlock_t.

    Architectures that can single-copy adress bytes can optimize
    queue_write_unlock() with a 0 write to the LSB (the write count).

    Performance as measured by Davidlohr Bueso (rwlock_t -> qrwlock_t):

    +--------------+-------------+---------------+
    | Workload | #users | delta |
    +--------------+-------------+---------------+
    | alltests | > 1400 | -4.83% |
    | custom | 0-100,> 100 | +1.43%,-1.57% |
    | high_systime | > 1000 | -2.61 |
    | shared | all | +0.32 |
    +--------------+-------------+---------------+

    http://www.stgolabs.net/qrwlock-stuff/aim7-results-vs-rwsem_optsin/

    Signed-off-by: Waiman Long
    [peterz: near complete rewrite]
    Signed-off-by: Peter Zijlstra
    Cc: Arnd Bergmann
    Cc: Linus Torvalds
    Cc: "Paul E.McKenney"
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Link: http://lkml.kernel.org/n/tip-gac1nnl3wvs2ij87zv2xkdzq@git.kernel.org
    Signed-off-by: Ingo Molnar

    Waiman Long
     

28 May, 2013

1 commit

  • The Kconfig symbols ARCH_INLINE_READ_UNLOCK_IRQ,
    ARCH_INLINE_SPIN_UNLOCK_IRQ, and ARCH_INLINE_WRITE_UNLOCK_IRQ were added
    in v2.6.33, but have never actually been used. Ingo Molnar spotted that
    this is caused by three identical copy/paste erros. Eg, the Kconfig
    entry for

    INLINE_READ_UNLOCK_IRQ

    has an (optional) dependency on:

    ARCH_INLINE_READ_UNLOCK_BH

    were it apparently should depend on:

    ARCH_INLINE_READ_UNLOCK_IRQ

    instead. Likewise for the Kconfig entries for INLINE_SPIN_UNLOCK_IRQ and
    INLINE_WRITE_UNLOCK_IRQ. Fix these three errors.

    This never really caused any real problems as these symbols are set (or
    unset) in a group - but it's worth fixing it nevertheless.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul Bolle
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1368780693.1350.228.camel@x61.thuisdomein
    Signed-off-by: Ingo Molnar

    Paul Bolle
     

13 Sep, 2012

1 commit

  • Break out the DEBUG_SPINLOCK dependency (requires moving up
    UNINLINE_SPIN_UNLOCK, as this was the only one in that block not
    depending on that option).

    Avoid putting values not selected into the resulting .config -
    they are not useful for anything, make the output less legible,
    and just consume space: Use "depends on" rather than directly
    setting the default from the combined dependency values.

    Signed-off-by: Jan Beulich
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/504DF2AC020000780009A2DF@nat28.tlf.novell.com
    Signed-off-by: Ingo Molnar

    Jan Beulich
     

23 Mar, 2012

1 commit

  • Get rid of INLINE_SPIN_UNLOCK entirely replacing it with
    UNINLINE_SPIN_UNLOCK instead of the reverse meaning.

    Whoever wants to change the default spinlock inlining
    behavior and uninline the spinlocks for some weird reason,
    such as spinlock debugging, paravirt etc. can now all just
    select UNINLINE_SPIN_UNLOCK

    Original discussion at: https://lkml.org/lkml/2012/3/21/357

    Suggested-by: Linus Torvalds
    Signed-off-by: Raghavendra K T
    Cc: Linus Torvalds
    Cc: Ralf Baechle
    Cc: Chris Metcalf
    Cc: Chris Zankel
    Cc: linux-mips@linux-mips.org
    Link: http://lkml.kernel.org/r/20120322095502.30866.75756.sendpatchset@codeblue
    [ tidied up the changelog a bit ]
    Signed-off-by: Ingo Molnar

    Raghavendra K T
     

10 Apr, 2011

1 commit


03 Dec, 2009

1 commit


14 Nov, 2009

1 commit

  • commit 892a7c67 (locking: Allow arch-inlined spinlocks) implements the
    selection of which lock functions are inlined based on defines in
    arch/.../spinlock.h: #define __always_inline__LOCK_FUNCTION

    Despite of the name __always_inline__* the lock functions can be built
    out of line depending on config options. Also if the arch does not set
    some inline defines the generic code might set them; again depending on
    config options.

    This makes it unnecessary hard to figure out when and which lock
    functions are inlined. Aside of that it makes it way harder and
    messier for -rt to manipulate the lock functions.

    Convert the inlining decision to CONFIG switches. Each lock function
    is inlined depending on CONFIG_INLINE_*. The configs implement the
    existing dependencies. The architecture code can select ARCH_INLINE_*
    to signal that it wants the corresponding lock function inlined.
    ARCH_INLINE_* is necessary as Kconfig ignores "depends on"
    restrictions when a config element is selected.

    No functional change.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Acked-by: Heiko Carstens
    Reviewed-by: Ingo Molnar
    Acked-by: Peter Zijlstra

    Thomas Gleixner