13 Jan, 2021

1 commit

  • Changes in 5.10.6
    Revert "drm/amd/display: Fix memory leaks in S3 resume"
    Revert "mtd: spinand: Fix OOB read"
    rtc: pcf2127: move watchdog initialisation to a separate function
    rtc: pcf2127: only use watchdog when explicitly available
    dt-bindings: rtc: add reset-source property
    kdev_t: always inline major/minor helper functions
    Bluetooth: Fix attempting to set RPA timeout when unsupported
    ALSA: hda/realtek - Modify Dell platform name
    ALSA: hda/hdmi: Fix incorrect mutex unlock in silent_stream_disable()
    drm/i915/tgl: Fix Combo PHY DPLL fractional divider for 38.4MHz ref clock
    scsi: ufs: Allow an error return value from ->device_reset()
    scsi: ufs: Re-enable WriteBooster after device reset
    RDMA/core: remove use of dma_virt_ops
    RDMA/siw,rxe: Make emulated devices virtual in the device tree
    fuse: fix bad inode
    perf: Break deadlock involving exec_update_mutex
    rwsem: Implement down_read_killable_nested
    rwsem: Implement down_read_interruptible
    exec: Transform exec_update_mutex into a rw_semaphore
    mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
    Linux 5.10.6

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Id4c57a151a1e8f2162163d2337b6055f04edbe9b

    Greg Kroah-Hartman
     

09 Jan, 2021

2 commits

  • [ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_interruptible. This is needed for perf_event_open to be
    converted (with no semantic changes) from working on a mutex to
    wroking on a rwsem.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     
  • [ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_killable_nested. This is needed so that kcmp_lock
    can be converted from working on a mutexes to working on rw_semaphores.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     

05 Jan, 2021

1 commit

  • The rwsem_waiter struct is needed in vendor hook alter_rwsem_list_add.
    It has parameter sem which is a struct rw_semaphore (already export in
    rwsem.h), inside the structure there is a wait_list to link
    "struct rwsem_waiter" items. The task information in each item of the
    wait_list is needed to be referenced in vendor loadable modules.

    Bug: 174902706
    Signed-off-by: Huang Yiwei
    Change-Id: Ic7d21ffdd795eaa203989751d26f8b1f32134d8b

    Huang Yiwei
     

07 Aug, 2020

1 commit


28 Jul, 2020

1 commit

  • - Add the hook to apply vendor's performance tune for owner
    of rwsem.

    - Add the hook for the waiter list of rwsem to allow
    vendor perform waiting queue enhancement

    - ANDROID_VENDOR_DATA added to rw_semaphore

    Bug: 161400830

    Signed-off-by: JianMin Liu
    Change-Id: I007a5e26f3db2adaeaf4e5ccea414ce7abfa83b8

    JianMin Liu
     

17 Jul, 2020

1 commit

  • Leading comma prevents arbitrary reordering of initialisation clauses.
    The whole point of C99 initialisation is to allow any such reordering.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200711145954.GA1178171@localhost.localdomain

    Alexey Dobriyan
     

21 Mar, 2020

1 commit

  • Extend lockdep to validate lock wait-type context.

    The current wait-types are:

    LD_WAIT_FREE, /* wait free, rcu etc.. */
    LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
    LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
    LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */

    Where lockdep validates that the current lock (the one being acquired)
    fits in the current wait-context (as generated by the held stack).

    This ensures that there is no attempt to acquire mutexes while holding
    spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
    other words, its a more fancy might_sleep().

    Obviously RCU made the entire ordeal more complex than a simple single
    value test because RCU can be acquired in (pretty much) any context and
    while it presents a context to nested locks it is not the same as it
    got acquired in.

    Therefore its necessary to split the wait_type into two values, one
    representing the acquire (outer) and one representing the nested context
    (inner). For most 'normal' locks these two are the same.

    [ To make static initialization easier we have the rule that:
    .outer == INV means .outer == .inner; because INV == 0. ]

    It further means that its required to find the minimal .inner of the held
    stack to compare against the outer of the new lock; because while 'normal'
    RCU presents a CONFIG type to nested locks, if it is taken while already
    holding a SPIN type it obviously doesn't relax the rules.

    Below is an example output generated by the trivial test code:

    raw_spin_lock(&foo);
    spin_lock(&bar);
    spin_unlock(&bar);
    raw_spin_unlock(&foo);

    [ BUG: Invalid wait context ]
    -----------------------------
    swapper/0/1 is trying to lock:
    ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
    other info that might help us debug this:
    1 lock held by swapper/0/1:
    #0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

    The way to read it is to look at the new -{n,m} part in the lock
    description; -{3:3} for the attempted lock, and try and match that up to
    the held locks, which in this case is the one: -{2,2}.

    This tells that the acquiring lock requires a more relaxed environment than
    presented by the lock stack.

    Currently only the normal locks and RCU are converted, the rest of the
    lockdep users defaults to .inner = INV which is ignored. More conversions
    can be done when desired.

    The check for spinlock_t nesting is not enabled by default. It's a separate
    config option for now as there are known problems which are currently
    addressed. The config option allows to identify these problems and to
    verify that the solutions found are indeed solving them.

    The config switch will be removed and the checks will permanently enabled
    once the vast majority of issues has been addressed.

    [ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
    failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
    [ tglx: Add the config option ]

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de

    Peter Zijlstra
     

11 Feb, 2020

1 commit

  • Remove the now unused RWSEM_OWNER_UNKNOWN hack. This hack breaks
    PREEMPT_RT and getting rid of it was the entire motivation for
    re-writing the percpu rwsem.

    The biggest problem is that it is fundamentally incompatible with any
    form of Priority Inheritance, any exclusively held lock must have a
    distinct owner.

    Requested-by: Christoph Hellwig
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Davidlohr Bueso
    Acked-by: Will Deacon
    Acked-by: Waiman Long
    Tested-by: Juri Lelli
    Link: https://lkml.kernel.org/r/20200204092228.GP14946@hirez.programming.kicks-ass.net

    Peter Zijlstra
     

06 Aug, 2019

1 commit

  • Currently rwsems is the only locking primitive that lacks this
    debug feature. Add it under CONFIG_DEBUG_RWSEMS and do the magic
    checking in the locking fastpath (trylock) operation such that
    we cover all cases. The unlocking part is pretty straightforward.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Waiman Long
    Cc: mingo@kernel.org
    Cc: Davidlohr Bueso
    Link: https://lkml.kernel.org/r/20190729044735.9632-1-dave@stgolabs.net

    Davidlohr Bueso
     

15 Jul, 2019

1 commit

  • Convert the locking documents to ReST and add them to the
    kernel development book where it belongs.

    Most of the stuff here is just to make Sphinx to properly
    parse the text file, as they're already in good shape,
    not requiring massive changes in order to be parsed.

    The conversion is actually:
    - add blank lines and identation in order to identify paragraphs;
    - fix tables markups;
    - add some lists markups;
    - mark literal blocks;
    - adjust title markups.

    At its new index.rst, let's add a :orphan: while this is not linked to
    the main index.rst file, in order to avoid build warnings.

    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Federico Vaga

    Mauro Carvalho Chehab
     

17 Jun, 2019

3 commits

  • The rwsem->owner contains not just the task structure pointer, it also
    holds some flags for storing the current state of the rwsem. Some of
    the flags may have to be atomically updated. To reflect the new reality,
    the owner is now changed to an atomic_long_t type.

    New helper functions are added to properly separate out the task
    structure pointer and the embedded flags.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-14-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Bit 1 of sem->owner (RWSEM_ANONYMOUSLY_OWNED) is used to designate an
    anonymous owner - readers or an anonymous writer. The setting of this
    anonymous bit is used as an indicator that optimistic spinning cannot
    be done on this rwsem.

    With the upcoming reader optimistic spinning patches, a reader-owned
    rwsem can be spinned on for a limit period of time. We still need
    this bit to indicate a rwsem is nonspinnable, but not setting this
    bit loses its meaning that the owner is known. So rename the bit
    to RWSEM_NONSPINNABLE to clarify its meaning.

    This patch also fixes a DEBUG_RWSEMS_WARN_ON() bug in __up_write().

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-12-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • The owner field in the rw_semaphore structure is used primarily for
    optimistic spinning. However, identifying the rwsem owner can also be
    helpful in debugging as well as tracing locking related issues when
    analyzing crash dump. The owner field may also store state information
    that can be important to the operation of the rwsem.

    So the owner field is now made a permanent member of the rw_semaphore
    structure irrespective of CONFIG_RWSEM_SPIN_ON_OWNER.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-2-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

10 Apr, 2019

2 commits

  • For an uncontended rwsem, count and owner are the only fields a task
    needs to touch when acquiring the rwsem. So they are put next to each
    other to increase the chance that they will share the same cacheline.

    On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2
    cache, a rwsem locking microbenchmark with one locking thread was
    run to write-lock and write-unlock an array of rwsems separated 2
    cachelines apart in a 1M byte memory block. The locking rates (kops/s)
    of the microbenchmark when the rwsems are at various "long" (8-byte)
    offsets from beginning of the cacheline before and after the patch were
    as follows:

    Cacheline Offset Pre-patch Post-patch
    ---------------- --------- ----------
    0 17,449 16,588
    1 17,450 16,465
    2 17,450 16,460
    3 17,453 16,462
    4 14,867 16,471
    5 14,867 16,470
    6 14,853 16,464
    7 14,867 13,172

    Before the patch, the count and owner are 4 "long"s apart. After the
    patch, they are only 1 "long" apart.

    The rwsem data have to be loaded from the L3 cache for each access. It
    can be seen that the locking rates are more consistent after the patch
    than before. Note that for this particular system, the performance
    drop happens whenever the count and owner are at an odd multiples of
    "long"s apart. No performance drop was observed when only a single rwsem
    was used (hot cache). So the drop is likely just an idiosyncrasy of the
    cache architecture of this chip than an inherent problem with the patch.

    Suggested-by: Linus Torvalds
    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20190404174320.22416-12-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • We don't need to expose rwsem internal functions which are not supposed
    to be called directly from other kernel code.

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Acked-by: Will Deacon
    Acked-by: Davidlohr Bueso
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Link: http://lkml.kernel.org/r/20190404174320.22416-4-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

03 Apr, 2019

2 commits

  • Currently, we have two different implementation of rwsem:

    1) CONFIG_RWSEM_GENERIC_SPINLOCK (rwsem-spinlock.c)
    2) CONFIG_RWSEM_XCHGADD_ALGORITHM (rwsem-xadd.c)

    As we are going to use a single generic implementation for rwsem-xadd.c
    and no architecture-specific code will be needed, there is no point
    in keeping two different implementations of rwsem. In most cases, the
    performance of rwsem-spinlock.c will be worse. It also doesn't get all
    the performance tuning and optimizations that had been implemented in
    rwsem-xadd.c over the years.

    For simplication, we are going to remove rwsem-spinlock.c and make all
    architectures use a single implementation of rwsem - rwsem-xadd.c.

    All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
    in the code are removed.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Link: https://lkml.kernel.org/r/20190322143008.21313-3-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • As the generic rwsem-xadd code is using the appropriate acquire and
    release versions of the atomic operations, the arch specific rwsem.h
    files will not be that much faster than the generic code as long as the
    atomic functions are properly implemented. So we can remove those arch
    specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
    effort.

    Currently, only x86, alpha and ia64 have implemented architecture
    specific fast paths. I don't have access to alpha and ia64 systems for
    testing, but they are legacy systems that are not likely to be updated
    to the latest kernel anyway.

    By using a rwsem microbenchmark, the total locking rates on a 4-socket
    56-core 112-thread x86-64 system before and after the patch were as
    follows (mixed means equal # of read and write locks):

    Before Patch After Patch
    # of Threads wlock rlock mixed wlock rlock mixed
    ------------ ----- ----- ----- ----- ----- -----
    1 29,201 30,143 29,458 28,615 30,172 29,201
    2 6,807 13,299 1,171 7,725 15,025 1,804
    4 6,504 12,755 1,520 7,127 14,286 1,345
    8 6,762 13,412 764 6,826 13,652 726
    16 6,693 15,408 662 6,599 15,938 626
    32 6,145 15,286 496 5,549 15,487 511
    64 5,812 15,495 60 5,858 15,572 60

    There were some run-to-run variations for the multi-thread tests. For
    x86-64, using the generic C code fast path seems to be a little bit
    faster than the assembly version with low lock contention. Looking at
    the assembly version of the fast paths, there are assembly to/from C
    code wrappers that save and restore all the callee-clobbered registers
    (7 registers on x86-64). The assembly generated from the generic C
    code doesn't need to do that. That may explain the slight performance
    gain here.

    The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
    with no code change as no other code other than those under
    kernel/locking needs to access the internal rwsem macros and functions.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

10 Sep, 2018

1 commit

  • Currently, when a reader acquires a lock, it only sets the
    RWSEM_READER_OWNED bit in the owner field. The other bits are simply
    not used. When debugging hanging cases involving rwsems and readers,
    the owner value does not provide much useful information at all.

    This patch modifies the current behavior to always store the task_struct
    pointer of the last rwsem-acquiring reader in a reader-owned rwsem. This
    may be useful in debugging rwsem hanging cases especially if only one
    reader is involved. However, the task in the owner field may not the
    real owner or one of the real owners at all when the owner value is
    examined, for example, in a crash dump. So it is just an additional
    hint about the past history.

    If CONFIG_DEBUG_RWSEMS=y is enabled, the owner field will be checked at
    unlock time too to make sure the task pointer value is valid. That does
    have a slight performance cost and so is only enabled as part of that
    debug option.

    From the performance point of view, it is expected that the changes
    shouldn't have any noticeable performance impact. A rwsem microbenchmark
    (with 48 worker threads and 1:1 reader/writer ratio) was ran on a
    2-socket 24-core 48-thread Haswell system. The locking rates on a
    4.19-rc1 based kernel were as follows:

    1) Unpatched kernel: 543.3 kops/s
    2) Patched kernel: 549.2 kops/s
    3) Patched kernel (CONFIG_DEBUG_RWSEMS on): 546.6 kops/s

    There was actually a slight increase in performance (1.1%) in this
    particular case. Maybe it was caused by the elimination of a branch or
    just a testing noise. Turning on the CONFIG_DEBUG_RWSEMS option also
    had less than the expected impact on performance.

    The least significant 2 bits of the owner value are now used to designate
    the rwsem is readers owned and the owners are anonymous.

    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/1536265114-10842-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

16 May, 2018

1 commit

  • The filesystem freezing code needs to transfer ownership of a rwsem
    embedded in a percpu-rwsem from the task that does the freezing to
    another one that does the thawing by calling percpu_rwsem_release()
    after freezing and percpu_rwsem_acquire() before thawing.

    However, the new rwsem debug code runs afoul with this scheme by warning
    that the task that releases the rwsem isn't the one that acquires it,
    as reported by Amir Goldstein:

    DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
    WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79

    Call Trace:
    percpu_up_write+0x1f/0x28
    thaw_super_locked+0xdf/0x120
    do_vfs_ioctl+0x270/0x5f1
    ksys_ioctl+0x52/0x71
    __x64_sys_ioctl+0x16/0x19
    do_syscall_64+0x5d/0x167
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    To work properly with the rwsem debug code, we need to annotate that the
    rwsem ownership is unknown during the tranfer period until a brave soul
    comes forward to acquire the ownership. During that period, optimistic
    spinning will be disabled.

    Reported-by: Amir Goldstein
    Tested-by: Amir Goldstein
    Signed-off-by: Waiman Long
    Acked-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Jan Kara
    Cc: Linus Torvalds
    Cc: Matthew Wilcox
    Cc: Oleg Nesterov
    Cc: Paul E. McKenney
    Cc: Theodore Y. Ts'o
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-fsdevel@vger.kernel.org
    Link: http://lkml.kernel.org/r/1526420991-21213-3-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

07 Nov, 2017

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

10 Oct, 2017

1 commit

  • Similar to down_read() and down_write_killable(),
    add killable version of down_read(), based on
    __down_read_killable() function, added in previous
    patches.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: arnd@arndb.de
    Cc: avagin@virtuozzo.com
    Cc: davem@davemloft.net
    Cc: fenghua.yu@intel.com
    Cc: gorcunov@virtuozzo.com
    Cc: heiko.carstens@de.ibm.com
    Cc: hpa@zytor.com
    Cc: ink@jurassic.park.msu.ru
    Cc: mattst88@gmail.com
    Cc: rientjes@google.com
    Cc: rth@twiddle.net
    Cc: schwidefsky@de.ibm.com
    Cc: tony.luck@intel.com
    Cc: viro@zeniv.linux.org.uk
    Link: http://lkml.kernel.org/r/150670119884.23930.2585570605960763239.stgit@localhost.localdomain
    Signed-off-by: Ingo Molnar

    Kirill Tkhai
     

10 Aug, 2017

1 commit

  • Rename rwsem_down_read_failed() in __rwsem_down_read_failed_common()
    and teach it to abort waiting in case of pending signals and killable
    state argument passed.

    Note, that we shouldn't wake anybody up in EINTR path, as:

    We check for (waiter.task) under spinlock before we go to out_nolock
    path. Current task wasn't able to be woken up, so there are
    a writer, owning the sem, or a writer, which is the first waiter.
    In the both cases we shouldn't wake anybody. If there is a writer,
    owning the sem, and we were the only waiter, remove RWSEM_WAITING_BIAS,
    as there are no waiters anymore.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: arnd@arndb.de
    Cc: avagin@virtuozzo.com
    Cc: davem@davemloft.net
    Cc: fenghua.yu@intel.com
    Cc: gorcunov@virtuozzo.com
    Cc: heiko.carstens@de.ibm.com
    Cc: hpa@zytor.com
    Cc: ink@jurassic.park.msu.ru
    Cc: mattst88@gmail.com
    Cc: rth@twiddle.net
    Cc: schwidefsky@de.ibm.com
    Cc: tony.luck@intel.com
    Link: http://lkml.kernel.org/r/149789534632.9059.2901382369609922565.stgit@localhost.localdomain
    Signed-off-by: Ingo Molnar

    Kirill Tkhai
     

08 Jun, 2016

1 commit

  • Convert the rwsem count variable to an atomic_long_t since we use it
    as an atomic variable. This also allows us to remove the
    rwsem_atomic_{add,update}() "abstraction" which would now be an unnecesary
    level of indirection. In follow up patches, we also remove the
    rwsem_atomic_{add,update}() definitions across the various architectures.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Jason Low
    [ Build warning fixes on various architectures. ]
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Fenghua Yu
    Cc: Heiko Carstens
    Cc: Jason Low
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Paul E. McKenney
    Cc: Peter Hurley
    Cc: Terry Rudd
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Tony Luck
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1465017963-4839-2-git-send-email-jason.low2@hpe.com
    Signed-off-by: Ingo Molnar

    Jason Low
     

26 May, 2016

1 commit


22 Apr, 2016

1 commit

  • Now that all the architectures implement the necessary glue code
    we can introduce down_write_killable(). The only difference wrt. regular
    down_write() is that the slow path waits in TASK_KILLABLE state and the
    interruption by the fatal signal is reported as -EINTR to the caller.

    Signed-off-by: Michal Hocko
    Cc: Andrew Morton
    Cc: Chris Zankel
    Cc: David S. Miller
    Cc: Linus Torvalds
    Cc: Max Filippov
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Signed-off-by: Davidlohr Bueso
    Cc: Signed-off-by: Jason Low
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: sparclinux@vger.kernel.org
    Link: http://lkml.kernel.org/r/1460041951-22347-12-git-send-email-mhocko@kernel.org
    Signed-off-by: Ingo Molnar

    Michal Hocko
     

13 Apr, 2016

1 commit

  • Introduce a generic implementation necessary for down_write_killable().

    This is a trivial extension of the already existing down_write() call
    which can be interrupted by SIGKILL. This patch doesn't provide
    down_write_killable() yet because arches have to provide the necessary
    pieces before.

    rwsem_down_write_failed() which is a generic slow path for the
    write lock is extended to take a task state and renamed to
    __rwsem_down_write_failed_common(). The return value is either a valid
    semaphore pointer or ERR_PTR(-EINTR).

    rwsem_down_write_failed_killable() is exported as a new way to wait for
    the lock and be killable.

    For rwsem-spinlock implementation the current __down_write() it updated
    in a similar way as __rwsem_down_write_failed_common() except it doesn't
    need new exports just visible __down_write_killable().

    Architectures which are not using the generic rwsem implementation are
    supposed to provide their __down_write_killable() implementation and
    use rwsem_down_write_failed_killable() for the slow path.

    Signed-off-by: Michal Hocko
    Cc: Andrew Morton
    Cc: Chris Zankel
    Cc: David S. Miller
    Cc: Linus Torvalds
    Cc: Max Filippov
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Signed-off-by: Davidlohr Bueso
    Cc: Signed-off-by: Jason Low
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: sparclinux@vger.kernel.org
    Link: http://lkml.kernel.org/r/1460041951-22347-7-git-send-email-mhocko@kernel.org
    Signed-off-by: Ingo Molnar

    Michal Hocko
     

13 Aug, 2014

1 commit

  • Specifically:
    Documentation/locking/lockdep-design.txt
    Documentation/locking/lockstat.txt
    Documentation/locking/mutex-design.txt
    Documentation/locking/rt-mutex-design.txt
    Documentation/locking/rt-mutex.txt
    Documentation/locking/spinlocks.txt
    Documentation/locking/ww-mutex-design.txt

    Signed-off-by: Davidlohr Bueso
    Acked-by: Randy Dunlap
    Signed-off-by: Peter Zijlstra
    Cc: jason.low2@hp.com
    Cc: aswin@hp.com
    Cc: Alexei Starovoitov
    Cc: Al Viro
    Cc: Andrew Morton
    Cc: Chris Mason
    Cc: Dan Streetman
    Cc: David Airlie
    Cc: Davidlohr Bueso
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Heiko Carstens
    Cc: Jason Low
    Cc: Josef Bacik
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Lubomir Rintel
    Cc: Masanari Iida
    Cc: Paul E. McKenney
    Cc: Randy Dunlap
    Cc: Tim Chen
    Cc: Vineet Gupta
    Cc: fengguang.wu@intel.com
    Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.com
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

16 Jul, 2014

5 commits

  • Just like with mutexes (CONFIG_MUTEX_SPIN_ON_OWNER),
    encapsulate the dependencies for rwsem optimistic spinning.
    No logical changes here as it continues to depend on both
    SMP and the XADD algorithm variant.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Jason Low
    [ Also make it depend on ARCH_SUPPORTS_ATOMIC_RMW. ]
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1405112406-13052-2-git-send-email-davidlohr@hp.com
    Cc: aswin@hp.com
    Cc: Chris Mason
    Cc: Davidlohr Bueso
    Cc: Josef Bacik
    Cc: Linus Torvalds
    Cc: Waiman Long
    Signed-off-by: Ingo Molnar

    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • Recent optimistic spinning additions to rwsem provide significant performance
    benefits on many workloads on large machines. The cost of it was increasing
    the size of the rwsem structure by up to 128 bits.

    However, now that the previous patches in this series bring the overhead of
    struct optimistic_spin_queue to 32 bits, this patch reorders some fields in
    struct rw_semaphore such that we can reduce the overhead of the rwsem structure
    by 64 bits (on 64 bit systems).

    The extra overhead required for rwsem optimistic spinning would now be up
    to 8 additional bytes instead of up to 16 bytes. Additionally, the size of
    rwsem would now be more in line with mutexes.

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: Scott Norton
    Cc: "Paul E. McKenney"
    Cc: Dave Chinner
    Cc: Waiman Long
    Cc: Davidlohr Bueso
    Cc: Rik van Riel
    Cc: Andrew Morton
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Tim Chen
    Cc: Konrad Rzeszutek Wilk
    Cc: Aswin Chandramouleeswaran
    Cc: Linus Torvalds
    Cc: Chris Mason
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1405358872-3732-6-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     
  • Currently, we initialize the osq lock by directly setting the lock's values. It
    would be preferable if we use an init macro to do the initialization like we do
    with other locks.

    This patch introduces and uses a macro and function for initializing the osq lock.

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: Scott Norton
    Cc: "Paul E. McKenney"
    Cc: Dave Chinner
    Cc: Waiman Long
    Cc: Davidlohr Bueso
    Cc: Rik van Riel
    Cc: Andrew Morton
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Tim Chen
    Cc: Konrad Rzeszutek Wilk
    Cc: Aswin Chandramouleeswaran
    Cc: Linus Torvalds
    Cc: Chris Mason
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1405358872-3732-4-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     
  • The cancellable MCS spinlock is currently used to queue threads that are
    doing optimistic spinning. It uses per-cpu nodes, where a thread obtaining
    the lock would access and queue the local node corresponding to the CPU that
    it's running on. Currently, the cancellable MCS lock is implemented by using
    pointers to these nodes.

    In this patch, instead of operating on pointers to the per-cpu nodes, we
    store the CPU numbers in which the per-cpu nodes correspond to in atomic_t.
    A similar concept is used with the qspinlock.

    By operating on the CPU # of the nodes using atomic_t instead of pointers
    to those nodes, this can reduce the overhead of the cancellable MCS spinlock
    by 32 bits (on 64 bit systems).

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: Scott Norton
    Cc: "Paul E. McKenney"
    Cc: Dave Chinner
    Cc: Waiman Long
    Cc: Davidlohr Bueso
    Cc: Rik van Riel
    Cc: Andrew Morton
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Tim Chen
    Cc: Konrad Rzeszutek Wilk
    Cc: Aswin Chandramouleeswaran
    Cc: Linus Torvalds
    Cc: Chris Mason
    Cc: Heiko Carstens
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1405358872-3732-3-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     
  • Currently, the per-cpu nodes structure for the cancellable MCS spinlock is
    named "optimistic_spin_queue". However, in a follow up patch in the series
    we will be introducing a new structure that serves as the new "handle" for
    the lock. It would make more sense if that structure is named
    "optimistic_spin_queue". Additionally, since the current use of the
    "optimistic_spin_queue" structure are "nodes", it might be better if we
    rename them to "node" anyway.

    This preparatory patch renames all current "optimistic_spin_queue"
    to "optimistic_spin_node".

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Cc: Scott Norton
    Cc: "Paul E. McKenney"
    Cc: Dave Chinner
    Cc: Waiman Long
    Cc: Davidlohr Bueso
    Cc: Rik van Riel
    Cc: Andrew Morton
    Cc: "H. Peter Anvin"
    Cc: Steven Rostedt
    Cc: Tim Chen
    Cc: Konrad Rzeszutek Wilk
    Cc: Aswin Chandramouleeswaran
    Cc: Linus Torvalds
    Cc: Chris Mason
    Cc: Heiko Carstens
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1405358872-3732-2-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     

05 Jun, 2014

2 commits

  • Optimistic spinning is only used by the xadd variant
    of rw-semaphores. Make sure that we use the old version
    of the __RWSEM_INITIALIZER macro for systems that rely
    on the spinlock one, otherwise warnings can be triggered,
    such as the following reported on an arm box:

    ipc/ipcns_notifier.c:22:8: warning: excess elements in struct initializer [enabled by default]
    ipc/ipcns_notifier.c:22:8: warning: (near initialization for 'ipcns_chain.rwsem') [enabled by default]
    ipc/ipcns_notifier.c:22:8: warning: excess elements in struct initializer [enabled by default]
    ipc/ipcns_notifier.c:22:8: warning: (near initialization for 'ipcns_chain.rwsem') [enabled by default]

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra
    Cc: Tim Chen
    Cc: Linus Torvalds
    Cc: Paul McKenney
    Cc: Michel Lespinasse
    Cc: Peter Hurley
    Cc: Alex Shi
    Cc: Rik van Riel
    Cc: Andrew Morton
    Cc: Andrea Arcangeli
    Cc: "H. Peter Anvin"
    Cc: Jason Low
    Cc: Andi Kleen
    Cc: Chris Mason
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1400545677.6399.10.camel@buesod1.americas.hpqcorp.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • We have reached the point where our mutexes are quite fine tuned
    for a number of situations. This includes the use of heuristics
    and optimistic spinning, based on MCS locking techniques.

    Exclusive ownership of read-write semaphores are, conceptually,
    just about the same as mutexes, making them close cousins. To
    this end we need to make them both perform similarly, and
    right now, rwsems are simply not up to it. This was discovered
    by both reverting commit 4fc3f1d6 (mm/rmap, migration: Make
    rmap_walk_anon() and try_to_unmap_anon() more scalable) and
    similarly, converting some other mutexes (ie: i_mmap_mutex) to
    rwsems. This creates a situation where users have to choose
    between a rwsem and mutex taking into account this important
    performance difference. Specifically, biggest difference between
    both locks is when we fail to acquire a mutex in the fastpath,
    optimistic spinning comes in to play and we can avoid a large
    amount of unnecessary sleeping and overhead of moving tasks in
    and out of wait queue. Rwsems do not have such logic.

    This patch, based on the work from Tim Chen and I, adds support
    for write-side optimistic spinning when the lock is contended.
    It also includes support for the recently added cancelable MCS
    locking for adaptive spinning. Note that is is only applicable
    to the xadd method, and the spinlock rwsem variant remains intact.

    Allowing optimistic spinning before putting the writer on the wait
    queue reduces wait queue contention and provided greater chance
    for the rwsem to get acquired. With these changes, rwsem is on par
    with mutex. The performance benefits can be seen on a number of
    workloads. For instance, on a 8 socket, 80 core 64bit Westmere box,
    aim7 shows the following improvements in throughput:

    +--------------+---------------------+-----------------+
    | Workload | throughput-increase | number of users |
    +--------------+---------------------+-----------------+
    | alltests | 20% | >1000 |
    | custom | 27%, 60% | 10-100, >1000 |
    | high_systime | 36%, 30% | >100, >1000 |
    | shared | 58%, 29% | 10-100, >1000 |
    +--------------+---------------------+-----------------+

    There was also improvement on smaller systems, such as a quad-core
    x86-64 laptop running a 30Gb PostgreSQL (pgbench) workload for up
    to +60% in throughput for over 50 clients. Additionally, benefits
    were also noticed in exim (mail server) workloads. Furthermore, no
    performance regression have been seen at all.

    Based-on-work-from: Tim Chen
    Signed-off-by: Davidlohr Bueso
    [peterz: rej fixup due to comment patches, sched/rt.h header]
    Signed-off-by: Peter Zijlstra
    Cc: Alex Shi
    Cc: Andi Kleen
    Cc: Michel Lespinasse
    Cc: Rik van Riel
    Cc: Peter Hurley
    Cc: "Paul E.McKenney"
    Cc: Jason Low
    Cc: Aswin Chandramouleeswaran
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: "Scott J Norton"
    Cc: Andrea Arcangeli
    Cc: Chris Mason
    Cc: Josef Bacik
    Link: http://lkml.kernel.org/r/1399055055.6275.15.camel@buesod1.americas.hpqcorp.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

29 Jan, 2014

1 commit

  • Btrfs needs a simple way to know if it needs to let go of it's read lock on a
    rwsem. Introduce rwsem_is_contended to check to see if there are any waiters on
    this rwsem currently. This is just a hueristic, it is meant to be light and not
    100% accurate and called by somebody already holding on to the rwsem in either
    read or write. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason
    Acked-by: Ingo Molnar

    Josef Bacik
     

24 Mar, 2013

1 commit

  • This reverts commit 11b80f459adaf91a712f95e7734a17655a36bf30.

    Bcache needs rw semaphores for cache coherency in writeback mode -
    writes have to take a read lock on a per cache device rw sem, and
    release it when the bio completes.

    But since this is for bios it's naturally not in the context of the
    process that originally took the lock.

    Signed-off-by: Kent Overstreet
    CC: Christoph Hellwig
    CC: David Howells

    Kent Overstreet
     

17 Jan, 2013

1 commit

  • Commit 1b963c81b145 ("lockdep, rwsem: provide down_write_nest_lock()")
    contains a bug in a codepath when CONFIG_DEBUG_LOCK_ALLOC is disabled,
    which causes down_read() to be called instead of down_write() by mistake
    on such configurations. Fix that.

    Reported-and-tested-by: Andrew Clayton
    Reported-and-tested-by: Zlatko Calusic
    Signed-off-by: Jiri Kosina
    Reviewed-by: Rik van Riel
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     

12 Jan, 2013

1 commit

  • down_write_nest_lock() provides a means to annotate locking scenario
    where an outer lock is guaranteed to serialize the order nested locks
    are being acquired.

    This is analogoue to already existing mutex_lock_nest_lock() and
    spin_lock_nest_lock().

    Signed-off-by: Jiri Kosina
    Cc: Rik van Riel
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Mel Gorman
    Tested-by: Sedat Dilek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Kosina