24 Mar, 2020

1 commit

  • The warning was intended to spot complete_all() users from hardirq
    context on PREEMPT_RT. The warning as-is will also trigger in interrupt
    handlers, which are threaded on PREEMPT_RT, which was not intended.

    Use lockdep_assert_RT_in_threaded_ctx() which triggers in non-preemptive
    context on PREEMPT_RT.

    Fixes: a5c6234e1028 ("completion: Use simple wait queues")
    Reported-by: kernel test robot
    Suggested-by: Peter Zijlstra
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200323152019.4qjwluldohuh3by5@linutronix.de

    Sebastian Siewior
     

21 Mar, 2020

1 commit

  • completion uses a wait_queue_head_t to enqueue waiters.

    wait_queue_head_t contains a spinlock_t to protect the list of waiters
    which excludes it from being used in truly atomic context on a PREEMPT_RT
    enabled kernel.

    The spinlock in the wait queue head cannot be replaced by a raw_spinlock
    because:

    - wait queues can have custom wakeup callbacks, which acquire other
    spinlock_t locks and have potentially long execution times

    - wake_up() walks an unbounded number of list entries during the wake up
    and may wake an unbounded number of waiters.

    For simplicity and performance reasons complete() should be usable on
    PREEMPT_RT enabled kernels.

    completions do not use custom wakeup callbacks and are usually single
    waiter, except for a few corner cases.

    Replace the wait queue in the completion with a simple wait queue (swait),
    which uses a raw_spinlock_t for protecting the waiter list and therefore is
    safe to use inside truly atomic regions on PREEMPT_RT.

    There is no semantical or functional change:

    - completions use the exclusive wait mode which is what swait provides

    - complete() wakes one exclusive waiter

    - complete_all() wakes all waiters while holding the lock which protects
    the wait queue against newly incoming waiters. The conversion to swait
    preserves this behaviour.

    complete_all() might cause unbound latencies with a large number of waiters
    being woken at once, but most complete_all() usage sites are either in
    testing or initialization code or have only a really small number of
    concurrent waiters which for now does not cause a latency problem. Keep it
    simple for now.

    The fixup of the warning check in the USB gadget driver is just a straight
    forward conversion of the lockless waiter check from one waitqueue type to
    the other.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Greg Kroah-Hartman
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Joel Fernandes (Google)
    Acked-by: Linus Torvalds
    Link: https://lkml.kernel.org/r/20200321113242.317954042@linutronix.de

    Thomas Gleixner
     

17 Jul, 2018

1 commit

  • Both the implementation and the users' expectation [1] for the various
    wakeup primitives have evolved over time, but the documentation has not
    kept up with these changes: brings it into 2018.

    [1] http://lkml.kernel.org/r/20180424091510.GB4064@hirez.programming.kicks-ass.net

    Also applied feedback from Alan Stern.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Acked-by: Peter Zijlstra (Intel)
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Boqun Feng
    Cc: Daniel Lustig
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Jonathan Corbet
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-12-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     

09 Mar, 2018

1 commit


04 Mar, 2018

1 commit

  • Do the following cleanups and simplifications:

    - sched/sched.h already includes , so no need to
    include it in sched/core.c again.

    - order the headers alphabetically

    - add all headers to kernel/sched/sched.h

    - remove all unnecessary includes from the .c files that
    are already included in kernel/sched/sched.h.

    Finally, make all scheduler .c files use a single common header:

    #include "sched.h"

    ... which now contains a union of the relied upon headers.

    This makes the various .c files easier to read and easier to handle.

    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

09 Jan, 2018

1 commit

  • There's two cross-release leftover facilities:

    - the crossrelease_hist_*() irq-tracing callbacks (NOPs currently)
    - the complete_release_commit() callback (NOP as well)

    Remove them.

    Cc: David Sterba
    Cc: Byungchul Park
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

05 Sep, 2017

2 commits

  • Pull locking updates from Ingo Molnar:

    - Add 'cross-release' support to lockdep, which allows APIs like
    completions, where it's not the 'owner' who releases the lock, to be
    tracked. It's all activated automatically under
    CONFIG_PROVE_LOCKING=y.

    - Clean up (restructure) the x86 atomics op implementation to be more
    readable, in preparation of KASAN annotations. (Dmitry Vyukov)

    - Fix static keys (Paolo Bonzini)

    - Add killable versions of down_read() et al (Kirill Tkhai)

    - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)

    - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)

    - Remove smp_mb__before_spinlock() and convert its usages, introduce
    smp_mb__after_spinlock() (Peter Zijlstra)

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
    locking/lockdep/selftests: Fix mixed read-write ABBA tests
    sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
    acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
    locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
    smp: Avoid using two cache lines for struct call_single_data
    locking/lockdep: Untangle xhlock history save/restore from task independence
    locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
    futex: Remove duplicated code and fix undefined behaviour
    Documentation/locking/atomic: Finish the document...
    locking/lockdep: Fix workqueue crossrelease annotation
    workqueue/lockdep: 'Fix' flush_work() annotation
    locking/lockdep/selftests: Add mixed read-write ABBA tests
    mm, locking/barriers: Clarify tlb_flush_pending() barriers
    locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
    locking/lockdep: Explicitly initialize wq_barrier::done::map
    locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
    locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
    locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
    locking/refcounts, x86/asm: Implement fast refcount overflow protection
    locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
    ...

    Linus Torvalds
     
  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this cycle were:

    - fix affine wakeups (Peter Zijlstra)

    - improve CPU onlining (and general bootup) scalability on systems
    with ridiculous number (thousands) of CPUs (Peter Zijlstra)

    - sched/numa updates (Rik van Riel)

    - sched/deadline updates (Byungchul Park)

    - sched/cpufreq enhancements and related cleanups (Viresh Kumar)

    - sched/debug enhancements (Xie XiuQi)

    - various fixes"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    sched/debug: Optimize sched_domain sysctl generation
    sched/topology: Avoid pointless rebuild
    sched/topology, cpuset: Avoid spurious/wrong domain rebuilds
    sched/topology: Improve comments
    sched/topology: Fix memory leak in __sdt_alloc()
    sched/completion: Document that reinit_completion() must be called after complete_all()
    sched/autogroup: Fix error reporting printk text in autogroup_create()
    sched/fair: Fix wake_affine() for !NUMA_BALANCING
    sched/debug: Intruduce task_state_to_char() helper function
    sched/debug: Show task state in /proc/sched_debug
    sched/debug: Use task_pid_nr_ns in /proc/$pid/sched
    sched/core: Remove unnecessary initialization init_idle_bootup_task()
    sched/deadline: Change return value of cpudl_find()
    sched/deadline: Make find_later_rq() choose a closer CPU in topology
    sched/numa: Scale scan period with tasks in group and shared/private
    sched/numa: Slow down scan rate if shared faults dominate
    sched/pelt: Fix false running accounting
    sched: Mark pick_next_task_dl() and build_sched_domain() as static
    sched/cpupri: Don't re-initialize 'struct cpupri'
    sched/deadline: Don't re-initialize 'struct cpudl'
    ...

    Linus Torvalds
     

17 Aug, 2017

2 commits

  • There is no agreed-upon definition of spin_unlock_wait()'s semantics,
    and it appears that all callers could do just as well with a lock/unlock
    pair. This commit therefore replaces the spin_unlock_wait() call in
    completion_done() with spin_lock() followed immediately by spin_unlock().
    This should be safe from a performance perspective because the lock
    will be held only the wakeup happens really quickly.

    Signed-off-by: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: Alan Stern
    Cc: Andrea Parri
    Cc: Linus Torvalds
    Reviewed-by: Steven Rostedt (VMware)

    Paul E. McKenney
     
  • The complete_all() function modifies the completion's "done" variable to
    UINT_MAX, and no other caller (wait_for_completion(), etc) will modify
    it back to zero. That means that any call to complete_all() must have a
    reinit_completion() before that completion can be used again.

    Document this fact by the complete_all() function.

    Also document that completion_done() will always return true if
    complete_all() is called.

    Signed-off-by: Steven Rostedt (VMware)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20170816131202.195c2f4b@gandalf.local.home
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

10 Aug, 2017

1 commit

  • Although wait_for_completion() and its family can cause deadlock, the
    lock correctness validator could not be applied to them until now,
    because things like complete() are usually called in a different context
    from the waiting context, which violates lockdep's assumption.

    Thanks to CONFIG_LOCKDEP_CROSSRELEASE, we can now apply the lockdep
    detector to those completion operations. Applied it.

    Signed-off-by: Byungchul Park
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akpm@linux-foundation.org
    Cc: boqun.feng@gmail.com
    Cc: kernel-team@lge.com
    Cc: kirill@shutemov.name
    Cc: npiggin@gmail.com
    Cc: walken@google.com
    Cc: willy@infradead.org
    Link: http://lkml.kernel.org/r/1502089981-21272-10-git-send-email-byungchul.park@lge.com
    Signed-off-by: Ingo Molnar

    Byungchul Park
     

20 Jun, 2017

1 commit

  • Rename:

    wait_queue_t => wait_queue_entry_t

    'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
    but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
    which had to carry the name.

    Start sorting this out by renaming it to 'wait_queue_entry_t'.

    This also allows the real structure name 'struct __wait_queue' to
    lose its double underscore and become 'struct wait_queue_entry',
    which is the more canonical nomenclature for such data types.

    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Mar, 2017

2 commits


14 Jan, 2017

1 commit

  • Documentation/scheduler/completion.txt says this about complete_all():

    "calls complete_all() to signal all current and future waiters."

    Which doesn't strictly match the current semantics. Currently
    complete_all() is equivalent to UINT_MAX/2 complete() invocations,
    which is distinctly less than 'all current and future waiters'
    (enumerable vs innumerable), although it has worked in practice.

    However, Dmitry had a weird case where it might matter, so change
    completions to use saturation semantics for complete()/complete_all().
    Once done hits UINT_MAX (and complete_all() sets it there) it will
    never again be decremented.

    Requested-by: Dmitry Torokhov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: der.herr@hofr.at
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

18 Feb, 2015

1 commit

  • Commit de30ec47302c "Remove unnecessary ->wait.lock serialization when
    reading completion state" was not correct, without lock/unlock the code
    like stop_machine_from_inactive_cpu()

    while (!completion_done())
    cpu_relax();

    can return before complete() finishes its spin_unlock() which writes to
    this memory. And spin_unlock_wait().

    While at it, change try_wait_for_completion() to use READ_ONCE().

    Reported-by: Paul E. McKenney
    Reported-by: Davidlohr Bueso
    Tested-by: Paul E. McKenney
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra (Intel)
    [ Added a comment with the barrier. ]
    Cc: Linus Torvalds
    Cc: Nicholas Mc Guire
    Cc: raghavendra.kt@linux.vnet.ibm.com
    Cc: waiman.long@hp.com
    Fixes: de30ec47302c ("sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state")
    Link: http://lkml.kernel.org/r/20150212195913.GA30430@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

04 Feb, 2015

2 commits


16 Nov, 2014

1 commit

  • As discussed in [1], accounting IO is meant for blkio only. Document that
    so driver authors won't use them for device io.

    [1] http://thread.gmane.org/gmane.linux.drivers.i2c/20470

    Signed-off-by: Wolfram Sang
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: One Thousand Gnomes
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1415098901-2768-1-git-send-email-wsa@the-dreams.de
    Signed-off-by: Ingo Molnar

    Wolfram Sang
     

06 Nov, 2013

1 commit

  • Completions already have their own header file: linux/completion.h
    Move the implementation out of kernel/sched/core.c and into its own
    file: kernel/sched/completion.c.

    Signed-off-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/n/tip-x2y49rmxu5dljt66ai2lcfuw@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra