19 May, 2020

1 commit

  • Since there are already a number of sites (ARM64, PowerPC) that effectively
    nest nmi_enter(), make the primitive support this before adding even more.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Acked-by: Marc Zyngier
    Acked-by: Will Deacon
    Cc: Michael Ellerman
    Link: https://lkml.kernel.org/r/20200505134100.864179229@linutronix.de

    Peter Zijlstra
     

21 Feb, 2020

1 commit

  • Code which solely needs to prevent migration of a task uses
    preempt_disable()/enable() pairs. This is the only reliable way to do so
    as setting the task affinity to a single CPU can be undone by a
    setaffinity operation from a different task/process.

    RT provides a seperate migrate_disable/enable() mechanism which does not
    disable preemption to achieve the semantic requirements of a (almost) fully
    preemptible kernel.

    As it is unclear from looking at a given code path whether the intention is
    to disable preemption or migration, introduce migrate_disable/enable()
    inline functions which can be used to annotate code which merely needs to
    disable migration. Map them to preempt_disable/enable() for now. The RT
    substitution will be provided later.

    Code which is annotated that way documents that it has no requirement to
    protect against reentrancy of a preempting task. Either this is not
    required at all or the call sites are already serialized by other means.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Juri Lelli
    Cc: Vincent Guittot
    Cc: Dietmar Eggemann
    Cc: Steven Rostedt
    Cc: Ben Segall
    Cc: Mel Gorman
    Cc: Sebastian Andrzej Siewior
    Link: https://lore.kernel.org/r/878slclv1u.fsf@nanos.tec.linutronix.de

    Thomas Gleixner
     

01 Aug, 2019

1 commit

  • CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by
    CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same
    functionality which today depends on CONFIG_PREEMPT.

    Switch the preemption code, scheduler and init task over to use
    CONFIG_PREEMPTION.

    That's the first step towards RT in that area. The more complex changes are
    coming separately.

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20190726212124.117528401@linutronix.de
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

07 Dec, 2018

1 commit

  • PREEMPT_NEED_RESCHED is never used directly, so move it into the arch
    code where it can potentially be implemented using either a different
    bit in the preempt count or as an entirely separate entity.

    Cc: Robert Love
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Martin Schwidefsky
    Acked-by: Peter Zijlstra (Intel)
    Signed-off-by: Will Deacon

    Will Deacon
     

31 Jul, 2018

1 commit

  • This patch detaches the preemptirq tracepoints from the tracers and
    keeps it separate.

    Advantages:
    * Lockdep and irqsoff event can now run in parallel since they no longer
    have their own calls.

    * This unifies the usecase of adding hooks to an irqsoff and irqson
    event, and a preemptoff and preempton event.
    3 users of the events exist:
    - Lockdep
    - irqsoff and preemptoff tracers
    - irqs and preempt trace events

    The unification cleans up several ifdefs and makes the code in preempt
    tracer and irqsoff tracers simpler. It gets rid of all the horrific
    ifdeferry around PROVE_LOCKING and makes configuration of the different
    users of the tracepoints more easy and understandable. It also gets rid
    of the time_* function calls from the lockdep hooks used to call into
    the preemptirq tracer which is not needed anymore. The negative delta in
    lines of code in this patch is quite large too.

    In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
    as a single point for registering probes onto the tracepoints. With
    this,
    the web of config options for preempt/irq toggle tracepoints and its
    users becomes:

    PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING
    | | \ | |
    \ (selects) / \ \ (selects) /
    TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS
    \ /
    \ (depends on) /
    PREEMPTIRQ_TRACEPOINTS

    Other than the performance tests mentioned in the previous patch, I also
    ran the locking API test suite. I verified that all tests cases are
    passing.

    I also injected issues by not registering lockdep probes onto the
    tracepoints and I see failures to confirm that the probes are indeed
    working.

    This series + lockdep probes not registered (just to inject errors):
    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/12:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] soft-safe-A + irqs-on/21:FAILED|FAILED| ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    With this series + lockdep probes registered, all locking tests pass:

    [ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/12: ok | ok | ok |
    [ 0.000000] sirq-safe-A => hirqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/12: ok | ok | ok |
    [ 0.000000] hard-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] soft-safe-A + irqs-on/21: ok | ok | ok |
    [ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |

    Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.org

    Acked-by: Peter Zijlstra (Intel)
    Reviewed-by: Namhyung Kim
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes (Google)
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

03 Mar, 2017

1 commit


06 Dec, 2016

1 commit

  • I recently encountered wreckage because access_ok() was used where it
    should not be, add an explicit WARN when access_ok() is used wrongly.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Oct, 2015

3 commits

  • Its unused, kill the definition.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Steven Rostedt
    Reviewed-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Since we stopped setting PREEMPT_ACTIVE, there is no need to mask it
    out of preempt_count() tests.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Steven Rostedt
    Reviewed-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Now that nothing tests for PREEMPT_ACTIVE anymore, stop setting it.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Steven Rostedt
    Reviewed-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

03 Aug, 2015

1 commit

  • These functions check should_resched() before unlocking spinlock/bh-enable:
    preempt_count always non-zero => should_resched() always returns false.
    cond_resched_lock() worked iff spin_needbreak is set.

    This patch adds argument "preempt_offset" to should_resched().

    preempt_count offset constants for that:

    PREEMPT_DISABLE_OFFSET - offset after preempt_disable()
    PREEMPT_LOCK_OFFSET - offset after spin_lock()
    SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable()
    SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh()

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Graf
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count modifiers")
    Link: http://lkml.kernel.org/r/20150715095204.12246.98268.stgit@buzz
    Signed-off-by: Ingo Molnar

    Konstantin Khlebnikov
     

04 Jul, 2015

1 commit

  • Commit 1cde2930e154 ("sched/preempt: Add static_key() to preempt_notifiers")
    had two problems. First, the preempt-notifier API needs to sleep with the
    addition of the static_key, we do however need to hold off preemption
    while modifying the preempt notifier list, otherwise a preemption could
    observe an inconsistent list state. KVM correctly registers and
    unregisters preempt notifiers with preemption disabled, so the sleep
    caused dmesg splats.

    Second, KVM registers and unregisters preemption notifiers very often
    (in vcpu_load/vcpu_put). With a single uniprocessor guest the static key
    would move between 0 and 1 continuously, hitting the slow path on every
    userspace exit.

    To fix this, wrap the static_key inc/dec in a new API, and call it from
    KVM.

    Fixes: 1cde2930e154 ("sched/preempt: Add static_key() to preempt_notifiers")
    Reported-by: Pontus Fuchs
    Reported-by: Takashi Iwai
    Tested-by: Takashi Iwai
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Paolo Bonzini

    Peter Zijlstra
     

07 Jun, 2015

2 commits

  • preempt.h has two seperate "#ifdef CONFIG_PREEMPT" sections: one to
    define preempt_enable() and another to define preempt_enable_notrace().

    Lets gather both.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Fengguang Wu
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1433432349-1021-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • preempt_schedule_context() is a tracing safe preemption point but it's
    only used when CONFIG_CONTEXT_TRACKING=y. Other configs have tracing
    recursion issues since commit:

    b30f0e3ffedf ("sched/preempt: Optimize preemption operations on __schedule() callers")

    introduced function based preemp_count_*() ops.

    Lets make it available on all configs and give it a more appropriate
    name for its new position.

    Reported-by: Fengguang Wu
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1433432349-1021-3-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

19 May, 2015

6 commits

  • Now that PREEMPT_ACTIVE implies PREEMPT_DISABLE_OFFSET, ignoring
    PREEMPT_ACTIVE from in_atomic() check isn't useful anymore.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-7-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-6-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • __schedule() disables preemption and some of its callers
    (the preempt_schedule*() family) also set PREEMPT_ACTIVE.

    So we have two preempt_count() modifications that could be performed
    at once.

    Lets remove the preemption disablement from __schedule() and pull
    this responsibility to its callers in order to optimize preempt_count()
    operations in a single place.

    Suggested-by: Linus Torvalds
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-5-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • "CHECK" suggests it's only used as a comparison mask. But now it's used
    further as a config-conditional preempt disabler offset. Lets
    disambiguate this name.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Adjust a few comments, and further integrate a few definitions after
    the dumb headers copy.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-3-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • preempt_mask.h defines all the preempt_count semantics and related
    symbols: preempt, softirq, hardirq, nmi, preempt active, need resched,
    etc...

    preempt.h defines the accessors and mutators of preempt_count.

    But there is a messy dependency game around those two header files:

    * preempt_mask.h includes preempt.h in order to access preempt_count()

    * preempt_mask.h defines all preempt_count semantic and symbols
    except PREEMPT_NEED_RESCHED that is needed by asm/preempt.h
    Thus we need to define it from preempt.h, right before including
    asm/preempt.h, instead of defining it to preempt_mask.h with the
    other preempt_count symbols. Therefore the preempt_count semantics
    happen to be spread out.

    * We plan to introduce preempt_active_[enter,exit]() to consolidate
    preempt_schedule*() code. But we'll need to access both preempt_count
    mutators (preempt_count_add()) and preempt_count symbols
    (PREEMPT_ACTIVE, PREEMPT_OFFSET). The usual place to define preempt
    operations is in preempt.h but then we'll need symbols in
    preempt_mask.h which already includes preempt.h. So we end up with
    a ressource circle dependency.

    Lets merge preempt_mask.h into preempt.h to solve these dependency issues.
    This way we gather semantic symbols and operation definition of
    preempt_count in a single file.

    This is a dumb copy-paste merge. Further merge re-arrangments are
    performed in a subsequent patch to ease review.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-2-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

23 Jan, 2014

1 commit

  • The #ifdef CONFIG_PREEMPT is both not needed and wrong.

    Its not required because asm/preempt.h should provide
    {set,clear}_preempt_need_resched() regardless and its wrong because
    for voluntary preempt we still rely on PREEMPT_NEED_RESCHED.

    Reported-and-Tested-by: Markus Trippelsdorf
    Fixes: 8cb75e0c4ec9 ("sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding")
    Signed-off-by: Peter Zijlstra
    Cc: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Link: http://lkml.kernel.org/r/20140122102435.GH31570@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

14 Jan, 2014

1 commit

  • With various drivers wanting to inject idle time; we get people
    calling idle routines outside of the idle loop proper.

    Therefore we need to be extra careful about not missing
    TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.

    While looking at this, I also realized there's a small window in the
    existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
    right after the tif_need_resched() test at the end of the loop but
    right before the need_resched() test at the start of the loop.

    So move preempt_fold_need_resched() out of the loop where we're
    guaranteed to have TIF_NEED_RESCHED set.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Jan, 2014

1 commit

  • Discourage drivers/modules to be creative with preemption.

    Sadly all is implemented in macros and inline so if they want to do
    evil they still can, but at least try and discourage some.

    Reviewed-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra
    Cc: Eliezer Tamir
    Cc: rui.zhang@intel.com
    Cc: jacob.jun.pan@linux.intel.com
    Cc: Mike Galbraith
    Cc: hpa@zytor.com
    Cc: Rusty Russell
    Cc: Arjan van de Ven
    Cc: lenb@kernel.org
    Cc: rjw@rjwysocki.net
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/n/tip-fn7h6vu8wtgxk0ih402qcijx@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

25 Sep, 2013

5 commits

  • Remove the bloat of the C calling convention out of the
    preempt_enable() sites by creating an ASM wrapper which allows us to
    do an asm("call ___preempt_schedule") instead.

    calling.h bits by Andi Kleen

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-tk7xdi1cvvxewixzke8t8le1@git.kernel.org
    [ Fixed build error. ]
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Rewrite the preempt_count macros in order to extract the 3 basic
    preempt_count value modifiers:

    __preempt_count_add()
    __preempt_count_sub()

    and the new:

    __preempt_count_dec_and_test()

    And since we're at it anyway, replace the unconventional
    $op_preempt_count names with the more conventional preempt_count_$op.

    Since these basic operators are equivalent to the previous _notrace()
    variants, do away with the _notrace() versions.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In order to prepare to per-arch implementations of preempt_count move
    the required bits into an asm-generic header and use this for all
    archs.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In order to combine the preemption and need_resched test we need to
    fold the need_resched information into the preempt_count value.

    Since the NEED_RESCHED flag is set across CPUs this needs to be an
    atomic operation, however we very much want to avoid making
    preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED
    infrastructure in place but at 3 sites test it and fold its value into
    preempt_count; namely:

    - resched_task() when setting TIF_NEED_RESCHED on the current task
    - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a
    remote task it follows it up with a reschedule IPI
    and we can modify the cpu local preempt_count from
    there.
    - cpu_idle_loop() for when resched_task() found tsk_is_polling().

    We use an inverted bitmask to indicate need_resched so that a 0 means
    both need_resched and !atomic.

    Also remove the barrier() in preempt_enable() between
    preempt_enable_no_resched() and preempt_check_resched() to avoid
    having to reload the preemption value and allow the compiler to use
    the flags of the previuos decrement. I couldn't come up with any sane
    reason for this barrier() to be there as preempt_enable_no_resched()
    already has a barrier() before doing the decrement.

    Suggested-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Replace the single preempt_count() 'function' that's an lvalue with
    two proper functions:

    preempt_count() - returns the preempt_count value as rvalue
    preempt_count_set() - Allows setting the preempt-count value

    Also provide preempt_count_ptr() as a convenience wrapper to implement
    all modifying operations.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org
    [ Fixed build failure. ]
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Jun, 2013

1 commit

  • Dave Jones hit the following bug report:

    ===============================
    [ INFO: suspicious RCU usage. ]
    3.10.0-rc2+ #1 Not tainted
    -------------------------------
    include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
    other info that might help us debug this:
    RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
    RCU used illegally from extended quiescent state!
    2 locks held by cc1/63645:
    #0: (&rq->lock){-.-.-.}, at: [] __schedule+0xed/0x9b0
    #1: (rcu_read_lock){.+.+..}, at: [] cpuacct_charge+0x5/0x1f0

    CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
    Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
    0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
    ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
    0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
    Call Trace:
    [] dump_stack+0x19/0x1b
    [] lockdep_rcu_suspicious+0xfd/0x130
    [] cpuacct_charge+0x185/0x1f0
    [] ? cpuacct_charge+0x5/0x1f0
    [] update_curr+0xec/0x240
    [] put_prev_task_fair+0x228/0x480
    [] __schedule+0x161/0x9b0
    [] preempt_schedule+0x51/0x80
    [] ? __cond_resched_softirq+0x60/0x60
    [] ? retint_careful+0x12/0x2e
    [] ftrace_ops_control_func+0x1dc/0x210
    [] ftrace_call+0x5/0x2f
    [] ? retint_careful+0xb/0x2e
    [] ? schedule_user+0x5/0x70
    [] ? schedule_user+0x5/0x70
    [] ? retint_careful+0x12/0x2e
    ------------[ cut here ]------------

    What happened was that the function tracer traced the schedule_user() code
    that tells RCU that the system is coming back from userspace, and to
    add the CPU back to the RCU monitoring.

    Because the function tracer does a preempt_disable/enable_notrace() calls
    the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
    then preempt_schedule() is called. But this is called before the user_exit()
    function can inform the kernel that the CPU is no longer in user mode and
    needs to be accounted for by RCU.

    The fix is to create a new preempt_schedule_context() that checks if
    the kernel is still in user mode and if so to switch it to kernel mode
    before calling schedule. It also switches back to user mode coming back
    from schedule in need be.

    The only user of this currently is the preempt_enable_notrace(), which is
    only used by the tracing subsystem.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

10 Apr, 2013

1 commit

  • In UP and non-preempt respectively, the spinlocks and preemption
    disable/enable points are stubbed out entirely, because there is no
    regular code that can ever hit the kind of concurrency they are meant to
    protect against.

    However, while there is no regular code that can cause scheduling, we
    _do_ end up having some exceptional (literally!) code that can do so,
    and that we need to make sure does not ever get moved into the critical
    region by the compiler.

    In particular, get_user() and put_user() is generally implemented as
    inline asm statements (even if the inline asm may then make a call
    instruction to call out-of-line), and can obviously cause a page fault
    and IO as a result. If that inline asm has been scheduled into the
    middle of a preemption-safe (or spinlock-protected) code region, we
    obviously lose.

    Now, admittedly this is *very* unlikely to actually ever happen, and
    we've not seen examples of actual bugs related to this. But partly
    exactly because it's so hard to trigger and the resulting bug is so
    subtle, we should be extra careful to get this right.

    So make sure that even when preemption is disabled, and we don't have to
    generate any actual *code* to explicitly tell the system that we are in
    a preemption-disabled region, we need to at least tell the compiler not
    to move things around the critical region.

    This patch grew out of the same discussion that caused commits
    79e5f05edcbf ("ARC: Add implicit compiler barrier to raw_local_irq*
    functions") and 3e2e0d2c222b ("tile: comment assumption about
    __insn_mtspr for ") to come about.

    Note for stable: use discretion when/if applying this. As mentioned,
    this bug may never have actually bitten anybody, and gcc may never have
    done the required code motion for it to possibly ever trigger in
    practice.

    Cc: stable@vger.kernel.org
    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 Mar, 2012

1 commit

  • Create a distinction between scheduler related preempt_enable_no_resched()
    calls and the nearly one hundred other places in the kernel that do not
    want to reschedule, for one reason or another.

    This distinction matters for -rt, where the scheduler and the non-scheduler
    preempt models (and checks) are different. For upstream it's purely
    documentational.

    Signed-off-by: Thomas Gleixner
    Link: http://lkml.kernel.org/n/tip-gs88fvx2mdv5psnzxnv575ke@git.kernel.org
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

10 Jun, 2011

1 commit

  • Create a new CONFIG_PREEMPT_COUNT that handles the inc/dec
    of preempt count offset independently. So that the offset
    can be updated by preempt_disable() and preempt_enable()
    even without the need for CONFIG_PREEMPT beeing set.

    This prepares to make CONFIG_DEBUG_SPINLOCK_SLEEP working
    with !CONFIG_PREEMPT where it currently doesn't detect
    code that sleeps inside explicit preemption disabled
    sections.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra

    Frederic Weisbecker
     

02 Dec, 2009

1 commit

  • 498657a478c60be092208422fefa9c7b248729c2 incorrectly assumed
    that preempt wasn't disabled around context_switch() and thus
    was fixing imaginary problem. It also broke KVM because it
    depended on ->sched_in() to be called with irq enabled so that
    it can do smp calls from there.

    Revert the incorrect commit and add comment describing different
    contexts under with the two callbacks are invoked.

    Avi: spotted transposed in/out in the added comment.

    Signed-off-by: Tejun Heo
    Acked-by: Avi Kivity
    Cc: peterz@infradead.org
    Cc: efault@gmx.de
    Cc: rusty@rustcorp.com.au
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tejun Heo
     

24 May, 2008

2 commits

  • Add preempt off timings. A lot of kernel core code is taken from the RT patch
    latency trace that was written by Ingo Molnar.

    This adds "preemptoff" and "preemptirqsoff" to /debugfs/tracing/available_tracers

    Now instead of just tracing irqs off, preemption off can be selected
    to be recorded.

    When this is selected, it shares the same files as irqs off timings.
    One can either trace preemption off, irqs off, or one or the other off.

    By echoing "preemptoff" into /debugfs/tracing/current_tracer, recording
    of preempt off only is performed. "irqsoff" will only record the time
    irqs are disabled, but "preemptirqsoff" will take the total time irqs
    or preemption are disabled. Runtime switching of these options is now
    supported by simpling echoing in the appropriate trace name into
    /debugfs/tracing/current_tracer.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Steven Rostedt
     
  • The tracer may need to call preempt_enable and disable functions
    for time keeping and such. The trace gets ugly when we see these
    functions show up for all traces. To make the output cleaner
    this patch adds preempt_enable_notrace and preempt_disable_notrace
    to be used by tracer (and debugging) functions.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Steven Rostedt
     

09 Feb, 2008

1 commit


26 Jul, 2007

1 commit

  • This adds a general mechanism whereby a task can request the scheduler to
    notify it whenever it is preempted or scheduled back in. This allows the
    task to swap any special-purpose registers like the fpu or Intel's VT
    registers.

    Signed-off-by: Avi Kivity
    [ mingo@elte.hu: fixes, cleanups ]
    Signed-off-by: Ingo Molnar

    Avi Kivity
     

26 Apr, 2006

1 commit


23 Dec, 2005

1 commit

  • Currently a simple

    void foo(void) { preempt_enable(); }

    produces the following code on ARM:

    foo:
    bic r3, sp, #8128
    bic r3, r3, #63
    ldr r2, [r3, #4]
    ldr r1, [r3, #0]
    sub r2, r2, #1
    tst r1, #4
    str r2, [r3, #4]
    blne preempt_schedule
    mov pc, lr

    The problem is that the TIF_NEED_RESCHED flag is loaded _before_ the
    preemption count is stored back, hence any interrupt coming within that
    3 instruction window causing TIF_NEED_RESCHED to be set won't be
    seen and scheduling won't happen as it should.

    Nothing currently prevents gcc from performing that reordering. There
    is already a barrier() before the decrement of the preemption count, but
    another one is needed between this and the TIF_NEED_RESCHED flag test
    for proper code ordering.

    Signed-off-by: Nicolas Pitre
    Acked-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nicolas Pitre