06 Mar, 2019

2 commits

  • Pull perf updates from Ingo Molnar:
    "Lots of tooling updates - too many to list, here's a few highlights:

    - Various subcommand updates to 'perf trace', 'perf report', 'perf
    record', 'perf annotate', 'perf script', 'perf test', etc.

    - CPU and NUMA topology and affinity handling improvements,

    - HW tracing and HW support updates:
    - Intel PT updates
    - ARM CoreSight updates
    - vendor HW event updates

    - BPF updates

    - Tons of infrastructure updates, both on the build system and the
    library support side

    - Documentation updates.

    - ... and lots of other changes, see the changelog for details.

    Kernel side updates:

    - Tighten up kprobes blacklist handling, reduce the number of places
    where developers can install a kprobe and hang/crash the system.

    - Fix/enhance vma address filter handling.

    - Various PMU driver updates, small fixes and additions.

    - refcount_t conversions

    - BPF updates

    - error code propagation enhancements

    - misc other changes"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
    perf script python: Add Python3 support to syscall-counts-by-pid.py
    perf script python: Add Python3 support to syscall-counts.py
    perf script python: Add Python3 support to stat-cpi.py
    perf script python: Add Python3 support to stackcollapse.py
    perf script python: Add Python3 support to sctop.py
    perf script python: Add Python3 support to powerpc-hcalls.py
    perf script python: Add Python3 support to net_dropmonitor.py
    perf script python: Add Python3 support to mem-phys-addr.py
    perf script python: Add Python3 support to failed-syscalls-by-pid.py
    perf script python: Add Python3 support to netdev-times.py
    perf tools: Add perf_exe() helper to find perf binary
    perf script: Handle missing fields with -F +..
    perf data: Add perf_data__open_dir_data function
    perf data: Add perf_data__(create_dir|close_dir) functions
    perf data: Fail check_backup in case of error
    perf data: Make check_backup work over directories
    perf tools: Add rm_rf_perf_data function
    perf tools: Add pattern name checking to rm_rf
    perf tools: Add depth checking to rm_rf
    perf data: Add global path holder
    ...

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "The main RCU related changes in this cycle were:

    - Additional cleanups after RCU flavor consolidation

    - Grace-period forward-progress cleanups and improvements

    - Documentation updates

    - Miscellaneous fixes

    - spin_is_locked() conversions to lockdep

    - SPDX changes to RCU source and header files

    - SRCU updates

    - Torture-test updates, including nolibc updates and moving nolibc to
    tools/include"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
    locking/locktorture: Convert to SPDX license identifier
    linux/torture: Convert to SPDX license identifier
    torture: Convert to SPDX license identifier
    linux/srcu: Convert to SPDX license identifier
    linux/rcutree: Convert to SPDX license identifier
    linux/rcutiny: Convert to SPDX license identifier
    linux/rcu_sync: Convert to SPDX license identifier
    linux/rcu_segcblist: Convert to SPDX license identifier
    linux/rcupdate: Convert to SPDX license identifier
    linux/rcu_node_tree: Convert to SPDX license identifier
    rcu/update: Convert to SPDX license identifier
    rcu/tree: Convert to SPDX license identifier
    rcu/tiny: Convert to SPDX license identifier
    rcu/sync: Convert to SPDX license identifier
    rcu/srcu: Convert to SPDX license identifier
    rcu/rcutorture: Convert to SPDX license identifier
    rcu/rcu_segcblist: Convert to SPDX license identifier
    rcu/rcuperf: Convert to SPDX license identifier
    rcu/rcu.h: Convert to SPDX license identifier
    RCU/torture.txt: Remove section MODULE PARAMETERS
    ...

    Linus Torvalds
     

13 Feb, 2019

2 commits

  • Since kprobe itself depends on RCU, probing on RCU debug
    routine can cause recursive breakpoint bugs.

    Prohibit probing on RCU debug routines.

    int3
    ->do_int3()
    ->ist_enter()
    ->RCU_LOCKDEP_WARN()
    ->debug_lockdep_rcu_enabled() -> int3

    Signed-off-by: Masami Hiramatsu
    Cc: Alexander Shishkin
    Cc: Andrea Righi
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mathieu Desnoyers
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/154998807741.31052.11229157537816341591.stgit@devbox
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Prohibit probing on the functions called before kprobe_int3_handler()
    in do_int3(). More specifically, ftrace_int3_handler(),
    poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is
    called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL.

    Since those are handled before kprobe_int3_handler(), probing those
    functions can cause a breakpoint recursion and crash the kernel.

    Signed-off-by: Masami Hiramatsu
    Cc: Alexander Shishkin
    Cc: Andrea Righi
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mathieu Desnoyers
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/154998793571.31052.11301258949601150994.stgit@devbox
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

10 Feb, 2019

10 commits


26 Jan, 2019

26 commits

  • The ever-evolving IS_ENABLED() macro is intended for CONFIG_* Kconfig
    options, but rcuperf currently uses it for the decidedly non-CONFIG_*
    MODULE macro. In the spirit of not inviting trouble, this commit
    substitutes tried-and-true #ifdef.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney
    Acked-by: Ingo Molnar

    Paul E. McKenney
     
  • Beyond a certain point in the CPU-hotplug offline process, timers get
    stranded on the outgoing CPU, and won't fire until that CPU comes back
    online, which might well be never. This commit therefore adds a hook
    in torture_onoff_init() that is invoked from torture_offline(), which
    rcutorture uses to occasionally wait for a grace period. This should
    result in failures for RCU implementations that rely on stranded timers
    eventually firing in the absence of the CPU coming back online.

    Reported-by: Sebastian Andrzej Siewior
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit records grace periods in rcutorture's n_launders_hist[]
    histogram, thus allowing rcu_torture_fwd_cb_hist() to print out the
    elapsed number of grace periods between buckets. This information
    helps to determine whether a lack of forward progress is due to stalled
    grace periods on the one hand or due to sluggish callback invocation on
    the other.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • srcu_queue_delayed_work_on() disables preemption (and therefore CPU
    hotplug in RCU's case) and then checks based on its own accounting if a
    CPU is online. If the CPU is online it uses queue_delayed_work_on()
    otherwise it fallbacks to queue_delayed_work().
    The problem here is that queue_work() on -RT does not work with disabled
    preemption.

    queue_work_on() works also on an offlined CPU. queue_delayed_work_on()
    has the problem that it is possible to program a timer on an offlined
    CPU. This timer will fire once the CPU is online again. But until then,
    the timer remains programmed and nothing will happen.

    Add a local timer which will fire (as requested per delay) on the local
    CPU and then enqueue the work on the specific CPU.

    RCUtorture testing with SRCU-P for 24h showed no problems.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Paul E. McKenney

    Sebastian Andrzej Siewior
     
  • This commit updates the DYNTICK_IRQ_NONIDLE header comment to remove
    the obsolete commentary about unmatched rcu_irq_{enter,exit}().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit removes the "@irq" argument from the rcu_nmi_exit() docbook
    header, given that this function now has no arguments.

    Reported-by: kbuild test robot
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • It turns out that it is queue_delayed_work_on() rather than
    queue_work_on() that has difficulties when used concurrently with
    CPU-hotplug removal operations. It is therefore unnecessary to protect
    CPU identification and queue_work_on() with preempt_disable().

    This commit therefore removes the preempt_disable() and preempt_enable()
    from sync_rcu_exp_select_cpus(), which has the further benefit of reducing
    the number of changes that must be maintained in the -rt patchset.

    Reported-by: Thomas Gleixner
    Reported-by: Sebastian Siewior
    Suggested-by: Boqun Feng
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Although the name rcu_process_callbacks() still makes sense for Tiny
    RCU, where most of what it does is invoke callbacks, it no longer makes
    much sense for Tree RCU, especially given that the actually callback
    invocation is relegated to rcu_do_batch(), or, for no-CBs CPUs, to the
    rcuo kthreads. Especially in the latter case, rcu_process_callbacks()
    has very little to do with actual callbacks. A better description of
    this function is that it performs RCU's core processing.

    This commit therefore changes the name of Tree RCU's rcu_process_callbacks()
    function to rcu_core(), which also has the virtue of being consistent with
    the existing invoke_rcu_core() function.

    While in the area, the header comment is reworked.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The name rcu_check_callbacks() arguably made sense back in the early
    2000s when RCU was quite a bit simpler than it is today, but it has
    become quite misleading, especially with the advent of dyntick-idle
    and NO_HZ_FULL. The rcu_check_callbacks() function is RCU's hook into
    the scheduling-clock interrupt, and is now but one of many ways that
    callbacks get promoted to invocable state.

    This commit therefore changes the name to rcu_sched_clock_irq(),
    which is the same number of characters and clearly indicates this
    function's relation to the rest of the Linux kernel. In addition, for
    the sake of consistency, rcu_flavor_check_callbacks() is also renamed
    to rcu_flavor_sched_clock_irq().

    While in the area, the header comments for both functions are reworked.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • consolidate.2019.01.26a: RCU flavor consolidation cleanups.
    fwd.2019.01.26a: RCU grace-period forward-progress fixes.

    Paul E. McKenney
     
  • Currently, __note_gp_changes() checks to see if the rcu_node structure's
    ->gp_seq_needed is greater than or equal to that of the rcu_data
    structure, and if so, updates the rcu_data structure's ->gp_seq_needed
    field. This results in a useless store in the case where the two fields
    are equal.

    This commit therefore carries out this store only in the case where the
    rcu_node structure's ->gp_seq_needed is strictly greater than that of
    the rcu_data structure.

    Signed-off-by: "Zhang, Jun"
    Signed-off-by: Paul E. McKenney
    Link: https://lkml.kernel.org/r/88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com

    Zhang, Jun
     
  • The rcu_gp_kthread_wake() function is invoked when it might be necessary
    to wake the RCU grace-period kthread. Because self-wakeups are normally
    a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
    this kthread, it naturally refuses to do the wakeup.

    Unfortunately, natural though it might be, this heuristic fails when
    rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
    that interrupted the grace-period kthread just after the final check of
    the wait-event condition but just before the schedule() call. In this
    case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
    is within the RCU grace-period kthread's context. Failing to provide
    this wakeup can result in grace periods failing to start, which in turn
    results in out-of-memory conditions.

    This race window is quite narrow, but it actually did happen during real
    testing. It would of course need to be fixed even if it was strictly
    theoretical in nature.

    This patch does not Cc stable because it does not apply cleanly to
    earlier kernel versions.

    Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
    Reported-by: "He, Bo"
    Co-developed-by: "Zhang, Jun"
    Co-developed-by: "He, Bo"
    Co-developed-by: "xiao, jin"
    Co-developed-by: Bai, Jie A
    Signed-off: "Zhang, Jun"
    Signed-off: "He, Bo"
    Signed-off: "xiao, jin"
    Signed-off: Bai, Jie A
    Signed-off-by: "Zhang, Jun"
    [ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
    !in_serving_softirq() to avoid redundant wakeups and to also handle the
    interrupt-handler scenario as well as the softirq-handler scenario that
    actually occurred in testing. ]
    Signed-off-by: Paul E. McKenney
    Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com

    Zhang, Jun
     
  • Life is hard if RCU manages to get stuck without triggering RCU CPU
    stall warnings or triggering the rcu_check_gp_start_stall() checks
    for failing to start a grace period. This commit therefore adds a
    boot-time-selectable sysrq key (commandeering "y") that allows manually
    dumping Tree RCU state. The new rcutree.sysrq_rcu kernel boot parameter
    must be set for this sysrq to be available.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_check_gp_kthread_starvation() function can be invoked without
    holding locks, so the access to the rcu_state structure's ->gp_flags
    field must be protected with READ_ONCE(). This commit therefore adds
    this protection.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • If a grace period fails to start (for example, because you commented
    out the last two lines of rcu_accelerate_cbs_unlocked()), rcu_core()
    will invoke rcu_check_gp_start_stall(), which will notice and complain.
    However, this complaint is lacking crucial debugging information such
    as when the last wakeup executed and what the value of ->gp_seq was at
    that time. This commit therefore removes the current pr_alert() from
    rcu_check_gp_start_stall(), instead invoking show_rcu_gp_kthreads(),
    which has been updated to print the needed information, which is collected
    by rcu_gp_kthread_wake().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit updates a few obsolete comments in the RCU callback-offload
    code.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_cpu_kthread_cpu used to provide debugfs information, but is no
    longer used. This commit therefore removes it.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Given that RCU has a perfectly good per-CPU rcu_data structure, most
    per-CPU quantities should be stored there.

    This commit therefore moves the rcu_cpu_has_work per-CPU variable to
    the rcu_data structure. This also makes this variable unconditionally
    present, which should be acceptable given the memory reduction due to the
    RCU flavor consolidation and also due to simplifications this will enable.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_cpu_kthread_loops variable used to provide debugfs information,
    but is no longer used. This commit therefore removes it.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Given that RCU has a perfectly good per-CPU rcu_data structure, most
    per-CPU quantities should be stored there.

    This commit therefore moves the rcu_cpu_kthread_status per-CPU variable
    to the rcu_data structure. This also makes this variable unconditionally
    present, which should be acceptable given the memory reduction due to the
    RCU flavor consolidation and also due to simplifications this will enable.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Given that RCU has a perfectly good per-CPU rcu_data structure, most
    per-CPU quantities should be stored there.

    This commit therefore moves the rcu_cpu_kthread_task per-CPU variable to
    the rcu_data structure. This also makes this variable unconditionally
    present, which should be acceptable given the memory reduction due to the
    RCU flavor consolidation and also due to simplifications this will enable.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • It is perfectly fine to set the rcutree.jiffies_till_first_fqs boot
    parameter to zero, in fact, this can be useful on specialty systems that
    usually have at least one idle CPU and that need fast grace periods.
    This is because this setting causes the RCU grace-period kthread to
    scan for idle threads immediately after grace-period initialization,
    as opposed to waiting several jiffies to do so.

    It is also perfectly fine to set the rcutree.rcu_kick_kthreads kernel
    parameter, which gives the RCU grace-period kthread an extra wakeup
    if it doesn't make progress for a period of three times the setting of
    the rcutree.jiffies_till_first_fqs boot parameter. This is of course
    problematic when the value of this parameter is zero, as it can result
    in unnecessary wakeup IPIs along with unnecessary WARN_ONCE() invocations.

    This commit therefore defers kthread kicking for at least two jiffies,
    regardless of the setting of rcutree.jiffies_till_first_fqs.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Back when there were multiple flavors of RCU, it was necessary to
    separately count lazy and non-lazy callbacks for each CPU. These counts
    were used in CONFIG_RCU_FAST_NO_HZ kernels to determine how long a newly
    idle CPU should be allowed to sleep before handling its RCU callbacks.
    But now that there is only one flavor, the callback counts for a given
    CPU's sole rcu_data structure are the counts for that CPU.

    This commit therefore removes the rcu_data structure's ->nonlazy_posted
    and ->nonlazy_posted_snap fields, the rcu_idle_count_callbacks_posted()
    and rcu_cpu_has_callbacks() functions, repurposes the rcu_data structure's
    ->all_lazy field to record the laziness state at the beginning of the
    latest idle sojourn, and modifies CONFIG_RCU_FAST_NO_HZ RCU CPU stall
    warnings accordingly.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Now that _synchronize_rcu_expedited() has only one caller, and given that
    this is a tail call, this commit inlines _synchronize_rcu_expedited()
    into synchronize_rcu_expedited().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Now that rcu_blocking_is_gp() makes the correct immediate-return
    decision for both PREEMPT and !PREEMPT, a single implementation of
    synchronize_rcu() will work correctly under both configurations.
    This commit therefore eliminates a few lines of code by consolidating
    the two implementations of synchronize_rcu().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The CONFIG_PREEMPT=n and CONFIG_PREEMPT=y implementations of
    synchronize_rcu_expedited() are quite similar, and with small
    modifications to rcu_blocking_is_gp() can be made identical. This commit
    therefore makes this change in order to save a few lines of code and to
    reduce the amount of duplicate code.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney