16 May, 2019

1 commit

  • Pull tracing updates from Steven Rostedt:
    "The major changes in this tracing update includes:

    - Removal of non-DYNAMIC_FTRACE from 32bit x86

    - Removal of mcount support from x86

    - Emulating a call from int3 on x86_64, fixes live kernel patching

    - Consolidated Tracing Error logs file

    Minor updates:

    - Removal of klp_check_compiler_support()

    - kdb ftrace dumping output changes

    - Accessing and creating ftrace instances from inside the kernel

    - Clean up of #define if macro

    - Introduction of TRACE_EVENT_NOP() to disable trace events based on
    config options

    And other minor fixes and clean ups"

    * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
    x86: Hide the int3_emulate_call/jmp functions from UML
    livepatch: Remove klp_check_compiler_support()
    ftrace/x86: Remove mcount support
    ftrace/x86_32: Remove support for non DYNAMIC_FTRACE
    tracing: Simplify "if" macro code
    tracing: Fix documentation about disabling options using trace_options
    tracing: Replace kzalloc with kcalloc
    tracing: Fix partial reading of trace event's id file
    tracing: Allow RCU to run between postponed startup tests
    tracing: Fix white space issues in parse_pred() function
    tracing: Eliminate const char[] auto variables
    ring-buffer: Fix mispelling of Calculate
    tracing: probeevent: Fix to make the type of $comm string
    tracing: probeevent: Do not accumulate on ret variable
    tracing: uprobes: Re-enable $comm support for uprobe events
    ftrace/x86_64: Emulate call function while updating in breakpoint handler
    x86_64: Allow breakpoints to emulate call instructions
    x86_64: Add gap to int3 to allow for call emulation
    tracing: kdb: Allow ftdump to skip all but the last few entries
    tracing: Add trace_total_entries() / trace_total_entries_cpu()
    ...

    Linus Torvalds
     

08 May, 2019

1 commit

  • Pull printk updates from Petr Mladek:

    - Allow state reset of printk_once() calls.

    - Prevent crashes when dereferencing invalid pointers in vsprintf().
    Only the first byte is checked for simplicity.

    - Make vsprintf warnings consistent and inlined.

    - Treewide conversion of obsolete %pf, %pF to %ps, %pF printf
    modifiers.

    - Some clean up of vsprintf and test_printf code.

    * tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    lib/vsprintf: Make function pointer_string static
    vsprintf: Limit the length of inlined error messages
    vsprintf: Avoid confusion between invalid address and value
    vsprintf: Prevent crash when dereferencing invalid pointers
    vsprintf: Consolidate handling of unknown pointer specifiers
    vsprintf: Factor out %pO handler as kobject_string()
    vsprintf: Factor out %pV handler as va_format()
    vsprintf: Factor out %p[iI] handler as ip_addr_string()
    vsprintf: Do not check address of well-known strings
    vsprintf: Consistent %pK handling for kptr_restrict == 0
    vsprintf: Shuffle restricted_pointer()
    printk: Tie printk_once / printk_deferred_once into .data.once for reset
    treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
    lib/test_printf: Switch to bitmap_zalloc()

    Linus Torvalds
     

09 Apr, 2019

2 commits

  • …, 'srcu.2019.03.26b', 'stall.2019.03.26b' and 'torture.2019.03.26b' into HEAD

    consolidate.2019.04.09a: Lingering RCU flavor consolidation cleanups.
    doc.2019.03.26b: Documentation updates.
    fixes.2019.03.26b: Miscellaneous fixes.
    srcu.2019.03.26b: SRCU updates.
    stall.2019.03.26b: RCU CPU stall warning updates.
    torture.2019.03.26b: Torture-test updates.

    Paul E. McKenney
     
  • %pF and %pf are functionally equivalent to %pS and %ps conversion
    specifiers. The former are deprecated, therefore switch the current users
    to use the preferred variant.

    The changes have been produced by the following command:

    git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
    while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done

    And verifying the result.

    Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
    Cc: Andy Shevchenko
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: xen-devel@lists.xenproject.org
    Cc: linux-acpi@vger.kernel.org
    Cc: linux-pm@vger.kernel.org
    Cc: drbd-dev@lists.linbit.com
    Cc: linux-block@vger.kernel.org
    Cc: linux-mmc@vger.kernel.org
    Cc: linux-nvdimm@lists.01.org
    Cc: linux-pci@vger.kernel.org
    Cc: linux-scsi@vger.kernel.org
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: linux-mm@kvack.org
    Cc: ceph-devel@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Sakari Ailus
    Acked-by: David Sterba (for btrfs)
    Acked-by: Mike Rapoport (for mm/memblock.c)
    Acked-by: Bjorn Helgaas (for drivers/pci)
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Petr Mladek

    Sakari Ailus
     

08 Apr, 2019

1 commit

  • When CONFIG_RCU_TRACE is not set, all these tracepoints are defined as
    do-nothing macro.
    We'd better make those inline functions that take proper arguments.

    As RCU_TRACE() is defined as do-nothing marco as well when
    CONFIG_RCU_TRACE is not set, so we can clean it up.

    Link: http://lkml.kernel.org/r/1553602391-11926-4-git-send-email-laoar.shao@gmail.com

    Reviewed-by: Paul E. McKenney
    Signed-off-by: Yafang Shao
    Signed-off-by: Steven Rostedt (VMware)

    Yafang Shao
     

27 Mar, 2019

33 commits

  • If the specified rcuperf.perf_type is not in the rcu_perf_init()
    function's perf_ops[] array, rcuperf prints some console messages and
    then invokes rcu_perf_cleanup() to set state so that a future torture
    test can run. However, rcu_perf_cleanup() also attempts to end the
    test that didn't actually start, and in doing so relies on the value
    of cur_ops, a value that is not particularly relevant in this case.
    This can result in confusing output or even follow-on failures due to
    attempts to use facilities that have not been properly initialized.

    This commit therefore sets the value of cur_ops to NULL in this case and
    inserts a check near the beginning of rcu_perf_cleanup(), thus avoiding
    relying on an irrelevant cur_ops value.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • If the specified rcutorture.torture_type is not in the rcu_torture_init()
    function's torture_ops[] array, rcutorture prints some console messages
    and then invokes rcu_torture_cleanup() to set state so that a future
    torture test can run. However, rcu_torture_cleanup() also attempts to
    end the test that didn't actually start, and in doing so relies on the
    value of cur_ops, a value that is not particularly relevant in this case.
    This can result in confusing output or even follow-on failures due to
    attempts to use facilities that have not been properly initialized.

    This commit therefore sets the value of cur_ops to NULL in this case
    and inserts a check near the beginning of rcu_torture_cleanup(),
    thus avoiding relying on an irrelevant cur_ops value.

    Reported-by: kernel test robot
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcutorture_oom_notify() function has a misplaced close parenthesis
    that results in increasingly long delays in rcu_fwd_progress_check()'s
    checking for various RCU forward-progress problems. This commit therefore
    puts the parenthesis in the right place.

    Signed-off-by: Neeraj Upadhyay
    Signed-off-by: Paul E. McKenney

    Neeraj Upadhyay
     
  • Back when there was a separate RCU-bh flavor, the ->ext_irq_conflict
    field was used to prevent executing local_bh_enable() while interrupts
    were disabled. However, there is no longer an RCU-bh flavor, so this
    commit removes the no-longer-needed ->ext_irq_conflict field.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The code actually rarely uses more than one type of RCU read-side
    protection, as is actually desired given that we need some reasonable
    probability of preempting RCU read-side critical sections, which cannot
    happen with multiple types of protection. This comment therefore adjusts
    the comment.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The Documentation/RCU/stallwarn.txt file says that stall warnings
    print "D" if dyntick-idle processing is enabled, but the code in
    print_cpu_stall_fast_no_hz() prints "." instead. This commit therefore
    reverses the sense of the test to make the code match the documentation.

    Signed-off-by: Neeraj Upadhyay
    Signed-off-by: Paul E. McKenney

    Neeraj Upadhyay
     
  • This commit further consolidates stall-warning functionality by moving
    forward-progress checkers into kernel/rcu/tree_stall.h, updating a
    comment or two while in the area. More specifically, this commit moves
    show_rcu_gp_kthreads(), rcu_check_gp_start_stall(), rcu_fwd_progress_check(),
    sysrq_rcu, sysrq_show_rcu(), sysrq_rcudump_op, and rcu_sysrq_init() from
    kernel/rcu/tree.c to kernel/rcu/tree_stall.h.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_iw_handler() function's sole purpose in life is to indicate
    whether a stalled CPU had interrupts disabled, so it belongs in
    kernel/rcu/tree_stall.h. This commit therefore makes that move,
    clarifying its header comment while in the area.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit does only code movement, removal of now-unneeded forward
    declarations, and addition of comments. It organizes the functions
    that implement RCU CPU stall warnings for normal grace periods into
    three categories:

    1. Control of RCU CPU stall warnings, including computing timeouts.

    2. Interaction of stall warnings with grace periods.

    3. Actual printing of the RCU CPU stall-warning messages.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit further consolidates the stall-warning code by moving
    print_cpu_stall_info() and its helper functions along with
    zero_cpu_stall_ticks() to kernel/rcu/tree_stall.h.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The print_cpu_stall_info_begin() and print_cpu_stall_info_end() print a
    single character each onto the console, and are a holdover from a time
    when RCU CPU stall warning messages could be abbreviated using a long-gone
    Kconfig option. This commit therefore adds these single characters to
    already-printed strings in the calling functions, and then eliminates
    both print_cpu_stall_info_begin() and print_cpu_stall_info_end().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Because expedited CPU stall warnings are contained within the
    kernel/rcu/tree_exp.h file, rcu_print_task_exp_stall() should live
    there too. This commit carries out the required code motion.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_print_detail_task_stall(), rcu_print_task_stall_begin(), and
    rcu_print_task_stall_end() functions were defined to allow long-gone
    Kconfig options to provide an abbreviated RCU CPU stall warning printout.
    This commit saves a few lines of code by inlining them into their sole
    callers.

    While in the area, a useless call of rcu_print_detail_task_stall_rnp()
    on the root rcu_node structure was eliminated. If there is only one
    rcu_node structure, its tasks get printed twice, but if there are more,
    the root rcu_node structure is guaranteed to have an empty list of blocked
    tasks, hence the uselessness. (Long ago, root rcu_node structures with
    non-empty ->blkd_tasks lists could happen, but no longer.)

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit completes the process of consolidating the code for RCU CPU
    stall warnings for normal grace periods by moving the remaining such
    code from kernel/rcu/tree.c to kernel/rcu/tree_stall.h.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The RCU CPU stall-warning code for normal grace periods is currently
    scattered across two files, due to earlier Tiny RCU support for RCU
    CPU stall warnings and for old Kconfig options that have long since
    been retired. Given that it is hard for the lead RCU maintainer to
    find relevant stall-warning code, it would be good to consolidate it.
    This commit continues this process by moving stall-warning code from
    kernel/rcu/tree_plugin.c to a new kernel/rcu/tree_stall.h file.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The RCU CPU stall-warning code for normal grace periods is currently
    scattered across three files, due to earlier Tiny RCU support for RCU
    CPU stall warnings and for old Kconfig options that have long since
    been retired. Given that it is hard for the lead RCU maintainer to
    find relevant stall-warning code, it would be good to consolidate it.
    This commit starts this process by moving stall-warning code from
    kernel/rcu/update.c to a new kernel/rcu/tree_stall.h file.

    Note that the definitions of rcu_cpu_stall_suppress and
    rcu_cpu_stall_timeout must remain in kernel/rcu/update.h to provide
    compatibility for kernel boot parameter lists.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The cleanup_srcu_struct_quiesced() function was added because NVME
    used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that
    NVME workqueues waiting on SRCU workqueues could result in deadlocks
    during low-memory conditions. However, SRCU now also has WQ_MEM_RECLAIM
    workqueues, so there is no longer a potential for deadlock. Furthermore,
    it turns out to be extremely hard to use cleanup_srcu_struct_quiesced()
    correctly due to the fact that SRCU callback invocation accesses the
    srcu_struct structure's per-CPU data area just after callbacks are
    invoked. Therefore, the usual practice of using srcu_barrier() to wait
    for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced()
    fails because SRCU's callback-invocation workqueue handler might be
    delayed, which can result in cleanup_srcu_struct_quiesced() being invoked
    (and thus freeing the per-CPU data) before the SRCU's callback-invocation
    workqueue handler is finished using that per-CPU data. Nor is this a
    theoretical problem: KASAN emitted use-after-free warnings because of
    this problem on actual runs.

    In short, NVME can now safely invoke cleanup_srcu_struct(), which
    avoids the use-after-free scenario. And cleanup_srcu_struct_quiesced()
    is quite difficult to use safely. This commit therefore removes
    cleanup_srcu_struct_quiesced(), switching its sole user back to
    cleanup_srcu_struct(). This effectively reverts the following pair
    of commits:

    f7194ac32ca2 ("srcu: Add cleanup_srcu_struct_quiesced()")
    4317228ad9b8 ("nvme: Avoid flush dependency in delete controller flow")

    Reported-by: Bart Van Assche
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Bart Van Assche
    Tested-by: Bart Van Assche

    Paul E. McKenney
     
  • If someone fails to drain the corresponding SRCU callbacks (for
    example, by failing to invoke srcu_barrier()) before invoking either
    cleanup_srcu_struct() or cleanup_srcu_struct_quiesced(), the resulting
    diagnostic is an ambiguous use-after-free diagnostic, and even then
    only if you are running something like KASAN. This commit therefore
    improves SRCU diagnostics by adding checks for in-flight callbacks at
    _cleanup_srcu_struct() time.

    Note that these diagnostics can still be defeated, for example, by
    invoking call_srcu() concurrently with cleanup_srcu_struct(). Which is
    a really bad idea, but sometimes all too easy to do. But even then,
    these diagnostics have at least some probability of catching the problem.

    Reported-by: Sagi Grimberg
    Reported-by: Bart Van Assche
    Signed-off-by: Paul E. McKenney
    Tested-by: Bart Van Assche

    Paul E. McKenney
     
  • The task_struct structure's ->rcu_read_unlock_special field is only ever
    read or written by the owning task, but it is accessed both at process
    and interrupt levels. It may therefore be accessed using plain reads
    and writes while interrupts are disabled, but must be accessed using
    READ_ONCE() and WRITE_ONCE() or better otherwise. This commit makes a
    few adjustments to align with this discipline.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit changes a rcu_exp_handler() comment from rcu_preempt_defer_qs()
    to rcu_preempt_deferred_qs() in order to better match reality.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Because rcu_wake_cond() checks for a null task_struct pointer, there is
    no need for its callers to do so. This commit eliminates the redundant
    check.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Previously, threads blocked on offlining CPUS were migrated to the
    root rcu_node structure, thus requiring RCU priority boosting on this
    structure. However, since commit d19fb8d1f3f6 ("rcu: Don't migrate
    blocked tasks even if all corresponding CPUs offline"), RCU does not
    migrate blocked tasks. Consequently, RCU no longer does RCU priority
    boosting on the root rcu_node structure as of commit 1be0085b515e ("rcu:
    Don't initiate RCU priority boosting on root rcu_node").

    This commit therefore brings comments for the force_qs_rnp() function's
    header comment in line with this new no-root-boosting reality.

    Signed-off-by: Zhouyi Zhou
    [ paulmck: Also remove obsolete comment on suppressing new grace periods. ]
    Signed-off-by: Paul E. McKenney

    Zhouyi Zhou
     
  • This commit better documents the jiffies_to_sched_qs default-value
    strategy used by adjust_jiffies_till_sched_qs()

    Reported-by: Joel Fernandes
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current code only calls adjust_jiffies_till_sched_qs() if
    jiffies_till_sched_qs is left at its default value, so when the
    jiffies_till_sched_qs kernel-boot parameter actually is specified,
    jiffies_to_sched_qs will be left with the value zero, which
    will result in useless slowdowns of cond_resched(). This commit
    therefore changes rcu_init_geometry() to unconditionally invoke
    adjust_jiffies_till_sched_qs(), which ensures that jiffies_to_sched_qs
    will be initialized in all cases, thus maintaining good cond_resched()
    performance.

    Signed-off-by: Neeraj Upadhyay
    Signed-off-by: Paul E. McKenney

    Neeraj Upadhyay
     
  • The current rcu_gp_kthread_wake() function uses in_interrupt()
    and thus does a self-wakeup from all interrupt contexts, including
    the pointless case where the GP kthread happens to be running with
    bottom halves disabled, along with the impossible case where the GP
    kthread is running within an NMI handler (you are not supposed to invoke
    rcu_gp_kthread_wake() from within an NMI handler. This commit therefore
    replaces the in_interrupt() with in_irq(), so that the self-wakeups
    happen only from handlers for hardware interrupts and softirqs.
    This also makes the code match the comment.

    Signed-off-by: Neeraj Upadhyay
    Signed-off-by: Paul E. McKenney
    Acked-by: Steven Rostedt (VMware)

    Neeraj Upadhyay
     
  • This commit prints a console message when cpulist_parse() reports a
    bad list of CPUs, and sets all CPUs' bits in that case. The reason for
    setting all CPUs' bits is that this is the safe(r) choice for real-time
    workloads, which would normally be the ones using the rcu_nocbs= kernel
    boot parameter. Either way, later RCU console log messages list the
    actual set of CPUs whose RCU callbacks will be offloaded.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Currently, the rcu_nocbs= kernel boot parameter requires that a specific
    list of CPUs be specified, and has no way to say "all of them".
    As noted by user RavFX in a comment to Phoronix topic 1002538, this
    is an inconvenient side effect of the removal of the RCU_NOCB_CPU_ALL
    Kconfig option. This commit therefore enables the rcu_nocbs= kernel boot
    parameter to be given the string "all", as in "rcu_nocbs=all" to specify
    that all CPUs on the system are to have their RCU callbacks offloaded.

    Another approach would be to make cpulist_parse() check for "all", but
    there are uses of cpulist_parse() that do other checking, which could
    conflict with an "all". This commit therefore focuses on the specific
    use of cpulist_parse() in rcu_nocb_setup().

    Just a note to other people who would like changes to Linux-kernel RCU:
    If you send your requests to me directly, they might get fixed somewhat
    faster. RavFX's comment was posted on January 22, 2018 and I first saw
    it on March 5, 2019. And the only reason that I found it -at- -all- was
    that I was looking for projects using RCU, and my search engine showed
    me that Phoronix comment quite by accident. Your choice, though! ;-)

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • As the result of recent addition of "rdp->core_needs_qs = false;" in
    the "if" block, now both branches of the if-else have the same
    assignment.

    Factor it out and reduce line count.

    Signed-off-by: Akira Yokosawa
    Cc: Joel Fernandes
    Signed-off-by: Paul E. McKenney
    Acked-by: Joel Fernandes (Google)

    Akira Yokosawa
     
  • The rcutree.kthread_prio kernel-boot parameter is used to set the
    priority for boost (rcub), per-CPU (rcuc), and grace-period (rcu_preempt
    or rcu_sched) kthreads. It is also used by rcutorture to check whether
    it is possible to meaningfully test RCU priority boosting. However,
    all of these cases will either ignore or be confused by any post-boot
    changes to rcutree.kthread_prio.

    Note that the user really can change the priorities of all of these
    kthreads using chrt, given sufficient privileges. Therefore, the
    read-write nature of sysfs access to rcutree.kthread_prio is thus at
    best an attractive nuisance.

    This commit therefore changes sysfs access to rcutree.kthread_prio to
    be read-only.

    Signed-off-by: Liu Song
    Signed-off-by: Paul E. McKenney

    Liu Song
     
  • The purpose of exit_rcu() is to handle cases where buggy code causes a
    task to exit within an RCU read-side critical section. It currently
    does that in the case where said RCU read-side critical section was
    preempted at least once, but fails to handle cases where preemption did
    not occur. This case needs to be handled because otherwise the final
    context switch away from the exiting task will incorrectly behave as if
    task exit were instead a preemption of an RCU read-side critical section,
    and will therefore queue the exiting task. The exiting task will have
    exited, and thus won't ever execute rcu_read_unlock(), which means that
    it will remain queued forever, blocking all subsequent grace periods,
    and eventually resulting in OOM.

    Although this is arguably better than letting grace periods proceed
    and having a later rcu_read_unlock() access the now-freed task
    structure that once belonged to the exiting tasks, it would obviously
    be better to correctly handle this case. This commit therefore sets
    ->rcu_read_lock_nesting to 1 in that case, so that the subsequence call
    to __rcu_read_unlock() causes the exiting task to exit its dangling RCU
    read-side critical section.

    Note that deferred quiescent states need not be considered. The reason
    is that removing the task from the ->blkd_tasks[] list in the call to
    rcu_preempt_deferred_qs() handles the per-task component of any deferred
    quiescent state, and all other components of any deferred quiescent state
    are associated with the CPU, which isn't going anywhere until some later
    CPU-hotplug operation, which will report any remaining deferred quiescent
    states from within the rcu_report_dead() function.

    Note also that negative values of ->rcu_read_lock_nesting need not be
    considered. First, these won't show up in exit_rcu() unless there is
    a serious bug in RCU, and second, setting ->rcu_read_lock_nesting sets
    the state so that the RCU read-side critical section will be exited
    normally.

    Again, this code has no effect unless there has been some prior bug
    that prevents a task from leaving an RCU read-side critical section
    before exiting. Furthermore, there have been no reports of the bug
    fixed by this commit appearing in production. This commit is therefore
    absolutely -not- recommended for backporting to -stable.

    Reported-by: ABHISHEK DUBEY
    Reported-by: BHARATH Y MOURYA
    Reported-by: Aravinda Prasad
    Signed-off-by: Paul E. McKenney
    Tested-by: ABHISHEK DUBEY

    Paul E. McKenney
     
  • The rcu_qs is disabling IRQs by self so no need to do the same in raise_softirq
    but instead we can save some cycles using raise_softirq_irqoff directly.

    CC: Paul E. McKenney
    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Paul E. McKenney

    Cyrill Gorcunov
     
  • When there are no callbacks pending on an idle system, I noticed that
    RCU softirq is continuously firing. During this the cpu_no_qs is set to
    false, and core_needs_qs is set to true indefinitely. This causes
    rcu_process_callbacks to be repeatedly called, even though the node
    corresponding to the CPU has that CPU's mask bit cleared and the system
    is idle. I believe the race is when such mask clearing is done during
    idle CPU scan of the quiescent state forcing stage in the kthread
    instead of the softirq. Since the rnp mask is cleared, but the flags on
    the CPU's rdp are not cleared, the CPU thinks it still needs to report
    to core RCU.

    Cure this by clearing the core_needs_qs flag when the CPU detects that
    its node is already updated which will avoid the unwanted softirq raises
    to the benefit of real-time systems.

    Test: Ran rcutorture for various tree RCU configs.

    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Paul E. McKenney

    Joel Fernandes (Google)
     
  • The rcu_pm_notify() function refuses to switch to/from expedited grace
    periods on systems with more than 256 CPUs due to the serialized
    initialization of expedited grace periods. However, expedited grace
    periods are now initialized in parallel, removing this concern.
    This commit therefore removes the checks from rcu_pm_notify(), so that
    expedited grace periods are used unconditionally during suspend/resume
    and hibernate/wake operations.

    As always, real-time workloads wishing to completely avoid expedited
    grace periods can use the rcupdate.rcu_normal= kernel parameter.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

06 Mar, 2019

2 commits

  • Pull perf updates from Ingo Molnar:
    "Lots of tooling updates - too many to list, here's a few highlights:

    - Various subcommand updates to 'perf trace', 'perf report', 'perf
    record', 'perf annotate', 'perf script', 'perf test', etc.

    - CPU and NUMA topology and affinity handling improvements,

    - HW tracing and HW support updates:
    - Intel PT updates
    - ARM CoreSight updates
    - vendor HW event updates

    - BPF updates

    - Tons of infrastructure updates, both on the build system and the
    library support side

    - Documentation updates.

    - ... and lots of other changes, see the changelog for details.

    Kernel side updates:

    - Tighten up kprobes blacklist handling, reduce the number of places
    where developers can install a kprobe and hang/crash the system.

    - Fix/enhance vma address filter handling.

    - Various PMU driver updates, small fixes and additions.

    - refcount_t conversions

    - BPF updates

    - error code propagation enhancements

    - misc other changes"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
    perf script python: Add Python3 support to syscall-counts-by-pid.py
    perf script python: Add Python3 support to syscall-counts.py
    perf script python: Add Python3 support to stat-cpi.py
    perf script python: Add Python3 support to stackcollapse.py
    perf script python: Add Python3 support to sctop.py
    perf script python: Add Python3 support to powerpc-hcalls.py
    perf script python: Add Python3 support to net_dropmonitor.py
    perf script python: Add Python3 support to mem-phys-addr.py
    perf script python: Add Python3 support to failed-syscalls-by-pid.py
    perf script python: Add Python3 support to netdev-times.py
    perf tools: Add perf_exe() helper to find perf binary
    perf script: Handle missing fields with -F +..
    perf data: Add perf_data__open_dir_data function
    perf data: Add perf_data__(create_dir|close_dir) functions
    perf data: Fail check_backup in case of error
    perf data: Make check_backup work over directories
    perf tools: Add rm_rf_perf_data function
    perf tools: Add pattern name checking to rm_rf
    perf tools: Add depth checking to rm_rf
    perf data: Add global path holder
    ...

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "The main RCU related changes in this cycle were:

    - Additional cleanups after RCU flavor consolidation

    - Grace-period forward-progress cleanups and improvements

    - Documentation updates

    - Miscellaneous fixes

    - spin_is_locked() conversions to lockdep

    - SPDX changes to RCU source and header files

    - SRCU updates

    - Torture-test updates, including nolibc updates and moving nolibc to
    tools/include"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
    locking/locktorture: Convert to SPDX license identifier
    linux/torture: Convert to SPDX license identifier
    torture: Convert to SPDX license identifier
    linux/srcu: Convert to SPDX license identifier
    linux/rcutree: Convert to SPDX license identifier
    linux/rcutiny: Convert to SPDX license identifier
    linux/rcu_sync: Convert to SPDX license identifier
    linux/rcu_segcblist: Convert to SPDX license identifier
    linux/rcupdate: Convert to SPDX license identifier
    linux/rcu_node_tree: Convert to SPDX license identifier
    rcu/update: Convert to SPDX license identifier
    rcu/tree: Convert to SPDX license identifier
    rcu/tiny: Convert to SPDX license identifier
    rcu/sync: Convert to SPDX license identifier
    rcu/srcu: Convert to SPDX license identifier
    rcu/rcutorture: Convert to SPDX license identifier
    rcu/rcu_segcblist: Convert to SPDX license identifier
    rcu/rcuperf: Convert to SPDX license identifier
    rcu/rcu.h: Convert to SPDX license identifier
    RCU/torture.txt: Remove section MODULE PARAMETERS
    ...

    Linus Torvalds