01 Dec, 2020

1 commit

  • This patch reverts commit 978defee11a5 ("tracing: Do a WARN_ON()
    if start_thread() in hwlat is called when thread exists")

    .start hook can be legally called several times if according
    tracer is stopped

    screen window 1
    [root@localhost ~]# echo 1 > /sys/kernel/tracing/events/kmem/kfree/enable
    [root@localhost ~]# echo 1 > /sys/kernel/tracing/options/pause-on-trace
    [root@localhost ~]# less -F /sys/kernel/tracing/trace

    screen window 2
    [root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on
    0
    [root@localhost ~]# echo hwlat > /sys/kernel/debug/tracing/current_tracer
    [root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/tracing_on
    [root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on
    0
    [root@localhost ~]# echo 2 > /sys/kernel/debug/tracing/tracing_on

    triggers warning in dmesg:
    WARNING: CPU: 3 PID: 1403 at kernel/trace/trace_hwlat.c:371 hwlat_tracer_start+0xc9/0xd0

    Link: https://lkml.kernel.org/r/bd4d3e70-400d-9c82-7b73-a2d695e86b58@virtuozzo.com

    Cc: Ingo Molnar
    Cc: stable@vger.kernel.org
    Fixes: 978defee11a5 ("tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread exists")
    Signed-off-by: Vasily Averin
    Signed-off-by: Steven Rostedt (VMware)

    Vasily Averin
     

19 Sep, 2020

1 commit

  • Current tracing_init_dentry() return a d_entry pointer, while is not
    necessary. This function returns NULL on success or error on failure,
    which means there is no valid d_entry pointer return.

    Let's return 0 on success and negative value for error.

    Link: https://lkml.kernel.org/r/20200712011036.70948-5-richard.weiyang@linux.alibaba.com

    Signed-off-by: Wei Yang
    Signed-off-by: Steven Rostedt (VMware)

    Wei Yang
     

31 Jul, 2020

2 commits

  • In calculation of the cpu mask for the hwlat kernel thread, the wrong
    cpu mask is used instead of the tracing_cpumask, this causes the
    tracing/tracing_cpumask useless for hwlat tracer. Fixes it.

    Link: https://lkml.kernel.org/r/20200730082318.42584-2-haokexin@gmail.com

    Cc: Ingo Molnar
    Cc: stable@vger.kernel.org
    Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
    Signed-off-by: Kevin Hao
    Signed-off-by: Steven Rostedt (VMware)

    Kevin Hao
     
  • We have set 'current_mask' to '&save_cpumask' in its declaration,
    so there is no need to assign again.

    Link: https://lkml.kernel.org/r/20200730082318.42584-1-haokexin@gmail.com

    Signed-off-by: Kevin Hao
    Signed-off-by: Steven Rostedt (VMware)

    Kevin Hao
     

04 Mar, 2020

1 commit

  • The hwlat tracer runs a loop of width time during a given window. It then
    reports the max latency over a given threshold and records a timestamp. But
    this timestamp is the time after the width has finished, and not the time it
    actually triggered.

    Record the actual time when the latency was greater than the threshold as
    well as the number of times it was greater in a given width per window.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

06 Feb, 2020

1 commit

  • Pull tracing updates from Steven Rostedt:

    - Added new "bootconfig".

    This looks for a file appended to initrd to add boot config options,
    and has been discussed thoroughly at Linux Plumbers.

    Very useful for adding kprobes at bootup.

    Only enabled if "bootconfig" is on the real kernel command line.

    - Created dynamic event creation.

    Merges common code between creating synthetic events and kprobe
    events.

    - Rename perf "ring_buffer" structure to "perf_buffer"

    - Rename ftrace "ring_buffer" structure to "trace_buffer"

    Had to rename existing "trace_buffer" to "array_buffer"

    - Allow trace_printk() to work withing (some) tracing code.

    - Sort of tracing configs to be a little better organized

    - Fixed bug where ftrace_graph hash was not being protected properly

    - Various other small fixes and clean ups

    * tag 'trace-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (88 commits)
    bootconfig: Show the number of nodes on boot message
    tools/bootconfig: Show the number of bootconfig nodes
    bootconfig: Add more parse error messages
    bootconfig: Use bootconfig instead of boot config
    ftrace: Protect ftrace_graph_hash with ftrace_sync
    ftrace: Add comment to why rcu_dereference_sched() is open coded
    tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu
    tracing: Annotate ftrace_graph_hash pointer with __rcu
    bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline
    tracing: Use seq_buf for building dynevent_cmd string
    tracing: Remove useless code in dynevent_arg_pair_add()
    tracing: Remove check_arg() callbacks from dynevent args
    tracing: Consolidate some synth_event_trace code
    tracing: Fix now invalid var_ref_vals assumption in trace action
    tracing: Change trace_boot to use synth_event interface
    tracing: Move tracing selftests to bottom of menu
    tracing: Move mmio tracer config up with the other tracers
    tracing: Move tracing test module configs together
    tracing: Move all function tracing configs together
    tracing: Documentation for in-kernel synthetic event API
    ...

    Linus Torvalds
     

14 Jan, 2020

2 commits

  • As there's two struct ring_buffers in the kernel, it causes some confusion.
    The other one being the perf ring buffer. It was agreed upon that as neither
    of the ring buffers are generic enough to be used globally, they should be
    renamed as:

    perf's ring_buffer -> perf_buffer
    ftrace's ring_buffer -> trace_buffer

    This implements the changes to the ring buffer that ftrace uses.

    Link: https://lore.kernel.org/r/20191213140531.116b3200@gandalf.local.home

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • As we are working to remove the generic "ring_buffer" name that is used by
    both tracing and perf, the ring_buffer name for tracing will be renamed to
    trace_buffer, and perf's ring buffer will be renamed to perf_buffer.

    As there already exists a trace_buffer that is used by the trace_arrays, it
    needs to be first renamed to array_buffer.

    Link: https://lore.kernel.org/r/20191213153553.GE20583@krava

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

11 Dec, 2019

1 commit


15 Nov, 2019

1 commit


13 Nov, 2019

1 commit

  • This patch implements the feature that the tracing_max_latency file,
    e.g. /sys/kernel/debug/tracing/tracing_max_latency will receive
    notifications through the fsnotify framework when a new latency is
    available.

    One particularly interesting use of this facility is when enabling
    threshold tracing, through /sys/kernel/debug/tracing/tracing_thresh,
    together with the preempt/irqsoff tracers. This makes it possible to
    implement a user space program that can, with equal probability,
    obtain traces of latencies that occur immediately after each other in
    spite of the fact that the preempt/irqsoff tracers operate in overwrite
    mode.

    This facility works with the hwlat, preempt/irqsoff, and wakeup
    tracers.

    The tracers may call the latency_fsnotify() from places such as
    __schedule() or do_idle(); this makes it impossible to call
    queue_work() directly without risking a deadlock. The same would
    happen with a softirq, kernel thread or tasklet. For this reason we
    use the irq_work mechanism to call queue_work().

    This patch creates a new workqueue. The reason for doing this is that
    I wanted to use the WQ_UNBOUND and WQ_HIGHPRI flags. My thinking was
    that WQ_UNBOUND might help with the latency in some important cases.

    If we use:

    queue_work(system_highpri_wq, &tr->fsnotify_work);

    then the work will (almost) always execute on the same CPU but if we are
    unlucky that CPU could be too busy while there could be another CPU in
    the system that would be able to process the work soon enough.

    queue_work_on() could be used to queue the work on another CPU but it
    seems difficult to select the right CPU.

    Link: http://lkml.kernel.org/r/20191008220824.7911-2-viktor.rosendahl@gmail.com

    Reviewed-by: Joel Fernandes (Google)
    Signed-off-by: Viktor Rosendahl (BMW)
    [ Added max() to have one compare for max latency ]
    Signed-off-by: Steven Rostedt (VMware)

    Viktor Rosendahl (BMW)
     

13 Oct, 2019

2 commits

  • max_latency is intended to record the maximum ever observed hardware
    latency, which may occur in either part of the loop (inner/outer). So
    we need to also consider the outer-loop sample when updating
    max_latency.

    Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu

    Fixes: e7c15cd8a113 ("tracing: Added hardware latency tracer")
    Cc: stable@vger.kernel.org
    Signed-off-by: Srivatsa S. Bhat (VMware)
    Signed-off-by: Steven Rostedt (VMware)

    Srivatsa S. Bhat (VMware)
     
  • nmi_total_ts is supposed to record the total time spent in *all* NMIs
    that occur on the given CPU during the (active portion of the)
    sampling window. However, the code seems to be overwriting this
    variable for each NMI, thereby only recording the time spent in the
    most recent NMI. Fix it by accumulating the duration instead.

    Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu

    Fixes: 7b2c86250122 ("tracing: Add NMI tracing in hwlat detector")
    Cc: stable@vger.kernel.org
    Signed-off-by: Srivatsa S. Bhat (VMware)
    Signed-off-by: Steven Rostedt (VMware)

    Srivatsa S. Bhat (VMware)
     

03 Jun, 2019

1 commit

  • In commit:

    4b53a3412d66 ("sched/core: Remove the tsk_nr_cpus_allowed() wrapper")

    the tsk_nr_cpus_allowed() wrapper was removed. There was not
    much difference in !RT but in RT we used this to implement
    migrate_disable(). Within a migrate_disable() section the CPU mask is
    restricted to single CPU while the "normal" CPU mask remains untouched.

    As an alternative implementation Ingo suggested to use:

    struct task_struct {
    const cpumask_t *cpus_ptr;
    cpumask_t cpus_mask;
    };
    with
    t->cpus_ptr = &t->cpus_mask;

    In -RT we then can switch the cpus_ptr to:

    t->cpus_ptr = &cpumask_of(task_cpu(p));

    in a migration disabled region. The rules are simple:

    - Code that 'uses' ->cpus_allowed would use the pointer.
    - Code that 'modifies' ->cpus_allowed would use the direct mask.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: https://lkml.kernel.org/r/20190423142636.14347-1-bigeasy@linutronix.de
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior
     

17 Aug, 2018

1 commit


02 Aug, 2018

2 commits

  • The start function of the hwlat tracer should never be called when the hwlat
    thread already exists. If it is called, do a WARN_ON().

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • The hwlat tracer uses a kernel thread to measure latencies. The function
    that creates this kernel thread, start_kthread(), can be called when the
    tracer is initialized and when the tracer is explicitly enabled.
    start_kthread() does not check if there is an existing hwlat kernel
    thread and will create a new one each time it is called.

    This causes the reference to the previous thread to be lost. Without the
    thread reference, the old kernel thread becomes unstoppable and
    continues to use CPU time even after the hwlat tracer has been disabled.
    This problem can be observed when a system is booted with tracing
    enabled and the hwlat tracer is configured like this:

    echo hwlat > current_tracer; echo 1 > tracing_on

    Add the missing check for an existing kernel thread in start_kthread()
    to prevent this problem. This function and the rest of the hwlat kernel
    thread setup and teardown are already serialized because they are called
    through the tracer core code with trace_type_lock held.

    [
    Note, this only fixes the symptom. The real fix was not to call
    this function when tracing_on was already one. But this still makes
    the code more robust, so we'll add it.
    ]

    Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de

    Signed-off-by: Erica Bugden
    Signed-off-by: Steven Rostedt (VMware)

    Erica Bugden
     

09 May, 2017

1 commit

  • struct timespec is not y2038 safe on 32 bit machines and needs to be
    replaced by struct timespec64 in order to represent times beyond year
    2038 on such machines.

    Fix all the timestamp representation in struct trace_hwlat and all the
    corresponding implementations.

    Link: http://lkml.kernel.org/r/1491613030-11599-3-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Deepa Dinamani
    Acked-by: Steven Rostedt (VMware)
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Deepa Dinamani
     

02 Mar, 2017

1 commit


28 Feb, 2017

2 commits

  • Pull another tracing update from Steven Rostedt:
    "Commit 79c6f448c8b79c ("tracing: Fix hwlat kthread migration") fixed a
    bug that was caused by a race condition in initializing the hwlat
    thread. When fixing this code, I realized that it should have been
    done differently. Instead of doing the rewrite and sending that to
    stable, I just sent the above commit to fix the bug that should be
    back ported.

    This commit is on top of the quick fix commit to rewrite the code the
    way it should have been written in the first place"

    * tag 'trace-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Clean up the hwlat binding code

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "This release has no new tracing features, just clean ups, minor fixes
    and small optimizations"

    * tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (25 commits)
    tracing: Remove outdated ring buffer comment
    tracing/probes: Fix a warning message to show correct maximum length
    tracing: Fix return value check in trace_benchmark_reg()
    tracing: Use modern function declaration
    jump_label: Reduce the size of struct static_key
    tracing/probe: Show subsystem name in messages
    tracing/hwlat: Update old comment about migration
    timers: Make flags output in the timer_start tracepoint useful
    tracing: Have traceprobe_probes_write() not access userspace unnecessarily
    tracing: Have COMM event filter key be treated as a string
    ftrace: Have set_graph_function handle multiple functions in one write
    ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock
    tracing: Reset parser->buffer to allow multiple "puts"
    ftrace: Have set_graph_functions handle write with RDWR
    ftrace: Reset fgd->hash in ftrace_graph_write()
    ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY
    ftrace: Create a slight optimization on searching the ftrace_hash
    tracing: Add ftrace_hash_key() helper function
    ftrace: Convert graph filter to use hash tables
    ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip
    ...

    Linus Torvalds
     

15 Feb, 2017

1 commit


01 Feb, 2017

1 commit


31 Jan, 2017

1 commit

  • The hwlat tracer creates a kernel thread at start of the tracer. It is
    pinned to a single CPU and will move to the next CPU after each period of
    running. If the user modifies the migration thread's affinity, it will not
    change after that happens.

    The original code created the thread at the first instance it was called,
    but later was changed to destroy the thread after the tracer was finished,
    and would not be created until the next instance of the tracer was
    established. The code that initialized the affinity was only called on the
    initial instantiation of the tracer. After that, it was not initialized, and
    the previous affinity did not match the current newly created one, making
    it appear that the user modified the thread's affinity when it did not, and
    the thread failed to migrate again.

    Cc: stable@vger.kernel.org
    Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

24 Nov, 2016

1 commit

  • The function __buffer_unlock_commit() is called in a few places outside of
    trace.c. But for the most part, it should really be inlined, as it is in the
    hot path of the trace_events. For the callers outside of trace.c, create a
    new function trace_buffer_unlock_commit_nostack(), as the reason it was used
    was to avoid the stack tracing that trace_buffer_unlock_commit() could do.

    Link: http://lkml.kernel.org/r/20161121183700.GW26852@two.firstfloor.org

    Reported-by: Andi Kleen
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

03 Sep, 2016

3 commits

  • As NMIs can also cause latency when interrupts are disabled, the hwlat
    detectory has no way to know if the latency it detects is from an NMI or an
    SMI or some other hardware glitch.

    As ftrace_nmi_enter/exit() funtions are no longer used (except for sh, which
    isn't supported anymore), I converted those to "arch_ftrace_nmi_enter/exit"
    and use ftrace_nmi_enter/exit() to check if hwlat detector is tracing or
    not, and if so, it calls into the hwlat utility.

    Since the hwlat detector only has a single kthread that is spinning with
    interrupts disabled, it marks what CPU it is on, and if the NMI callback
    happens on that CPU, it records the time spent in that NMI. This is added to
    the output that is generated by the hwlat detector as:

    #3 inner/outer(us): 9/9 ts:1470836488.206734548
    #4 inner/outer(us): 0/8 ts:1470836497.140808588
    #5 inner/outer(us): 0/6 ts:1470836499.140825168 nmi-total:5 nmi-count:1
    #6 inner/outer(us): 9/9 ts:1470836501.140841748

    All time is still tracked in microseconds.

    The NMI information is only shown when an NMI occurred during the sample.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Instead of having the hwlat detector thread stay on one CPU, have it migrate
    across all the CPUs specified by tracing_cpumask. If the user modifies the
    thread's CPU affinity, the migration will stop until the next instance that
    the tracer is instantiated. The migration happens at the end of each window
    (period).

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The hardware latency tracer has been in the PREEMPT_RT patch for some time.
    It is used to detect possible SMIs or any other hardware interruptions that
    the kernel is unaware of. Note, NMIs may also be detected, but that may be
    good to note as well.

    The logic is pretty simple. It simply creates a thread that spins on a
    single CPU for a specified amount of time (width) within a periodic window
    (window). These numbers may be adjusted by their cooresponding names in

    /sys/kernel/tracing/hwlat_detector/

    The defaults are window = 1000000 us (1 second)
    width = 500000 us (1/2 second)

    The loop consists of:

    t1 = trace_clock_local();
    t2 = trace_clock_local();

    Where trace_clock_local() is a variant of sched_clock().

    The difference of t2 - t1 is recorded as the "inner" timestamp and also the
    timestamp t1 - prev_t2 is recorded as the "outer" timestamp. If either of
    these differences are greater than the time denoted in
    /sys/kernel/tracing/tracing_thresh then it records the event.

    When this tracer is started, and tracing_thresh is zero, it changes to the
    default threshold of 10 us.

    The hwlat tracer in the PREEMPT_RT patch was originally written by
    Jon Masters. I have modified it quite a bit and turned it into a
    tracer.

    Based-on-code-by: Jon Masters
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)