Eric Lee / smarc-fsl-linux-kernel

19 Sep, 2020

1 commit

22c36b182 tracing: make tracing_init_dentry() returns an integer instead of a d_entry pointer ... Browse Code »

Current tracing_init_dentry() return a d_entry pointer, while is not
necessary. This function returns NULL on success or error on failure,
which means there is no valid d_entry pointer return.

Let's return 0 on success and negative value for error.

Link: https://lkml.kernel.org/r/20200712011036.70948-5-richard.weiyang@linux.alibaba.com

Signed-off-by: Wei Yang
Signed-off-by: Steven Rostedt (VMware)

Wei Yang
2020-09-19 10:17:14 +0800

20 Mar, 2020

1 commit

bc1a72afd ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance() ... Browse Code »

When the ring buffer was first created, the iterator followed the normal
producer/consumer operations where it had both a peek() operation, that just
returned the event at the current location, and a read(), that would return
the event at the current location and also increment the iterator such that
the next peek() or read() will return the next event.

The only use of the ring_buffer_read() is currently to move the iterator to
the next location and nothing now actually reads the event it returns.
Rename this function to its actual use case to ring_buffer_iter_advance(),
which also adds the "iter" part to the name, which is more meaningful. As
the timestamp returned by ring_buffer_read() was never used, there's no
reason that this new version should bother having returning it. It will also
become a void function.

Link: http://lkml.kernel.org/r/20200317213416.018928618@goodmis.org

Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2020-03-20 07:11:19 +0800

14 Jan, 2020

2 commits

132924943 tracing: Make struct ring_buffer less ambiguous ... Browse Code »

As there's two struct ring_buffers in the kernel, it causes some confusion.
The other one being the perf ring buffer. It was agreed upon that as neither
of the ring buffers are generic enough to be used globally, they should be
renamed as:

perf's ring_buffer -> perf_buffer
ftrace's ring_buffer -> trace_buffer

This implements the changes to the ring buffer that ftrace uses.

Link: https://lore.kernel.org/r/20191213140531.116b3200@gandalf.local.home

Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2020-01-14 02:19:38 +0800
1c5eb4481 tracing: Rename trace_buffer to array_buffer ... Browse Code »

As we are working to remove the generic "ring_buffer" name that is used by
both tracing and perf, the ring_buffer name for tracing will be renamed to
trace_buffer, and perf's ring buffer will be renamed to perf_buffer.

As there already exists a trace_buffer that is used by the trace_arrays, it
needs to be first renamed to array_buffer.

Link: https://lore.kernel.org/r/20191213153553.GE20583@krava

Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2020-01-14 02:19:38 +0800

31 Jul, 2019

1 commit

6c77221df fgraph: Remove redundant ftrace_graph_notrace_addr() test ... Browse Code »

We already have tested it before. The second one should be removed.
With this change, the performance should have little improvement.

Link: http://lkml.kernel.org/r/20190730140850.7927-1-changbin.du@gmail.com

Cc: stable@vger.kernel.org
Fixes: 9cd2992f2d6c ("fgraph: Have set_graph_notrace only affect function_graph tracer")
Signed-off-by: Changbin Du
Signed-off-by: Steven Rostedt (VMware)

Changbin Du
2019-07-31 09:50:03 +0800

07 Feb, 2019

2 commits

afbab501c tracing: Put a margin between flags and duration for wakeup tracers ... Browse Code »

Don't mix context flags with function duration info.

Instead of this:

# tracer: wakeup_rt
#
# wakeup_rt latency trace v1.1.5 on 5.0.0-rc1-test+
# --------------------------------------------------------------------
# latency: 177 us, #545/545, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:8)
# -----------------
# | task: migration/0-11 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | | |||| | | | | | |
0 us | 0) -0 | dNh5 | /* 0:120:R + [000] 11: 0:R migration/0 */
2 us | 0) -0 | dNh5 0.000 us | (null)();
4 us | 0) -0 | dNh4 | _raw_spin_unlock() {
4 us | 0) -0 | dNh4 0.304 us | preempt_count_sub();
5 us | 0) -0 | dNh3 1.063 us | }
5 us | 0) -0 | dNh3 0.266 us | ttwu_stat();
6 us | 0) -0 | dNh3 | _raw_spin_unlock_irqrestore() {
6 us | 0) -0 | dNh3 0.273 us | preempt_count_sub();
6 us | 0) -0 | dNh2 0.818 us | }

Show this:

# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 4.20.0+
# --------------------------------------------------------------------
# latency: 593 us, #674/674, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kworker/0:1H-339 (uid:0 nice:-20 policy:0 rt_prio:0)
# -----------------
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | | |||| | | | | | |
0 us | 0) -0 | dNs. | | /* 0:120:R + [000] 339:100:R kworker/0:1H */
3 us | 0) -0 | dNs. | 0.000 us | (null)();
67 us | 0) -0 | dNs. | 0.721 us | ttwu_stat();
69 us | 0) -0 | dNs. | 0.607 us | _raw_spin_unlock_irqrestore();
71 us | 0) -0 | .Ns. | 0.598 us | _raw_spin_lock_irq();
72 us | 0) -0 | .Ns. | 0.584 us | _raw_spin_lock_irq();
73 us | 0) -0 | dNs. | + 11.118 us | __next_timer_interrupt();
75 us | 0) -0 | dNs. | | call_timer_fn() {
76 us | 0) -0 | dNs. | | delayed_work_timer_fn() {
76 us | 0) -0 | dNs. | | __queue_work() {
...

Link: http://lkml.kernel.org/r/20190101154614.8887-4-changbin.du@gmail.com

Signed-off-by: Changbin Du
Signed-off-by: Steven Rostedt (VMware)

Changbin Du
2019-02-07 00:56:19 +0800
9acd8de69 function_graph: Support displaying relative timestamp ... Browse Code »

When function_graph is used for latency tracers, relative timestamp
is more straightforward than absolute timestamp as function trace
does. This change adds relative timestamp support to function_graph
and applies to latency tracers (wakeup and irqsoff).

Instead of:

# tracer: irqsoff
#
# irqsoff latency trace v1.1.5 on 5.0.0-rc1-test
# --------------------------------------------------------------------
# latency: 521 us, #1125/1125, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:8)
# -----------------
# | task: swapper/2-0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: __schedule
# => ended at: _raw_spin_unlock_irq
#
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | | |||| | | | | | |
124.974306 | 2) systemd-693 | d..1 0.000 us | __schedule();
124.974307 | 2) systemd-693 | d..1 | rcu_note_context_switch() {
124.974308 | 2) systemd-693 | d..1 0.487 us | rcu_preempt_deferred_qs();
124.974309 | 2) systemd-693 | d..1 0.451 us | rcu_qs();
124.974310 | 2) systemd-693 | d..1 2.301 us | }
[..]
124.974826 | 2) -0 | d..2 | finish_task_switch() {
124.974826 | 2) -0 | d..2 | _raw_spin_unlock_irq() {
124.974827 | 2) -0 | d..2 0.000 us | _raw_spin_unlock_irq();
124.974828 | 2) -0 | d..2 0.000 us | tracer_hardirqs_on();
-0 2d..2 552us :
=> __schedule
=> schedule_idle
=> do_idle
=> cpu_startup_entry
=> start_secondary
=> secondary_startup_64

Show:

# tracer: irqsoff
#
# irqsoff latency trace v1.1.5 on 5.0.0-rc1-test+
# --------------------------------------------------------------------
# latency: 511 us, #1053/1053, CPU#7 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:8)
# -----------------
# | task: swapper/7-0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: __schedule
# => ended at: _raw_spin_unlock_irq
#
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | | |||| | | | | | |
0 us | 7) sshd-1704 | d..1 0.000 us | __schedule();
1 us | 7) sshd-1704 | d..1 | rcu_note_context_switch() {
1 us | 7) sshd-1704 | d..1 0.611 us | rcu_preempt_deferred_qs();
2 us | 7) sshd-1704 | d..1 0.484 us | rcu_qs();
3 us | 7) sshd-1704 | d..1 2.599 us | }
[..]
509 us | 7) -0 | d..2 | finish_task_switch() {
510 us | 7) -0 | d..2 | _raw_spin_unlock_irq() {
510 us | 7) -0 | d..2 0.000 us | _raw_spin_unlock_irq();
512 us | 7) -0 | d..2 0.000 us | tracer_hardirqs_on();
-0 7d..2 543us :
=> __schedule
=> schedule_idle
=> do_idle
=> cpu_startup_entry
=> start_secondary
=> secondary_startup_64

Link: http://lkml.kernel.org/r/20190101154614.8887-2-changbin.du@gmail.com

Signed-off-by: Changbin Du
Signed-off-by: Steven Rostedt (VMware)

Changbin Du
2019-02-07 00:56:18 +0800

09 Dec, 2018

4 commits

76b42b63e function_graph: Move ftrace_graph_ret_addr() to fgraph.c ... Browse Code »

Move the function function_graph_ret_addr() to fgraph.c, as the management
of the curr_ret_stack is going to change, and all the accesses to ret_stack
needs to be done in fgraph.c.

Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-12-09 09:54:07 +0800
688f7089d fgraph: Add new fgraph_ops structure to enable function graph hooks ... Browse Code »

Currently the registering of function graph is to pass in a entry and return
function. We need to have a way to associate those functions together where
the entry can determine to run the return hook. Having a structure that
contains both functions will facilitate the process of converting the code
to be able to do such.

This is similar to the way function hooks are enabled (it passes in
ftrace_ops). Instead of passing in the functions to use, a single structure
is passed in to the registering function.

The unregister function is now passed in the fgraph_ops handle. When we
allow more than one callback to the function graph hooks, this will let the
system know which one to remove.

Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-12-09 09:54:07 +0800
c8dd0f458 function_graph: Do not expose the graph_time option when profiler is not configured ... Browse Code »

When the function profiler is not configured, the "graph_time" option is
meaningless, as the function profiler is the only thing that makes use of
it. Do not expose it if the profiler is not configured.

Link: http://lkml.kernel.org/r/20181123061133.GA195223@google.com

Reported-by: Joel Fernandes
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-12-09 09:54:06 +0800
761efe8a9 function_graph: Remove the use of FTRACE_NOTRACE_DEPTH ... Browse Code »

The curr_ret_stack is no longer set to a negative value when a function is
not to be traced by the function graph tracer. Remove the usage of
FTRACE_NOTRACE_DEPTH, as it is no longer needed.

Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-12-09 09:54:06 +0800

30 Nov, 2018

4 commits

9cd2992f2 fgraph: Have set_graph_notrace only affect function_graph tracer ... Browse Code »

In order to make the function graph infrastructure more generic, there can
not be code specific for the function_graph tracer in the generic code. This
includes the set_graph_notrace logic, that stops all graph calls when a
function in the set_graph_notrace is hit.

By using the trace_recursion mask, we can use a bit in the current
task_struct to implement the notrace code, and move the logic out of
fgraph.c and into trace_functions_graph.c and keeps it affecting only the
tracer and not all call graph callbacks.

Acked-by: Namhyung Kim
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-30 12:38:34 +0800
d864a3ca8 fgraph: Create a fgraph.c file to store function graph infrastructure ... Browse Code »

As the function graph infrastructure can be used by thing other than
tracing, moving the code to its own file out of the trace_functions_graph.c
code makes more sense.

The fgraph.c file will only contain the infrastructure required to hook into
functions and their return code.

Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-30 12:38:34 +0800
c43ac4a53 tracing: Do not line wrap short line in function_graph_enter() ... Browse Code »

Commit 588ca1786f2dd ("function_graph: Use new curr_ret_depth to manage
depth instead of curr_ret_stack") removed a parameter from the call
ftrace_push_return_trace() that made it so that the entire call was under 80
characters, but it did not remove the line break. There's no reason to break
that line up, so make it a single line.

Link: http://lkml.kernel.org/r/20181122100322.GN2131@hirez.programming.kicks-ass.net

Reported-by: Peter Zijlstra
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-30 12:38:34 +0800
5cf99a0f3 tracing/fgraph: Fix set_graph_function from showing interrupts ... Browse Code »

The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.

The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.

The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.

Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence
Tested-by: Joe Lawrence
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-30 11:09:00 +0800

28 Nov, 2018

4 commits

7c6ea35ef function_graph: Reverse the order of pushing the ret_stack and the callback ... Browse Code »

The function graph profiler uses the ret_stack to store the "subtime" and
reuse it by nested functions and also on the return. But the current logic
has the profiler callback called before the ret_stack is updated, and it is
just modifying the ret_stack that will later be allocated (it's just lucky
that the "subtime" is not touched when it is allocated).

This could also cause a crash if we are at the end of the ret_stack when
this happens.

By reversing the order of the allocating the ret_stack and then calling the
callbacks attached to a function being traced, the ret_stack entry is no
longer used before it is allocated.

Cc: stable@kernel.org
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-28 09:31:54 +0800
552701dd0 function_graph: Move return callback before update of curr_ret_stack ... Browse Code »

In the past, curr_ret_stack had two functions. One was to denote the depth
of the call graph, the other is to keep track of where on the ret_stack the
data is used. Although they may be slightly related, there are two cases
where they need to be used differently.

The one case is that it keeps the ret_stack data from being corrupted by an
interrupt coming in and overwriting the data still in use. The other is just
to know where the depth of the stack currently is.

The function profiler uses the ret_stack to save a "subtime" variable that
is part of the data on the ret_stack. If curr_ret_stack is modified too
early, then this variable can be corrupted.

The "max_depth" option, when set to 1, will record the first functions going
into the kernel. To see all top functions (when dealing with timings), the
depth variable needs to be lowered before calling the return hook. But by
lowering the curr_ret_stack, it makes the data on the ret_stack still being
used by the return hook susceptible to being overwritten.

Now that there's two variables to handle both cases (curr_ret_depth), we can
move them to the locations where they can handle both cases.

Cc: stable@kernel.org
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-28 09:31:54 +0800
39eb456da function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack ... Browse Code »

Currently, the depth of the ret_stack is determined by curr_ret_stack index.
The issue is that there's a race between setting of the curr_ret_stack and
calling of the callback attached to the return of the function.

Commit 03274a3ffb44 ("tracing/fgraph: Adjust fgraph depth before calling
trace return callback") moved the calling of the callback to after the
setting of the curr_ret_stack, even stating that it was safe to do so, when
in fact, it was the reason there was a barrier() there (yes, I should have
commented that barrier()).

Not only does the curr_ret_stack keep track of the current call graph depth,
it also keeps the ret_stack content from being overwritten by new data.

The function profiler, uses the "subtime" variable of ret_stack structure
and by moving the curr_ret_stack, it allows for interrupts to use the same
structure it was using, corrupting the data, and breaking the profiler.

To fix this, there needs to be two variables to handle the call stack depth
and the pointer to where the ret_stack is being used, as they need to change
at two different locations.

Cc: stable@kernel.org
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-28 09:31:54 +0800
d125f3f86 function_graph: Make ftrace_push_return_trace() static ... Browse Code »

As all architectures now call function_graph_enter() to do the entry work,
no architecture should ever call ftrace_push_return_trace(). Make it static.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Cc: stable@kernel.org
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-28 09:31:54 +0800

27 Nov, 2018

1 commit

8114865ff function_graph: Create function_graph_enter() to consolidate architecture code ... Browse Code »

Currently all the architectures do basically the same thing in preparing the
function graph tracer on entry to a function. This code can be pulled into a
generic location and then this will allow the function graph tracer to be
fixed, as well as extended.

Create a new function graph helper function_graph_enter() that will call the
hook function (ftrace_graph_entry) and the shadow stack operation
(ftrace_push_return_trace), and remove the need of the architecture code to
manage the shadow stack.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Cc: stable@kernel.org
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-11-27 05:18:04 +0800

04 Jul, 2018

1 commit

1fe4293f4 tracing: Fix missing return symbol in function_graph output ... Browse Code »

The function_graph tracer does not show the interrupt return marker for the
leaf entry. On leaf entries, we see an unbalanced interrupt marker (the
interrupt was entered, but nevern left).

Before:
1) | SyS_write() {
1) | __fdget_pos() {
1) 0.061 us | __fget_light();
1) 0.289 us | }
1) | vfs_write() {
1) 0.049 us | rw_verify_area();
1) + 15.424 us | __vfs_write();
1) ==========> |
1) 6.003 us | smp_apic_timer_interrupt();
1) 0.055 us | __fsnotify_parent();
1) 0.073 us | fsnotify();
1) + 23.665 us | }
1) + 24.501 us | }

After:
0) | SyS_write() {
0) | __fdget_pos() {
0) 0.052 us | __fget_light();
0) 0.328 us | }
0) | vfs_write() {
0) 0.057 us | rw_verify_area();
0) | __vfs_write() {
0) ==========> |
0) 8.548 us | smp_apic_timer_interrupt();
0)
Signed-off-by: Steven Rostedt (VMware)

Changbin Du
2018-07-04 06:47:11 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

09 Sep, 2017

1 commit

9b130ad5b treewide: make "nr_cpu_ids" unsigned ... Browse Code »

First, number of CPUs can't be negative number.

Second, different signnnedness leads to suboptimal code in the following
cases:

1)
kmalloc(nr_cpu_ids * sizeof(X));

"int" has to be sign extended to size_t.

2)
while (loff_t *pos < nr_cpu_ids)

MOVSXD is 1 byte longed than the same MOV.

Other cases exist as well. Basically compiler is told that nr_cpu_ids
can't be negative which can't be deduced if it is "int".

Code savings on allyesconfig kernel: -3KB

add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
function old new delta
coretemp_cpu_online 450 512 +62
rcu_init_one 1234 1272 +38
pci_device_probe 374 399 +25

...

pgdat_reclaimable_pages 628 556 -72
select_fallback_rq 446 369 -77
task_numa_find_cpu 1923 1807 -116

Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2017-09-09 09:26:48 +0800

09 Dec, 2016

2 commits

1a4144286 tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too ... Browse Code »

Currently both the wakeup and irqsoff traces do not handle set_graph_notrace
well. The ftrace infrastructure will ignore the return paths of all
functions leaving them hanging without an end:

# echo '*spin*' > set_graph_notrace
# cat trace
[...]
_raw_spin_lock() {
preempt_count_add() {
do_raw_spin_lock() {
update_rq_clock();

Where the '*spin*' functions should have looked like this:

_raw_spin_lock() {
preempt_count_add();
do_raw_spin_lock();
}
update_rq_clock();

Instead, have the wakeup and irqsoff tracers ignore the functions that are
set by the set_graph_notrace like the function_graph tracer does. Move
the logic in the function_graph tracer into a header to allow wakeup and
irqsoff tracers to use it as well.

Cc: Namhyung Kim
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-12-09 22:21:35 +0800
794de08a1 fgraph: Handle a case where a tracer ignores set_graph_notrace ... Browse Code »

Both the wakeup and irqsoff tracers can use the function graph tracer when
the display-graph option is set. The problem is that they ignore the notrace
file, and record the entry of functions that would be ignored by the
function_graph tracer. This causes the trace->depth to be recorded into the
ring buffer. The set_graph_notrace uses a trick by adding a large negative
number to the trace->depth when a graph function is to be ignored.

On trace output, the graph function uses the depth to record a stack of
functions. But since the depth is negative, it accesses the array with a
negative number and causes an out of bounds access that can cause a kernel
oops or corrupt data.

Have the print functions handle cases where a tracer still records functions
even when they are in set_graph_notrace.

Also add warnings if the depth is below zero before accessing the array.

Note, the function graph logic will still prevent the return of these
functions from being recorded, which means that they will be left hanging
without a return. For example:

# echo '*spin*' > set_graph_notrace
# echo 1 > options/display-graph
# echo wakeup > current_tracer
# cat trace
[...]
_raw_spin_lock() {
preempt_count_add() {
do_raw_spin_lock() {
update_rq_clock();

Where it should look like:

_raw_spin_lock() {
preempt_count_add();
do_raw_spin_lock();
}
update_rq_clock();

Cc: stable@vger.kernel.org
Cc: Namhyung Kim
Fixes: 29ad23b00474 ("ftrace: Add set_graph_notrace filter")
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-12-09 22:19:28 +0800

24 Nov, 2016

1 commit

52ffabe38 tracing: Make __buffer_unlock_commit() always_inline ... Browse Code »

The function __buffer_unlock_commit() is called in a few places outside of
trace.c. But for the most part, it should really be inlined, as it is in the
hot path of the trace_events. For the callers outside of trace.c, create a
new function trace_buffer_unlock_commit_nostack(), as the reason it was used
was to avoid the stack tracing that trace_buffer_unlock_commit() could do.

Link: http://lkml.kernel.org/r/20161121183700.GW26852@two.firstfloor.org

Reported-by: Andi Kleen
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-11-24 09:30:51 +0800

07 Oct, 2016

1 commit

95107b30b Merge tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"This release cycle is rather small. Just a few fixes to tracing.

The big change is the addition of the hwlat tracer. It not only
detects SMIs, but also other latency that's caused by the hardware. I
have detected some latency from large boxes having bus contention"

* tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Call traceoff trigger after event is recorded
ftrace/scripts: Add helper script to bisect function tracing problem functions
tracing: Have max_latency be defined for HWLAT_TRACER as well
tracing: Add NMI tracing in hwlat detector
tracing: Have hwlat trace migrate across tracing_cpumask CPUs
tracing: Add documentation for hwlat_detector tracer
tracing: Added hardware latency tracer
ftrace: Access ret_stack->subtime only in the function profiler
function_graph: Handle TRACE_BPUTS in print_graph_comment
tracing/uprobe: Drop isdigit() check in create_trace_uprobe

Linus Torvalds
2016-10-07 02:48:41 +0800

02 Sep, 2016

1 commit

8861dd303 ftrace: Access ret_stack->subtime only in the function profiler ... Browse Code »

The subtime is used only for function profiler with function graph
tracer enabled. Move the definition of subtime under
CONFIG_FUNCTION_PROFILER to reduce the memory usage. Also move the
initialization of subtime into the graph entry callback.

Link: http://lkml.kernel.org/r/20160831025529.24018-1-namhyung@kernel.org

Cc: Ingo Molnar
Cc: Josh Poimboeuf
Signed-off-by: Namhyung Kim
Signed-off-by: Steven Rostedt

Namhyung Kim
2016-09-02 00:19:40 +0800

01 Sep, 2016

1 commit

613dccdf6 function_graph: Handle TRACE_BPUTS in print_graph_comment ... Browse Code »

It missed to handle TRACE_BPUTS so messages recorded by trace_bputs()
will be shown with symbol info unnecessarily.

You can see it with the trace_printk sample code:

# cd /sys/kernel/tracing/
# echo sys_sync > set_graph_function
# echo 1 > options/sym-offset
# echo function_graph > current_tracer

Note that the sys_sync filter was there to prevent recording other
functions and the sym-offset option was needed since the first message
was called from a module init function so kallsyms doesn't have the
symbol and omitted in the output.

# cd ~/build/kernel
# insmod samples/trace_printk/trace-printk.ko

# cd -
# head trace

Before:

# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | /* 0xffffffffa0002000: This is a static string that will use trace_bputs */
1) | /* This is a dynamic string that will use trace_puts */
1) | /* trace_printk_irq_work+0x5/0x7b [trace_printk]: (irq) This is a static string that will use trace_bputs */
1) | /* (irq) This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bprintk() */
1) | /* (irq) This is a dynamic string that will use trace_printk */

After:

# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | /* This is a static string that will use trace_bputs */
1) | /* This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bputs */
1) | /* (irq) This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bprintk() */
1) | /* (irq) This is a dynamic string that will use trace_printk */

Link: http://lkml.kernel.org/r/20160901024354.13720-1-namhyung@kernel.org

Signed-off-by: Namhyung Kim
Signed-off-by: Steven Rostedt

Namhyung Kim
2016-09-01 23:19:55 +0800

24 Aug, 2016

4 commits

223918e32 ftrace: Add ftrace_graph_ret_addr() stack unwinding helpers ... Browse Code »

When function graph tracing is enabled for a function, ftrace modifies
the stack by replacing the original return address with the address of a
hook function (return_to_handler).

Stack unwinders need a way to get the original return address. Add an
arch-independent helper function for that named ftrace_graph_ret_addr().

This adds two variations of the function: one depends on
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, and the other relies on an index state
variable.

The former is recommended because, in some cases, the latter can cause
problems when the unwinder skips stack frames. It can get out of sync
with the ret_stack index and wrong addresses can be reported for the
stack trace.

Once all arches have been ported to use
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, we can get rid of the distinction.

Signed-off-by: Josh Poimboeuf
Acked-by: Steven Rostedt
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Byungchul Park
Cc: Denys Vlasenko
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Nilay Vaish
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/36bd90f762fc5e5af3929e3797a68a64906421cf.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-08-24 18:15:14 +0800
9a7c348ba ftrace: Add return address pointer to ftrace_ret_stack ... Browse Code »

Storing this value will help prevent unwinders from getting out of sync
with the function graph tracer ret_stack. Now instead of needing a
stateful iterator, they can compare the return address pointer to find
the right ret_stack entry.

Note that an array of 50 ftrace_ret_stack structs is allocated for every
task. So when an arch implements this, it will add either 200 or 400
bytes of memory usage per task (depending on whether it's a 32-bit or
64-bit platform).

Signed-off-by: Josh Poimboeuf
Acked-by: Steven Rostedt
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Byungchul Park
Cc: Denys Vlasenko
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Nilay Vaish
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/a95cfcc39e8f26b89a430c56926af0bb217bc0a1.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-08-24 18:15:14 +0800
daa460a88 ftrace: Only allocate the ret_stack 'fp' field when needed ... Browse Code »

This saves some memory when HAVE_FUNCTION_GRAPH_FP_TEST isn't defined.
On x86_64 with newer versions of gcc which have -mfentry, it saves 400
bytes per task.

Signed-off-by: Josh Poimboeuf
Acked-by: Steven Rostedt
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Byungchul Park
Cc: Denys Vlasenko
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Nilay Vaish
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/5c7747d9ea7b5cb47ef0a8ce8a6cea6bf7aa94bf.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-08-24 18:15:14 +0800
e4a744ef2 ftrace: Remove CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST from config ... Browse Code »

Make HAVE_FUNCTION_GRAPH_FP_TEST a normal define, independent from
kconfig. This removes some config file pollution and simplifies the
checking for the fp test.

Suggested-by: Steven Rostedt
Signed-off-by: Josh Poimboeuf
Acked-by: Steven Rostedt
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Byungchul Park
Cc: Denys Vlasenko
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Nilay Vaish
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/2c4e5f05054d6d367f702fd153af7a0109dd5c81.1471607358.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-08-24 18:15:13 +0800

28 Jun, 2016

1 commit

7fa8b7171 tracing/function_graph: Fix filters for function_graph threshold ... Browse Code »

Function graph tracer currently ignores filters if tracing_thresh is set.
For example, even if set_ftrace_pid is set, then its ignored if tracing_thresh
set, resulting in all processes being traced.

To fix this, we reuse the same entry function as when tracing_thresh is not
set and do everything as in the regular case except for writing the function entry
to the ring buffer.

Link: http://lkml.kernel.org/r/1466228694-2677-1-git-send-email-agnel.joel@gmail.com

Cc: Frederic Weisbecker
Cc: Ingo Molnar
Signed-off-by: Joel Fernandes
Signed-off-by: Steven Rostedt

Joel Fernandes
2016-06-28 01:29:24 +0800

20 Jun, 2016

1 commit

345ddcc88 ftrace: Have set_ftrace_pid use the bitmap like events do ... Browse Code »

Convert set_ftrace_pid to use the bitmap like set_event_pid does. This
allows for instances to use the pid filtering as well, and will allow for
function-fork option to set if the children of a traced function should be
traced or not.

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2016-06-20 21:54:19 +0800

26 Mar, 2016

1 commit

be7635e72 arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections ... Browse Code »

KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.

Move the definition of __irq_entry to so that the
users don't need to pull in . Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.

Signed-off-by: Alexander Potapenko
Acked-by: Steven Rostedt
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Andrey Konovalov
Cc: Dmitry Vyukov
Cc: Andrey Ryabinin
Cc: Konstantin Serebryany
Cc: Dmitry Chernenkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Potapenko
2016-03-26 07:37:42 +0800

23 Mar, 2016

1 commit

a395d6a7e kernel/...: convert pr_warning to pr_warn ... Browse Code »

Use the more common logging method with the eventual goal of removing
pr_warning altogether.

Miscellanea:

- Realign arguments
- Coalesce formats
- Add missing space between a few coalesced formats

Signed-off-by: Joe Perches
Acked-by: Rafael J. Wysocki [kernel/power/suspend.c]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-23 06:36:02 +0800

08 Nov, 2015

1 commit

03e88ae6b tracing: Remove unused ftrace_cpu_disabled per cpu variable ... Browse Code »

Since the ring buffer is lockless, there is no need to disable ftrace on
CPU. And no one doing so: after commit 68179686ac67cb ("tracing: Remove
ftrace_disable/enable_cpu()") ftrace_cpu_disabled stays the same after
initialization, nothing changes it.
ftrace_cpu_disabled shouldn't be used by any external module since it
disables only function and graph_function tracers but not any other
tracer.

Link: http://lkml.kernel.org/r/1446836846-22239-1-git-send-email-0x7f454c46@gmail.com

Signed-off-by: Dmitry Safonov
Signed-off-by: Steven Rostedt

Dmitry Safonov
2015-11-08 02:25:14 +0800

01 Oct, 2015

2 commits

983f938ae tracing: Move trace_flags from global to a trace_array field ... Browse Code »

In preparation to make trace options per instance, the global trace_flags
needs to be moved from being a global variable to a field within the trace
instance trace_array structure.

There's still more work to do, as there's some functions that use
trace_flags without passing in a way to get to the current_trace array. For
those, the global_trace is used directly (from trace.c). This includes
setting and clearing the trace_flags. This means that when a new instance is
created, it just gets the trace_flags of the global_trace and will not be
able to modify them. Depending on the functions that have access to the
trace_array, the flags of an instance may not affect parts of its trace,
where the global_trace is used. These will be fixed in future changes.

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-10-01 03:22:55 +0800
555772041 tracing: Move sleep-time and graph-time options out of the core trace_flags ... Browse Code »

The sleep-time and graph-time options are only for the function graph tracer
and are not used by anything else. As tracer options are now visible when
the tracer is not activated, its better to move the function graph specific
tracer options into the function graph tracer.

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-10-01 03:22:42 +0800