Eric Lee / smarc-fsl-linux-kernel

13 May, 2015

1 commit

7e255d346 tracing: Export tracing clock functions ... Browse Code »

Critical tracepoint hooks should never call anything that takes a lock,
so they are unable to call getrawmonotonic() or ktime_get().

Export the rest of the tracing clock functions so can be used in
tracepoint hooks.

Background: We have a customer that adds their own module and registers
a tracepoint hook to sched_wakeup. They were using ktime_get() for a
time source, but it grabs a seq lock and caused a deadlock to occur.

Link: http://lkml.kernel.org/r/1430406624-22609-1-git-send-email-jsnitsel@redhat.com

Signed-off-by: Jerry Snitselaar
Signed-off-by: Steven Rostedt

Jerry Snitselaar
2015-05-13 03:56:57 +0800

21 Jul, 2014

1 commit

58d4e21e5 tracing: Fix wraparound problems in "uptime" trace clock ... Browse Code »

The "uptime" trace clock added in:

commit 8aacf017b065a805d27467843490c976835eb4a5
tracing: Add "uptime" trace clock that uses jiffies

has wraparound problems when the system has been up more
than 1 hour 11 minutes and 34 seconds. It converts jiffies
to nanoseconds using:
(u64)jiffies_to_usecs(jiffy) * 1000ULL
but since jiffies_to_usecs() only returns a 32-bit value, it
truncates at 2^32 microseconds. An additional problem on 32-bit
systems is that the argument is "unsigned long", so fixing the
return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
system).

Avoid these problems by using jiffies_64 as our basis, and
not converting to nanoseconds (we do convert to clock_t because
user facing API must not be dependent on internal kernel
HZ values).

Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com

Cc: stable@vger.kernel.org # 3.10+
Fixes: 8aacf017b065 "tracing: Add "uptime" trace clock that uses jiffies"
Signed-off-by: Tony Luck
Signed-off-by: Steven Rostedt

Tony Luck
2014-07-21 21:56:12 +0800

15 Mar, 2013

1 commit

8aacf017b tracing: Add "uptime" trace clock that uses jiffies ... Browse Code »

Add a simple trace clock called "uptime" for those that are
interested in the uptime of the trace. It uses jiffies as that's
the safest method, as other uptime clocks grab seq locks, which could
cause a deadlock if taken from an event or function tracer.

Requested-by: Mauro Carvalho Chehab
Cc: Thomas Gleixner
Cc: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-15 12:36:09 +0800

20 Feb, 2013

1 commit

8f55cea41 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf changes from Ingo Molnar:
"There are lots of improvements, the biggest changes are:

Main kernel side changes:

- Improve uprobes performance by adding 'pre-filtering' support, by
Oleg Nesterov.

- Make some POWER7 events available in sysfs, equivalent to what was
done on x86, from Sukadev Bhattiprolu.

- tracing updates by Steve Rostedt - mostly misc fixes and smaller
improvements.

- Use perf/event tracing to report PCI Express advanced errors, by
Tony Luck.

- Enable northbridge performance counters on AMD family 15h, by Jacob
Shin.

- This tracing commit:

tracing: Remove the extra 4 bytes of padding in events

changes the ABI. All involved parties (PowerTop in particular)
seem to agree that it's safe to do now with the introduction of
libtraceevent, but the devil is in the details ...

Main tooling side changes:

- Add 'event group view', from Namyung Kim:

To use it, 'perf record' should group events when recording. And
then perf report parses the saved group relation from file header
and prints them together if --group option is provided. You can
use the 'perf evlist' command to see event group information:

$ perf record -e '{ref-cycles,cycles}' noploop 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]

$ perf evlist --group
{ref-cycles,cycles}

With this example, default perf report will show you each event
separately.

You can use --group option to enable event group view:

$ perf report --group
...
# group: {ref-cycles,cycles}
# ========
# Samples: 7K of event 'anon group { ref-cycles, cycles }'
# Event count (approx.): 6876107743
#
# Overhead Command Shared Object Symbol
# ................ ....... ................. ..........................
99.84% 99.76% noploop noploop [.] main
0.07% 0.00% noploop ld-2.15.so [.] strcmp
0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu
0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time
0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask
0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe
0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock
0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page
0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks
0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time

As you can see the Overhead column now contains both of ref-cycles
and cycles and header line shows group information also - 'anon
group { ref-cycles, cycles }'. The output is sorted by period of
group leader first.

- Initial GTK+ annotate browser, from Namhyung Kim.

- Add option for runtime switching perf data file in perf report,
just press 's' and a menu with the valid files found in the current
directory will be presented, from Feng Tang.

- Add support to display whole group data for raw columns, from Jiri
Olsa.

- Add per processor socket count aggregation in perf stat, from
Stephane Eranian.

- Add interval printing in 'perf stat', from Stephane Eranian.

- 'perf test' improvements

- Add support for wildcards in tracepoint system name, from Jiri
Olsa.

- Add anonymous huge page recognition, from Joshua Zhu.

- perf build-id cache now can show DSOs present in a perf.data file
that are not in the cache, to integrate with build-id servers being
put in place by organizations such as Fedora.

- perf top now shares more of the evsel config/creation routines with
'record', paving the way for further integration like 'top'
snapshots, etc.

- perf top now supports DWARF callchains.

- Fix mmap limitations on 32-bit, fix from David Miller.

- 'perf bench numa mem' NUMA performance measurement suite

- ... and lots of fixes, performance improvements, cleanups and other
improvements I failed to list - see the shortlog and git log for
details."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits)
perf/x86/amd: Enable northbridge performance counters on AMD family 15h
perf/hwbp: Fix cleanup in case of kzalloc failure
perf tools: Fix build with bison 2.3 and older.
perf tools: Limit unwind support to x86 archs
perf annotate: Make it to be able to skip unannotatable symbols
perf gtk/annotate: Fail early if it can't annotate
perf gtk/annotate: Show source lines with gray color
perf gtk/annotate: Support multiple event annotation
perf ui/gtk: Implement basic GTK2 annotation browser
perf annotate: Fix warning message on a missing vmlinux
perf buildid-cache: Add --update option
uprobes/perf: Avoid uprobe_apply() whenever possible
uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE
uprobes/perf: Teach trace_uprobe/perf code to pre-filter
uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's
uprobes: Introduce uprobe_apply()
perf: Introduce hw_perf_event->tp_target and ->tp_list
uprobes/perf: Always increment trace_uprobe->nhit
uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe
uprobes/tracing: Introduce is_trace_uprobe_enabled()
...

Linus Torvalds
2013-02-20 09:49:41 +0800

31 Jan, 2013

1 commit

5e67b51e3 tracing: Use sched_clock_cpu for trace_clock_global ... Browse Code »

For systems with an unstable sched_clock, all cpu_clock() does is enable/
disable local irq during the call to sched_clock_cpu(). And for stable
systems they are same.

trace_clock_global() already disables interrupts, so it can call
sched_clock_cpu() directly.

Link: http://lkml.kernel.org/r/1356576585-28782-2-git-send-email-namhyung@kernel.org

Signed-off-by: Namhyung Kim
Signed-off-by: Steven Rostedt

Namhyung Kim
2013-01-31 00:02:05 +0800

23 Jan, 2013

1 commit

0a71e4c6d tracing: Remove trace.h header from trace_clock.c ... Browse Code »

As trace_clock is used by other things besides tracing, and it
does not require anything from trace.h, it is best not to include
the header file in trace_clock.c.

Signed-off-by: Steven Rostedt

Steven Rostedt
2013-01-23 01:06:56 +0800

09 Jan, 2013

1 commit

dc975e94f tracing: Export trace_clock_local() ... Browse Code »

The rcutorture tests need to be able to trace the time of the
beginning of an RCU read-side critical section, and thus need access
to trace_clock_local(). This commit therefore adds a the needed
EXPORT_SYMBOL_GPL().

Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2013-01-09 06:14:55 +0800

19 Sep, 2011

1 commit

6249687f7 tracing: Add a counter clock for those that do not trust clocks ... Browse Code »

When debugging tight race conditions, it can be helpful to have a
synchronized tracing method. Although in most cases the global clock
provides this functionality, if timings is not the issue, it is more
comforting to know that the order of events really happened in a precise
order.

Instead of using a clock, add a "counter" that is simply an incrementing
atomic 64bit counter that orders the events as they are perceived to
happen.

The trace_clock_counter() is added from the attempt by Peter Zijlstra
trying to convert the trace_clock_global() to it. I took Peter's counter
code and made trace_clock_counter() instead, and added it to the choice
of clocks. Just echo counter > /debug/tracing/trace_clock to activate
it.

Requested-by: Thomas Gleixner
Requested-by: Peter Zijlstra
Reviewed-By: Valdis Kletnieks
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-09-19 23:35:58 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

07 Aug, 2010

1 commit

c4efd6b56 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug
sched: No need for bootmem special cases
sched: Revert nohz_ratelimit() for now
sched: Reduce update_group_power() calls
sched: Update rq->clock for nohz balanced cpus
sched: Fix spelling of sibling
sched, cpuset: Drop __cpuexit from cpu hotplug callbacks
sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check()
sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()
sched: thread_group_cputime: Simplify, document the "alive" check
sched: Remove the obsolete exit_state/signal hacks
sched: task_tick_rt: Remove the obsolete ->signal != NULL check
sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless
sched: Fix comments to make them DocBook happy
sched: Fix fix_small_capacity
powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
powerpc: Enable asymmetric SMT scheduling on POWER7
sched: Add asymmetric group packing option for sibling domain
sched: Fix capacity calculations for SMT4
sched: Change nohz idle load balancing logic to push model
...

Linus Torvalds
2010-08-07 00:39:22 +0800

09 Jun, 2010

1 commit

c676329ab sched_clock: Add local_clock() API and improve documentation ... Browse Code »

For people who otherwise get to write: cpu_clock(smp_processor_id()),
there is now: local_clock().

Also, as per suggestion from Andrew, provide some documentation on
the various clock interfaces, and minimize the unsigned long long vs
u64 mess.

Signed-off-by: Peter Zijlstra
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Jens Axboe
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 16:34:49 +0800

04 Jun, 2010

1 commit

5168ae50a tracing: Remove ftrace_preempt_disable/enable ... Browse Code »

The ftrace_preempt_disable/enable functions were to address a
recursive race caused by the function tracer. The function tracer
traces all functions which makes it easily susceptible to recursion.
One area was preempt_enable(). This would call the scheduler and
the schedulre would call the function tracer and loop.
(So was it thought).

The ftrace_preempt_disable/enable was made to protect against recursion
inside the scheduler by storing the NEED_RESCHED flag. If it was
set before the ftrace_preempt_disable() it would not call schedule
on ftrace_preempt_enable(), thinking that if it was set before then
it would have already scheduled unless it was already in the scheduler.

This worked fine except in the case of SMP, where another task would set
the NEED_RESCHED flag for a task on another CPU, and then kick off an
IPI to trigger it. This could cause the NEED_RESCHED to be saved at
ftrace_preempt_disable() but the IPI to arrive in the the preempt
disabled section. The ftrace_preempt_enable() would not call the scheduler
because the flag was already set before entring the section.

This bug would cause a missed preemption check and cause lower latencies.

Investigating further, I found that the recusion caused by the function
tracer was not due to schedule(), but due to preempt_schedule(). Now
that preempt_schedule is completely annotated with notrace, the recusion
no longer is an issue.

Reported-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-06-04 07:32:38 +0800

30 Mar, 2010

1 commit

e36673ec5 tracing: Fix lockdep warning in global_clock() ... Browse Code »

# echo 1 > events/enable
# echo global > trace_clock

------------[ cut here ]------------
WARNING: at kernel/lockdep.c:3162 check_flags+0xb2/0x190()
...
---[ end trace 3f86734a89416623 ]---
possible reason: unannotated irqs-on.
...

There's no reason to use the raw_local_irq_save() in trace_clock_global.
The local_irq_save() version is fine, and does not cause the bug in lockdep.

Acked-by: Peter Zijlstra
Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt

Li Zefan
2010-03-30 03:16:44 +0800

01 Mar, 2010

1 commit

ae1f30384 tracing: Include irqflags headers from trace clock ... Browse Code »

trace_clock.c includes spinlock.h, which ends up including
asm/system.h, which in turn includes linux/irqflags.h in x86.

So the definition of raw_local_irq_save is luckily covered there,
but this is not the case in parisc:

tip/kernel/trace/trace_clock.c:86: error: implicit declaration of function 'raw_local_irq_save'
tip/kernel/trace/trace_clock.c:112: error: implicit declaration of function 'raw_local_irq_restore'

We need to include linux/irqflags.h directly from trace_clock.c
to avoid such build error.

Signed-off-by: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Robert Richter
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2010-03-01 02:45:01 +0800

15 Dec, 2009

3 commits

0199c4e68 locking: Convert __raw_spin* functions to arch_spin* ... Browse Code »

Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Cc: linux-arch@vger.kernel.org

Thomas Gleixner
2009-12-15 06:55:32 +0800
edc35bd72 locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKED ... Browse Code »

Further name space cleanup. No functional change

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Cc: linux-arch@vger.kernel.org

Thomas Gleixner
2009-12-15 06:55:32 +0800
445c89514 locking: Convert raw_spinlock to arch_spinlock ... Browse Code »

The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Cc: linux-arch@vger.kernel.org

Thomas Gleixner
2009-12-15 06:55:32 +0800

12 Nov, 2009

1 commit

8b2a5dac7 tracing: do not disable interrupts for trace_clock_local ... Browse Code »

Disabling interrupts in trace_clock_local takes quite a performance
hit to the recording of traces. Using perf top we see:

------------------------------------------------------------------------------
PerfTop: 244 irqs/sec kernel:100.0% [1000Hz cpu-clock-msecs], (all, 4 CPUs)
------------------------------------------------------------------------------

samples pcnt kernel function
_______ _____ _______________

2842.00 - 40.4% : trace_clock_local
1043.00 - 14.8% : rb_reserve_next_event
784.00 - 11.1% : ring_buffer_lock_reserve
600.00 - 8.5% : __rb_reserve_next
579.00 - 8.2% : rb_end_commit
440.00 - 6.3% : ring_buffer_unlock_commit
290.00 - 4.1% : ring_buffer_producer_thread [ring_buffer_benchmark]
155.00 - 2.2% : debug_smp_processor_id
117.00 - 1.7% : trace_recursive_unlock
103.00 - 1.5% : ring_buffer_event_data
28.00 - 0.4% : do_gettimeofday
22.00 - 0.3% : _spin_unlock_irq
14.00 - 0.2% : native_read_tsc
11.00 - 0.2% : getnstimeofday

Where trace_clock_local is 40% of the tracing, and the time for recording
a trace according to ring_buffer_benchmark is 210ns. After converting
the interrupts to preemption disabling we have from perf top:

------------------------------------------------------------------------------
PerfTop: 1084 irqs/sec kernel:99.9% [1000Hz cpu-clock-msecs], (all, 4 CPUs)
------------------------------------------------------------------------------

samples pcnt kernel function
_______ _____ _______________

1277.00 - 16.8% : native_read_tsc
1148.00 - 15.1% : rb_reserve_next_event
896.00 - 11.8% : ring_buffer_lock_reserve
688.00 - 9.1% : __rb_reserve_next
664.00 - 8.8% : rb_end_commit
563.00 - 7.4% : ring_buffer_unlock_commit
508.00 - 6.7% : _spin_unlock_irq
365.00 - 4.8% : debug_smp_processor_id
321.00 - 4.2% : trace_clock_local
303.00 - 4.0% : ring_buffer_producer_thread [ring_buffer_benchmark]
273.00 - 3.6% : native_sched_clock
122.00 - 1.6% : trace_recursive_unlock
113.00 - 1.5% : sched_clock
101.00 - 1.3% : ring_buffer_event_data
53.00 - 0.7% : tick_nohz_stop_sched_tick

Where trace_clock_local drops from 40% to only taking 4% of the total time.
The trace time also goes from 210ns down to 179ns (31ns).

I talked with Peter Zijlstra about the impact that sched_clock may have
without having interrupts disabled, and he told me that if a timer interrupt
comes in, sched_clock may report a wrong time.

Balancing a seldom incorrect timestamp with a 15% performance boost, I'll
take the performance boost.

Acked-by: Peter Zijlstra
Signed-off-by: Steven Rostedt

Steven Rostedt
2009-11-12 12:38:33 +0800

16 Sep, 2009

1 commit

6ca6cca31 tracing: optimize global_trace_clock cachelines ... Browse Code »

The prev_trace_clock_time is only read or written to when the
trace_clock_lock is taken. For better perfomance, they
should share the same cache line.

Reported-by: Peter Zijlstra
Signed-off-by: Steven Rostedt

Steven Rostedt
2009-09-16 00:24:22 +0800

23 Mar, 2009

1 commit

b8b942653 tracing: fix four sparse warnings ... Browse Code »

Impact: cleanup.

This patch fixes the following sparse warnings:

kernel/trace/trace.c:385:9: warning: symbol 'trace_seq_to_buffer' was
not declared. Should it be static?

kernel/trace/trace_clock.c:29:13: warning: symbol 'trace_clock_local'
was not declared. Should it be static?

kernel/trace/trace_clock.c:54:13: warning: symbol 'trace_clock' was not
declared. Should it be static?

kernel/trace/trace_clock.c:74:13: warning: symbol 'trace_clock_global'
was not declared. Should it be static?

Signed-off-by: Dmitri Vorobiev
LKML-Reference:
Signed-off-by: Ingo Molnar

Dmitri Vorobiev
2009-03-23 01:16:54 +0800

11 Mar, 2009

1 commit

6cc3c6e12 trace_clock: fix preemption bug ... Browse Code »

Using the function_graph tracer in recent kernels generates a spew of
preemption BUGs. Fix this by not requiring trace_clock_local() users
to disable preemption themselves.

Signed-off-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-03-11 03:03:01 +0800

27 Feb, 2009

1 commit

14131f2f9 tracing: implement trace_clock_*() APIs ... Browse Code »

Impact: implement new tracing timestamp APIs

Add three trace clock variants, with differing scalability/precision
tradeoffs:

- local: CPU-local trace clock
- medium: scalable global clock with some jitter
- global: globally monotonic, serialized clock

Make the ring-buffer use the local trace clock internally.

Acked-by: Peter Zijlstra
Acked-by: Steven Rostedt
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-02-27 01:44:06 +0800