03 Mar, 2016
7 commits
-
Move code for finalizing 'perf.data' to record__finish_output(). It will
be used by following commits to split output to multiple files.Signed-off-by: He Kuang
Cc: Alexei Starovoitov
Cc: He Kuang
Cc: Jiri Olsa
Cc: Li Zefan
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan
Signed-off-by: Arnaldo Carvalho de Melo -
Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.Signed-off-by: He Kuang
Cc: Alexei Starovoitov
Cc: He Kuang
Cc: Jiri Olsa
Cc: Li Zefan
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan
Signed-off-by: Arnaldo Carvalho de Melo -
Commits in a BPF patchkit will extract kernel and module synthesizing
code into a separated function and call it multiple times. This patch
replace 'if (err < 0)' using WARN_ONCE, makes sure the error message
show one time.Signed-off-by: Wang Nan
Cc: Alexei Starovoitov
Cc: He Kuang
Cc: Jiri Olsa
Cc: Li Zefan
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences
when setting a field path"), 'perf data convert' gets incorrect result
if there's bpf output data. For example:# perf data convert --to-ctf ./out.ctf
# babeltrace ./out.ctf
[10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
[10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }The expected result of the first sample should be:
raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }
however, 'perf data convert' output big endian value to resuling CTF
file.The reason is a internal change (or a bug?) of babeltrace.
Before this patch, at the first add_bpf_output_values(), byte order of
all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
It would be fixed by:perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
->bt_ctf_trace_add_stream_class
->bt_ctf_field_type_structure_set_byte_order
->bt_ctf_field_type_integer_set_byte_orderduring creating the stream.
However, the babeltrace commit mentioned above duplicates types in
sequence to prevent potential conflict in following call stack and link
the newly allocated type into the 'raw_data' sequence:perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
-> bt_ctf_trace_add_stream_class
-> bt_ctf_stream_class_resolve_types
...
-> bt_ctf_field_type_sequence_copy
->bt_ctf_field_type_integer_copyThis happens before byte order setting, so only the newly allocated
type is initialized, the byte order of original type perf choose to
create the first raw_data is still uncertain.Byte order in CTF output is not related to byte order in perf.data.
Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
behavior changing, set byte order according to compiling options.Signed-off-by: Wang Nan
Cc: Jeremie Galarneau
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Jiri Olsa
Cc: Jérémie Galarneau
Cc: Li Zefan
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
bpf_perf_event_output() outputs data through sample->raw_data. This
patch adds support to convert those data into CTF. A python script then
can be used to process output data from BPF programs.Test result:
# cat ./test_bpf_output_2.c
/************************ BEGIN **************************/
#include
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};
#define SEC(NAME) __attribute__((section(NAME), used))
static u64 (*ktime_get_ns)(void) =
(void *)BPF_FUNC_ktime_get_ns;
static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
(void *)BPF_FUNC_trace_printk;
static int (*get_smp_processor_id)(void) =
(void *)BPF_FUNC_get_smp_processor_id;
static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
(void *)BPF_FUNC_perf_event_output;struct bpf_map_def SEC("maps") channel = {
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(u32),
.max_entries = __NR_CPUS__,
};static inline int __attribute__((always_inline))
func(void *ctx, int type)
{
struct {
u64 ktime;
int type;
} __attribute__((packed)) output_data;
char error_data[] = "Error: failed to output\n";
int err;output_data.type = type;
output_data.ktime = ktime_get_ns();
err = perf_event_output(ctx, &channel, get_smp_processor_id(),
&output_data, sizeof(output_data));
if (err)
trace_printk(error_data, sizeof(error_data));
return 0;
}
SEC("func_begin=sys_nanosleep")
int func_begin(void *ctx) {return func(ctx, 1);}
SEC("func_end=sys_nanosleep%return")
int func_end(void *ctx) { return func(ctx, 2);}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
/************************* END ***************************/# ./perf record -e bpf-output/no-inherit,name=evt/ \
-e ./test_bpf_output_2.c/map:channel.event=evt/ \
usleep 100000
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]# ./perf script
usleep 14942 92503.198504: evt: ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
usleep 14942 92503.298562: evt: ffffffff810585e9 kretprobe_trampoline_holder (/lib....# ./perf data convert --to-ctf ./out.ctf
[ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
[ perf data convert: Converted and wrote 0.000 MB (2 samples) ]# babeltrace ./out.ctf
[01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
[01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }# cat ./test_bpf_output_2.py
from babeltrace import TraceCollection
tc = TraceCollection()
tc.add_trace('./out.ctf', 'ctf')
d = {1:[], 2:[]}
for event in tc.events:
if not event.name.startswith('evt'):
continue
raw_data = event['raw_data']
(time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
d[type].append(time)
print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1])))));# python3 ./test_bpf_output_2.py
[100056879]Committer note:
Make sure you have python3-devel installed, not python-devel, which may
be for python2, which will lead to some "PyInstance_Type" errors. Also
make sure that you use the right libbabeltrace, because it is shipped
in Fedora, for instance, but an older version.To build libbabeltrace's python binding one also needs to use:
./configure --enable-python-bindings
And then set PYTHONPATH=/usr/local/lib64/python3.4/site-packages/.
Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Acked-by: Jiri Olsa
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Li Zefan
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
Only put the frontend/backend stalled cycles into the default perf stat
events when the CPU actually supports them.This avoids empty columns with --metric-only on newer Intel CPUs.
Committer note:
Before:
$ perf stat ls
Performance counter stats for 'ls':
1.080893 task-clock (msec) # 0.619 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
97 page-faults # 0.090 M/sec
3,327,741 cycles # 3.079 GHz
stalled-cycles-frontend
stalled-cycles-backend
1,609,544 instructions # 0.48 insn per cycle
319,117 branches # 295.235 M/sec
12,246 branch-misses # 3.84% of all branches0.001746508 seconds time elapsed
$After:
$ perf stat ls
Performance counter stats for 'ls':
0.693948 task-clock (msec) # 0.662 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
95 page-faults # 0.137 M/sec
1,792,509 cycles # 2.583 GHz
1,599,047 instructions # 0.89 insn per cycle
316,328 branches # 455.838 M/sec
12,453 branch-misses # 3.94% of all branches0.001048987 seconds time elapsed
$Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1456532881-26621-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo -
Ingo reported regression on display format of big numbers, which is
missing separators (in default perf stat output).triton:~/tip> perf stat -a sleep 1
...
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycleThis is caused by recent change:
perf stat: Check existence of frontend/backed stalled cycles
that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.The lc string returned from setlocale, that we use to store old locale
value, may be allocated in static storage. Getting a dynamic copy to
make it survive another setlocale call.$ perf stat ls
...
2,360,602 cycles # 3.080 GHz
2,703,090 instructions # 1.15 insn per cycle
546,031 branches # 712.511 M/secCommitter note:
Since the patch introducing the regression didn't made to perf/core,
move it to just before where the regression was introduced, so that we
don't break bisection for this feature.Reported-by: Ingo Molnar
Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo
29 Feb, 2016
33 commits
-
Currently there's a single function that is used to display a record's
data in human readable format. That's pevent_print_event().
Unfortunately, this gives little room for adding other output within the
line without updating that function call.I've decided to split that function into 3 parts.
pevent_print_event_task() which prints the task comm, pid and the CPU
pevent_print_event_time() which outputs the record's timestamp
pevent_print_event_data() which outputs the rest of the event data.pevent_print_event() now simply calls these three functions.
To save time from doing the search for event from the record's type, I
created a new helper function called pevent_find_event_by_record(),
which returns the record's event, and this event has to be passed to the
above functions.Signed-off-by: Steven Rostedt
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20160229090128.43a56704@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo -
Some tracepoint have multiple fields with the same name, "nr", the first
one is a unique syscall ID, the other is a syscall argument:# cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_io_getevents/format
name: sys_enter_io_getevents
ID: 747
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;field:int nr; offset:8; size:4; signed:1;
field:aio_context_t ctx_id; offset:16; size:8; signed:0;
field:long min_nr; offset:24; size:8; signed:0;
field:long nr; offset:32; size:8; signed:0;
field:struct io_event * events; offset:40; size:8; signed:0;
field:struct timespec * timeout; offset:48; size:8; signed:0;print fmt: "ctx_id: 0x%08lx, min_nr: 0x%08lx, nr: 0x%08lx, events: 0x%08lx, timeout: 0x%08lx", ((unsigned long)(REC->ctx_id)), ((unsigned long)(REC->min_nr)), ((unsigned long)(REC->nr)), ((unsigned long)(REC->events)), ((unsigned long)(REC->timeout))
#Fix it by renaming the "/format" common tracepoint field "nr" to "__syscall_nr".
Signed-off-by: Taeung Song
[ Do not rename the struct member, just the '/format' field name ]
Signed-off-by: Steven Rostedt
Acked-by: Peter Zijlstra
Cc: Jiri Olsa
Cc: Lai Jiangshan
Cc: Namhyung Kim
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20160226132301.3ae065a4@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo -
Format fields of a syscall have the first variable '__syscall_nr' or
'nr' that mean the syscall number. But it isn't relevant here so drop
it.'nr' among fields of syscall was renamed '__syscall_nr'. So add
exception handling to drop '__syscall_nr' and modify the comment for
this excpetion handling.Reported-by: Arnaldo Carvalho de Melo
Signed-off-by: Taeung Song
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1456492465-5946-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
The util/python-ext-sources file contains source files required to build
the python extension relative to $(srctree)/tools/perf,Such a file path $(FILE).c is handed over to the python extension build
system, which builds the final object in the
$(PYTHON_EXTBUILD)/tmp/$(FILE).o path.After the build is done all files from $(PYTHON_EXTBUILD)lib/ are
carried as the result binaries.Above system fails when we add source file relative to ../lib, which we
do for:../lib/bitmap.c
../lib/find_bit.c
../lib/hweight.c
../lib/rbtree.cAll above objects will be built like:
$(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c
$(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c
$(PYTHON_EXTBUILD)/tmp/../lib/hweight.c
$(PYTHON_EXTBUILD)/tmp/../lib/rbtree.cwhich accidentally happens to be final library path:
$(PYTHON_EXTBUILD)/lib/
Changing setup.py to pass full paths of source files to Extension build
class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp
directory.Reported-by: Jeff Bastian
Signed-off-by: Jiri Olsa
Tested-by: Josh Boyer
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Required to use it in modular perf drivers.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.930735780@linutronix.de
Signed-off-by: Ingo Molnar -
RAPL is a per package facility and we already have a mechanism for a dedicated
per package reader. So there is no point to have multiple CPUs doing the
same. The current implementation actually starts two timers on two CPUs if one
does:perf stat -C1,2 -e -e power/energy-pkg ....
which makes the whole concept of 1 reader per package moot.
What's worse is that the above returns the double of the actual energy
consumption, but that's a different problem to address and cannot be solved by
removing the pointless per cpuness of that mechanism.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.845369524@linutronix.de
Signed-off-by: Ingo Molnar -
Store the PMU pointer in event->pmu_private and use it instead of the per CPU
data. Preparatory step to get rid of the per CPU allocations. The usage sites
are the perf fast path, so we keep that even after the conversion to per
package storage as a CPU to package lookup involves 3 loads versus 1 with the
pmu_private pointer.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.748151799@linutronix.de
Signed-off-by: Ingo Molnar -
This lock is taken in hard interrupt context even on Preempt-RT. Make it raw
so RT does not have to patch it.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.669411833@linutronix.de
Signed-off-by: Ingo Molnar -
Split out code from init into seperate functions. Tidy up the code and get rid
of pointless comments. I wish there would be comments for code which is not
obvious....Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.588544679@linutronix.de
Signed-off-by: Ingo Molnar -
The output is inconsistent. Use a proper pr_fmt prefix and split out the
advertisement into a seperate function.Remove the WARN_ON() in the failure case. It's pointless as we already know
where it failed.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.504551295@linutronix.de
Signed-off-by: Ingo Molnar -
No point in doing the same calculation over and over. Do it once in
rapl_check_hw_unit().Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.409238136@linutronix.de
Signed-off-by: Ingo Molnar -
There is no point in having a quirk machinery for a single possible
function. Get rid of it and move the quirk to a place where it actually
makes sense.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.311639465@linutronix.de
Signed-off-by: Ingo Molnar -
Like uncore the rapl driver lacks error handling. It leaks memory and leaves
the hotplug notifier registered.Add the proper error checks, cleanup the memory and register the hotplug
notifier only on success.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.231222076@linutronix.de
Signed-off-by: Ingo Molnar -
The Knights Landings support added the events and the detection case, but then
returns 0 without actually initializing the driver.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Dasaratharaman Chandramouli
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Fixes: 3a2a7797326a4 "perf/x86/intel/rapl: Add support for Knights Landing (KNL)"
Link: http://lkml.kernel.org/r/20160222221012.149331888@linutronix.de
Signed-off-by: Ingo Molnar -
CQM is a strict per package facility. Use the proper cpumasks to lookup the
readers.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221012.054916179@linutronix.de
Signed-off-by: Ingo Molnar -
Almost every cpumask function is exported, just not the one I need to make the
Intel uncore driver modular.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: David S. Miller
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rusty Russell
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.878299859@linutronix.de
Signed-off-by: Ingo Molnar -
Andi wanted to do this before, but the patch fell down the cracks. Implement
it with the proper error handling.Requested-by: Andi Kleen
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.799159968@linutronix.de
Signed-off-by: Ingo Molnar -
The only missing bit is to completely clear the hardware state on failure
exit. This is now a pretty simple exercise.Undo the box->init_box() setup on all packages which have been initialized so
far.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.702452407@linutronix.de
Signed-off-by: Ingo Molnar -
Uncore is a per package facility, but the code tries to mimick a per CPU
facility with completely convoluted constructs.Simplify the whole machinery by tracking per package information. While at it,
avoid the kfree/alloc dance when a CPU goes offline and online again. There is
no point in freeing the box after it was allocated. We just keep proper
refcounting and the first CPU which comes online in a package does the
initialization/activation of the box.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.622258933@linutronix.de
Signed-off-by: Ingo Molnar -
For per package oriented services we must be able to rely on the number of CPU
packages to be within bounds. Create a tracking facility, which- calculates the number of possible packages depending on nr_cpu_ids after boot
- makes sure that the package id is within the number of possible packages. If
the apic id is outside we map it to a logical package id if there is enough
space available.Provide interfaces for drivers to query the mapping and do translations from
physcial to logical ids.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Luis R. Rodriguez
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Toshi Kani
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.541071755@linutronix.de
Signed-off-by: Ingo Molnar -
Store the PMU pointer in event->pmu_private, so we can get rid of the
per CPU data storage.We keep it after converting to per package data, because a CPU to
package lookup will be 3 loads versus one and these usage sites are
in the perf fast path.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.460851335@linutronix.de
Signed-off-by: Ingo Molnar -
For PMUs which are not per CPU, but e.g. per package/socket, we want to be
able to store a reference to the underlying per package/socket facility in the
event at init time so we can avoid magic storage constructs in the PMU driver.This allows us to get rid of the per CPU dance in the intel uncore and RAPL
drivers and avoids a lookup of the per package data in the perf hotpath.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.364140369@linutronix.de
Signed-off-by: Ingo Molnar -
No users outside of this file.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.285504825@linutronix.de
Signed-off-by: Ingo Molnar -
Clean up the code a bit before reworking it completely.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.204771538@linutronix.de
Signed-off-by: Ingo Molnar -
When tearing down the boxes nothing undoes the hardware state which was setup
by box->init_box(). Add a box->exit_box() callback and implement it for the
uncores which have an init_box() callback.This misses the cleanup in the error exit pathes, but I cannot be bothered to
implement it before cleaning up the rest of the driver, which makes that task
way simpler.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221011.023930023@linutronix.de
Signed-off-by: Ingo Molnar -
The storage array is size limited, but misses a sanity check
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.929967806@linutronix.de
Signed-off-by: Ingo Molnar -
This driver lacks any form of proper error handling. If initialization fails
or hotplug prepare fails, it lets the facility with half initialized stuff
around.Fix the state and memory leaks in a first step. As a second step we need to
undo the hardware state which is set via uncore_box_init() on some of the
uncore implementations.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.848880559@linutronix.de
Signed-off-by: Ingo Molnar -
No point in doing partial rollbacks. Robustify uncore_exit_type() so it does
not dereference type->pmus unconditionally and remove all the partial rollback
hackery.Preparatory patch for proper error handling.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.751077467@linutronix.de
Signed-off-by: Ingo Molnar -
uncore_cpumask_init() is only ever called from intel_uncore_init() where the
mask is guaranteed to be empty.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andi Kleen
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Harish Chegondi
Cc: Jacob Pan
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20160222221010.657326866@linutronix.de
Signed-off-by: Ingo Molnar -
Alexander volunteered to review perf (kernel) patches.
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
BDX-DE and BDX-EP share the same uncore code path. But there is no sbox
in BDX-DE. This patch remove SBOX support for BDX-DE.Signed-off-by: Kan Liang
Signed-off-by: Peter Zijlstra (Intel)
Cc:
Cc:
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Tony Battersby
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F0770589D336@SHSMSX103.ccr.corp.intel.com
Signed-off-by: Ingo Molnar -
Signed-off-by: Ingo Molnar