14 Oct, 2020
2 commits
-
Allow the CC compiler to accept a CFLAGS environment variable. This
doesn't change the code generated but makes it easier to integrate
running the shell script in build systems like bazel.Signed-off-by: Ian Rogers
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexios Zavras
Cc: Andi Kleen
Cc: Greg Kroah-Hartman
Cc: Igor Lubashev
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Mark Rutland
Cc: Mathieu Poirier
Cc: Namhyung Kim
Cc: Nick Desaulniers
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Wei Li
Link: http://lore.kernel.org/lkml/20200306071110.130202-4-irogers@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo -
The variable 'traceid_list' is defined in the header file cs-etm.h,
if multiple C files include cs-etm.h the compiler might complaint for
multiple definition of 'traceid_list'.To fix multiple definition error, move the definition of 'traceid_list'
into cs-etm.c.Fixes: cd8bfd8c973e ("perf tools: Add processing of coresight metadata")
Reported-by: Thomas Backlund
Signed-off-by: Leo Yan
Reviewed-by: Mathieu Poirier
Reviewed-by: Mike Leach
Tested-by: Mike Leach
Tested-by: Thomas Backlund
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Suzuki Poulouse
Cc: Tor Jeremiassen
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200505133642.4756-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo
08 Oct, 2020
1 commit
-
* tag 'v5.4.70': (3051 commits)
Linux 5.4.70
netfilter: ctnetlink: add a range check for l3/l4 protonum
ep_create_wakeup_source(): dentry name can change under you...
...Conflicts:
arch/arm/mach-imx/pm-imx6.c
arch/arm64/boot/dts/freescale/imx8mm-evk.dts
arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
drivers/crypto/caam/caamalg.c
drivers/gpu/drm/imx/dw_hdmi-imx.c
drivers/gpu/drm/imx/imx-ldb.c
drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
drivers/mmc/host/sdhci-esdhc-imx.c
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
drivers/net/ethernet/freescale/enetc/enetc.c
drivers/net/ethernet/freescale/enetc/enetc_pf.c
drivers/thermal/imx_thermal.c
drivers/usb/cdns3/ep0.c
drivers/xen/swiotlb-xen.c
sound/soc/fsl/fsl_esai.c
sound/soc/fsl/fsl_sai.cSigned-off-by: Jason Liu
01 Oct, 2020
16 commits
-
[ Upstream commit 8510895bafdbf7c4dd24c22946d925691135c2b2 ]
A big uncore event group is split into multiple small groups which only
include the uncore events from the same PMU. This has been supported in
the commit 3cdc5c2cb924a ("perf parse-events: Handle uncore event
aliases in small groups properly").If the event's PMU name starts to repeat, it must be a new event.
That can be used to distinguish the leader from other members.
But now it only compares the pointer of pmu_name
(leader->pmu_name == evsel->pmu_name).If we use "perf stat -M LLC_MISSES.PCIE_WRITE -a" on cascadelakex,
the event list is:evsel->name evsel->pmu_name
---------------------------------------------------------------
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_4 (as leader)
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_2
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_0
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_5
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_3
unc_iio_data_req_of_cpu.mem_write.part0 uncore_iio_1
unc_iio_data_req_of_cpu.mem_write.part1 uncore_iio_4
......For the event "unc_iio_data_req_of_cpu.mem_write.part1" with
"uncore_iio_4", it should be the event from PMU "uncore_iio_4".
It's not a new leader for this PMU.But if we use "(leader->pmu_name == evsel->pmu_name)", the check
would be failed and the event is stored to leaders[] as a new
PMU leader.So this patch uses strcmp to compare the PMU name between events.
Fixes: d4953f7ef1a2 ("perf parse-events: Fix 3 use after frees found with clang ASAN")
Signed-off-by: Jin Yao
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jin Yao
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200430003618.17002-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 463538a383a27337cb83ae195e432a839a52d639 ]
Commit 5aa98879efe7 ("s390/cpum_sf: prohibit callchain data collection")
prohibits call graph sampling for hardware events on s390. The
information recorded is out of context and does not match.On s390 this commit now breaks test case 68 Zstd perf.data
compression/decompression.Therefore omit call graph sampling on s390 in this test.
Output before:
[root@t35lp46 perf]# ./perf test -Fv 68
68: Zstd perf.data compression/decompression :
--- start ---
Collecting compressed record file:
Error:
cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
Try 'perf stat'
---- end ----
Zstd perf.data compression/decompression: FAILED!
[root@t35lp46 perf]#Output after:
[root@t35lp46 perf]# ./perf test -Fv 68
68: Zstd perf.data compression/decompression :
--- start ---
Collecting compressed record file:
500+0 records in
500+0 records out
256000 bytes (256 kB, 250 KiB) copied, 0.00615638 s, 41.6 MB/s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB /tmp/perf.data.X3M,
compressed (original 0.002 MB, ratio is 3.609) ]
Checking compressed events stats:
# compressed : Zstd, level = 1, ratio = 4
COMPRESSED events: 1
2ELIFREPh---- end ----
Zstd perf.data compression/decompression: Ok
[root@t35lp46 perf]#Signed-off-by: Thomas Richter
Reviewed-by: Sumanth Korikkar
Cc: Heiko Carstens
Cc: Sven Schnelle
Cc: Vasily Gorbik
Link: http://lore.kernel.org/lkml/20200729135314.91281-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 61f82e3fb697a8e85f22fdec786528af73dc36d1 ]
In the absence of any modules, no "modules" map is created, but there
are other executable pages to map, due to eBPF JIT, kprobe or ftrace.
Map them by recognizing that the first "module" symbol is not
necessarily from a module, and adjust the map accordingly.Signed-off-by: Adrian Hunter
Cc: Alexander Shishkin
Cc: Borislav Petkov
Cc: H. Peter Anvin
Cc: Jiri Olsa
Cc: Leo Yan
Cc: Mark Rutland
Cc: Masami Hiramatsu
Cc: Mathieu Poirier
Cc: Peter Zijlstra
Cc: Steven Rostedt (VMware)
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit a159e2fe89b4d1f9fb54b0ae418b961e239bf617 ]
Avoid a simple memory leak.
Signed-off-by: Ian Rogers
Cc: Alexander Shishkin
Cc: Alexei Starovoitov
Cc: Andi Kleen
Cc: Andrii Nakryiko
Cc: Cong Wang
Cc: Daniel Borkmann
Cc: Jin Yao
Cc: Jiri Olsa
Cc: John Fastabend
Cc: John Garry
Cc: Kajol Jain
Cc: Kan Liang
Cc: Kim Phillips
Cc: Mark Rutland
Cc: Martin KaFai Lau
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Song Liu
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: Yonghong Song
Cc: bpf@vger.kernel.org
Cc: kp singh
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200508053629.210324-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 07e9a6f538cbeecaf5c55b6f2991416f873cdcbd ]
Need to free "str" before return when asprintf() failed to avoid memory
leak.Signed-off-by: Xie XiuQi
Cc: Alexander Shishkin
Cc: Hongbo Yao
Cc: Jiri Olsa
Cc: Li Bin
Cc: Mark Rutland
Cc: Namhyung Kim
Link: http://lore.kernel.org/lkml/20200521133218.30150-4-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit ea9eb1f456a08c18feb485894185f7a4e31cc8a4 ]
Joakim reported wrong duration_time value for interval bigger
than 4000 [1].The problem is in the interval value we pass to update_stats
function, which is typed as 'unsigned int' and overflows when
we get over 2^32 (happens between intervals 4000 and 5000).Retyping the passed value to unsigned long long.
[1] https://www.spinics.net/lists/linux-perf-users/msg11777.html
Fixes: b90f1333ef08 ("perf stat: Update walltime_nsecs_stats in interval mode")
Reported-by: Joakim Zhang
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200518131445.3745083-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 7597ce89b3ed239f7a3408b930d2a6c7a4c938a1 ]
Make the architecture test directory agree with the code comment.
Committer notes:
This was split from a larger patch.
The code was assuming the developer always worked from tools/perf/, so make sure we
do the test -d having $toolsdir/perf/arch/$arch, to match the intent expressed in the comment,
just above that loop.Signed-off-by: Ian Rogers
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexios Zavras
Cc: Andi Kleen
Cc: Greg Kroah-Hartman
Cc: Igor Lubashev
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Mark Rutland
Cc: Mathieu Poirier
Cc: Namhyung Kim
Cc: Nick Desaulniers
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Wei Li
Link: http://lore.kernel.org/lkml/20200306071110.130202-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 3efc899d9afb3d03604f191a0be9669eabbfc4aa ]
If allocated, perf_pkg_mask and metric_events need freeing.
Signed-off-by: Ian Rogers
Reviewed-by: Andi Kleen
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lore.kernel.org/lkml/20200512235918.10732-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 266150c94c69429cf6d18e130237224a047f5061 ]
Realloc of size zero is a free not an error, avoid this causing a double
free. Caught by clang's address sanitizer:==2634==ERROR: AddressSanitizer: attempting double-free on 0x6020000015f0 in thread T0:
#0 0x5649659297fd in free llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:123:3
#1 0x5649659e9251 in __zfree tools/lib/zalloc.c:13:2
#2 0x564965c0f92c in mem2node__exit tools/perf/util/mem2node.c:114:2
#3 0x564965a08b4c in perf_c2c__report tools/perf/builtin-c2c.c:2867:2
#4 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#5 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#6 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#7 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#8 0x564965942e41 in main tools/perf/perf.c:538:30x6020000015f0 is located 0 bytes inside of 1-byte region [0x6020000015f0,0x6020000015f1)
freed by thread T0 here:
#0 0x564965929da3 in realloc third_party/llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x564965c0f55e in mem2node__init tools/perf/util/mem2node.c:97:16
#2 0x564965a08956 in perf_c2c__report tools/perf/builtin-c2c.c:2803:8
#3 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#4 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#5 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#6 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#7 0x564965942e41 in main tools/perf/perf.c:538:3previously allocated by thread T0 here:
#0 0x564965929c42 in calloc third_party/llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x5649659e9220 in zalloc tools/lib/zalloc.c:8:9
#2 0x564965c0f32d in mem2node__init tools/perf/util/mem2node.c:61:12
#3 0x564965a08956 in perf_c2c__report tools/perf/builtin-c2c.c:2803:8
#4 0x564965a0616a in cmd_c2c tools/perf/builtin-c2c.c:2989:10
#5 0x564965944348 in run_builtin tools/perf/perf.c:312:11
#6 0x564965943235 in handle_internal_command tools/perf/perf.c:364:8
#7 0x5649659440c4 in run_argv tools/perf/perf.c:408:2
#8 0x564965942e41 in main tools/perf/perf.c:538:3v2: add a WARN_ON_ONCE when the free condition arises.
Signed-off-by: Ian Rogers
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200320182347.87675-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit bec49a9e05db3dbdca696fa07c62c52638fb6371 ]
When it is not possible for a non-privilege perf command to monitor at
the kernel level (:k), the fallback code forces a :u. That works if the
event was previously monitoring both levels. But if the event was
already constrained to kernel only, then it does not make sense to
restrict it to user only.Given the code works by exclusion, a kernel only event would have:
attr->exclude_user = 1
The fallback code would add:
attr->exclude_kernel = 1
In the end the end would not monitor in either the user level or kernel
level. In other words, it would count nothing.An event programmed to monitor kernel only cannot be switched to user
only without seriously warning the user.This patch forces an error in this case to make it clear the request
cannot really be satisfied.Behavior with paranoid 1:
$ sudo bash -c "echo 1 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1Performance counter stats for 'sleep 1':
1,520,413 cycles:k
1.002361664 seconds time elapsed
0.002480000 seconds user
0.000000000 seconds sysOld behavior with paranoid 2:
$ sudo bash -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1
Performance counter stats for 'sleep 1':0 cycles:ku
1.002358127 seconds time elapsed
0.002384000 seconds user
0.000000000 seconds sysNew behavior with paranoid 2:
$ sudo bash -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1
Error:
You may not have permission to collect stats.Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).The current value is 2:
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMINTo make this setting permanent, edit /etc/sysctl.conf too, e.g.:
kernel.perf_event_paranoid = -1
v2 of this patch addresses the review feedback from jolsa@redhat.com.
Signed-off-by: Stephane Eranian
Reviewed-by: Ian Rogers
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200414161550.225588-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit d74b181a028bb5a468f0c609553eff6a8fdf4887 ]
'snprintf' returns the number of characters which would be generated for
the given input.If the returned value is *greater than* or equal to the buffer size, it
means that the output has been truncated.Fix the overflow test accordingly.
Fixes: 7780c25bae59f ("perf tools: Allow ability to map cpus to nodes easily")
Fixes: 92a7e1278005b ("perf cpumap: Add cpu__max_present_cpu()")
Signed-off-by: Christophe JAILLET
Suggested-by: David Laight
Cc: Alexander Shishkin
Cc: Don Zickus
Cc: He Zhe
Cc: Jan Stancek
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324070319.10901-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit d4953f7ef1a2e87ef732823af35361404d13fea8 ]
Reproducible with a clang asan build and then running perf test in
particular 'Parse event definition strings'.Signed-off-by: Ian Rogers
Acked-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Leo Yan
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200314170356.62914-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit c9f5baa136777b2c982f6f7a90c9da69a88be148 ]
When 'etm->instructions_sample_period' is less than
'tidq->period_instructions', the function cs_etm__sample() cannot handle
this case properly with its logic.Let's see below flow as an example:
- If we set itrace option '--itrace=i4', then function cs_etm__sample()
has variables with initialized values:tidq->period_instructions = 0
etm->instructions_sample_period = 4- When the first packet is coming:
packet->instr_count = 10; the number of instructions executed in this
packet is 10, thus update period_instructions as below:tidq->period_instructions = 0 + 10 = 10
instrs_over = 10 - 4 = 6
offset = 10 - 6 - 1 = 3
tidq->period_instructions = instrs_over = 6- When the second packet is coming:
packet->instr_count = 10; in the second pass, assume 10 instructions
in the trace sample again:tidq->period_instructions = 6 + 10 = 16
instrs_over = 16 - 4 = 12
offset = 10 - 12 - 1 = -3 -> the negative value
tidq->period_instructions = instrs_over = 12So after handle these two packets, there have below issues:
The first issue is that cs_etm__instr_addr() returns the address within
the current trace sample of the instruction related to offset, so the
offset is supposed to be always unsigned value. But in fact, function
cs_etm__sample() might calculate a negative offset value (in handling
the second packet, the offset is -3) and pass to cs_etm__instr_addr()
with u64 type with a big positive integer.The second issue is it only synthesizes 2 samples for sample period = 4.
In theory, every packet has 10 instructions so the two packets have
total 20 instructions, 20 instructions should generate 5 samples
(4 x 5 = 20). This is because cs_etm__sample() only calls once
cs_etm__synth_instruction_sample() to generate instruction sample per
range packet.This patch fixes the logic in function cs_etm__sample(); the basic
idea for handling coming packet is:- To synthesize the first instruction sample, it combines the left
instructions from the previous packet and the head of the new
packet; then generate continuous samples with sample period;
- At the tail of the new packet, if it has the rest instructions,
these instructions will be left for the sequential sample.Suggested-by: Mike Leach
Signed-off-by: Leo Yan
Reviewed-by: Mathieu Poirier
Reviewed-by: Mike Leach
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Robert Walker
Cc: Suzuki Poulouse
Cc: coresight ml
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit d01751563caf0dec7be36f81de77cc0197b77e59 ]
If use option '--itrace=iNNN' with Arm CoreSight trace data, perf tool
fails inject instruction samples; the root cause is the packets are only
swapped for branch samples and last branches but not for instruction
samples, so the new coming packets cannot be properly handled for only
synthesizing instruction samples.To fix this issue, this patch refactors the code with a new function
cs_etm__packet_swap() which is used to swap packets and adds the
condition for instruction samples.Signed-off-by: Leo Yan
Reviewed-by: Mathieu Poirier
Reviewed-by: Mike Leach
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Robert Walker
Cc: Suzuki Poulouse
Cc: coresight ml
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 3f5777fbaf04c58d940526a22a2e0c813c837936 ]
The memory for global pointer is never freed during normal program
execution, so let's do that in the main function exit as a good
programming practice.A stray blank line is also removed.
Reported-by: Jiri Olsa
Signed-off-by: John Garry
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: James Clark
Cc: Joakim Zhang
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1583406486-154841-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 2bbc83537614517730e9f2811195004b712de207 ]
This test places a kprobe to function getname_flags() in the kernel
which has the following prototype:struct filename *getname_flags(const char __user *filename, int flags, int *empty)
The 'filename' argument points to a filename located in user space memory.
Looking at commit 88903c464321c ("tracing/probe: Add ustring type for
user-space string") the kprobe should indicate that user space memory is
accessed.Output before:
[root@m35lp76 perf]# ./perf test 66 67
66: Use vfs_getname probe to get syscall args filenames : FAILED!
67: Check open filename arg using perf trace + vfs_getname: FAILED!
[root@m35lp76 perf]#Output after:
[root@m35lp76 perf]# ./perf test 66 67
66: Use vfs_getname probe to get syscall args filenames : Ok
67: Check open filename arg using perf trace + vfs_getname: Ok
[root@m35lp76 perf]#Comments from Masami Hiramatsu:
This bug doesn't happen on x86 or other archs on which user address
space and kernel address space is the same. On some arches (ppc64 in
this case?) user address space is partially or completely the same as
kernel address space.(Yes, they switch the world when running into the kernel) In this case,
we need to use different data access functions for each space.That is why I introduced the "ustring" type for kprobe events.
As far as I can see, Thomas's patch is sane. Thomas, could you show us
your result on your test environment?Comments from Thomas Richter:
Test results for s/390 included above.
Signed-off-by: Thomas Richter
Acked-by: Masami Hiramatsu
Tested-by: Arnaldo Carvalho de Melo
Cc: Heiko Carstens
Cc: Sumanth Korikkar
Cc: Vasily Gorbik
Link: http://lore.kernel.org/lkml/20200217102111.61137-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin
23 Sep, 2020
5 commits
-
[ Upstream commit d26383dcb2b4b8629fde05270b4e3633be9e3d4b ]
The following leaks were detected by ASAN:
Indirect leak of 360 byte(s) in 9 object(s) allocated from:
#0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
#2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
#3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
#4 0x560578e07045 in test__pmu tests/pmu.c:155
#5 0x560578de109b in run_test tests/builtin-test.c:410
#6 0x560578de109b in test_and_print tests/builtin-test.c:440
#7 0x560578de401a in __cmd_test tests/builtin-test.c:661
#8 0x560578de401a in cmd_test tests/builtin-test.c:807
#9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308Fixes: cff7f956ec4a1 ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Ian Rogers
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit b12eea5ad8e77f8a380a141e3db67c07432dde16 ]
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string. But tool event (duration_time) passes a result of
strdup() caused a leak.It was found by ASAN during metric test:
Direct leak of 210 byte(s) in 70 object(s) allocated from:
#0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
#2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
#3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
#4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
#5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
#6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
#7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
#8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
#9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
#10 0x559fbbc0109b in run_test tests/builtin-test.c:410
#11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
#12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
#13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
#14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Ian Rogers
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit bfd1b83d75e44a9f65de30accb3dd3b5940bd3ac ]
Asan reported leak of cpu and thread maps as they have one more refcount
than released. I found that after setting evlist maps it should release
it's refcount.It seems to be broken from the beginning so I chose the original commit
as the culprit. But not sure how it's applied to stable trees since
there are many changes in the code after that.Fixes: 7e2ed097538c5 ("perf evlist: Store pointer to the cpu and thread maps")
Fixes: 4112eb1899c0e ("perf evlist: Default to syswide target when no thread/cpu maps set")
Signed-off-by: Namhyung Kim
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Ian Rogers
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lore.kernel.org/lkml/20200915031819.386559-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 8a39e8c4d9baf65d88f66d49ac684df381e30055 ]
When compiling with DEBUG=1 on Fedora 32 I'm getting crash for 'perf
test signal':Program received signal SIGSEGV, Segmentation fault.
0x0000000000c68548 in __test_function ()
(gdb) bt
#0 0x0000000000c68548 in __test_function ()
#1 0x00000000004d62e9 in test_function () at tests/bp_signal.c:61
#2 0x00000000004d689a in test__bp_signal (test=0xa8e280 DW_AT_producer : (indirect string, offset: 0x254a): GNU C99 10.2.1 20200723 (Red Hat 10.2.1-1) -mtune=generic -march=x86-64 -ggdb3 -std=gnu99 -fno-omit-frame-pointer -funwind-tables -fstack-protector-all
^^^^^
^^^^^
^^^^^
$Before:
$ perf test signal
20: Breakpoint overflow signal handler : FAILED!
$After:
$ perf test signal
20: Breakpoint overflow signal handler : Ok
$Fixes: 8fd34e1cce18 ("perf test: Improve bp_signal")
Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Wang Nan
Link: http://lore.kernel.org/lkml/20200911130005.1842138-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
The ddr bit width on i.MX8MN evk board is 16 bit, not 32 bit.
Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang
10 Sep, 2020
2 commits
-
commit a060c1f12b525ba828f871eff3127dabf8daa1e6 upstream.
The help info of option "--no-bpf-event" is wrongly described as "record
bpf events", correct it.Committer testing:
$ perf record -h bpf
Usage: perf record [] []
or: perf record [] -- []--clang-opt
options passed to clang when compiling BPF scriptlets
--clang-path
clang binary to use for compiling BPF scriptlets
--no-bpf-event do not record bpf events$
Fixes: 71184c6ab7e6 ("perf record: Replace option --bpf-event with --no-bpf-event")
Signed-off-by: Wei Li
Acked-by: Song Liu
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Hanjun Guo
Cc: Jiri Olsa
Cc: Li Bin
Cc: Mark Rutland
Cc: Namhyung Kim
Link: http://lore.kernel.org/lkml/20200819031947.12115-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit e62458e3940eb3dfb009481850e140fbee183b04 ]
The new string should have enough space for the original string and the
back slashes IMHO.Fixes: fbc2844e84038ce3 ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv")
Signed-off-by: Namhyung Kim
Reviewed-by: Ian Rogers
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: John Garry
Cc: Kajol Jain
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: William Cohen
Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin
05 Sep, 2020
1 commit
-
commit e48a73a312ebf19cc3d72aa74985db25c30757c1 upstream.
Event modifiers are not mentioned in the perf record or perf stat
manpages. Add them to orient new users more effectively by pointing
them to the perf list manpage for details.Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS")
Signed-off-by: Kim Phillips
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Ian Rogers
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Paul Clarke
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tony Jones
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman
02 Sep, 2020
5 commits
-
Add bandwidth usage metric for i.MX8QM DDR Perf.
Test Report:
------------------------------------------------------
root@imx8qmmek:~# ./perf list metricList of pre-defined events (to be used in -e):
Metrics:
imx8qm-ddr0-all-r
[imx8qm: bytes of all masters read from ddr0]
imx8qm-ddr0-all-w
[imx8qm: bytes of all masters write to ddr0]
imx8qm-ddr0-bandwidth-usage
[imx8qm: percentage of bandwidth usage for ddr0]
imx8qm-ddr1-all-r
[imx8qm: bytes of all masters read from ddr1]
imx8qm-ddr1-all-w
[imx8qm: bytes of all masters write to ddr1]
imx8qm-ddr1-bandwidth-usage
[imx8qm: percentage of bandwidth usage for ddr1]
------------------------------------------------------
root@imx8qmmek:~# ./perf stat -a -I 1000 -M imx8qm-ddr0-bandwidth-usage dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000137560 8403160 imx8_ddr0/read-cycles/ # 11.9 % imx8qm-ddr0-bandwidth-usage
1.000137560 86499449 imx8_ddr0/write-cycles/
1.000137560 1000137560 ns duration_time
2.000542875 8818984 imx8_ddr0/read-cycles/ # 10.5 % imx8qm-ddr0-bandwidth-usage
2.000542875 74883499 imx8_ddr0/write-cycles/
2.000542875 1000405315 ns duration_time
3.000839188 8604400 imx8_ddr0/read-cycles/ # 9.6 % imx8qm-ddr0-bandwidth-usage
3.000839188 68284175 imx8_ddr0/write-cycles/
3.000839188 1000296313 ns duration_time
--------------------------------------------------------
root@imx8qmmek:~# ./perf stat -a -I 1000 -M imx8qm-ddr1-bandwidth-usage dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000129435 15152856 imx8_ddr1/read-cycles/ # 14.5 % imx8qm-ddr1-bandwidth-usage
1.000129435 100669236 imx8_ddr1/write-cycles/
1.000129435 1000129435 ns duration_time
2.000521875 15463356 imx8_ddr1/read-cycles/ # 13.4 % imx8qm-ddr1-bandwidth-usage
2.000521875 91710077 imx8_ddr1/write-cycles/
2.000521875 1000392440 ns duration_time
3.000794688 15773560 imx8_ddr1/read-cycles/ # 12.7 % imx8qm-ddr1-bandwidth-usage
3.000794688 85948507 imx8_ddr1/write-cycles/
3.000794688 1000272813 ns duration_timeSigned-off-by: Joakim Zhang
-
Add bandwidth usage metric for i.MX8QXP DDR Perf.
Test Report:
----------------------------------------------------
root@imx8qxpmek:~# ./perf list metricList of pre-defined events (to be used in -e):
Metrics:
imx8qxp-ddr0-all-r
[imx8qxp: bytes of all masters read from ddr0]
imx8qxp-ddr0-all-w
[imx8qxp: bytes of all masters write to ddr0]
imx8qxp-ddr0-bandwidth-usage
[imx8qxp: percentage of bandwidth usage for ddr0]
--------------------------------------------------------
root@imx8qxpmek:~# ./perf stat -a -I 1000 -M imx8qxp-ddr0-bandwidth-usage dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000170750 681608 imx8_ddr0/read-cycles/ # 13.6 % imx8qxp-ddr0-bandwidth-usage
1.000170750 81013320 imx8_ddr0/write-cycles/
1.000170750 1000170750 ns duration_time
2.000833375 592804 imx8_ddr0/read-cycles/ # 13.7 % imx8qxp-ddr0-bandwidth-usage
2.000833375 81756288 imx8_ddr0/write-cycles/
2.000833375 1000662625 ns duration_time
3.001393875 611804 imx8_ddr0/read-cycles/ # 13.6 % imx8qxp-ddr0-bandwidth-usage
3.001393875 80897346 imx8_ddr0/write-cycles/
3.001393875 1000560500 ns duration_time
4.001917375 600564 imx8_ddr0/read-cycles/ # 13.5 % imx8qxp-ddr0-bandwidth-usage
4.001917375 80269884 imx8_ddr0/write-cycles/
4.001917375 1000523500 ns duration_timeSigned-off-by: Joakim Zhang
-
Add bandwidth usage metric for i.MX8MM DDR Perf.
Test Report:
-------------------------------------------------------
root@imx8mmevk:~# ./perf list metricList of pre-defined events (to be used in -e):
Metrics:
imx8mm-ddr0-2d-r
[imx8mm: bursts of gpu 2d read from ddr0]
imx8mm-ddr0-2d-w
[imx8mm: bursts of gpu 2d write to ddr0]
imx8mm-ddr0-3d-r
[imx8mm: bursts of gpu 3d read from ddr0]
imx8mm-ddr0-3d-w
[imx8mm: bursts of gpu 3d write to ddr0]
imx8mm-ddr0-a53-r
[imx8mm: bursts of a53 core read from ddr0]
imx8mm-ddr0-a53-w
[imx8mm: bursts of a53 core write to ddr0]
imx8mm-ddr0-all-r
[imx8mm: bytes of all masters read from ddr0]
imx8mm-ddr0-all-w
[imx8mm: bytes of all masters write to ddr0]
---------------------------------------------------------
root@imx8mmevk:~# ./perf stat -a -I 1000 -M imx8mm-ddr0-bandwidth-usage-lpddr4 dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000127125 324072 imx8_ddr0/read-cycles/ # 33.4 % imx8mm-ddr0-bandwidth-usage-lpddr4
1.000127125 250417562 imx8_ddr0/write-cycles/
1.000127125 1000127125 ns duration_time
2.001282750 293964 imx8_ddr0/read-cycles/ # 33.9 % imx8mm-ddr0-bandwidth-usage-lpddr4
2.001282750 254176749 imx8_ddr0/write-cycles/
2.001282750 1001155625 ns duration_time
3.002299500 234264 imx8_ddr0/read-cycles/ # 33.0 % imx8mm-ddr0-bandwidth-usage-lpddr4
3.002299500 247474957 imx8_ddr0/write-cycles/
3.002299500 1001016750 ns duration_time
4.003355875 202304 imx8_ddr0/read-cycles/ # 32.8 % imx8mm-ddr0-bandwidth-usage-lpddr4
4.003355875 245469156 imx8_ddr0/write-cycles/
4.003355875 1001056375 ns duration_timeSigned-off-by: Joakim Zhang
-
Add JSON file for i.MX8MQ DDR Perf.
Test Report:
-------------------------------------------------------------
root@imx8mqevk:~# ./perf list metricList of pre-defined events (to be used in -e):
Metrics:
imx8mq-ddr0-all-r
[imx8mq: bytes of all masters read from ddr0]
imx8mq-ddr0-all-w
[imx8mq: bytes of all masters write to ddr0]
imx8mq-ddr0-bandwidth-usage
[imx8mq: percentage of bandwidth usage for ddr0]
------------------------------------------------------------
root@imx8mqevk:~# ./perf stat -a -I 1000 -M imx8mq-ddr0-all-r,imx8mq-ddr0-all-w
1.001143121 34224 imx8_ddr0/read-cycles/ # 547584.0 imx8mq-ddr0-all-r
1.001143121 10805 imx8_ddr0/write-cycles/ # 172880.0 imx8mq-ddr0-all-w
2.003035881 31656 imx8_ddr0/read-cycles/ # 506496.0 imx8mq-ddr0-all-r
2.003035881 7585 imx8_ddr0/write-cycles/ # 121360.0 imx8mq-ddr0-all-w
3.004305241 19864 imx8_ddr0/read-cycles/ # 317824.0 imx8mq-ddr0-all-r
3.004305241 1483 imx8_ddr0/write-cycles/ # 23728.0 imx8mq-ddr0-all-w
------------------------------------------------------------
root@imx8mqevk:~# ./perf stat -a -I 1000 -M imx8mq-ddr0-bandwidth-usage dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000643080 126560 imx8_ddr0/read-cycles/ # 11.6 % imx8mq-ddr0-bandwidth-usage
1.000643080 92714082 imx8_ddr0/write-cycles/
1.000643080 1000643080 ns duration_time
2.002052721 82056 imx8_ddr0/read-cycles/ # 9.4 % imx8mq-ddr0-bandwidth-usage
2.002052721 75279735 imx8_ddr0/write-cycles/
2.002052721 1001409641 ns duration_time
3.003379081 85448 imx8_ddr0/read-cycles/ # 9.3 % imx8mq-ddr0-bandwidth-usage
3.003379081 74199950 imx8_ddr0/write-cycles/
3.003379081 1001326360 ns duration_time
4.004734241 91084 imx8_ddr0/read-cycles/ # 9.5 % imx8mq-ddr0-bandwidth-usage
4.004734241 75513082 imx8_ddr0/write-cycles/
4.004734241 1001355160 ns duration_timeSigned-off-by: Joakim Zhang
-
Add JSON file for i.MX8MN DDR Perf
Test Report:
---------------------------------------------------------------
root@imx8mnevk:~# ./perf list metricList of pre-defined events (to be used in -e):
Metrics:
imx8mn-ddr0-all-r
[imx8mn: bytes of all masters read from ddr0]
imx8mn-ddr0-all-w
[imx8mn: bytes of all masters write to ddr0]
imx8mn-ddr0-bandwidth-usage-ddr4
[imx8mn: percentage of bandwidth usage for ddr0]
imx8mn-ddr0-bandwidth-usage-lpddr4
[imx8mn: percentage of bandwidth usage for ddr0]
------------------------------------------------------------------
root@imx8mnevk:~# ./perf stat -a -I 1000 -M imx8mn-ddr0-all-r,imx8mn-ddr0-all-w
1.000469875 108120 imx8_ddr0/read-cycles/ # 1729920.0 imx8mn-ddr0-all-r
1.000469875 28841 imx8_ddr0/write-cycles/ # 461456.0 imx8mn-ddr0-all-w
2.001191750 37396 imx8_ddr0/read-cycles/ # 598336.0 imx8mn-ddr0-all-r
2.001191750 6090 imx8_ddr0/write-cycles/ # 97440.0 imx8mn-ddr0-all-w
------------------------------------------------------------------
root@imx8mnevk:~# ./perf stat -a -I 1000 -M imx8mn-ddr0-bandwidth-usage-lpddr4 dd if=/dev/zero of=/dev/null bs=1M count=1000000
1.000762250 840456 imx8_ddr0/read-cycles/ # 48.9 % imx8mn-ddr0-bandwidth-usage-lpddr4
1.000762250 390024176 imx8_ddr0/write-cycles/
1.000762250 1000762250 ns duration_time
2.001982125 592944 imx8_ddr0/read-cycles/ # 48.5 % imx8mn-ddr0-bandwidth-usage-lpddr4
2.001982125 387366923 imx8_ddr0/write-cycles/
2.001982125 1001219875 ns duration_time
3.003123250 542650 imx8_ddr0/read-cycles/ # 48.4 % imx8mn-ddr0-bandwidth-usage-lpddr4
3.003123250 386631603 imx8_ddr0/write-cycles/
3.003123250 1001141125 ns duration_time
4.004289875 538522 imx8_ddr0/read-cycles/ # 48.4 % imx8mn-ddr0-bandwidth-usage-lpddr4
4.004289875 386577020 imx8_ddr0/write-cycles/
4.004289875 1001166625 ns duration_time
5.005546750 515596 imx8_ddr0/read-cycles/ # 48.4 % imx8mn-ddr0-bandwidth-usage-lpddr4
5.005546750 386800889 imx8_ddr0/write-cycles/
5.005546750 1001256875 ns duration_timeSigned-off-by: Joakim Zhang
26 Aug, 2020
1 commit
-
[ Upstream commit 12d572e785b15bc764e956caaa8a4c846fd15694 ]
Fix the memory leakage in debuginfo__find_trace_events() when the probe
point is not found in the debuginfo. If there is no probe point found in
the debuginfo, debuginfo__find_probes() will NOT return -ENOENT, but 0.Thus the caller of debuginfo__find_probes() must check the tf.ntevs and
release the allocated memory for the array of struct probe_trace_event.The current code releases the memory only if the debuginfo__find_probes()
hits an error but not checks tf.ntevs. In the result, the memory allocated
on *tevs are not released if tf.ntevs == 0.This fixes the memory leakage by checking tf.ntevs == 0 in addition to
ret < 0.Fixes: ff741783506c ("perf probe: Introduce debuginfo to encapsulate dwarf information")
Signed-off-by: Masami Hiramatsu
Reviewed-by: Srikar Dronamraju
Cc: Andi Kleen
Cc: Oleg Nesterov
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/159438668346.62703.10887420400718492503.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin
21 Aug, 2020
3 commits
-
[ Upstream commit 1beaef29c34154ccdcb3f1ae557f6883eda18840 ]
For memcpy, the source pages are memset to zero only when --cycles is
used. This leads to wildly different results with or without --cycles,
since all sources pages are likely to be mapped to the same zero page
without explicit writes.Before this fix:
$ export cmd="./perf stat -e LLC-loads -- ./perf bench \
mem memcpy -s 1024MB -l 100 -f default"
$ $cmd2,935,826 LLC-loads
3.821677452 seconds time elapsed$ $cmd --cycles
217,533,436 LLC-loads
8.616725985 seconds time elapsedAfter this fix:
$ $cmd
214,459,686 LLC-loads
8.674301124 seconds time elapsed$ $cmd --cycles
214,758,651 LLC-loads
8.644480006 seconds time elapsedFixes: 47b5757bac03c338 ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
Signed-off-by: Vincent Whitchurch
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: kernel@axis.com
Link: http://lore.kernel.org/lkml/20200810133404.30829-1-vincent.whitchurch@axis.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
commit a58a057ce65b52125dd355b7d8b0d540ea267a5f upstream.
CBR events can result in a duplicate branch event, because the state
type defaults to a branch. Fix by clearing the state type.Example: trace 'sleep' and hope for a frequency change
Before:
$ perf record -e intel_pt//u sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bpe > before.txtAfter:
$ perf script --itrace=bpe > after.txt
$ diff -u before.txt after.txt
# --- before.txt 2020-07-07 14:42:18.191508098 +0300
# +++ after.txt 2020-07-07 14:42:36.587891753 +0300
@@ -29673,7 +29673,6 @@
sleep 93431 [007] 15411.619905: 1 branches:u: 0 [unknown] ([unknown]) => 7f0818abb2e0 clock_nanosleep@@GLIBC_2.17+0x0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.619905: 1 branches:u: 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 0 [unknown] ([unknown])
sleep 93431 [007] 15411.720069: cbr: cbr: 15 freq: 1507 MHz ( 56%) 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
- sleep 93431 [007] 15411.720069: 1 branches:u: 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 0 [unknown] ([unknown])
sleep 93431 [007] 15411.720076: 1 branches:u: 0 [unknown] ([unknown]) => 7f0818abb30e clock_nanosleep@@GLIBC_2.17+0x2e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.720077: 1 branches:u: 7f0818abb323 clock_nanosleep@@GLIBC_2.17+0x43 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 7f0818ac0eb7 __nanosleep+0x17 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.720077: 1 branches:u: 7f0818ac0ebf __nanosleep+0x1f (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 55cb7e4c2827 rpl_nanosleep+0x97 (/usr/bin/sleep)Fixes: 91de8684f1cff ("perf intel-pt: Cater for CBR change in PSB+")
Fixes: abe5a1d3e4bee ("perf intel-pt: Decoder to output CBR changes immediately")
Signed-off-by: Adrian Hunter
Reviewed-by: Andi Kleen
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman -
commit 401136bb084fd021acd9f8c51b52fe0a25e326b2 upstream.
While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely. The
result was an occasional lost EXSTOP event.Signed-off-by: Adrian Hunter
Reviewed-by: Andi Kleen
Cc: Jiri Olsa
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman
05 Aug, 2020
4 commits
-
commit e4d9b04b973b2dbce7b42af95ea70d07da1c936d upstream.
Noticed with gcc 10 (fedora rawhide) that those variables were not being
declared as static, so end up with:ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
make[4]: *** [/git/perf/tools/build/Makefile.build:145: /tmp/build/perf/bench/perf-in.o] Error 1Prefix those with bench__ and add them to bench/bench.h, so that we can
share those on the tools needing to access those variables from signal
handlers.Acked-by: Thomas Gleixner
Cc: Adrian Hunter
Cc: Davidlohr Bueso
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lore.kernel.org/lkml/20200303155811.GD13702@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Ben Hutchings
Signed-off-by: Greg Kroah-Hartman -
commit ebcb9464a2ae3a547e97de476575c82ece0e93e2 upstream.
It is possible to return a pointer to a local variable when looking up
the architecture name for the running system and no normalization is
done on that value, i.e. we may end up returning the uts.machine local
variable.While this doesn't happen on most arches, as normalization takes place,
lets fix this by making that a static variable and optimize it a bit by
not always running uname(), only the first time.Noticed in fedora rawhide running with:
[perfbuilder@a5ff49d6e6e4 ~]$ gcc --version
gcc (GCC) 10.0.1 20200216 (Red Hat 10.0.1-0.8)Reported-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Ben Hutchings
Signed-off-by: Greg Kroah-Hartman -
commit cff20b3151ccab690715cb6cf0f5da5cccb32adf upstream.
To fix the build with newer gccs, that without this patch exit with:
LD /tmp/build/perf/tests/perf-in.o
ld: /tmp/build/perf/tests/bp_account.o:/git/perf/tools/perf/tests/bp_account.c:22: multiple definition of `the_var'; /tmp/build/perf/tests/bp_signal.o:/git/perf/tools/perf/tests/bp_signal.c:38: first defined here
make[4]: *** [/git/perf/tools/build/Makefile.build:145: /tmp/build/perf/tests/perf-in.o] Error 1First noticed in fedora:rawhide/32 with:
[perfbuilder@a5ff49d6e6e4 ~]$ gcc --version
gcc (GCC) 10.0.1 20200216 (Red Hat 10.0.1-0.8)Reported-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Ben Hutchings
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit bd3c628f8fafa6cbd6a1ca440034b841f0080160 ]
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.[root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]#
[root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
-e cache-misses sleep 1
[root@localhost 0620]#The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.Fixes: ffd3d18c20b8 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li
Reviewed-by: Mathieu Poirier
Tested-by-by: Leo Yan
Cc: Alexander Shishkin
Cc: Hanjun Guo
Cc: Jiri Olsa
Cc: Kim Phillips
Cc: Mark Rutland
Cc: Mike Leach
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Suzuki Poulouse
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin