18 May, 2020
4 commits
-
Add ddr bandwidth usage metric support for i.MX8DXL.
Metric:
imx8dxl_lpddr4.bandwidth_usage
imx8dxl_ddr3l.bandwidth_usageExample:
root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_lpddr4.bandwidth_usage dd if=/dev/zero of=/dev/null bs=100M count=1000000
1.000242625 1444320 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 15.1 % imx8dxl_lpddr4.bandwidth_usage
1.000242625 88964544 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
1.000242625 1000242625 ns duration_time
2.001170500 297392 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 16.0 % imx8dxl_lpddr4.bandwidth_usage
2.001170500 95684315 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
2.001170500 1000927875 ns duration_time
3.001840125 320798 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 16.0 % imx8dxl_lpddr4.bandwidth_usage
3.001840125 95655155 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
3.001840125 1000669625 ns duration_timeReviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
Add ddr bandwidth usage metric support for i.MX8MP.
Metric:
imx8mp-lpddr4-bandwidth-usageExample:
root@imx8mpevk:~# ./perf stat -a -I 1000 -M imx8mp-lpddr4-bandwidth-usage dd if=/dev/zero of=/dev/null bs=100M count=1000000
1.000770875 18081664 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 37.0 % imx8mp-lpddr4-bandwidth-usage
1.000770875 5895351484 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
1.000770875 1000770875 ns duration_time
2.001780250 11137456 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 39.0 % imx8mp-lpddr4-bandwidth-usage
2.001780250 6232776052 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
2.001780250 1001009375 ns duration_time
3.002748125 10643520 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 39.0 % imx8mp-lpddr4-bandwidth-usage
3.002748125 6229700768 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
3.002748125 1000967875 ns duration_timeReviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
In interval mode, if metric expression contains duration_time event,
command with -I 5000 config can trigger this cast issue.Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
For interval mode, the metric is printed after the '#' character if it
exists. But it's not calculated by the counts generated in this
interval.See the following examples:
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000422803 764,809 inst_retired.any # 2.9 CPI
1.000422803 2,234,932 cycles
2.001464585 1,960,061 inst_retired.any # 1.6 CPI
2.001464585 4,022,591 cyclesThe second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000429493 2,869,311 cycles
1.000429493 816,875 instructions # 0.28 insn per cycle
2.001516426 9,260,973 cycles
2.001516426 5,250,634 instructions # 0.87 insn per cycleThe second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is
0.57).The current code uses a global variable 'rt_stat' for tracking and
updating the std dev of runtime stat. Unlike the counts, 'rt_stat' is not
reset for interval. While the counts are reset for interval.perf_stat_process_counter()
{
if (config->interval)
init_stats(ps->res_stats);
}So for interval mode, the 'rt_stat' variable should be reset too.
This patch resets 'rt_stat' before read_counters(), so the runtime stat
is only calculated by the counts generated in this interval.With this patch:
root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
# time counts unit events
1.000420924 2,408,818 inst_retired.any # 2.1 CPI
1.000420924 5,010,111 cycles
2.001448579 2,798,407 inst_retired.any # 1.6 CPI
2.001448579 4,599,861 cyclesroot@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
# time counts unit events
1.000428555 2,769,714 cycles
1.000428555 774,462 instructions # 0.28 insn per cycle
2.001471562 3,595,904 cycles
2.001471562 1,243,703 instructions # 0.35 insn per cycleNow the second 'insn per cycle' and CPI are calculated by the counts
generated in this interval.Signed-off-by: Jin Yao
Acked-by: Jiri Olsa
Tested-By: Kajol Jain
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jin Yao
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200420145417.6864-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de MeloJoakim cherry pick from perf/core commit: 197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang
01 Apr, 2020
2 commits
-
Add JSON metrics for imx8dxl DDR Perf.
Acked-by: Fugang Duan
Signed-off-by: Joakim Zhang -
Fix metricgroup add metric events multiple times.
Before:
root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_ddr0_read.all
1.000135500 40047 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 320376.0 imx8dxl_ddr0_read.all
# 320376.0 imx8dxl_ddr0_read.all
# 320376.0 imx8dxl_ddr0_read.allAfter:
root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_ddr0_read.all
1.000135500 40047 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 320376.0 imx8dxl_ddr0_read.allFixes: commit b8552bd2eafe ("tools: perf: metricgroup: add metricgroup for each PMU")
Acked-by: Fugang Duan
Signed-off-by: Joakim Zhang
08 Mar, 2020
1 commit
-
Merge Linux stable release v5.4.24 into imx_5.4.y
* tag 'v5.4.24': (3306 commits)
Linux 5.4.24
blktrace: Protect q->blk_trace with RCU
kvm: nVMX: VMWRITE checks unsupported field before read-only field
...Signed-off-by: Jason Liu
Conflicts:
arch/arm/boot/dts/imx6sll-evk.dts
arch/arm/boot/dts/imx7ulp.dtsi
arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
drivers/clk/imx/clk-composite-8m.c
drivers/gpio/gpio-mxc.c
drivers/irqchip/Kconfig
drivers/mmc/host/sdhci-of-esdhc.c
drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
drivers/net/can/flexcan.c
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
drivers/net/ethernet/mscc/ocelot.c
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
drivers/net/phy/realtek.c
drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
drivers/perf/fsl_imx8_ddr_perf.c
drivers/tee/optee/shm_pool.c
drivers/usb/cdns3/gadget.c
kernel/sched/cpufreq.c
net/core/xdp.c
sound/soc/fsl/fsl_esai.c
sound/soc/fsl/fsl_sai.c
sound/soc/sof/core.c
sound/soc/sof/imx/Kconfig
sound/soc/sof/loader.c
07 Mar, 2020
1 commit
-
libbfd has changed the bfd_section_* macros to inline functions
bfd_section_ since 2019-09-18. See below two commits:
o http://www.sourceware.org/ml/gdb-cvs/2019-09/msg00064.html
o https://www.sourceware.org/ml/gdb-cvs/2019-09/msg00072.htmlThis fix make perf able to build with both old and new libbfd.
Signed-off-by: Changbin Du
Acked-by: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200128152938.31413-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
05 Mar, 2020
4 commits
-
commit 604e2139a1026793b8c2172bd92c7e9d039a5cf0 upstream.
When we moved zalloc.o to the library we missed gtk library which needs
it compiled in, otherwise the missing __zfree symbol will cause the
library to fail to load.Adding the zalloc object to the gtk library build.
Fixes: 7f7c536f23e6 ("tools lib: Adopt zalloc()/zfree() from tools/perf")
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Jelle van der Waa
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200113104358.123511-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman -
commit 3f7774033e6820d25beee5cf7aefa11d4968b951 upstream.
We need to set actions->ms.map since 599a2f38a989 ("perf hists browser:
Check sort keys before hot key actions"), as in that patch we bail out
if map is NULL.Reviewed-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Namhyung Kim
Fixes: 599a2f38a989 ("perf hists browser: Check sort keys before hot key actions")
Link: https://lkml.kernel.org/n/tip-wp1ssoewy6zihwwexqpohv0j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman -
commit b9167c8078c3527de6da241c8a1a75a9224ed90a upstream.
Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
timeout per test") added a 45 second timeout for tests, and also added
a way for tests to customise the timeout via a settings file.For example the ftrace tests take multiple minutes to run, so they
were given longer in commit b43e78f65b1d ("tracing/selftests: Turn off
timeout setting").This works when the tests are run from the source tree. However if the
tests are installed with "make -C tools/testing/selftests install",
the settings files are not copied into the install directory. When the
tests are then run from the install directory the longer timeouts are
not applied and the tests timeout incorrectly.So add the settings files to TEST_FILES of the appropriate Makefiles
to cause the settings files to be installed using the existing install
logic.Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Signed-off-by: Michael Ellerman
Signed-off-by: Shuah Khan
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit e404b8c7cfb31654c9024d497cec58a501501692 ]
After commit 27596472473a ("ipv6: fix ECMP route replacement") it is no
longer possible to replace an ECMP-able route by a non ECMP-able route.
For example,
ip route add 2001:db8::1/128 via fe80::1 dev dummy0
ip route replace 2001:db8::1/128 dev dummy0
does not work as expected.Tweak the replacement logic so that point 3 in the log of the above commit
becomes:
3. If the new route is not ECMP-able, and no matching non-ECMP-able route
exists, replace matching ECMP-able route (if any) or add the new route.We can now summarize the entire replace semantics to:
When doing a replace, prefer replacing a matching route of the same
"ECMP-able-ness" as the replace argument. If there is no such candidate,
fallback to the first route found.Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Benjamin Poirier
Reviewed-by: Michal Kubecek
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
29 Feb, 2020
1 commit
-
commit f2e97dc126b712c0d21219ed0c42710006c1cf52 upstream.
Fix following build error. We could push a tcp.h header into one of the
include paths, but I think its easy enough to simply pull in the three
defines we need here. If we end up using more of tcp.h at some point
we can pull it in later./home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: In function ‘connected_socket_v4’:
/home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: error: ‘TCP_REPAIR_ON’ undeclared (first use in this function)
repair = TCP_REPAIR_ON;
^
/home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: note: each undeclared identifier is reported only once for each function it appears in
/home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:29:11: error: ‘TCP_REPAIR_OFF_NO_WP’ undeclared (first use in this function)
repair = TCP_REPAIR_OFF_NO_WP;Then with fix,
$ ./test_progs -n 44
#44/1 sockmap create_update_free:OK
#44/2 sockhash create_update_free:OK
#44 sockmap_basic:OKFixes: 5d3919a953c3c ("selftests/bpf: Test freeing sockmap/sockhash with a socket in it")
Signed-off-by: John Fastabend
Signed-off-by: Alexei Starovoitov
Reviewed-by: Jakub Sitnicki
Link: https://lore.kernel.org/bpf/158131347731.21414.12120493483848386652.stgit@john-Precision-5820-Tower
Signed-off-by: Greg Kroah-Hartman
24 Feb, 2020
10 commits
-
[ Upstream commit 414f50434aa2463202a5b35e844f4125dd1a7101 ]
Some newer cards supported by aacraid can take up to 40s to recover
after an EEH event. This causes spurious failures in the basic EEH
self-test since the current maximim timeout is only 30s.Fix the immediate issue by bumping the timeout to a default of 60s,
and allow the wait time to be specified via an environmental variable
(EEH_MAX_WAIT).Reported-by: Steve Best
Suggested-by: Douglas Miller
Signed-off-by: Oliver O'Halloran
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200122031125.25991-1-oohall@gmail.com
Signed-off-by: Sasha Levin -
[ Upstream commit 51bad0f05616c43d6d34b0a19bcc9bdab8e8fb39 ]
Currently, there is a lot of false positives if a single reuseport test
fails. This is because expected_results and the result map are not cleared.Zero both after individual test runs, which fixes the mentioned false
positives.Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Lorenz Bauer
Signed-off-by: Daniel Borkmann
Reviewed-by: Jakub Sitnicki
Acked-by: Martin KaFai Lau
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200124112754.19664-5-lmb@cloudflare.com
Signed-off-by: Sasha Levin -
[ Upstream commit 8b7e20a7ba54836076ff35a28349dabea4cec48f ]
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does.
Commit
12a78d43de76 ("x86/decoder: Add new TEST instruction pattern")
added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did
not add f7 XX/001/XXX (Group 3-2).Actually, this TEST opcode variant (ModRM.reg /1) is not described in
the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3,
Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map.Without this fix, Randy found a warning by insn_decoder_test related
to this issue as below.HOSTCC arch/x86/tools/insn_decoder_test
HOSTCC arch/x86/tools/insn_sanity
TEST posttest
arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx)
arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2
arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures
TEST posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c)To fix this error, add the TEST opcode according to AMD64 APM Vol.3.
[ bp: Massage commit message. ]
Reported-by: Randy Dunlap
Signed-off-by: Masami Hiramatsu
Signed-off-by: Borislav Petkov
Acked-by: Randy Dunlap
Tested-by: Randy Dunlap
Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
Signed-off-by: Sasha Levin -
[ Upstream commit 8580bed7e751e6d4f17881e059daf3cb37ba4717 ]
Building objtool with ARCH=x86_64 fails with:
$make ARCH=x86_64 -C tools/objtool
...
CC arch/x86/decode.o
arch/x86/decode.c:10:22: fatal error: asm/insn.h: No such file or directory
#include
^
compilation terminated.
mv: cannot stat ‘arch/x86/.decode.o.tmp’: No such file or directory
make[2]: *** [arch/x86/decode.o] Error 1
...The root cause is that the command-line variable 'ARCH' cannot be
overridden. It can be replaced by 'SRCARCH', which is defined in
'tools/scripts/Makefile.arch'.Signed-off-by: Shile Zhang
Signed-off-by: Josh Poimboeuf
Signed-off-by: Ingo Molnar
Reviewed-by: Kamalesh Babulal
Link: https://lore.kernel.org/r/d5d11370ae116df6c653493acd300ec3d7f5e925.1579543924.git.jpoimboe@redhat.com
Signed-off-by: Sasha Levin -
[ Upstream commit 585c91f40d201bc564d4e76b83c05b3b5363fe7e ]
Fix unsafe unaligned pointer usage in usbip network interfaces. usbip tool
build fails with new gcc -Werror=address-of-packed-member checks.usbip_network.c: In function ‘usbip_net_pack_usb_device’:
usbip_network.c:79:32: error: taking address of packed member of ‘struct usbip_usb_device’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
79 | usbip_net_pack_uint32_t(pack, &udev->busnum);Fix with minor changes to pass by value instead of by address.
Signed-off-by: Shuah Khan
Link: https://lore.kernel.org/r/20200109012416.2875-1-skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: Sasha Levin -
[ Upstream commit 6794200fa3c9c3e6759dae099145f23e4310f4f7 ]
GCC9 introduced string hardening mechanisms, which exhibits the error
during fs api compilation:error: '__builtin_strncpy' specified bound 4096 equals destination size
[-Werror=stringop-truncation]This comes when the length of copy passed to strncpy is is equal to
destination size, which could potentially lead to buffer overflow.There is a need to mitigate this potential issue by limiting the size of
destination by 1 and explicitly terminate the destination with NULL.Signed-off-by: Andrey Zhizhikin
Reviewed-by: Petr Mladek
Acked-by: Jiri Olsa
Cc: Alexei Starovoitov
Cc: Andrii Nakryiko
Cc: Daniel Borkmann
Cc: Kefeng Wang
Cc: Martin KaFai Lau
Cc: Petr Mladek
Cc: Sergey Senozhatsky
Cc: Song Liu
Cc: Yonghong Song
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191211080109.18765-1-andrey.zhizhikin@leica-geosystems.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 1162f844030ac1ac7321b5e8f6c9badc7a11428f ]
Currently, when bpftool cgroup show has an error, no error
message is printed. This is confusing because the user may think the
result is empty.Before the change:
$ bpftool cgroup show /sys/fs/cgroup
ID AttachType AttachFlags Name
$ echo $?
255After the change:
$ ./bpftool cgroup show /sys/fs/cgroup
Error: can't query bpf programs attached to /sys/fs/cgroup: Operation
not permittedv2: Rename check_query_cgroup_progs to cgroup_has_attached_progs
Signed-off-by: Hechao Li
Signed-off-by: Daniel Borkmann
Link: https://lore.kernel.org/bpf/20191224011742.3714301-1-hechaol@fb.com
Signed-off-by: Sasha Levin -
[ Upstream commit ea6a547669b37453f2b1a5d85188d75b3613dfaa ]
The SO_TXTIME test depends on accurate timers. In some virtualized
environments the test has been reported to be flaky. This is easily
reproduced by disabling kvm acceleration in Qemu.Allow greater variance in a run and retry to further reduce flakiness.
Observed errors are one of two kinds: either the packet arrives too
early or late at recv(), or it was dropped in the qdisc itself and the
recv() call times out.In the latter case, the qdisc queues a notification to the error
queue of the send socket. Also explicitly report this cause.Link: https://lore.kernel.org/netdev/CA+FuTSdYOnJCsGuj43xwV1jxvYsaoa_LzHQF9qMyhrkLrivxKw@mail.gmail.com
Reported-by: Naresh Kamboju
Signed-off-by: Willem de Bruijn
Signed-off-by: Jakub Kicinski
Signed-off-by: Sasha Levin -
[ Upstream commit ac87813d4372f4c005264acbe3b7f00c1dee37c4 ]
Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
timeout per test") adds support for a new per-test-directory "settings"
file. But this only works for tests not in a sub-subdirectories, e.g.- tools/testing/selftests/rtc (rtc) is OK,
- tools/testing/selftests/net/mptcp (net/mptcp) is not.We have to increase the timeout for net/mptcp tests which are not
upstreamed yet but this fix is valid for other tests if they need to add
a "settings" file, see the full list with:tools/testing/selftests/*/*/**/Makefile
Note that this patch changes the text header message printed at the end
of the execution but this text is modified only for the tests that are
in sub-subdirectories, e.g.ok 1 selftests: net/mptcp: mptcp_connect.sh
Before we had:
ok 1 selftests: mptcp: mptcp_connect.sh
But showing the full target name is probably better, just in case a
subsubdir has the same name as another one in another subdirectory.Fixes: 852c8cbf34d3 (selftests/kselftest/runner.sh: Add 45 second timeout per test)
Signed-off-by: Matthieu Baerts
Reviewed-by: Kees Cook
Signed-off-by: Shuah Khan
Signed-off-by: Sasha Levin -
[ Upstream commit 6b64a650f0b2ae3940698f401732988699eecf7a ]
It was observed[1] on arm64 that __builtin_strlen led to an infinite
loop in the get_size selftest. This is because __builtin_strlen (and
other builtins) may sometimes result in a call to the C library
function. The C library implementation of strlen uses an IFUNC
resolver to load the most efficient strlen implementation for the
underlying machine and hence has a PLT indirection even for static
binaries. Because this binary avoids the C library startup routines,
the PLT initialization never happens and hence the program gets stuck
in an infinite loop.On x86_64 the __builtin_strlen just happens to expand inline and avoid
the call but that is not always guaranteed.Further, while testing on x86_64 (Fedora 31), it was observed that the
test also failed with a segfault inside write() because the generated
code for the write function in glibc seems to access TLS before the
syscall (probably due to the cancellation point check) and fails
because TLS is not initialised.To mitigate these problems, this patch reduces the interface with the
C library to just the syscall function. The syscall function still
sets errno on failure, which is undesirable but for now it only
affects cases where syscalls fail.[1] https://bugs.linaro.org/show_bug.cgi?id=5479
Signed-off-by: Siddhesh Poyarekar
Reported-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Reviewed-by: Tim Bird
Signed-off-by: Shuah Khan
Signed-off-by: Sasha Levin
20 Feb, 2020
1 commit
-
commit 80cc7bb6c104d733bff60ddda09f19139c61507c upstream.
For data collected on machines with front end stalled cycles supported,
such as found on modern AMD CPU families, commit 146540fb545b ("perf
stat: Always separate stalled cycles per insn") introduces a new line in
CSV output with a leading comma that upsets some automated scripts.
Scripts have to use "-e ex_ret_instr" to work around this issue, after
upgrading to a version of perf with that commit.We could add "if (have_frontend_stalled && !config->csv_sep)" to the not
(total && avg) else clause, to emphasize that CSV users are usually
scripts, and are written to do only what is needed, i.e., they wouldn't
typically invoke "perf stat" without specifying an explicit event list.But - let alone CSV output - why should users now tolerate a constant
0-reporting extra line in regular terminal output?:BEFORE:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
181,110,981 instructions # 0.58 insn per cycle
# 0.00 stalled cycles per insn
309,876,469 cycles1.002202582 seconds time elapsed
The user would not like to see the now permanent:
"0.00 stalled cycles per insn"
line fixture, as it gives no useful information.
So this patch removes the printing of the zeroed stalled cycles line
altogether, almost reverting the very original commit fb4605ba47e7
("perf stat: Check for frontend stalled for metrics"), which seems like
it was written to normalize --metric-only column output of common Intel
machines at the time: modern Intel machines have ceased to support the
genericised frontend stalled metrics AFAICT.AFTER:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
244,071,432 instructions # 0.69 insn per cycle
355,353,490 cycles1.001862516 seconds time elapsed
Output behaviour when stalled cycles is indeed measured is not affected
(BEFORE == AFTER):$ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1
Performance counter stats for 'system wide':
247,227,799 instructions # 0.63 insn per cycle
# 0.26 stalled cycles per insn
394,745,636 cycles
63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle1.002079770 seconds time elapsed
Fixes: 146540fb545b ("perf stat: Always separate stalled cycles per insn")
Signed-off-by: Kim Phillips
Acked-by: Andi Kleen
Acked-by: Jiri Olsa
Acked-by: Song Liu
Cc: Alexander Shishkin
Cc: Cong Wang
Cc: Davidlohr Bueso
Cc: Jin Yao
Cc: Kan Liang
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman
15 Feb, 2020
3 commits
-
commit 1985f8c7f9a42a651a9750d6fcadc74336d182df upstream.
If we compile tools/acpi target in the top source directory, we'd get a
compilation error showing as bellow:# make tools/acpi
DESCEND power/acpi
DESCEND tools/acpidbg
CC tools/acpidbg/acpidbg.o
Assembler messages:
Fatal error: can't create /home/lzy/kernel-upstream/power/acpi/\
tools/acpidbg/acpidbg.o: No such file or directory
../../Makefile.rules:26: recipe for target '/home/lzy/kernel-upstream/\
power/acpi/tools/acpidbg/acpidbg.o' failed
make[3]: *** [/home/lzy/kernel-upstream//power/acpi/tools/acpidbg/\
acpidbg.o] Error 1
Makefile:19: recipe for target 'acpidbg' failed
make[2]: *** [acpidbg] Error 2
Makefile:54: recipe for target 'acpi' failed
make[1]: *** [acpi] Error 2
Makefile:1607: recipe for target 'tools/acpi' failed
make: *** [tools/acpi] Error 2Fixes: d5a4b1a540b8 ("tools/power/acpi: Remove direct kernel source include reference")
Signed-off-by: Zhengyuan Liu
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Greg Kroah-Hartman -
commit 5d3919a953c3c96c02fc7a337f8376cde43ae31f upstream.
Commit 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear
down") introduced sleeping issues inside RCU critical sections and while
holding a spinlock on sockmap/sockhash tear-down. There has to be at least
one socket in the map for the problem to surface.This adds a test that triggers the warnings for broken locking rules. Not a
fix per se, but rather tooling to verify the accompanying fixes. Run on a
VM with 1 vCPU to reproduce the warnings.Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down")
Signed-off-by: Jakub Sitnicki
Signed-off-by: Daniel Borkmann
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200206111652.694507-4-jakub@cloudflare.com
Signed-off-by: Greg Kroah-Hartman -
commit d95f1e8b462c4372ac409886070bb8719d8a4d3a upstream.
Turns out the xlated program instructions can also be missing if
kptr_restrict sysctl is set. This means that the previous fix to check the
jited_prog_insns pointer was insufficient; add another check of the
xlated_prog_insns pointer as well.Fixes: 5b79bcdf0362 ("bpftool: Don't crash on missing jited insns or ksyms")
Fixes: cae73f233923 ("bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()")
Signed-off-by: Toke Høiland-Jørgensen
Signed-off-by: Daniel Borkmann
Reviewed-by: Quentin Monnet
Link: https://lore.kernel.org/bpf/20200206102906.112551-1-toke@redhat.com
Signed-off-by: Greg Kroah-Hartman
12 Feb, 2020
6 commits
-
Add SocName in DDR JSON file, so that metric/metricgroup can filter by this
property.Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
All metrics under one CPUID would be loaded by Perf tool when the CPUID
of SoC is matched. So users could see other platforms' metrics from one
platform, which is very confused. We can match metric/metricgroup with
SOCNAME if needed.Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
Add support for checking socname for ARCH arm64.
Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
Add socname for struct pmu_event for that we can distinguish different SoCs
by this property.Reviewed-by: Fugang Duan
Signed-off-by: Joakim Zhang -
[ Upstream commit eb573e746b9d4f0921dcb2449be3df41dae3caea ]
Commit f01642e4912b ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properlyIn power9 platform:
command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
1.000208486
2.000368863
2.001400558Similarly in skylake platform:
command:./perf stat --metric-only -M Power -I 1000
1.000579994
2.002189493With current upstream version, issue is with event name comparison logic
in find_evsel_group(). Current logic is to compare events belonging to a
metric group to the events in perf_evlist. Since the break statement is
missing in the loop used for comparison between metric group and
perf_evlist events, the loop continues to execute even after getting a
pattern match, and end up in discarding the matches.Incase of single metric event belongs to metric group, its working fine,
because in case of single event once it compare all events it reaches to
end of perf_evlist.Example for single metric event in power9 platform:
command:# ./perf stat --metric-only -M branches_per_inst -I 1000 sleep 1
1.000094653 0.2
1.001337059 0.0This patch fixes the issue by making sure once we found all events
belongs to that metric event matched in find_evsel_group(), we
successfully break from that loop by adding corresponding condition.With this patch:
In power9 platform:command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
result:#
time derat_4k_miss_rate_percent derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
1.000135672 0.0 0.3 1.0 0.0 0.2 0.0 0.0
2.000380617 0.0 0.0 0.0 0.0 0.0 0.0 0.0command:# ./perf stat --metric-only -M Power -I 1000
Similarly in skylake platform:
result:#
time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
1.000563580 0.3 0.0 2.6 44.2 21.9 0.0 0.0 0.0
2.002235027 0.4 0.0 2.7 43.0 20.7 0.0 0.0 0.0Committer testing:
Before:
[root@seventh ~]# perf stat --metric-only -M Power -I 1000
# time
1.000383223
2.001168182
3.001968545
4.002741200
5.003442022
^C 5.777687244[root@seventh ~]#
After the patch:
[root@seventh ~]# perf stat --metric-only -M Power -I 1000
# time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
1.000406577 0.4 0.1 1.4 97.0 0.0 0.0 0.0 0.0
2.001481572 0.3 0.0 0.6 97.9 0.0 0.0 0.0 0.0
3.002332585 0.2 0.0 1.0 97.5 0.0 0.0 0.0 0.0
4.003196624 0.2 0.0 0.3 98.6 0.0 0.0 0.0 0.0
5.004063851 0.3 0.0 0.7 97.7 0.0 0.0 0.0 0.0
^C 5.471260276 0.2 0.0 0.5 49.3 0.0 0.0 0.0 0.0[root@seventh ~]#
[root@seventh ~]# dmesg | grep -i skylake
[ 0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
[root@seventh ~]#Fixes: f01642e4912b ("perf metricgroup: Support multiple events for metricgroup")
Signed-off-by: Kajol Jain
Reviewed-by: Ravi Bangoria
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Anju T Sudhakar
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Madhavan Srinivasan
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha LevinJoakim cherry pick from upstream:
3635b27cc058 perf metricgroup: Fix printing event names of metric group with multiple eventsSigned-off-by: Joakim Zhang
-
Currently when cross compiling perf tool for ARM64 on my x86 machine I
get this error:arch/arm64/util/sym-handling.c:9:10: fatal error: gelf.h: No such file or directory
#includeFor the build, libelf is reported off:
Auto-detecting system features:
...
... libelf: [ OFF ]Indeed, test-libelf is not built successfully:
more ./build/feature/test-libelf.make.output
test-libelf.c:2:10: fatal error: libelf.h: No such file or directory
#include
^~~~~~~~~~
compilation terminated.I have no such problems natively compiling on ARM64, and I did not
previously have this issue for cross compiling. Fix by relocating the
gelf.h include.Signed-off-by: John Garry
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1573045254-39833-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de MeloJoakim cherry pick from upstream:
1302caaef52a perf tools: Fix cross compile for ARM64Signed-off-by: Joakim Zhang
11 Feb, 2020
7 commits
-
commit 5fcf3a55a62afb0760ccb6f391d62f20bce4a42f upstream.
The filter name is fixed to "exit_reason" for some kvm_exit events, no
matter what architect we have. Actually, the filter name ("exit_reason")
is only applicable to x86, meaning it's broken on other architects
including aarch64.This fixes the issue by providing various kvm_exit filter names, depending
on architect we're on. Afterwards, the variable filter name is picked and
applied through ioctl(fd, SET_FILTER).Reported-by: Andrew Jones
Signed-off-by: Gavin Shan
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini
Signed-off-by: Greg Kroah-Hartman -
commit 8bec4f665e0baecb5f1b683379fc10b3745eb612 upstream.
The reuseport tests currently suffer from a race condition: FIN
packets count towards DROP_ERR_SKB_DATA, since they don't contain
a valid struct cmd. Tests will spuriously fail depending on whether
check_results is called before or after the FIN is processed.Exit the BPF program early if FIN is set.
Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Lorenz Bauer
Signed-off-by: Daniel Borkmann
Reviewed-by: Jakub Sitnicki
Acked-by: Martin KaFai Lau
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200124112754.19664-3-lmb@cloudflare.com
Signed-off-by: Greg Kroah-Hartman -
commit c31dbb1e41d1857b403f9bf58c87f5898519a0bc upstream.
Use a proper temporary file for sendpage tests. This means that running
the tests doesn't clutter the working directory, and allows running the
test on read-only filesystems.Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Lorenz Bauer
Signed-off-by: Daniel Borkmann
Reviewed-by: Jakub Sitnicki
Acked-by: Martin KaFai Lau
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200124112754.19664-2-lmb@cloudflare.com
Signed-off-by: Greg Kroah-Hartman -
commit f1c3656c6d9c147d07d16614455aceb34932bdeb upstream.
The same with commit 4e59afbbed96 ("selftests/bpf: skip nmi test when perf
hw events are disabled"), it would make more sense to skip the
test_stacktrace_build_id_nmi test if the setup (e.g. virtual machines) has
disabled hardware perf events.Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context")
Signed-off-by: Hangbin Liu
Signed-off-by: Daniel Borkmann
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200117100656.10359-1-liuhangbin@gmail.com
Signed-off-by: Greg Kroah-Hartman -
commit 580205dd4fe800b1e95be8b6df9e2991f975a8ad upstream.
Fix two issues in test_attach_probe:
1. it was not able to parse /proc/self/maps beyond the first line,
since %s means parse string until white space.
2. offset has to be accounted for otherwise uprobed address is incorrect.Fixes: 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests")
Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Acked-by: Yonghong Song
Acked-by: Andrii Nakryiko
Link: https://lore.kernel.org/bpf/20191219020442.1922617-1-ast@kernel.org
Signed-off-by: Greg Kroah-Hartman -
commit 7145fcfffef1fad4266aaf5ca96727696916edb7 upstream.
when the following command is done on a fresh clone of the kernel tree,
[root@f31 tc-testing]# ./tdc.py -c bpf
test cases that need to build the eBPF sample program fail systematically,
because 'buildebpfPlugin' is unable to install the kernel headers (i.e, the
'khdr' target fails). Pass the correct environment to 'make', in place of
ENVIR, to allow running these tests.Fixes: 4c2d39bd40c1 ("tc-testing: use a plugin to build eBPF program")
Signed-off-by: Davide Caratti
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
commit 35b9211c0a2427e8f39e534f442f43804fc8d5ca upstream.
Fix bug requesting invalid size of reallocated array when constructing CO-RE
relocation candidate list. This can cause problems if there are many potential
candidates and a very fine-grained memory allocator bucket sizes are used.Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: William Smith
Signed-off-by: Andrii Nakryiko
Signed-off-by: Daniel Borkmann
Acked-by: Yonghong Song
Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com
Signed-off-by: Greg Kroah-Hartman