15 Oct, 2020
8 commits
-
The metrics "LLC Ld Miss" and "Load Dram" overlap with each other for
accouting items:"LLC Ld Miss" = "lcl_dram" + "rmt_dram" + "rmt_hit" + "rmt_hitm"
"Load Dram" = "lcl_dram" + "rmt_dram"Furthermore, the metrics "LLC Ld Miss" is not directive to show
statistics due to it contains summary value and cannot give out
breakdown details.For this reason, add a new metrics "RMT Load Hit" which is used to
present the remote cache hit; it contains two items:"RMT Load Hit" = remote hit ("rmt_hit") + remote hitm ("rmt_hitm")
As result, the metrics "LLC Ld Miss" is perfectly divided into two
metrics "RMT Load Hit" and "Load Dram". It's not necessary to keep
metrics "LLC Ld Miss", so remove it.Before:
# ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- LLC --- Load Dram ----
# Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm Ld Miss Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 548 2615 66 169 481 0 0 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 187 361 27 11 78 0 0 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 131 0 10 263 1 0 0 0After:
# ----------- Cacheline ---------- Tot ------- Load Hitm ------- Total Total Total ---- Stores ---- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
# Index Address Node PA cnt Hitm Total LclHitm RmtHitm records Loads Stores L1Hit L1Miss FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 548 2615 66 169 481 0 0 0 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 187 361 27 11 78 0 0 0 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 131 0 10 263 1 0 0 0 0Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo
Link: https://lore.kernel.org/r/20201014050921.5591-9-leo.yan@linaro.org -
"rmt_hit" is accounted into two metrics: one is accounted into the
metrics "LLC Ld Miss" (see the function llc_miss() for calculation
"llcmiss"); and it's accounted into metrics "LLC Load Hit". Thus,
for the literal meaning, it is contradictory that "rmt_hit" is
accounted for both "LLC Ld Miss" (LLC miss) and "LLC Load Hit"
(LLC hit).Thus this is easily to introduce confusion: "LLC Load Hit" gives
impression that all items belong to it are LLC hit; in fact "rmt_hit"
is LLC miss and remote cache hit.To give out clear semantics for metric "LLC Load Hit", "rmt_hit" is
moved out from it and changes "LLC Load Hit" to contain two items:LLC Load Hit = LLC's hit ("ld_llchit") + LLC's hitm ("lcl_hitm")
For output alignment, adjusts the header for "LLC Load Hit".
Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
Replace the header string "Lcl" with "LclHit", which is more explicit
to express the event type is LLC local hit.Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
Local and remote HITM use the headers 'Lcl' and 'Rmt' respectively,
suppose if we want to extend the tool to display these two dimensions
under any one metrics, users cannot understand the semantics if only
based on the header string 'Lcl' or 'Rmt'.To explicit express the meaning for HITM items, this patch changes the
headers string as "LclHitm" and "RmtHitm", the strings are more readable
and this allows to extend metrics for using HITM items.Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
The metrics "LLC Load Hitm" contains two items: one is "local Hitm" and
another is "remote Hitm"."local Hitm" means: L3 HIT and was serviced by another processor core
with a cross core snoop where modified copies were found; it's no doubt
that "local Hitm" belongs to LLC access.But for "remote Hitm", based on the code in util/mem-events, it's the
event for remote cache HIT and was serviced by another processor core
with modified copies. Thus the remote Hitm is a remote cache's hit and
actually it's LLC load miss.Now the display format gives users the impression that "local Hitm" and
"remote Hitm" both belong to the LLC load, but this is not the fact as
described.This patch changes the header from "LLC Load Hitm" to "Load Hitm", this
can avoid the give the wrong impression that all Hitm belong to LLC.Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
The metrics are not organized based on memory hierarchy, e.g. the tool
doesn't organize the metrics order based on memory nodes from the close
node (e.g. L1/L2 cache) to far node (e.g. L3 cache and DRAM).To output metrics with more friendly form, this patch refines the
metrics order based on memory hierarchy:"Core Load Hit" => "LLC Load Hit" => "LLC Ld Miss" => "Load Dram"
Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
The total stores is displayed under the metrics "Store Reference", to
output the same format with total records and all loads, extract the
total stores number as a standalone metrics "Total Stores".After this patch, the tool shows the summary numbers ("Total records",
"Total loads", "Total Stores") in the unified form.Before:
# ----------- Cacheline ---------- Tot ----- LLC Load Hitm ----- Total Total ---- Store Reference ---- --- Load Dram ---- LLC ----- Core Load Hit ----- -- LLC Load Hit --
# Index Address Node PA cnt Hitm Total Lcl Rmt records Loads Total L1Hit L1Miss Lcl Rmt Ld Miss FB L1 L2 Llc Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 0 0 0 548 2615 66 169 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 0 0 0 187 361 27 11 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 0 0 0 131 0 10 263 0After:
# ----------- Cacheline ---------- Tot ----- LLC Load Hitm ----- Total Total Total ---- Stores ---- --- Load Dram ---- LLC ----- Core Load Hit ----- -- LLC Load Hit --
# Index Address Node PA cnt Hitm Total Lcl Rmt records Loads Stores L1Hit L1Miss Lcl Rmt Ld Miss FB L1 L2 Llc Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ........ ........
#
0 0x55f07d580100 0 1499 85.89% 481 481 0 7243 3879 3364 2599 765 0 0 0 548 2615 66 169 0
1 0x55f07d580080 0 1 13.93% 78 78 0 664 664 0 0 0 0 0 0 187 361 27 11 0
2 0x55f07d5800c0 0 1 0.18% 1 1 0 405 405 0 0 0 0 0 0 131 0 10 263 0Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo -
To view the statistics with "breakdown" mode, it's good to show the
summary numbers for the total records, all stores and all loads, then
the sequential conlumns can be used to break into more detailed items.To achieve this purpose, this patch displays the summary numbers for
records/stores/loads continuously and places them before breakdown
items, this can allow uses to easily read the summarized statistics.Signed-off-by: Leo Yan
Tested-by: Joe Mario
Acked-by: Jiri Olsa
Link: https://lore.kernel.org/r/20201014050921.5591-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo
14 Oct, 2020
1 commit
-
Since commit b027cc6fdf1b ("perf c2c: Fix 'perf c2c record -e list' to
show the default events used"), "perf c2c" tool can show the memory
events properly, it's no reason to still suggest user to use the
command "perf mem record -e list" for showing events.This patch updates the usage for showing memory events with command
"perf c2c record -e list".Signed-off-by: Leo Yan
Acked-by: Jiri Olsa
Acked-by: Ian Rogers
Signed-off-by: Arnaldo Carvalho de Melo
Link: https://lore.kernel.org/r/20201011121022.22409-1-leo.yan@linaro.org
23 Jun, 2020
1 commit
-
To differentiate from libperf's 'struct perf_evlist' methods.
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
28 May, 2020
1 commit
-
When the event is passed as list, the default events should be listed as
per 'perf mem record -e list'. Previous behavior is:$ perf c2c record -e list
failed: event 'list' not found, use '-e list' to get list of available eventsUsage: perf c2c record [] []
or: perf c2c record [] -- []-e, --event event selector. Use 'perf mem record -e list' to list available events
$New behavior:
$ perf c2c record -e list
ldlat-loads : available
ldlat-stores : availablev3: is a rebase.
v2: addresses review comments by Jiri Olsa.https://lore.kernel.org/lkml/20191127081844.GH32367@krava/
Signed-off-by: Ian Rogers
Tested-by: Arnaldo Carvalho de Melo
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Kan Liang
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lore.kernel.org/lkml/20200507220604.3391-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo
06 May, 2020
1 commit
-
As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
30 Apr, 2020
1 commit
-
Fixes coccicheck warnings:
tools/perf/builtin-c2c.c:1712:2-3: Unneeded semicolon
tools/perf/builtin-c2c.c:1928:2-3: Unneeded semicolon
tools/perf/builtin-c2c.c:2962:2-3: Unneeded semicolonReported-by: Hulk Robot
Signed-off-by: Zou Wei
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/1588064336-70456-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
18 Apr, 2020
1 commit
-
With the LBR stitching approach, the reconstructed LBR call stack can
break the HW limitation. However, it may reconstruct invalid call stacks
in some cases, e.g. exception handing such as setjmp/longjmp. Also, it
may impact the processing time especially when the number of samples
with stitched LBRs are huge.Add an option to enable the approach.
Signed-off-by: Kan Liang
Reviewed-by: Andi Kleen
Acked-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexey Budankov
Cc: Mathieu Poirier
Cc: Michael Ellerman
Cc: Namhyung Kim
Cc: Pavel Gerasimov
Cc: Peter Zijlstra
Cc: Ravi Bangoria
Cc: Stephane Eranian
Cc: Vitaly Slobodskoy
Link: http://lore.kernel.org/lkml/20200319202517.23423-17-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
15 Jan, 2020
1 commit
-
Commit 722ddfde366f ("perf tools: Fix time sorting") changed - correctly
so - hist_entry__sort to return int64. Unfortunately several of the
builtin-c2c.c comparison routines only happened to work due the cast
caused by the wrong return type.This causes meaningless ordering of both the cacheline list, and the
cacheline details page. E.g a simple:perf c2c record -a sleep 3
perf c2c reportwill result in cacheline table like
=================================================
Shared Data Cache Line Table
=================================================
#
# ------- Cacheline ---------- Total Tot - LLC Load Hitm - - Store Reference - - Load Dram - LLC Total - Core Load Hit - - LLC Load Hit -
# Index Address Node PA cnt records Hitm Total Lcl Rmt Total L1Hit L1Miss Lcl Rmt Ld Miss Loads FB L1 L2 Llc Rmt
# ..... .............. .... ...... ....... ...... ..... ..... ... .... ..... ...... ...... .... ...... ..... ..... ..... ... .... .......0 0x7f0d27ffba00 N/A 0 52 0.12% 13 6 7 12 12 0 0 7 14 40 4 16 0 0 0
1 0x7f0d27ff61c0 N/A 0 6353 14.04% 1475 801 674 779 779 0 0 718 1392 5574 1299 1967 0 115 0
2 0x7f0d26d3ec80 N/A 0 71 0.15% 16 4 12 13 13 0 0 12 24 58 1 20 0 9 0
3 0x7f0d26d3ec00 N/A 0 98 0.22% 23 17 6 19 19 0 0 6 12 79 0 40 0 10 0i.e. with the list not being ordered by Total Hitm.
Fixes: 722ddfde366f ("perf tools: Fix time sorting")
Signed-off-by: Andres Freund
Tested-by: Michael Petlan
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org # v3.16+
Link: http://lore.kernel.org/lkml/20200109043030.233746-1-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo
06 Jan, 2020
1 commit
-
Sometimes we're in an outer code, like the main hists browser popup menu
and the user follows a suggestion about using some hotkey, and that
hotkey is really handled by hists_browser__run(), so allow for calling
it with that hotkey, making it handle it instead of waiting for the user
to press one.Reviewed-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: Jin Yao
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-xv2l7i6o4urn37nv1h40ryfs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
15 Oct, 2019
1 commit
-
There is a memory leak problem in the failure paths of
build_cl_output(), so fix it.Signed-off-by: Yunfeng Ye
Acked-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Feilong Lin
Cc: Hu Shiyuan
Cc: Mark Rutland
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/4d3c0178-5482-c313-98e1-f82090d2d456@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
21 Sep, 2019
1 commit
-
This patch is to return error code of perf_new_session function on
failure instead of NULL.Test Results:
Before Fix:
$ perf c2c report -input
failed to open nput: No such file or directory$ echo $?
0
$After Fix:
$ perf c2c report -input
failed to open nput: No such file or directory$ echo $?
254
$Committer notes:
Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
session), i.e. we need to pass zero in case of failure, which was the
case before when NULL was returned by perf_session__new() for failure,
but now we need to negate the result of IS_ERR(session) to respect that
TEST_ASSERT_VAL) expectation of zero meaning failure.Reported-by: Nageswara R Sastry
Signed-off-by: Mamatha Inamdar
Tested-by: Arnaldo Carvalho de Melo
Tested-by: Nageswara R Sastry
Acked-by: Ravi Bangoria
Reviewed-by: Jiri Olsa
Reviewed-by: Mukesh Ojha
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Greg Kroah-Hartman
Cc: Jeremie Galarneau
Cc: Kate Stewart
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Shawn Landden
Cc: Song Liu
Cc: Thomas Gleixner
Cc: Tzvetomir Stoyanov
Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo
20 Sep, 2019
1 commit
-
Only a 'struct perf_cmp_map' forward allocation is necessary, fix the
places that need the header but were getting it indirectly, by luck,
from env.h.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-3sj3n534zghxhk7ygzeaqlx9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
01 Sep, 2019
3 commits
-
The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We only need a forward declaration, add it and fixup all the files that
need ui_progress definitions but were wrongly getting it from hist.h.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-84a90o9jdxybffxo9jmouokw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
30 Aug, 2019
2 commits
-
Its not needed there, add it to the places that need it and were getting
it via those headers.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-5yulx1u16vyd0zmrbg1tjhju@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
There's wrong bitmap considered when checking for cpu count of specific
node.We do the needed computation for 'set' variable, but at the end we use
the 'c2c_he->cpuset' weight, which shows misleading numbers.Fixes: 1e181b92a2da ("perf c2c report: Add 'node' sort key")
Reported-by: Joe Mario
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20190820140219.28338-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
26 Aug, 2019
1 commit
-
To disentangle util/sort.h a bit more.
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-6kbf2cauas06rbqp15pyter5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
23 Aug, 2019
1 commit
-
If c2c is recorded on a machine where any cpus are offline, 'perf c2c
report' throws an error "node/cpu topology bugFailed setup nodes".It fails because while preparing node-cpu mapping we don't consider
offline cpus.Reported-by: Nageswara R Sastry
Signed-off-by: Ravi Bangoria
Acked-by: Jiri Olsa
Fixes: 1e181b92a2da ("perf c2c report: Add 'node' sort key")
Link: http://lkml.kernel.org/r/20190822085045.25108-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
22 Aug, 2019
1 commit
-
So it's part of the libperf library as one of basic functions operating
on the perf_cpu_map class.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190822111141.25823-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
30 Jul, 2019
3 commits
-
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.Committer notes:
Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Rename struct cpu_map to struct perf_cpu_map, so it could be part of
libperf.Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190721112506.12306-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Jul, 2019
1 commit
-
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
07 Mar, 2019
1 commit
-
Ravi Bangoria reported that we fail with an empty NUMA node with the
following message:$ lscpu
NUMA node0 CPU(s):
NUMA node1 CPU(s): 0-4$ sudo ./perf c2c report
node/cpu topology bugFailed setup nodesFix this by detecting the empty node and keeping its CPU set empty.
Reported-by: Nageswara R Sastry
Signed-off-by: Jiri Olsa
Tested-by: Ravi Bangoria
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jonas Rabenstein
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
23 Feb, 2019
1 commit
-
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo
06 Feb, 2019
2 commits
-
Add argument to hists__resort_cb_t so that we can pass data from upper
layers to the callback function. It will be used in the following
patches.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Jin Yao
Cc: Kan Liang
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190204141808.23031-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Several places were using definitions found in symbols.h but not
including it, getting it by sheer luck from some other headers that now
are in the process of removing that include because they don't need it
or because simply having struct forward declarations is enough, fix it.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
25 Jan, 2019
1 commit
-
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for histograms. Specifically, the following
are converted to rb_root_cached, and users accordingly:hist::entries_in_array
hist::entries_in
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_outSigned-off-by: Davidlohr Bueso
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20181206191819.30182-7-dave@stgolabs.net
[ Added some missing conversions to rb_first_cached() ]
Signed-off-by: Arnaldo Carvalho de Melo
22 Jan, 2019
1 commit
-
An automatic const char[] variable gets initialized at runtime, just
like any other automatic variable. For long strings, that uses a lot of
stack and wastes time building the string; e.g. for the "No %s
allocation events..." case one has:444516: 48 b8 4e 6f 20 25 73 20 61 6c movabs $0x6c61207325206f4e,%rax # "No %s al"
...
444674: 48 89 45 80 mov %rax,-0x80(%rbp)
444678: 48 b8 6c 6f 63 61 74 69 6f 6e movabs $0x6e6f697461636f6c,%rax # "location"
444682: 48 89 45 88 mov %rax,-0x78(%rbp)
444686: 48 b8 20 65 76 65 6e 74 73 20 movabs $0x2073746e65766520,%rax # " events "
444690: 66 44 89 55 c4 mov %r10w,-0x3c(%rbp)
444695: 48 89 45 90 mov %rax,-0x70(%rbp)
444699: 48 b8 66 6f 75 6e 64 2e 20 20 movabs $0x20202e646e756f66,%raxMake them all static so that the compiler just references objects in .rodata.
Committer testing:
Ok, using dwarves's codiff tool:
$ codiff --functions /tmp/perf.before ~/bin/perf
builtin-sched.c:
cmd_sched | -48
1 function changed, 48 bytes removed, diff: -48builtin-report.c:
cmd_report | -32
1 function changed, 32 bytes removed, diff: -32builtin-kmem.c:
cmd_kmem | -64
build_alloc_func_list | -50
2 functions changed, 114 bytes removed, diff: -114builtin-c2c.c:
perf_c2c__report | -390
1 function changed, 390 bytes removed, diff: -390ui/browsers/header.c:
tui__header_window | -104
1 function changed, 104 bytes removed, diff: -104/home/acme/bin/perf:
9 functions changed, 688 bytes removed, diff: -688Signed-off-by: Rasmus Villemoes
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20181102230624.20064-1-linux@rasmusvillemoes.dk
Signed-off-by: Arnaldo Carvalho de Melo
29 Dec, 2018
2 commits
-
The cachelines being reported are the ones with percentages all the way
down to 0.05%. That makes for very long output files. Raising that to
0.1%. The user can always specify --show-all if they want all the
cachelines with hits.Suggested-by: Joe Mario
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20181228101820.28010-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Joe suggested to have the coalesce default set just to 'iaddr', because
it's easier to read on the default 'perf c2c report' output.By removing the "pid" field from the default -c/--coalesce option, the
'perf c2c' report will group all the relevant PIDs under the instruction
address ('iaddr') bucket. User can always run "-c pid,iaddr" for a more
fine grained output on particular PIDs.Suggested-by: Joe Mario
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20181228101820.28010-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
31 Jul, 2018
1 commit
-
'perf c2c' scans read/write accesses and tries to find false sharing
cases, so when the events it wants were not asked for or ended up not
taking place, we get no histograms.So do not try to display entry details if there's not any. Currently
this ends up in crash:$ perf c2c report # then press 'd'
perf: Segmentation fault
$Committer testing:
Before:
Record a perf.data file without events of interest to 'perf c2c report',
then call it and press 'd':# perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (6 samples) ]
# perf c2c report
perf: Segmentation fault
-------- backtrace --------
perf[0x5b1d2a]
/lib64/libc.so.6(+0x346df)[0x7fcb566e36df]
perf[0x46fcae]
perf[0x4a9f1e]
perf[0x4aa220]
perf(main+0x301)[0x42c561]
/lib64/libc.so.6(__libc_start_main+0xe9)[0x7fcb566cff29]
perf(_start+0x29)[0x42c999]
#After the patch the segfault doesn't take place, a follow up patch to
tell the user why nothing changes when 'd' is pressed would be good.Reported-by: rodia@autistici.org
Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Don Zickus
Cc: Joe Mario
Cc: Namhyung Kim
Cc: Peter Zijlstra
Fixes: f1c5fd4d0bb9 ("perf c2c report: Add TUI cacheline browser")
Link: http://lkml.kernel.org/r/20180724062008.26126-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo