13 Jun, 2019
1 commit
-
Add DDR performance monitor support for iMX8QXP. The PMU consists of 3
programmable event counters and a single dedicated cycle counter.Example usage:
$ perf stat -a -e \
imx8_ddr0/read-cycles/,imx8_ddr0/write-cycles/,imx8_ddr0/precharge/ ls- or -
$ perf stat -a -e \
imx8_ddr0/cycles/,imx8_ddr0/read-access/,imx8_ddr0/write-access/ lsOther events are supported, and advertised via perf list.
Reviewed-by: Andrey Smirnov
Signed-off-by: Frank Li
[will: rewrote commit message/kconfig and used #defines for dev/cpuhp names]
Signed-off-by: Will Deacon
04 Apr, 2019
1 commit
-
Adds a new driver to support the SMMUv3 PMU and add it into the
perf events framework.Each SMMU node may have multiple PMUs associated with it, each of
which may support different events.SMMUv3 PMCG devices are named as smmuv3_pmcg_ where
is the physical page address of the SMMU PMCG
wrapped to 4K boundary. For example, the PMCG at 0xff88840000 is
named smmuv3_pmcg_ff88840Filtering by stream id is done by specifying filtering parameters
with the event. options are:
filter_enable - 0 = no filtering, 1 = filtering enabled
filter_span - 0 = exact match, 1 = pattern match
filter_stream_id - pattern to filter againstExample: perf stat -e smmuv3_pmcg_ff88840/transaction,filter_enable=1,
filter_span=1,filter_stream_id=0x42/ -a netperfApplies filter pattern 0x42 to transaction events, which means events
matching stream ids 0x42 & 0x43 are counted as only upper StreamID
bits are required to match the given filter. Further filtering
information is available in the SMMU documentation.SMMU events are not attributable to a CPU, so task mode and sampling
are not supported.Signed-off-by: Neil Leeder
Signed-off-by: Shameer Kolothum
Reviewed-by: Robin Murphy
[will: fold in review feedback from Robin]
[will: rewrite Kconfig text and allow building as a module]
Signed-off-by: Will Deacon
06 Dec, 2018
1 commit
-
This patch adds a perf driver for the PMU UNCORE devices DDR4 Memory
Controller(DMC) and Level 3 Cache(L3C). Each PMU supports up to 4
counters. All counters lack overflow interrupt and are
sampled periodically.Reviewed-by: Suzuki K Poulose
Signed-off-by: Ganapatrao Kulkarni
[will: consistent enum cpuhp_state naming]
Signed-off-by: Will Deacon
07 Mar, 2018
2 commits
-
The arm-cci driver is really two entirely separate drivers; one for MCPM
port control and the other for the performance monitors. Since they are
already relatively self-contained, let's take the plunge and move the
PMU parts out to drivers/perf where they belong these days. For non-MCPM
systems this leaves a small dependency on the remaining "bus" stub for
initial probing and discovery, but we end up with something that still
fits the general pattern of its fellow system PMU drivers to ease future
maintenance.Moving code to a new file also offers a perfect excuse to modernise the
license/copyright headers and clean up some funky linewraps on the way.Cc: Lorenzo Pieralisi
Reviewed-by: Suzuki Poulose
Acked-by: Punit Agrawal
Signed-off-by: Robin Murphy
Signed-off-by: Arnd Bergmann -
The arm-ccn driver is purely a perf driver for the CCN PMU, not a bus
driver in the sense of the other residents of drivers/bus/, so let's
move it to the appropriate place for SoC PMU drivers. Not to mention
moving the documentation accordingly as well.Acked-by: Pawel Moll
Acked-by: Will Deacon
Signed-off-by: Robin Murphy
Signed-off-by: Arnd Bergmann
03 Jan, 2018
1 commit
-
Add support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).
The DSU integrates one or more cores with an L3 memory system, control
logic, and external interfaces to form a multicore cluster. The PMU
allows counting the various events related to L3, SCU etc, along with
providing a cycle counter.The PMU can be accessed via system registers, which are common
to the cores in the same cluster. The PMU registers follow the
semantics of the ARMv8 PMU, mostly, with the exception that
the counters record the cluster wide events.This driver is mostly based on the ARMv8 and CCI PMU drivers.
The driver only supports ARM64 at the moment. It can be extended
to support ARM32 by providing register accessors like we do in
arch/arm64/include/arm_dsu_pmu.h.Cc: Mark Rutland
Cc: Will Deacon
Reviewed-by: Jonathan Cameron
Reviewed-by: Mark Rutland
Signed-off-by: Suzuki K Poulose
Signed-off-by: Will Deacon
16 Nov, 2017
1 commit
-
Pull arm64 updates from Will Deacon:
"The big highlight is support for the Scalable Vector Extension (SVE)
which required extensive ABI work to ensure we don't break existing
applications by blowing away their signal stack with the rather large
new vector context ( of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
arm64: Make ARMV8_DEPRECATED depend on SYSCTL
arm64: Implement __lshrti3 library function
arm64: support __int128 on gcc 5+
arm64/sve: Add documentation
arm64/sve: Detect SVE and activate runtime support
arm64/sve: KVM: Hide SVE from CPU features exposed to guests
arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
arm64/sve: KVM: Prevent guests from using SVE
arm64/sve: Add sysctl to set the default vector length for new processes
arm64/sve: Add prctl controls for userspace vector length management
arm64/sve: ptrace and ELF coredump support
arm64/sve: Preserve SVE registers around EFI runtime service calls
arm64/sve: Preserve SVE registers around kernel-mode NEON use
arm64/sve: Probe SVE capabilities and usable vector lengths
arm64: cpufeature: Move sys_caps_initialised declarations
arm64/sve: Backend logic for setting the vector length
arm64/sve: Signal handling support
arm64/sve: Support vector length resetting for new processes
arm64/sve: Core task context handling
arm64/sve: Low-level CPU setup
...
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
20 Oct, 2017
1 commit
-
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.Acked-by: Mark Rutland
Reviewed-by: Jonathan Cameron
Signed-off-by: Shaokun Zhang
Signed-off-by: Anurup M
[will: Fix leader accounting in uncore group validation]
Signed-off-by: Will Deacon
18 Oct, 2017
1 commit
-
The ARMv8.2 architecture introduces the optional Statistical Profiling
Extension (SPE).SPE can be used to profile a population of operations in the CPU pipeline
after instruction decode. These are either architected instructions (i.e.
a dynamic instruction trace) or CPU-specific uops and the choice is fixed
statically in the hardware and advertised to userspace via caps/. Sampling
is controlled using a sampling interval, similar to a regular PMU counter,
but also with an optional random perturbation to avoid falling into patterns
where you continuously profile the same instruction in a hot loop.After each operation is decoded, the interval counter is decremented. When
it hits zero, an operation is chosen for profiling and tracked within the
pipeline until it retires. Along the way, information such as TLB lookups,
cache misses, time spent to issue etc is captured in the form of a sample.
The sample is then filtered according to certain criteria (e.g. load
latency) that can be specified in the event config (described under
format/) and, if the sample satisfies the filter, it is written out to
memory as a record, otherwise it is discarded. Only one operation can
be sampled at a time.The in-memory buffer is linear and virtually addressed, raising an
interrupt when it fills up. The PMU driver handles these interrupts to
give the appearance of a ring buffer, as expected by the AUX code.The in-memory trace-like format is self-describing (though not parseable
in reverse) and written as a series of records, with each record
corresponding to a sample and consisting of a sequence of packets. These
packets are defined by the architecture, although some have CPU-specific
fields for recording information specific to the microarchitecture.As a simple example, a record generated for a branch instruction may
consist of the following packets:0 (Address) : Virtual PC of the branch instruction
1 (Type) : Conditional direct branch
2 (Counter) : Number of cycles taken from Dispatch to Issue
3 (Address) : Virtual branch target + condition flags
4 (Counter) : Number of cycles taken from Dispatch to Complete
5 (Events) : Mispredicted as not-taken
6 (END) : End of recordIt is also possible to toggle properties such as timestamp packets in
each record.This patch adds support for SPE in the form of a new perf driver.
Cc: Alexander Shishkin
Reviewed-by: Mark Rutland
Signed-off-by: Will Deacon
11 Apr, 2017
2 commits
-
This patch adds framework code to handle parsing PMU data out of the
MADT, sanity checking this, and managing the association of CPUs (and
their interrupts) with appropriate logical PMUs.For the time being, we expect that only one PMU driver (PMUv3) will make
use of this, and we simply pass in a single probe function.This is based on an earlier patch from Jeremy Linton.
Signed-off-by: Mark Rutland
Tested-by: Jeremy Linton
Cc: Will Deacon
Signed-off-by: Will Deacon -
Now that we've split the pdev and DT probing logic from the runtime
management, let's move the former into its own file. We gain a few lines
due to the copyright header and includes, but this should keep the logic
clearly separated, and paves the way for adding ACPI support in a
similar fashion.Signed-off-by: Mark Rutland
Tested-by: Jeremy Linton
[will: rename nr_irqs to avoid conflict with global variable]
Signed-off-by: Will Deacon
04 Apr, 2017
1 commit
-
This adds a new dynamic PMU to the Perf Events framework to program
and control the L3 cache PMUs in some Qualcomm Technologies SOCs.The driver supports a distributed cache architecture where the overall
cache for a socket is comprised of multiple slices each with its own PMU.
Access to each individual PMU is provided even though all CPUs share all
the slices. User space needs to aggregate to individual counts to provide
a global picture.The driver exports formatting and event information to sysfs so it can
be used by the perf user space tools with the syntaxes:
perf stat -a -e l3cache_0_0/read-miss/
perf stat -a -e l3cache_0_0/event=0x21/Acked-by: Mark Rutland
Signed-off-by: Agustin Vega-Frias
[will: fixed sparse issues]
Signed-off-by: Will Deacon
09 Feb, 2017
1 commit
-
Adds perf events support for L2 cache PMU.
The L2 cache PMU driver is named 'l2cache_0' and can be used
with perf events to profile L2 events such as cache hits
and misses on Qualcomm Technologies processors.Reviewed-by: Mark Rutland
Signed-off-by: Neil Leeder
[will: minimise nesting in l2_cache_associate_cpu_with_cluster]
[will: use kstrtoul for unsigned long, remove redunant .owner setting]
Signed-off-by: Will Deacon
16 Sep, 2016
1 commit
-
This patch adds a driver for the SoC-wide (AKA uncore) PMU hardware
found in APM X-Gene SoCs.Signed-off-by: Tai Nguyen
Reviewed-by: Mark Rutland
31 Jul, 2015
1 commit
-
To enable sharing of the arm_pmu code with arm64, this patch factors it
out to drivers/perf/. A new drivers/perf directory is added for
performance monitor drivers to live under.MAINTAINERS is updated accordingly. Files added previously without a
corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and
perf_event.h) are also added.Cc: Arnaldo Carvalho de Melo
Cc: Greg Kroah-Hartman
Cc: Ingo Molnar
Cc: Linus Walleij
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Russell King
Cc: Will Deacon
Signed-off-by: Mark Rutland
[will: augmented Kconfig help slightly]
Signed-off-by: Will Deacon