12 Oct, 2018

1 commit


11 Jul, 2018

4 commits

  • Add support for 64bit event by using chained event counters
    and 64bit cycle counters.

    PMUv3 allows chaining a pair of adjacent 32-bit counters, effectively
    forming a 64-bit counter. The low/even counter is programmed to count
    the event of interest, and the high/odd counter is programmed to count
    the CHAIN event, taken when the low/even counter overflows.

    For CPU cycles, when 64bit mode is requested, the cycle counter
    is used in 64bit mode. If the cycle counter is not available,
    falls back to chaining.

    Cc: Will Deacon
    Acked-by: Mark Rutland
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Will Deacon

    Suzuki K Poulose
     
  • The armpmu uses get_event_idx callback to allocate an event
    counter for a given event, which marks the selected counter
    as "used". Now, when we delete the counter, the arm_pmu goes
    ahead and clears the "used" bit and then invokes the "clear_event_idx"
    call back, which kind of splits the job between the core code
    and the backend. To keep things tidy, mandate the implementation
    of clear_event_idx() and add it for exisiting backends.
    This will be useful for adding the chained event support, where
    we leave the event idx maintenance to the backend.

    Also, when an event is removed from the PMU, reset the hw.idx
    to indicate that a counter is not allocated for this event,
    to help the backends do better checks. This will be also used
    for the chain counter support.

    Cc: Will Deacon
    Cc: Mark Rutland
    Reviewed-by: Julien Thierry
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Will Deacon

    Suzuki K Poulose
     
  • Each PMU has a set of 32bit event counters. But in some
    special cases, the events could be counted using counters
    which are effectively 64bit wide.

    e.g, Arm V8 PMUv3 has a 64 bit cycle counter which can count
    only the CPU cycles. Also, the PMU can chain the event counters
    to effectively count as a 64bit counter.

    Add support for tracking the events that uses 64bit counters.
    This only affects the periods set for each counter in the core
    driver.

    Cc: Will Deacon
    Reviewed-by: Julien Thierry
    Acked-by: Mark Rutland
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Will Deacon

    Suzuki K Poulose
     
  • Each PMU defines their max_period of the counter as the maximum
    value that can be counted. Since all the PMU backends support
    32bit counters by default, let us remove the redundant field.

    No functional changes.

    Cc: Will Deacon
    Acked-by: Mark Rutland
    Reviewed-by: Julien Thierry
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Will Deacon

    Suzuki K Poulose
     

22 May, 2018

1 commit

  • The arm_pmu::handle_irq() callback has the same prototype as a generic
    IRQ handler, taking the IRQ number and a void pointer argument which it
    must convert to an arm_pmu pointer.

    This means that all arm_pmu::handle_irq() take an IRQ number they never
    use, and all must explicitly cast the void pointer to an arm_pmu
    pointer.

    Instead, let's change arm_pmu::handle_irq to take an arm_pmu pointer,
    allowing these casts to be removed. The redundant IRQ number parameter
    is also removed.

    Suggested-by: Hoeun Ryu
    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     

20 Mar, 2018

1 commit


17 Mar, 2018

1 commit

  • Mark noticed that the change to sibling_list changed some iteration
    semantics; because previously we used group_list as list entry,
    sibling events would always have an empty sibling_list.

    But because we now use sibling_list for both list head and list entry,
    siblings will report as having siblings.

    Fix this with a custom for_each_sibling_event() iterator.

    Fixes: 8343aae66167 ("perf/core: Remove perf_event::group_entry")
    Reported-by: Mark Rutland
    Suggested-by: Mark Rutland
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Cc: vincent.weaver@maine.edu
    Cc: alexander.shishkin@linux.intel.com
    Cc: torvalds@linux-foundation.org
    Cc: alexey.budankov@linux.intel.com
    Cc: valery.cherepennikov@intel.com
    Cc: eranian@google.com
    Cc: acme@redhat.com
    Cc: linux-tip-commits@vger.kernel.org
    Cc: davidcc@google.com
    Cc: kan.liang@intel.com
    Cc: Dmitry.Prohorov@intel.com
    Cc: jolsa@redhat.com
    Link: https://lkml.kernel.org/r/20180315170129.GX4043@hirez.programming.kicks-ass.net

    Peter Zijlstra
     

12 Mar, 2018

1 commit

  • Now that all the grouping is done with RB trees, we no longer need
    group_entry and can replace the whole thing with sibling_list.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Mark Rutland
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Arnaldo Carvalho de Melo
    Cc: David Carrillo-Cisneros
    Cc: Dmitri Prokhorov
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Linus Torvalds
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Valery Cherepennikov
    Cc: Vince Weaver
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

28 Feb, 2018

1 commit

  • Commit 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug")
    moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks,
    regardless of whether they are implemented as PPIs or SPIs. This can
    lead to us sleeping from atomic context due to disable_irq blocking:

    | BUG: sleeping function called from invalid context at kernel/irq/manage.c:112
    | in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1
    | no locks held by migration/1/15.
    | irq event stamp: 192
    | hardirqs last enabled at (191): []
    | _raw_spin_unlock_irq+0x2c/0x4c
    | hardirqs last disabled at (192): [] multi_cpu_stop+0x9c/0x140
    | softirqs last enabled at (0): []
    | copy_process.isra.77.part.78+0x43c/0x1504
    | softirqs last disabled at (0): [< (null)>] (null)
    | CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651
    | Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
    | Call trace:
    | dump_backtrace+0x0/0x140
    | show_stack+0x14/0x1c
    | dump_stack+0xb4/0xf0
    | ___might_sleep+0x1fc/0x218
    | __might_sleep+0x70/0x80
    | synchronize_irq+0x40/0xa8
    | disable_irq+0x20/0x2c
    | arm_perf_teardown_cpu+0x80/0xac

    Since the interrupt is always CPU-affine and this code is running with
    interrupts disabled, we can just use disable_irq_nosync as we know there
    isn't a concurrent invocation of the handler to worry about.

    Fixes: 6de3f79112cc ("arm_pmu: explicitly enable/disable SPIs at hotplug")
    Reported-by: Geert Uytterhoeven
    Tested-by: Geert Uytterhoeven
    Acked-by: Mark Rutland
    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Will Deacon
     

20 Feb, 2018

7 commits

  • We can't request IRQs in atomic context, so for ACPI systems we'll have
    to request them up-front, and later associate them with CPUs.

    This patch reorganises the arm_pmu code to do so. As we no longer have
    the arm_pmu structure at probe time, a number of prototypes need to be
    adjusted, requiring changes to the common arm_pmu code and arm_pmu
    platform code.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • To support ACPI systems, we need to request IRQs before we know the
    associated PMU, and thus we need some percpu variable that the IRQ
    handler can find the PMU from.

    As we're going to request IRQs without the PMU, we can't rely on the
    arm_pmu::active_irqs mask, and similarly need to track requested IRQs
    with a percpu variable.

    Signed-off-by: Mark Rutland
    [will: made armpmu_count_irq_users static]
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • To support ACPI systems, we need to request IRQs before CPUs are
    hotplugged, and thus we need to request IRQs before we know their
    associated PMU.

    This is problematic if a PMU IRQ is pending out of reset, as it may be
    taken before we know the PMU, and thus the IRQ handler won't be able to
    handle it, leaving it screaming.

    To avoid such problems, lets request all IRQs in a disabled state, and
    explicitly enable/disable them at hotplug time, when we're sure the PMU
    has been probed.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • The arm_pmu platform code explicitly checks for mismatched PPIs at probe
    time, while the ACPI code leaves this to the core code. Future
    refactoring will make this difficult for the core code to check, so
    let's have the ACPI code check this explicitly.

    As before, upon a failure we'll continue on without an interrupt. Ho
    hum.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • In ACPI systems, we don't know the makeup of CPUs until we hotplug them
    on, and thus have to allocate the PMU datastructures at hotplug time.
    Thus, we must use GFP_ATOMIC allocations.

    Let's add an armpmu_alloc_atomic() that we can use in this case.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • The armpmu_{request,free}_irqs() helpers are only used by
    arm_pmu_platform.c, so let's fold them in and make them static.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Now that we have no platforms passing platform data to the arm_pmu code,
    we can get rid of the platdata and associated hooks, paving the way for
    rework of our IRQ handling.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     

24 Oct, 2017

1 commit

  • arm_pmu interrupts are maked as PERCPU even when these are not local
    physical interrupts to a single CPU. When using non-local interrupts,
    interrupts marked as PERCPU will not get freed not disabled properly
    by the PMU driver.

    Check if interrupts are local to a single CPU with PERCPU_DEVID since
    this is what the PMU driver really needs to know.

    Acked-by: Mark Rutland
    Signed-off-by: Julien Thierry
    Signed-off-by: Will Deacon

    Julien Thierry
     

09 Aug, 2017

1 commit


27 Jul, 2017

1 commit

  • Since the PMU register interface is banked per CPU, CPU PMU interrrupts
    cannot be handled by a CPU other than the one with the PMU asserting the
    interrupt. This means that migrating PMU SPIs, as we do during a CPU
    hotplug operation doesn't make any sense and can lead to the IRQ being
    disabled entirely if we route a spurious IRQ to the new affinity target.

    This has been observed in practice on AMD Seattle, where CPUs on the
    non-boot cluster appear to take a spurious PMU IRQ when coming online,
    which is routed to CPU0 where it cannot be handled.

    This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their
    affinity prior to requesting them, ensuring that they cannot
    be migrated during hotplug events. This interacts badly with the DB8500
    erratum workaround that ping-pongs the interrupt affinity from the handler,
    so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags
    to be overridden in the platdata.

    Fixes: 3cf7ee98b848 ("drivers/perf: arm_pmu: move irq request/free into probe")
    Cc: Mark Rutland
    Cc: Linus Walleij
    Signed-off-by: Will Deacon

    Will Deacon
     

11 Apr, 2017

11 commits

  • This patch adds framework code to handle parsing PMU data out of the
    MADT, sanity checking this, and managing the association of CPUs (and
    their interrupts) with appropriate logical PMUs.

    For the time being, we expect that only one PMU driver (PMUv3) will make
    use of this, and we simply pass in a single probe function.

    This is based on an earlier patch from Jeremy Linton.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Now that we've split the pdev and DT probing logic from the runtime
    management, let's move the former into its own file. We gain a few lines
    due to the copyright header and includes, but this should keep the logic
    clearly separated, and paves the way for adding ACPI support in a
    similar fashion.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    [will: rename nr_irqs to avoid conflict with global variable]
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Currently we request (and potentially free) all IRQs for a given PMU in
    cpu_pmu_init(). This works for platform/DT probing today, but it doesn't
    fit ACPI well as we don't have all our affinity data up-front.

    In preparation for ACPI support, fold the IRQ request/free into
    arm_pmu_device_probe(), which will remain specific to platform/DT
    probing.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Currently we have functions to request/free all IRQs for a given PMU.
    While this works today, this won't work for ACPI, where we don't know
    the full set of IRQs up front, and need to request them separately.

    To enable supporting ACPI, this patch splits out the cpu-local
    request/free into new functions, allowing us to request/free individual
    IRQs.

    As this makes it possible/necessary to request a PPI once per cpu, an
    additional check is added to detect mismatched PPIs. This shouldn't
    matter for the DT / platform case, as we check this when parsing.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • For historical reasons, portions of the arm_pmu code use a cpu_pmu_
    prefix rather than an armpmu_ prefix. While a minor annoyance, this
    hasn't been a problem thusfar.

    However, to enable ACPI support, we'll need to expose a few things in
    header files, and we should aim to keep those consistently namespaced.
    In preparation for exporting our IRQ request/free functions, rename
    these to have an armpmu_ prefix. For consistency, the 'cpu_pmu'
    parameter is also renamed to 'armpmu'.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • In armpmu_dispatch_irq() we look at arm_pmu::plat_device to acquire
    platdata, so that we can defer to platform-specific IRQ handling,
    required on some 32-bit parts. With the advent of ACPI we won't always
    have a platform_device, and so we must avoid trying to dereference
    fields from it.

    This patch fixes up armpmu_dispatch_irq() to avoid doing so, introducing
    a new armpmu_get_platdata() helper.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • The ARM PMU framework code always uses armpmu_dispatch_irq as its common
    IRQ handler. Passing this down from cpu_pmu_init() is somewhat
    pointless, and gets in the way of refactoring.

    This patch makes cpu_pmu_request_irqs() always use armpmu_dispatch_irq
    as the handler when requesting IRQs, and removes the handler parameter
    from its prototype.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Currently arm_pmu_device_probe contains probing logic specific to the
    platform_device infrastructure, and some logic required to safely
    register the PMU with various systems.

    This patch factors out the logic relating to the registration of the
    PMU. This makes arm_pmu_device_probe a little easier to read, and will
    make it easier to reuse the logic for an ACPI-specific probing
    mechanism.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • Given we always want to initialise common fields on an allocated PMU,
    this patch folds this common initialisation into armpmu_alloc(). This
    will make it simpler to reuse this code for an ACPI-specific probe path.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • We expect an ARM PMU's init function to have a particular prototype,
    which we open-code in a few places. This is less than ideal, considering
    that we cast a void value to this type in one location, and a mismatch
    could easily be missed.

    Add a typedef so that we can ensure this is consistent.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • We currently disable the PMU temporarily in armpmu_add(). We may have
    required this historically, but the perf core always disables an event's
    PMU when calling event::pmu::add(), so this is not necessary.

    We don't do similarly in armpmu_del(), or elsewhere, so this is
    unnecessary and inconsistent, and only serves to confuse the reader.

    Remove the pointless disable, simplifying armpmu_add() in the process.

    Signed-off-by: Mark Rutland
    Tested-by: Jeremy Linton
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland
     

01 Apr, 2017

3 commits

  • For historical reasons, we lazily request and free interrupts in the
    arm pmu driver. This requires us to refcount use of the pmu (by way of
    counting the active events) in order to request/free interrupts at the
    correct times, which complicates the driver somewhat.

    The existing logic is flawed, as it only considers currently online CPUs
    when requesting, freeing, or managing the affinity of interrupts.
    Intervening hotplug events can result in erroneous IRQ affinity, online
    CPUs for which interrupts have not been requested, or offline CPUs whose
    interrupts are still requested.

    To fix this, this patch splits the requesting of interrupts from any
    per-cpu management (i.e. per-cpu enable/disable, and configuration of
    cpu affinity). We now request all interrupts up-front at probe time (and
    never free them, since we never unregister PMUs).

    The management of affinity, and per-cpu enable/disable now happens in
    our cpu hotplug callback, ensuring it occurs consistently. This means
    that we must now invoke the CPU hotplug callback at boot time in order
    to configure IRQs, and since the callback also resets the PMU hardware,
    we can remove the duplicate reset in the probe path.

    This rework renders our event refcounting unnecessary, so this is
    removed.

    Signed-off-by: Mark Rutland
    [will: make armpmu_get_cpu_irq static]
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • When requesting or freeing interrupts, we use platform_get_irq() to find
    relevant irqs, backing this up with additional information in an
    optional irq_affinity table.

    This means that our irq request and free paths are tied to a
    platform_device, and our request path must jump through a number of
    hoops in order to determine the required affinity of each interrupt.

    Given that the affinity must be static, we can compute the affinity once
    up-front at probe time, simplifying the irq request and free paths. By
    recording interrupts in a per-cpu data structure, we simplify a few
    paths, and permit a subsequent rework of the request and free paths.

    Signed-off-by: Mark Rutland
    [will: rename local nr_irqs variable to avoid conflict with global]
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • For historical reasons, we allocate per-cpu data associated with a PMU
    rather late, in cpu_pmu_init, after we've parsed whatever hardware
    information we were provided with.

    In order to allow use to store some per-cpu data early in the probe
    path, we need to allocate (and initialise) the per-cpu data earlier.
    This patch reworks the way we allocate the pmu and associated per-cpu
    data in order to make that possible.

    Signed-off-by: Mark Rutland
    [will: make armpmu_{alloc,free} static
    Signed-off-by: Will Deacon

    Mark Rutland
     

02 Mar, 2017

1 commit


25 Dec, 2016

1 commit

  • When the state names got added a script was used to add the extra argument
    to the calls. The script basically converted the state constant to a
    string, but the cleanup to convert these strings into meaningful ones did
    not happen.

    Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
    are used in all the other places already.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

04 Oct, 2016

1 commit

  • Pull CPU hotplug updates from Thomas Gleixner:
    "Yet another batch of cpu hotplug core updates and conversions:

    - Provide core infrastructure for multi instance drivers so the
    drivers do not have to keep custom lists.

    - Convert custom lists to the new infrastructure. The block-mq custom
    list conversion comes through the block tree and makes the diffstat
    tip over to more lines removed than added.

    - Handle unbalanced hotplug enable/disable calls more gracefully.

    - Remove the obsolete CPU_STARTING/DYING notifier support.

    - Convert another batch of notifier users.

    The relayfs changes which conflicted with the conversion have been
    shipped to me by Andrew.

    The remaining lot is targeted for 4.10 so that we finally can remove
    the rest of the notifiers"

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
    cpufreq: Fix up conversion to hotplug state machine
    blk/mq: Reserve hotplug states for block multiqueue
    x86/apic/uv: Convert to hotplug state machine
    s390/mm/pfault: Convert to hotplug state machine
    mips/loongson/smp: Convert to hotplug state machine
    mips/octeon/smp: Convert to hotplug state machine
    fault-injection/cpu: Convert to hotplug state machine
    padata: Convert to hotplug state machine
    cpufreq: Convert to hotplug state machine
    ACPI/processor: Convert to hotplug state machine
    virtio scsi: Convert to hotplug state machine
    oprofile/timer: Convert to hotplug state machine
    block/softirq: Convert to hotplug state machine
    lib/irq_poll: Convert to hotplug state machine
    x86/microcode: Convert to hotplug state machine
    sh/SH-X3 SMP: Convert to hotplug state machine
    ia64/mca: Convert to hotplug state machine
    ARM/OMAP/wakeupgen: Convert to hotplug state machine
    ARM/shmobile: Convert to hotplug state machine
    arm64/FP/SIMD: Convert to hotplug state machine
    ...

    Linus Torvalds
     

03 Oct, 2016

1 commit

  • Pull arm64 updates from Will Deacon:
    "It's a bit all over the place this time with no "killer feature" to
    speak of. Support for mismatched cache line sizes should help people
    seeing whacky JIT failures on some SoCs, and the big.LITTLE perf
    updates have been a long time coming, but a lot of the changes here
    are cleanups.

    We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer
    workaround is acked by Russell, the DT/OF bits are acked by Rob, the
    arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and
    jump_label by Peter (all CC'd).

    Summary:

    - Support for execute-only page permissions
    - Support for hibernate and DEBUG_PAGEALLOC
    - Support for heterogeneous systems with mismatches cache line sizes
    - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
    - arm64 PMU perf updates, including cpumasks for heterogeneous systems
    - Set UTS_MACHINE for building rpm packages
    - Yet another head.S tidy-up
    - Some cleanups and refactoring, particularly in the NUMA code
    - Lots of random, non-critical fixes across the board"

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits)
    arm64: tlbflush.h: add __tlbi() macro
    arm64: Kconfig: remove SMP dependence for NUMA
    arm64: Kconfig: select OF/ACPI_NUMA under NUMA config
    arm64: fix dump_backtrace/unwind_frame with NULL tsk
    arm/arm64: arch_timer: Use archdata to indicate vdso suitability
    arm64: arch_timer: Work around QorIQ Erratum A-008585
    arm64: arch_timer: Add device tree binding for A-008585 erratum
    arm64: Correctly bounds check virt_addr_valid
    arm64: migrate exception table users off module.h and onto extable.h
    arm64: pmu: Hoist pmu platform device name
    arm64: pmu: Probe default hw/cache counters
    arm64: pmu: add fallback probe table
    MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry
    arm64: Improve kprobes test for atomic sequence
    arm64/kvm: use alternative auto-nop
    arm64: use alternative auto-nop
    arm64: alternative: add auto-nop infrastructure
    arm64: lse: convert lse alternatives NOP padding to use __nops
    arm64: barriers: introduce nops and __nops macros for NOP sequences
    arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s
    ...

    Linus Torvalds
     

17 Sep, 2016

1 commit

  • In preparation for ACPI support, add a pmu_probe_info table to
    the arm_pmu_device_probe() call. This table gets used when
    probing in the absence of a devicetree node for PMU.

    Signed-off-by: Mark Salter
    Signed-off-by: Jeremy Linton
    Signed-off-by: Will Deacon

    Mark Salter
     

09 Sep, 2016

1 commit

  • In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
    each of which covers a subset of CPUs in the system. In some cases
    userspace needs to know which CPUs a given logical PMU covers, so we'd
    like to expose a cpumask under sysfs, similar to what is done for uncore
    PMUs.

    Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance
    opening and reading events"), perf stat only correctly handled a cpumask
    holding a single CPU, and only when profiling in system-wide mode. In
    other cases, the presence of a cpumask file could cause perf stat to
    behave erratically.

    Thus, exposing a cpumask file would break older perf binaries in cases
    where they would otherwise work.

    To avoid this issue while still providing userspace with the information
    it needs, this patch exposes a differently-named file (cpus) under
    sysfs. New tools can look for this and operate correctly, while older
    tools will not be adversely affected by its presence.

    Signed-off-by: Mark Rutland
    Cc: Will Deacon
    Signed-off-by: Will Deacon

    Mark Rutland