04 Mar, 2013

1 commit

  • Pull new ImgTec Meta architecture from James Hogan:
    "This adds core architecture support for Imagination's Meta processor
    cores, followed by some later miscellaneous arch/metag cleanups and
    fixes which I kept separate to ease review:

    - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
    - A few fixes all over, particularly for symbol prefixes
    - A few privilege protection fixes
    - Several cleanups (setup.c includes, split out a lot of
    metag_ksyms.c)
    - Fix some missing exports
    - Convert hugetlb to use vm_unmapped_area()
    - Copy device tree to non-init memory
    - Provide dma_get_sgtable()"

    * tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
    metag: Provide dma_get_sgtable()
    metag: prom.h: remove declaration of metag_dt_memblock_reserve()
    metag: copy devicetree to non-init memory
    metag: cleanup metag_ksyms.c includes
    metag: move mm/init.c exports out of metag_ksyms.c
    metag: move usercopy.c exports out of metag_ksyms.c
    metag: move setup.c exports out of metag_ksyms.c
    metag: move kick.c exports out of metag_ksyms.c
    metag: move traps.c exports out of metag_ksyms.c
    metag: move irq enable out of irqflags.h on SMP
    genksyms: fix metag symbol prefix on crc symbols
    metag: hugetlb: convert to vm_unmapped_area()
    metag: export clear_page and copy_page
    metag: export metag_code_cache_flush_all
    metag: protect more non-MMU memory regions
    metag: make TXPRIVEXT bits explicit
    metag: kernel/setup.c: sort includes
    perf: Enable building perf tools for Meta
    metag: add boot time LNKGET/LNKSET check
    metag: add __init to metag_cache_probe()
    ...

    Linus Torvalds
     

03 Mar, 2013

1 commit


02 Mar, 2013

1 commit

  • Pull new ARC architecture from Vineet Gupta:
    "Initial ARC Linux port with some fixes on top for 3.9-rc1:

    I would like to introduce the Linux port to ARC Processors (from
    Synopsys) for 3.9-rc1. The patch-set has been discussed on the public
    lists since Nov and has received a fair bit of review, specially from
    Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb...

    The arch bits are in arch/arc, some asm-generic changes (acked by
    Arnd), a minor change to PARISC (acked by Helge).

    The series is a touch bigger for a new port for 2 main reasons:

    1. It enables a basic kernel in first sub-series and adds
    ptrace/kgdb/.. later

    2. Some of the fallout of review (DeviceTree support, multi-platform-
    image support) were added on top of orig series, primarily to
    record the revision history.

    This updated pull request additionally contains

    - fixes due to our GNU tools catching up with the new syscall/ptrace
    ABI

    - some (minor) cross-arch Kconfig updates."

    * tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits)
    ARC: split elf.h into uapi and export it for userspace
    ARC: Fixup the current ABI version
    ARC: gdbserver using regset interface possibly broken
    ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window
    ARC: make a copy of flat DT
    ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed"
    ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled
    ARC: Fix pt_orig_r8 access
    ARC: [3.9] Fallout of hlist iterator update
    ARC: 64bit RTSC timestamp hardware issue
    ARC: Don't fiddle with non-existent caches
    ARC: Add self to MAINTAINERS
    ARC: Provide a default serial.h for uart drivers needing BASE_BAUD
    ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux
    ARC: [Review] Multi-platform image #8: platform registers SMP callbacks
    ARC: [Review] Multi-platform image #7: SMP common code to use callbacks
    ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional
    ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core
    ARC: [Review] Multi-platform image #4: Isolate platform headers
    ARC: [Review] Multi-platform image #3: switch to board callback
    ...

    Linus Torvalds
     

16 Feb, 2013

1 commit


07 Feb, 2013

1 commit

  • I am getting segfaults *after* the time sorting of perf samples where
    the event type is off the charts:

    (gdb) bt
    \#0 0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
    \#1 0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
    file_offset=0) at util/session.c:884
    \#2 0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
    \#3 0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645

    This is bizarre because the event has already been processed once --
    before it was added to the samples queue -- and the event was found to
    be sane at that time.

    There seem to be 2 causes:

    1. perf_evlist__mmap_read updates the read location even though there
    are outstanding references to events sitting in the mmap buffers via the
    ordered samples queue.

    2. There is a single evlist->event_copy for all evlist entries.
    event_copy is used to handle an event wrapping at the mmap buffer
    boundary.

    This patch addresses the second problem - making event_copy local to
    each perf_mmap. With this change my highly repeatable use case no longer
    fails.

    The first problem is much more complicated and will be the subject of a
    future patch.

    Signed-off-by: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     

25 Jan, 2013

2 commits


08 Dec, 2012

1 commit


20 Nov, 2012

2 commits

  • Make perf build for x86 once the UAPI disintegration patches for that arch
    have been applied by adding the appropriate -I flags - in the right order -
    and then converting some #includes that use ../.. notation to find main kernel
    headerfiles to use and instead.

    Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
    This makes sure we get the userspace version of the pt_regs struct. Ideally,
    we wouldn't have the latter -I flag at all, but unfortunately we want
    asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
    at least not for x86. I wonder if the bits outside of the __KERNEL__ guards
    *should* be transferred there.

    I note also that perf seems to do its dependency handling manually by listing
    all the header files it might want to use in LIB_H in the Makefile. Can this
    be changed to use -MD?

    Note that to do make this work, we need to export and UAPI disintegrate
    linux/hw_breakpoint.h, which I think should've been exported previously so that
    perf can access the bits. We have to do this in the same patch to maintain
    bisectability.

    Signed-off-by: David Howells

    David Howells
     
  • Use the 'unistd.h' from arch/powerpc/include/uapi to build the perf tool.

    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: David Howells
    Cc: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: linuxppc-dev@ozlabs.org
    Link: http://lkml.kernel.org/r/20121107191818.GA16211@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

15 Nov, 2012

2 commits

  • Use the 'unistd.h' from arch/powerpc/include/uapi to build the perf tool.

    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: David Howells
    Cc: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: linuxppc-dev@ozlabs.org
    Link: http://lkml.kernel.org/r/20121107191818.GA16211@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     
  • Final function renames to match test__* style and include cleanup.

    Signed-off-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1352508412-16914-12-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

02 Nov, 2012

1 commit

  • The test attr suite is run only if it's run under perf source directory,
    or tests are found in installed path.

    Otherwise tests are omitted (notification is displayed) and finished as
    successful.

    Signed-off-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1351634526-1516-25-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

01 Nov, 2012

1 commit

  • The idea is run perf session with kidnapping sys_perf_event_open
    function. For each sys_perf_event_open call we store the perf_event_attr
    data to the file to be checked later against what we expect.

    You can run this by:
    $ python ./tests/attr.py -d ./tests/attr/ -p ./perf -v

    v2 changes:
    - preserve errno value in the hook

    Signed-off-by: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20121031145247.GB1027@krava.brq.redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

29 Oct, 2012

1 commit

  • Currently many perf commands annotate/evlist/report/script/lock etc all
    support "-i" option to chose a specific perf data, and all of them
    create a local "input_name" to save the file name for that perf data.

    Since most of these commands need it, we can add a global variable for
    it, also it can some other benefits:

    1. When calling script browser inside hists/annotation browser, it needs
    to know the perf data file name to run that script.

    2. For further feature like runtime switching to another perf data file,
    this variable can also help.

    Signed-off-by: Feng Tang
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1351569369-26732-2-git-send-email-feng.tang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Feng Tang
     

20 Oct, 2012

1 commit

  • …it/acme/linux into perf/urgent

    Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

    * The python binding needs to link with libtraceevent and to initialize
    the 'page_size' variable so that mmaping works again.

    * The callchain folding character that appears on the TUI just before
    the overhead had disappeared due to recent changes, add it back.

    * Intel PEBS in VT-x context uses the DS address as a guest linear address,
    even though its programmed by the host as a host linear address. This either
    results in guest memory corruption and or the hardware faulting and 'crashing'
    the virtual machine. Therefore we have to disable PEBS on VT-x enter and
    re-enable on VT-x exit, enforcing a strict exclude_guest.

    Kernel side enforcement fix by Peter Zijlstra, tooling side fix by David Ahern.

    * Fix build on sparc due to UAPI, fix from David Miller.

    * Fixes for the srclike sort key for unresolved symbols and when processing
    samples in JITted code, where we don't have an ELF file, just an special
    symbol table, fixes from Namhyung Kim.

    * Fix some leaks in libtraceevent, from Steven Rostedt.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

17 Oct, 2012

1 commit

  • More UAPI stuff.

    Signed-off-by: David S. Miller
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20121017.010656.383828471689899431.davem@davemloft.net
    Signed-off-by: Arnaldo Carvalho de Melo

    David Miller
     

15 Oct, 2012

1 commit

  • The UAPI commits forgot to test tooling builds such as tools/perf/,
    and this fixes the fallout.

    Manual conversion.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

02 Oct, 2012

1 commit

  • Pull arm64 support from Catalin Marinas:
    "Linux support for the 64-bit ARM architecture (AArch64)

    Features currently supported:
    - 39-bit address space for user and kernel (each)
    - 4KB and 64KB page configurations
    - Compat (32-bit) user applications (ARMv7, EABI only)
    - Flattened Device Tree (mandated for all AArch64 platforms)
    - ARM generic timers"

    * tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits)
    arm64: ptrace: remove obsolete ptrace request numbers from user headers
    arm64: Do not set the SMP/nAMP processor bit
    arm64: MAINTAINERS update
    arm64: Build infrastructure
    arm64: Miscellaneous header files
    arm64: Generic timers support
    arm64: Loadable modules
    arm64: Miscellaneous library functions
    arm64: Performance counters support
    arm64: Add support for /proc/sys/debug/exception-trace
    arm64: Debugging support
    arm64: Floating point and SIMD
    arm64: 32-bit (compat) applications support
    arm64: User access library functions
    arm64: Signal handling support
    arm64: VDSO support
    arm64: System calls handling
    arm64: ELF definitions
    arm64: SMP support
    arm64: DMA mapping API
    ...

    Linus Torvalds
     

17 Sep, 2012

1 commit

  • This patch adds support for the AArch64 performance counters.

    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas
    Acked-by: Tony Lindgren
    Acked-by: Nicolas Pitre
    Acked-by: Olof Johansson
    Acked-by: Santosh Shilimkar

    Will Deacon
     

12 Aug, 2012

1 commit

  • This patch enables perf to use the DWARF unwind code.

    It extends the perf record '-g' option with following arguments:
    'fp' - provides framepointer based user
    stack backtrace
    'dwarf[,size]' - provides DWARF (libunwind) based user stack
    backtrace. The size specifies the size of the
    user stack dump. If omitted it is 8192 by default.

    If libunwind is found during the perf build, then the 'dwarf' argument
    becomes available for record command. The 'fp' stays as default option
    in any case.

    Examples: (perf compiled with libunwind)

    perf record -g dwarf ls
    - provides dwarf unwind with 8192 as stack dump size

    perf record -g dwarf,4096 ls
    - provides dwarf unwind with 4096 as stack dump size

    perf record -g -- ls
    perf record -g fp ls
    - provides frame pointer unwind

    Signed-off-by: Jiri Olsa
    Original-patch-by: Frederic Weisbecker
    Cc: "Frank Ch. Eigler"
    Cc: Arun Sharma
    Cc: Benjamin Redelings
    Cc: Corey Ashford
    Cc: Cyrill Gorcunov
    Cc: Frank Ch. Eigler
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Masami Hiramatsu
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Cc: Ulrich Drepper
    Link: http://lkml.kernel.org/r/1344345647-11536-13-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

26 May, 2012

1 commit

  • The attr.branch_sample_type field is defined as u64 by the API. As
    such, we need to ensure the variable holding the value of the branch
    stack filters is also u64 otherwise we may lose bits in the future.

    Note also that the bogus definition of the field in perf_record_opts
    caused problems on big-endian PPC systems. Thanks to Anshuman Khandual
    for tracking the problem on PPC.

    Reported-by: Anshuman Khandual
    Signed-off-by: Stephane Eranian
    Cc: Anshuman Khandual
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120525211344.GA7729@quad
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

03 May, 2012

2 commits

  • For further work on perf_target, it'd be better off splitting the code
    into a separate file.

    Signed-off-by: Namhyung Kim
    Reviewed-by: David Ahern
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1335417327-11796-9-git-send-email-namhyung.kim@lge.com
    [ committer note: Fixed perl build by using stdbool and types.h in target.h ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The perf_target struct will be used for taking care of cpu/thread maps
    based on user's input. Since it is used on various subcommands it'd
    better factoring it out.

    Thanks to Arnaldo for suggesting the better name.

    Signed-off-by: Namhyung Kim
    Reviewed-by: David Ahern
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1335417327-11796-2-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

21 Mar, 2012

1 commit

  • Pull perf events changes for v3.4 from Ingo Molnar:

    - New "hardware based branch profiling" feature both on the kernel and
    the tooling side, on CPUs that support it. (modern x86 Intel CPUs
    with the 'LBR' hardware feature currently.)

    This new feature is basically a sophisticated 'magnifying glass' for
    branch execution - something that is pretty difficult to extract from
    regular, function histogram centric profiles.

    The simplest mode is activated via 'perf record -b', and the result
    looks like this in perf report:

    $ perf record -b any_call,u -e cycles:u branchy

    $ perf report -b --sort=symbol
    52.34% [.] main [.] f1
    24.04% [.] f1 [.] f3
    23.60% [.] f1 [.] f2
    0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
    0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
    0.01% [k] _IO_vfprintf_internal [k] strchrnul
    0.01% [k] __printf [k] _IO_vfprintf_internal
    0.01% [k] main [k] __printf

    This output shows from/to branch columns and shows the highest
    percentage (from,to) jump combinations - i.e. the most likely taken
    branches in the system. "branches" can also include function calls
    and any other synchronous and asynchronous transitions of the
    instruction pointer that are not 'next instruction' - such as system
    calls, traps, interrupts, etc.

    This feature comes with (hopefully intuitive) flat ascii and TUI
    support in perf report.

    - Various 'perf annotate' visual improvements for us assembly junkies.
    It will now recognize function calls in the TUI and by hitting enter
    you can follow the call (recursively) and back, amongst other
    improvements.

    - Multiple threads/processes recording support in perf record, perf
    stat, perf top - which is activated via a comma-list of PIDs:

    perf top -p 21483,21485
    perf stat -p 21483,21485 -ddd
    perf record -p 21483,21485

    - Support for per UID views, via the --uid paramter to perf top, perf
    report, etc. For example 'perf top --uid mingo' will only show the
    tasks that I am running, excluding other users, root, etc.

    - Jump label restructurings and improvements - this includes the
    factoring out of the (hopefully much clearer) include/linux/static_key.h
    generic facility:

    struct static_key key = STATIC_KEY_INIT_FALSE;

    ...

    if (static_key_false(&key))
    do unlikely code
    else
    do likely code

    ...
    static_key_slow_inc();
    ...
    static_key_slow_inc();
    ...

    The static_key_false() branch will be generated into the code with as
    little impact to the likely code path as possible. the
    static_key_slow_*() APIs flip the branch via live kernel code patching.

    This facility can now be used more widely within the kernel to
    micro-optimize hot branches whose likelihood matches the static-key
    usage and fast/slow cost patterns.

    - SW function tracer improvements: perf support and filtering support.

    - Various hardenings of the perf.data ABI, to make older perf.data's
    smoother on newer tool versions, to make new features integrate more
    smoothly, to support cross-endian recording/analyzing workflows
    better, etc.

    - Restructuring of the kprobes code, the splitting out of 'optprobes',
    and a corner case bugfix.

    - Allow the tracing of kernel console output (printk).

    - Improvements/fixes to user-space RDPMC support, allowing user-space
    self-profiling code to extract PMU counts without performing any
    system calls, while playing nice with the kernel side.

    - 'perf bench' improvements

    - ... and lots of internal restructurings, cleanups and fixes that made
    these features possible. And, as usual this list is incomplete as
    there were also lots of other improvements

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
    perf report: Fix annotate double quit issue in branch view mode
    perf report: Remove duplicate annotate choice in branch view mode
    perf/x86: Prettify pmu config literals
    perf report: Enable TUI in branch view mode
    perf report: Auto-detect branch stack sampling mode
    perf record: Add HEADER_BRANCH_STACK tag
    perf record: Provide default branch stack sampling mode option
    perf tools: Make perf able to read files from older ABIs
    perf tools: Fix ABI compatibility bug in print_event_desc()
    perf tools: Enable reading of perf.data files from different ABI rev
    perf: Add ABI reference sizes
    perf report: Add support for taken branch sampling
    perf record: Add support for sampling taken branch
    perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
    x86/kprobes: Split out optprobe related code to kprobes-opt.c
    x86/kprobes: Fix a bug which can modify kernel code permanently
    x86/kprobes: Fix instruction recovery on optimized path
    perf: Add callback to flush branch_stack on context switch
    perf: Disable PERF_SAMPLE_BRANCH_* when not supported
    perf/x86: Add LBR software filter support for Intel CPUs
    ...

    Linus Torvalds
     

14 Mar, 2012

1 commit

  • On ancient systems I get this build failure:

    util/../../../arch/x86/include/asm/unistd.h:67:29: error: asm/unistd_64.h: No such file or directory
    In file included from util/cache.h:7,
    from builtin-test.c:8:
    util/../perf.h: In function ‘sys_perf_event_open’:In file included from util/../perf.h:16
    perf.h:170: error: ‘__NR_perf_event_open’ undeclared (first use in this function)

    The reason is that this old system does not have the split
    unistd.h headers yet, from which to pick up the syscall
    definitions.

    Add the syscall numbers to the already existing i386 and x86_64
    blocks in perf.h, and also provide empty include file stubs.

    With this patch perf builds and works fine on 5 years old
    user-space as well.

    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/n/tip-jctwg64le1w47tuaoeyftsg9@git.kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Arnaldo Carvalho de Melo

    Ingo Molnar
     

09 Mar, 2012

2 commits

  • This patch adds a new option to enable taken branch stack
    sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
    of perf_events.

    There is a new option to active this mode: -b.
    It is possible to pass a set of filters to select the type of
    branches to sample.

    The following filters are available:

    - any : any type of branches
    - any_call : any function call or system call
    - any_ret : any function return or system call return
    - any_ind : any indirect branch
    - u: only when the branch target is at the user level
    - k: only when the branch target is in the kernel
    - hv: only when the branch target is in the hypervisor

    Filters can be combined by passing a comma separated list
    to the option:

    $ perf record -b any_call,u -e cycles:u branchy

    Signed-off-by: Roberto Agostino Vitillo
    Signed-off-by: Stephane Eranian
    Cc: peterz@infradead.org
    Cc: acme@redhat.com
    Cc: robert.richter@amd.com
    Cc: ming.m.lin@intel.com
    Cc: andi@firstfloor.org
    Cc: asharma@fb.com
    Cc: vweaver1@eecs.utk.edu
    Cc: khandual@linux.vnet.ibm.com
    Cc: dsahern@gmail.com
    Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Roberto Agostino Vitillo
     
  • This patch adds:

    - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
    - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
    - build histograms on branches

    Signed-off-by: Roberto Agostino Vitillo
    Signed-off-by: Stephane Eranian
    Cc: peterz@infradead.org
    Cc: acme@redhat.com
    Cc: robert.richter@amd.com
    Cc: ming.m.lin@intel.com
    Cc: andi@firstfloor.org
    Cc: asharma@fb.com
    Cc: vweaver1@eecs.utk.edu
    Cc: khandual@linux.vnet.ibm.com
    Cc: dsahern@gmail.com
    Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Roberto Agostino Vitillo
     

03 Mar, 2012

1 commit

  • Just fall back to resetting those fields, if set, warning the user that
    that feature is not available.

    If guest samples appear they will just be discarded because no struct
    machine will be found and thus the event will be accounted as not
    handled and dropped, see 0c09571.

    Reported-by: Namhyung Kim
    Tested-by: Joerg Roedel
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Joerg Roedel
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

15 Feb, 2012

2 commits

  • Instead of requiring that users of perf_record_opts set
    .sample_id_all_avail to true, just invert the logic, using
    .sample_id_all_missing, that doesn't need to be explicitely initialized
    since gcc will zero members ommitted in a struct initialization.

    Just like the newly introduced .exclude_{guest,host} feature test.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Just fall back to resetting those fields, if set, warning the user that
    that feature is not available.

    If guest samples appear they will just be discarded because no struct
    machine will be found and thus the event will be accounted as not
    handled and dropped, see 0c09571.

    Reported-by: Namhyung Kim
    Tested-by: Joerg Roedel
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Joerg Roedel
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

14 Feb, 2012

2 commits

  • The perf_event_attr size needs to be initialized in all cases because it
    captures the ABI version.

    This patch moves the initialization of the field from the
    perf_event_open() syscall stub to its proper location in the
    event_attr_init().

    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120209151238.GA10272@quad
    Signed-off-by: Stephane Eranian
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • Allow a user to collect events for multiple threads or processes
    using a comma separated list.

    e.g., collect data on a VM and its vhost thread:
    perf top -p 21483,21485
    perf stat -p 21483,21485 -ddd
    perf record -p 21483,21485

    or monitoring vcpu threads
    perf top -t 21488,21489
    perf stat -t 21488,21489 -ddd
    perf record -t 21488,21489

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     

25 Jan, 2012

1 commit

  • The new --uid command line option will show only the tasks for a given
    user, using the proc interface to figure out the existing tasks.

    Kernel work is needed to close races at startup, but this should already
    be useful in many use cases.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-bdnspm000gw2l984a2t53o8z@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

20 Dec, 2011

1 commit

  • The problem is that when SAMPLE_PERIOD is not set, the kernel generates
    a number of samples in proportion to an event's period. Number of these
    samples may be too big and the kernel throttles all samples above a
    defined limit.

    E.g.: I want to trace when a process sleeps. I created a process which
    sleeps for 1ms and for 4ms. perf got 100 events in both cases.

    swapper 0 [000] 1141.371830: sched_stat_sleep: comm=foo pid=1801 delay=1386750 [ns]
    swapper 0 [000] 1141.369444: sched_stat_sleep: comm=foo pid=1801 delay=4499585 [ns]

    In the first case a kernel want to send 4499585 events and in the second
    case it wants to send 1386750 events. perf-reports shows that process
    sleeps in both places equal time.

    Instead of this we can get only one sample with an attribute period. As
    result we have less data transferring between kernel and user-space and
    we avoid throttling of samples.

    The patch "events: Don't divide events if it has field period" added a
    kernel part of this functionality.

    Acked-by: Arun Sharma
    Cc: Arun Sharma
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: devel@openvz.org
    Link: http://lkml.kernel.org/r/1324391565-1369947-1-git-send-email-avagin@openvz.org
    Signed-off-by: Andrew Vagin
    Signed-off-by: Arnaldo Carvalho de Melo

    Andrew Vagin
     

28 Nov, 2011

4 commits


13 Oct, 2011

1 commit

  • To do that we needed to stop using newtForm, as we don't want libnewt to
    catch the xterm resize signal.

    Remove some more newt calls and instead use the underlying libslang
    directly. In time tools/perf will use just libslang.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-h1824yjiru5n2ivz4bseizwj@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo