15 Jul, 2011

1 commit

  • While attempting to create a timechart of boot up I found perf didn't
    tolerate modules being loaded/unloaded. This patch fixes this by
    reading the file once and then writing the size read at the correct
    point in the file. It also simplifies the code somewhat.

    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Sonny Rao
    Signed-off-by: Michael Neuling
    Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
    Signed-off-by: Steven Rostedt

    Sonny Rao
     

05 Jul, 2011

1 commit

  • Add an option to perf report/annotate/script to specify which
    CPUs to operate on. This enables us to take a single system wide
    profile and analyse each CPU (or group of CPUs) in isolation.

    This was useful when profiling a multiprocess workload where the
    bottleneck was on one CPU but this was hidden in the overall
    profile. Per process and per thread breakdowns didn't help
    because multiple processes were running on each CPU and no
    single process consumed an entire CPU.

    The patch converts the list of CPUs returned by cpu_map__new
    into a bitmap for fast lookup. I wanted to use -C to be
    consistent with perf top/record/stat, but unfortunately perf
    report already uses -C .

    v2: Incorporate suggestions from David Ahern:
    - Added -c to perf script
    - Check that SAMPLE_CPU is set when -c is used
    - Update documentation

    v3: Create perf_session__cpu_bitmap()

    Signed-off-by: Anton Blanchard
    Acked-by: David Ahern
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
    Signed-off-by: Ingo Molnar

    Anton Blanchard
     

01 Jul, 2011

3 commits

  • Previously, when you want perf-stat to output the statistics in
    csv mode, no information of the noise will be printed out.

    For example right now we output this --repeat information:

    ./perf stat -r3 -x, sleep 1
    1.164789,task-clock
    8,context-switches
    0,CPU-migrations
    219,page-faults
    3337800,cycles

    With this patch, the output will be appended with an additional
    entry for the noise value:

    ./perf stat -r3 -x, sleep 1
    1.164789,task-clock,3.75%
    8,context-switches,75.00%
    0,CPU-migrations,100.00%
    219,page-faults,0.00%
    3337800,cycles,3.36%

    Signed-off-by: Zhengyu He
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: Venkatesh Pallipadi
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
    Signed-off-by: Ingo Molnar

    Zhengyu He
     
  • …ic/random-tracing into perf/core

    Ingo Molnar
     
  • Merge reason: Pick up the latest fixes.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

30 Jun, 2011

6 commits

  • We don't need to display the parent field if the parent
    sorting machinery is only used for parent filtering
    (as in "-p foo").

    However if parent filtering is used in combination with
    explicit parent sorting ( -s parent), we want to
    display it.

    Result with:

    perf report -p kernel_thread -s parent

    Before:

    # Overhead Parent symbol
    # ........ .............
    #
    0.07%
    |
    --- ioread8
    ata_sff_check_status
    ata_sff_tf_load
    ata_sff_qc_issue
    ata_bmdma_qc_issue
    ata_qc_issue
    ata_scsi_translate
    ata_scsi_queuecmd
    scsi_dispatch_cmd
    scsi_request_fn
    __blk_run_queue
    __make_request
    generic_make_request
    submit_bio
    submit_bh
    journal_submit_commit_record
    jbd2_journal_commit_transaction
    kjournald2
    kthread
    kernel_thread_helpe

    After:

    # Overhead Parent symbol
    # ........ .............
    #
    0.07% kernel_thread_helper
    |
    --- ioread8
    ata_sff_check_status
    ata_sff_tf_load
    ata_sff_qc_issue
    ata_bmdma_qc_issue
    ata_qc_issue
    ata_scsi_translate
    ata_scsi_queuecmd
    scsi_dispatch_cmd
    scsi_request_fn
    __blk_run_queue
    __make_request
    generic_make_request
    submit_bio
    submit_bh
    journal_submit_commit_record
    jbd2_journal_commit_transaction
    kjournald2
    kthread
    kernel_thread_helper

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Cc: Sam Liao

    Frederic Weisbecker
     
  • So that the parent sort dimension can be registered twice: once
    if we add it as an explicit sort dimension (-s parent) and twice
    if we request a parent filter (-p foo).

    We'll have only one parent sort dimension in the end but this
    allows to override the default parent filter with we gave in "-p"
    option. The goal of this is to prepare to allow the use of
    "-s parent" and "-p foo" at the same time, ie: sort by filtered
    parent.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Cc: Sam Liao

    Frederic Weisbecker
     
  • As for newt ui, don't display entries that have been marked
    as ignored.

    The practical current effect of this is to make parent
    filtering really working. Before, entries that were ignored
    were given a null parent but were still displayed. This
    resulted in some weird effects:

    # Overhead Command Shared Object Symbol
    # ........ ........... ................. ............
    #
    ^A
    |
    --- __lock_acquire
    |
    |--95.97%-- lock_acquire
    | |
    | |--30.75%-- _raw_spin_lock

    Discard these from the stdio display.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Cc: Sam Liao

    Frederic Weisbecker
     
  • These are probably some old leftovers.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Cc: Sam Liao

    Frederic Weisbecker
     
  • These don't need to be globally visible.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Cc: Sam Liao

    Frederic Weisbecker
     
  • Add "caller/callee" option to support inverted butterfly report,
    in the inverted report (with caller option), the call graph start
    from the callee's ancestor. Users can use such view to catch system's
    performance bottleneck from a sysprof like view. Using this option
    with specified sort order like pid gives us high level view of call
    graph statistics.

    Also add "-G" alias for inverted call graph.

    Signed-off-by: Sam Liao
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Ahern
    Signed-off-by: Frederic Weisbecker

    Sam Liao
     

20 Jun, 2011

1 commit

  • …-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tools/perf: Fix static build of perf tool
    tracing: Fix regression in printk_formats file

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    clocksource: Make watchdog robust vs. interruption
    timerfd: Fix wakeup of processes when timer is cancelled on clock change

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, MAINTAINERS: Add x86 MCE people
    x86, efi: Do not reserve boot services regions within reserved areas

    Linus Torvalds
     

19 Jun, 2011

1 commit


17 Jun, 2011

1 commit


16 Jun, 2011

3 commits

  • Merge reason: add the latest fixes.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • To build a statically linked version of the perf tool all needed
    libraries must be added in the correct order to get the symbols
    resolved. Currently this is broken when, e.g. python or newt
    support is enabled -- libpython needs libpthread which is an
    unconditional link dependency of the perf tool; libslang needs
    libm, another unconditional dependency. To solve the problem in
    the long run without the need to keep track of transitive
    library dependencies, simply make the linker look at the EXTLIBS
    multiple times until it has all symbols resolved.

    Signed-off-by: Mathias Krause
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com
    Signed-off-by: Ingo Molnar

    Mathias Krause
     
  • When generating the perf version from the kernel version using 'make
    kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
    trigger additional output which pollute the contents.

    Signed-off-by: Andy Whitcroft
    Signed-off-by: Michal Marek

    Andy Whitcroft
     

15 Jun, 2011

1 commit

  • Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
    introduced performance regression. In an AIM7 test, this commit degraded
    performance by about 40%.

    The commit runs rcu callbacks in a kthread instead of softirq. We observed
    high rate of context switch which is caused by this. Out test system has
    64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
    which is caused by RCU's per-CPU kthread. A trace showed that most of
    the time the RCU per-CPU kthread doesn't actually handle any callbacks,
    but instead just does a very small amount of work handling grace periods.
    This means that RCU's per-CPU kthreads are making the scheduler do quite
    a bit of work in order to allow a very small amount of RCU-related
    processing to be done.

    Alex Shi's analysis determined that this slowdown is due to lock
    contention within the scheduler. Unfortunately, as Peter Zijlstra points
    out, the scheduler's real-time semantics require global action, which
    means that this contention is inherent in real-time scheduling. (Yes,
    perhaps someone will come up with a workaround -- otherwise, -rt is not
    going to do well on large SMP systems -- but this patch will work around
    this issue in the meantime. And "the meantime" might well be forever.)

    This patch therefore re-introduces softirq processing to RCU, but only
    for core RCU work. RCU callbacks are still executed in kthread context,
    so that only a small amount of RCU work runs in softirq context in the
    common case. This should minimize ksoftirqd execution, allowing us to
    skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.

    Signed-off-by: Shaohua Li
    Tested-by: "Alex,Shi"
    Signed-off-by: Paul E. McKenney

    Shaohua Li
     

10 Jun, 2011

2 commits


08 Jun, 2011

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf: Fix comments in include/linux/perf_event.h
    perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI
    perf python: Fix argument name list of read_on_cpu()
    perf evlist: Don't die if sample_{id_all|type} is invalid
    perf python: Use exception to propagate errors
    perf evlist: Remove dependency on debug routines
    perf, cgroups: Fix up for new API

    Linus Torvalds
     

04 Jun, 2011

2 commits


03 Jun, 2011

10 commits

  • Mandatory arguments need to be present in the argument name list, as
    well as optional arguments, otherwise python barfs:

    # ./python/twatch.py
    Traceback (most recent call last):
    File "./python/twatch.py", line 41, in
    main()
    File "./python/twatch.py", line 32, in main
    event = evlist.read_on_cpu(cpu)
    RuntimeError: more argument specifiers than keyword list entries

    Hence, add cpu to the name list.

    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Arnaldo Carvalho de Melo

    Frederic Weisbecker
     
  • Fixes two more cases where the python binding would not load:

    . Not finding die(), which it shouldn't anyway, not good to just stop the
    world because some particular perf.data file is invalid, just propagate
    the error to the caller.

    . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
    where it belongs, as most cases are moving to operate on an evsel object.o

    One of the fixed problems:

    [root@emilia ~]# python
    >>> import perf
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
    >>>
    [root@emilia ~]#

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We were using pr_debug to tell the user about not being able to parse a sample
    where we should really use the python way of reporting errors: exceptions.

    Fixes this problem:

    [root@emilia ~]# python
    >>> import perf
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
    >>>
    [root@emilia ~]

    As we want to keep the objects linked in the python binding (and in the future
    in a shared library) minimal.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So far we avoided having to link debug.o in the python binding, keep it
    that way by not using ui__warning() in evlist.c.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Resolve to a function or variable if possible and if the sym option is
    enabled.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The 'sym' option displays both the function name and the DSO it comes
    from. Split the display of the dso into a separate option. This allows
    display of the ip address and symbol without the dso, thus shortening
    line lengths - and decluttering the output a bit.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • Currently the "sym" output field is used to dump instruction pointers
    and callchain stack. Sample addresses can also be converted to symbols,
    so the meaning of "sym" needs to be fixed. This patch adds an "ip"
    option and if it is selected the user can also opt to dump symbols for
    them. If the user opts to dump IP without syms only the address is
    shown.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • perf stat continues running even if the event list contains counters
    that are not supported. The resulting output then contains
    for those events which gets confusing as to which events are supported,
    but not counted and which are not supported.

    Before:

    perf stat -ddd -- sleep 1

    Performance counter stats for 'sleep 1':

    0.571283 task-clock # 0.001 CPUs utilized
    1 context-switches # 0.002 M/sec
    0 CPU-migrations # 0.000 M/sec
    157 page-faults # 0.275 M/sec
    1,037,707 cycles # 1.816 GHz
    stalled-cycles-frontend
    stalled-cycles-backend
    654,499 instructions # 0.63 insns per cycle
    136,129 branches # 238.286 M/sec
    branch-misses
    L1-dcache-loads
    L1-dcache-load-misses
    LLC-loads
    LLC-load-misses
    L1-icache-loads
    L1-icache-load-misses
    dTLB-loads
    dTLB-load-misses
    iTLB-loads
    iTLB-load-misses
    L1-dcache-prefetches
    L1-dcache-prefetch-misses

    1.001004836 seconds time elapsed

    After:

    perf stat -ddd -- sleep 1

    Performance counter stats for 'sleep 1':

    1.350326 task-clock # 0.001 CPUs utilized
    2 context-switches # 0.001 M/sec
    0 CPU-migrations # 0.000 M/sec
    157 page-faults # 0.116 M/sec
    11,986 cycles # 0.009 GHz
    stalled-cycles-frontend
    stalled-cycles-backend
    496,986 instructions # 41.46 insns per cycle
    138,065 branches # 102.246 M/sec
    7,245 branch-misses # 5.25% of all branches
    L1-dcache-loads
    L1-dcache-load-misses
    LLC-loads
    LLC-load-misses
    L1-icache-loads
    L1-icache-load-misses
    dTLB-loads
    dTLB-load-misses
    iTLB-loads
    iTLB-load-misses
    L1-dcache-prefetches
    L1-dcache-prefetch-misses

    1.002397333 seconds time elapsed

    v1->v2:
    changed supported type from int to bool

    v2->v3
    fixed vertical alignment of new struct element

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The list of methods argument names only needs to be NULL terminated
    once. Remove the second ones.

    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Arnaldo Carvalho de Melo

    Frederic Weisbecker
     
  • Mandatory arguments need to be present in the argument name list, as
    well as optional arguments, otherwise python barfs:

    # ./python/twatch.py
    Traceback (most recent call last):
    File "./python/twatch.py", line 41, in
    main()
    File "./python/twatch.py", line 32, in main
    event = evlist.read_on_cpu(cpu)
    RuntimeError: more argument specifiers than keyword list entries

    Hence, add cpu to the name list.

    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Arnaldo Carvalho de Melo

    Frederic Weisbecker
     

02 Jun, 2011

6 commits

  • By ignoring the unset values of the minconfig in deciding
    what to test in the config_bisect can cause the problem
    config from being tested too.

    Just do not test the configs that are set in the minconfig.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The command that is called that reboots the kernel may fail
    but the return code is not passed back to the ktest.pl script.
    This is because a ';' is used between the two commands and
    if the second command fails, only the first command's return
    code is returned. Using a '&&' between the two commands fixes
    this.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Because in perl the array size returned by $#arr, is the last
    index and not the actually size of the array, we end the config
    bisect early, thinking there is only one config left when there
    are in fact two. Thus the result has a 50% chance of picking
    the correct config that caused the problem.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Fixes two more cases where the python binding would not load:

    . Not finding die(), which it shouldn't anyway, not good to just stop the
    world because some particular perf.data file is invalid, just propagate
    the error to the caller.

    . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
    where it belongs, as most cases are moving to operate on an evsel object.o

    One of the fixed problems:

    [root@emilia ~]# python
    >>> import perf
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
    >>>
    [root@emilia ~]#

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We were using pr_debug to tell the user about not being able to parse a sample
    where we should really use the python way of reporting errors: exceptions.

    Fixes this problem:

    [root@emilia ~]# python
    >>> import perf
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
    >>>
    [root@emilia ~]

    As we want to keep the objects linked in the python binding (and in the future
    in a shared library) minimal.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So far we avoided having to link debug.o in the python binding, keep it
    that way by not using ui__warning() in evlist.c.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

30 May, 2011

1 commit