23 Jun, 2012

1 commit


12 Jun, 2012

1 commit

  • We need to use the per event info snapshoted at record time to
    synthesize the events name, so do it just after reading the perf.data
    headers, when we already processed the /sys events data, otherwise we'll
    end up using the local /sys that only by sheer luck will have the same
    tracepoint ID -> real event association.

    Example:

    # uname -a
    Linux felicio.ghostprotocols.net 3.4.0-rc5+ #1 SMP Sat May 19 15:27:11 BRT 2012 x86_64 x86_64 x86_64 GNU/Linux
    # perf record -e sched:sched_switch usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.015 MB perf.data (~648 samples) ]
    # cat /t/events/sched/sched_switch/id
    279
    # perf evlist -v
    sched:sched_switch: sample_freq=1, type: 2, config: 279, size: 80, sample_type: 1159, read_format: 7, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
    #

    So on the above machine the sched:sched_switch has tracepoint id 279, but on
    the machine were we'll analyse it it has a different id:

    $ cat /t/events/sched/sched_switch/id
    56
    $ perf evlist -i /tmp/perf.data
    kmem:mm_balancedirty_writeout
    $ cat /t/events/kmem/mm_balancedirty_writeout/id
    279

    With this fix:

    $ perf evlist -i /tmp/perf.data
    sched:sched_switch

    Reported-by: Dmitry Antipov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-auwks8fpuhmrdpiefs55o5oz@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

11 Jun, 2012

2 commits

  • The following commit:

    commit 56f3bae70638b33477a6015fd362ccfe354fd3ee
    Author: Jim Cromie
    Date: Wed Sep 7 17:14:00 2011 -0600

    perf stat: Add --log-fd option to redirect stderr elsewhere

    introduced a bug in the way perf stat outputs the results by default,
    i.e., without the --log-fd or --output option. It would default to
    writing to file descriptor 0, i.e., stdin. Writing to stdin is allowed
    and is equivalent to writing to stdout. However, there is a major
    difference for any script that was already capturing the output of perf
    stat via redirection:

    perf stat >/tmp/log .... or perf stat 2>/tmp/log ....

    They would not capture anything anymore. They would have to do:
    perf stat 0>/tmp/log ...

    This breaks compatibility with existing scripts and does not look very
    natural.

    This patch fixes the problem by looking at output_fd only when it was
    modified by user (> 0). It also checks that the value if positive.
    Passing --log-fd 0 is ignored.

    I would also argue that defaulting to stderr for the results is not the
    right thing to do, though this patch does not address this specific
    issue.

    Signed-off-by: Stephane Eranian
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Jim Cromie
    Link: http://lkml.kernel.org/r/20120515111111.GA9870@quad
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • Based on Jiri's latest attempt:
    https://lkml.org/lkml/2012/5/16/61

    Basically, adds_features should be byte swapped assuming unsigned
    longs are either 8-bytes (u64) or 4-bytes (u32).

    Fixes 32-bit ppc dumping 64-bit x86 feature data:
    ========
    captured on: Sun May 20 19:23:23 2012
    hostname : nxos-vdc-dev3
    os release : 3.4.0-rc7+
    perf version : 3.4.rc4.137.g978da3
    arch : x86_64
    nrcpus online : 16
    nrcpus avail : 16
    cpudesc : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz
    cpuid : GenuineIntel,6,26,5
    total memory : 24680324 kB
    ...

    Verified 64-bit x86 can still dump feature data for 32-bit ppc.

    Signed-off-by: David Ahern
    Reviewed-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/4FBBB539.5010805@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     

09 Jun, 2012

2 commits

  • The SuSE security team suggested to use recvfrom instead of recv to be
    certain that the connector message is originated from kernel.

    CVE-2012-2669

    Signed-off-by: Olaf Hering
    Signed-off-by: Marcus Meissner
    Signed-off-by: Sebastian Krahmer
    Signed-off-by: K. Y. Srinivasan
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Olaf Hering
     
  • Pull perf fixes from Ingo Molnar:
    "A bit larger than what I'd wish for - half of it is due to hw driver
    updates to Intel Ivy-Bridge which info got recently released,
    cycles:pp should work there now too, amongst other things. (but we
    are generally making exceptions for hardware enablement of this type.)

    There are also callchain fixes in it - responding to mostly
    theoretical (but valid) concerns. The tooling side sports perf.data
    endianness/portability fixes which did not make it for the merge
    window - and various other fixes as well."

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
    perf/x86: Check user address explicitly in copy_from_user_nmi()
    perf/x86: Check if user fp is valid
    perf: Limit callchains to 127
    perf/x86: Allow multiple stacks
    perf/x86: Update SNB PEBS constraints
    perf/x86: Enable/Add IvyBridge hardware support
    perf/x86: Implement cycles:p for SNB/IVB
    perf/x86: Fix Intel shared extra MSR allocation
    x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
    perf: Remove duplicate invocation on perf_event_for_each
    perf uprobes: Remove unnecessary check before strlist__delete
    perf symbols: Check for valid dso before creating map
    perf evsel: Fix 32 bit values endianity swap for sample_id_all header
    perf session: Handle endianity swap on sample_id_all header data
    perf symbols: Handle different endians properly during symbol load
    perf evlist: Pass third argument to ioctl explicitly
    perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP
    perf tools: Make --version show kernel version instead of pull req tag
    perf tools: Check if callchain is corrupted
    perf callchain: Make callchain cursors TLS
    ...

    Linus Torvalds
     

06 Jun, 2012

1 commit

  • …it/acme/linux into perf/urgent

    Pull perf fixes from Arnaldo Carvalho de Melo:

    * Endianness fixes from Jiri Olsa

    * Fixes for make perf tarball

    * Fix for DSO name in perf script callchains, from David Ahern

    * Segfault fixes for perf top --callchain, from Namhyung Kim

    * Minor function result fixes from Srikar Dronamraju

    * Add missing 3rd ioctl parameter, from Namhyung Kim

    * Fix pager usage in minimal embedded systems, from Avik Sil

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

04 Jun, 2012

2 commits

  • Initial IVB support went into turbostat in Linux-3.1:
    553575f1ae048aa44682b46b3c51929a0b3ad337
    (tools turbostat: recognize and run properly on IVB)

    However, when running on IVB, turbostat would fail
    to report the new couters added with SNB, c7, pc2 and pc7.
    So in scenarios where these counters are non-zero on IVB,
    turbostat would report erroneous residencey results.

    In particular c7 time would be added to c1 time,
    since c1 time is calculated as "that which is left over".

    Also, turbostat reports MHz capabilities when passed
    the "-v" option, and it would incorrectly report 133MHz
    bclk instead of 100MHz bclk for IVB, which would inflate
    GHz reported with that option.

    This patch is a backport of a fix already included in turbostat v2.

    Signed-off-by: Len Brown

    Len Brown
     
  • Linux 3.4 included a modification to turbostat to
    lower cross-call overhead by using scheduler affinity:

    15aaa34654831e98dd76f7738b6c7f5d05a66430
    (tools turbostat: reduce measurement overhead due to IPIs)

    In the use-case where turbostat forks a child program,
    that change had the un-intended side-effect of binding
    the child to the last cpu in the system.

    This change removed the binding before forking the child.

    This is a back-port of a fix already included in turbostat v2.

    Signed-off-by: Len Brown

    Len Brown
     

01 Jun, 2012

4 commits

  • Merge misc patches from Andrew Morton:

    - the "misc" tree - stuff from all over the map

    - checkpatch updates

    - fatfs

    - kmod changes

    - procfs

    - cpumask

    - UML

    - kexec

    - mqueue

    - rapidio

    - pidns

    - some checkpoint-restore feature work. Reluctantly. Most of it
    delayed a release. I'm still rather worried that we don't have a
    clear roadmap to completion for this work.

    * emailed from Andrew Morton : (78 patches)
    kconfig: update compression algorithm info
    c/r: prctl: add ability to set new mm_struct::exe_file
    c/r: prctl: extend PR_SET_MM to set up more mm_struct entries
    c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat
    syscalls, x86: add __NR_kcmp syscall
    fs, proc: introduce /proc//task//children entry
    sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE
    aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
    eventfd: change int to __u64 in eventfd_signal()
    fs/nls: add Apple NLS
    pidns: make killed children autoreap
    pidns: use task_active_pid_ns in do_notify_parent
    rapidio/tsi721: add DMA engine support
    rapidio: add DMA engine support for RIO data transfers
    ipc/mqueue: add rbtree node caching support
    tools/selftests: add mq_perf_tests
    ipc/mqueue: strengthen checks on mqueue creation
    ipc/mqueue: correct mq_attr_ok test
    ipc/mqueue: improve performance of send/recv
    selftests: add mq_open_tests
    ...

    Linus Torvalds
     
  • While doing the checkpoint-restore in the user space one need to determine
    whether various kernel objects (like mm_struct-s of file_struct-s) are
    shared between tasks and restore this state.

    The 2nd step can be solved by using appropriate CLONE_ flags and the
    unshare syscall, while there's currently no ways for solving the 1st one.

    One of the ways for checking whether two tasks share e.g. mm_struct is to
    provide some mm_struct ID of a task to its proc file, but showing such
    info considered to be not that good for security reasons.

    Thus after some debates we end up in conclusion that using that named
    'comparison' syscall might be the best candidate. So here is it --
    __NR_kcmp.

    It takes up to 5 arguments - the pids of the two tasks (which
    characteristics should be compared), the comparison type and (in case of
    comparison of files) two file descriptors.

    Lookups for pids are done in the caller's PID namespace only.

    At moment only x86 is supported and tested.

    [akpm@linux-foundation.org: fix up selftests, warnings]
    [akpm@linux-foundation.org: include errno.h]
    [akpm@linux-foundation.org: tweak comment text]
    Signed-off-by: Cyrill Gorcunov
    Acked-by: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Andrey Vagin
    Cc: KOSAKI Motohiro
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Thomas Gleixner
    Cc: Glauber Costa
    Cc: Andi Kleen
    Cc: Tejun Heo
    Cc: Matt Helsley
    Cc: Pekka Enberg
    Cc: Eric Dumazet
    Cc: Vasiliy Kulikov
    Cc: Alexey Dobriyan
    Cc: Valdis.Kletnieks@vt.edu
    Cc: Michal Marek
    Cc: Frederic Weisbecker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • Add the mq_perf_tests tool I used when creating my mq performance patch.
    Also add a local .gitignore to keep the binaries from showing up in git
    status output.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Doug Ledford
    Cc: Stephen Rothwell
    Cc: Manfred Spraul
    Cc: Frederic Weisbecker
    Acked-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Ledford
     
  • Add a directory to house POSIX message queue subsystem specific tests.
    Add first test which checks the operation of mq_open() under various
    corner conditions.

    Signed-off-by: Doug Ledford
    Cc: KOSAKI Motohiro
    Cc: Doug Ledford
    Cc: Joe Korty
    Cc: Amerigo Wang
    Cc: Serge E. Hallyn
    Cc: Jiri Slaby
    Cc: Manfred Spraul
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Ledford
     

31 May, 2012

15 commits

  • Since strlist__delete() itself checks, the additional check before
    calling strlist__delete() is redundant.

    No Functional change.

    Signed-off-by: Srikar Dronamraju
    Suggested-by: Arnaldo Carvalho de Melo
    Cc: Ananth N Mavinakayanahalli
    Cc: Anton Arapov
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20120531114643.23691.38666.sendpatchset@srdronam.in.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Srikar Dronamraju
     
  • dso__new() can return NULL. Hence verify dso before creating a new map.

    Signed-off-by: Srikar Dronamraju
    Suggested-by: Arnaldo Carvalho de Melo
    Cc: Ananth N Mavinakayanahalli
    Cc: Anton Arapov
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20120531114656.23691.54223.sendpatchset@srdronam.in.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Srikar Dronamraju
     
  • We swap the sample_id_all header by u64 pointers. Some members of the
    header happen to be 32 bit values. We need to handle them separatelly.

    Together with other endianity patches, this change fixies perf report
    discrepancies on origin and target systems as described in test 1 below,
    e.g. following perf report diff:

    ...
    0.12% ps [kernel.kallsyms] [k] clear_page
    - 0.12% awk bash [.] alloc_word_desc
    + 0.12% awk bash [.] yyparse
    0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e
    0.10% perf libc-2.12.so [.] __ctype_toupper_loc
    - 0.09% rhts-test-runne bash [.] maybe_make_export_env
    + 0.09% rhts-test-runne bash [.] 0x385a0
    0.09% ps [kernel.kallsyms] [k] page_fault
    ...

    Note, running following to test perf endianity handling:
    test 1)
    - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

    - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

    - the diff should produce no output
    (besides some white space stuff and possibly different
    date/TZ output)

    test 2)
    - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
    - mount origin system root to the target system on /mnt/origin
    - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
    --kallsyms /mnt/origin/proc/kallsyms
    - complete perf.data header is displayed

    Signed-off-by: Jiri Olsa
    Reviewed-by: David Ahern
    Tested-by: David Ahern
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338380624-7443-4-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding endianity swapping for event header attached via sample_id_all.

    Currently we dont do that and it's causing wrong data to be read when
    running report on architecture with different endianity than the record.

    The perf is currently able to process 32-bit PPC samples on 32-bit
    and 64-bit x86.

    Together with other endianity patches, this change fixies perf report
    discrepancies on origin and target systems as described in test 1
    below, e.g. following perf report diff:

    ...
    0.12% ps [kernel.kallsyms] [k] clear_page
    - 0.12% awk bash [.] alloc_word_desc
    + 0.12% awk bash [.] yyparse
    0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e
    0.10% perf libc-2.12.so [.] __ctype_toupper_loc
    - 0.09% rhts-test-runne bash [.] maybe_make_export_env
    + 0.09% rhts-test-runne bash [.] 0x385a0
    0.09% ps [kernel.kallsyms] [k] page_fault
    ...

    Note, running following to test perf endianity handling:
    test 1)
    - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

    - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

    - the diff should produce no output
    (besides some white space stuff and possibly different
    date/TZ output)

    test 2)
    - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
    - mount origin system root to the target system on /mnt/origin
    - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
    --kallsyms /mnt/origin/proc/kallsyms
    - complete perf.data header is displayed

    Signed-off-by: Jiri Olsa
    Reviewed-by: David Ahern
    Tested-by: David Ahern
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338380624-7443-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Currently we dont care about the file object's endianness. It's possible
    we read buildid file object from different architecture than we are
    currentlly running on. So we need to care about properly reading such
    object's data - handle different endianness properly.

    Adding:
    needs_swap DSO field
    dso__swap_init function to initialize DSO's needs_swap
    DSO__SWAP to read the data with proper swaps

    Together with other endianity patches, this change fixies perf report
    discrepancies on origin and target systems as described in test 1 below,
    e.g. following perf report diff:

    ...
    0.12% ps [kernel.kallsyms] [k] clear_page
    - 0.12% awk bash [.] alloc_word_desc
    + 0.12% awk bash [.] yyparse
    0.11% beah-rhts-task libpython2.6.so.1.0 [.] 0x5560e
    0.10% perf libc-2.12.so [.] __ctype_toupper_loc
    - 0.09% rhts-test-runne bash [.] maybe_make_export_env
    + 0.09% rhts-test-runne bash [.] 0x385a0
    0.09% ps [kernel.kallsyms] [k] page_fault
    ...

    Note, running following to test perf endianity handling:
    test 1)
    - origin system:
    # perf record -a -- sleep 10 (any perf record will do)
    # perf report > report.origin
    # perf archive perf.data

    - copy the perf.data, report.origin and perf.data.tar.bz2
    to a target system and run:
    # tar xjvf perf.data.tar.bz2 -C ~/.debug
    # perf report > report.target
    # diff -u report.origin report.target

    - the diff should produce no output
    (besides some white space stuff and possibly different
    date/TZ output)

    test 1)
    - origin system:
    # perf record -ag -fo /tmp/perf.data -- sleep 1
    - mount origin system root to the target system on /mnt/origin
    - target system:
    # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
    --kallsyms /mnt/origin/proc/kallsyms
    - complete perf.data header is displayed

    Signed-off-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338380624-7443-2-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The ioctl on perf event fd wants 3 arguments but we only passed 2. As
    the only user of the functions is perf record and it calls them for
    every event (regardless of group setting), just pass 0 for now.

    Signed-off-by: Namhyung Kim
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338443506-25009-3-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The ioctl interface of perf event fd receives 3 arguments to control
    event group behavior but it lacked documentation.

    Signed-off-by: Namhyung Kim
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338443506-25009-2-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Before:

    $ perf --version
    perf version perf.urgent.for.mingo.5.g37da28

    After:

    $ perf --version
    perf version 3.4.8941.g37da28.dirty

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-vc9b4e6023iegz9kabr3yvyv@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We faced segmentation fault on perf top -G at very high sampling rate
    due to a corrupted callchain. While the root cause was not revealed (I
    failed to figure it out), this patch tries to protect us from the
    segfault on such cases.

    Reported-by: Arnaldo Carvalho de Melo
    Signed-off-by: Namhyung Kim
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Sunjin Yang
    Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • perf top -G has a race on callchain cursor between main thread and
    display thread. Since the callchain cursors are used locally make them
    thread-local data would solve the problem.

    Signed-off-by: Namhyung Kim
    Reported-by: Sunjin Yang
    Suggested-by: Arnaldo Carvalho de Melo
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Sunjin Yang
    Link: http://lkml.kernel.org/r/1338443007-24857-1-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Pull perf updates from Ingo Molnar.

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
    perf ui browser: Stop using 'self'
    perf annotate browser: Read perf config file for settings
    perf config: Allow '_' in config file variable names
    perf annotate browser: Make feature toggles global
    perf annotate browser: The idx_asm field should be used in asm only view
    perf tools: Convert critical messages to ui__error()
    perf ui: Make --stdio default when TUI is not supported
    tools lib traceevent: Silence compiler warning on 32bit build
    perf record: Fix branch_stack type in perf_record_opts
    perf tools: Reconstruct event with modifiers from perf_event_attr
    perf top: Fix counter name fixup when fallbacking to cpu-clock
    perf tools: fix thread_map__new_by_pid_str() memory leak in error path
    perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specified
    tools lib traceevent: Fix signature of create_arg_item()
    tools lib traceevent: Use proper function parameter type
    tools lib traceevent: Fix freeing arg on process_dynamic_array()
    tools lib traceevent: Fix a possibly wrong memory dereference
    tools lib traceevent: Fix a possible memory leak
    tools lib traceevent: Allow expressions in __print_symbolic() fields
    perf evlist: Explicititely initialize input_name
    ...

    Linus Torvalds
     
  • Some Distributions may lack "less" package being included by default,
    e.g., Linaro nano rootfs. In those cases use the portable "pager"
    command instead of "less".

    Signed-off-by: Avik Sil
    Acked-by: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338287725-26382-1-git-send-email-avik.sil@linaro.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Avik Sil
     
  • The patch series that introduced the top level tools/ makefile and the
    libtraceevent broke this feature where files needed to build in a
    detached tarball were not included in the MANIFEST file and thus not
    included in the tarball.

    Fix it by adding the relevant files to the MANIFEST.

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/n/tip-z3mjj74927xvqwhlmu18kj80@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • $ perf script -i /tmp/perf.data
    ...
    gcc 13623 544315.062858: context-switches:
    ffffffff815f65c9 __schedule ([kernel.kallsyms])
    ffffffff81087cea __cond_resched ([kernel.kallsyms])
    ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
    ffffffff815fb87a do_page_fault ([kernel.kallsyms])
    ffffffff815f8465 page_fault ([kernel.kallsyms])
    2b7a71ea0303 _dl_lookup_symbol_x ([kernel.kallsyms])
    2b7a71ea1eb5 _dl_relocate_object ([kernel.kallsyms])
    2b7a71e99b2e dl_main ([kernel.kallsyms])
    2b7a71eab7f4 _dl_sysdep_start ([kernel.kallsyms])

    All DSO's in a callchain are printed as [kernel.kallsyms].

    git bisect chased it to:

    547a92e0aedb88129e7fbd804697a11949de2e5a is the first bad commit
    commit 547a92e0aedb88129e7fbd804697a11949de2e5a
    Author: Akihiro Nagai
    Date: Mon Jan 30 13:42:57 2012 +0900

    perf script: Unify the expressions indicating "unknown"

    The perf script command uses various expressions to indicate "unknown".

    It is unfriendly for user scripts to parse it. So, this patch unifies
    the expressions to "[unknown]".

    Looks like a copy-paste in that the other references use al.map but this one
    should be node->map.

    With this patch you get:

    $ perf script -i /tmp/perf.data
    ...
    gcc 13623 544315.062858: context-switches:
    ffffffff815f65c9 __schedule ([kernel.kallsyms])
    ffffffff81087cea __cond_resched ([kernel.kallsyms])
    ffffffff815f6b92 _cond_resched ([kernel.kallsyms])
    ffffffff815fb87a do_page_fault ([kernel.kallsyms])
    ffffffff815f8465 page_fault ([kernel.kallsyms])
    2b7a71ea0303 _dl_lookup_symbol_x (/lib64/ld-2.14.90.so)
    2b7a71ea1eb5 _dl_relocate_object (/lib64/ld-2.14.90.so)
    2b7a71e99b2e dl_main (/lib64/ld-2.14.90.so)
    2b7a71eab7f4 _dl_sysdep_start (/lib64/ld-2.14.90.so)

    Signed-off-by: David Ahern
    Cc: Akihiro Nagai
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1338353906-60706-1-git-send-email-dsahern@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • When no event is specified the tools use perf_evlist__add_default(), that will
    call event_attr_init to initialize the KVM exclusion bits.

    When the change was made to the tools so that by default guest samples would be
    excluded, the changes were made just to the parsing routines and to
    perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
    just by perf stat to add multiple events, according to the level of detail
    specified.

    Recently the tools were changed to reconstruct the event name from all the
    details in perf_event_attr, not just from .type and .config, but taking into
    account all the feature bits (.exclude_{guest,host,user,kernel,etc},
    .precise_ip, etc).

    That is when we noticed that the default for perf stat wasn't the one for the
    rest of the tools, i.e. the .exclude_guest bit wasn't being set.

    I.e. the default, that doesn't call event_attr_init was showing the :HG
    modifier:

    $ perf stat usleep 1

    Performance counter stats for 'usleep 1':

    0.942119 task-clock # 0.454 CPUs utilized
    1 context-switches # 0.001 M/sec
    0 CPU-migrations # 0.000 K/sec
    126 page-faults # 0.134 M/sec
    693,193 cycles:HG # 0.736 GHz [40.11%]
    407,461 stalled-cycles-frontend:HG # 58.78% frontend cycles idle [72.29%]
    365,403 stalled-cycles-backend:HG # 52.71% backend cycles idle
    465,982 instructions:HG # 0.67 insns per cycle
    # 0.87 stalled cycles per insn
    89,760 branches:HG # 95.275 M/sec
    6,178 branch-misses:HG # 6.88% of all branches

    0.002077228 seconds time elapsed

    While if one explicitely specifies the same events, which will make the parsing code
    to be called and thus event_attr_init is called:

    $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1

    Performance counter stats for 'usleep 1':

    1.040349 task-clock # 0.500 CPUs utilized
    2 context-switches # 0.002 M/sec
    0 CPU-migrations # 0.000 K/sec
    127 page-faults # 0.122 M/sec
    587,966 cycles # 0.565 GHz [13.18%]
    459,167 stalled-cycles-frontend # 78.09% frontend cycles idle
    390,249 stalled-cycles-backend # 66.37% backend cycles idle
    504,006 instructions # 0.86 insns per cycle
    # 0.91 stalled cycles per insn
    96,455 branches # 92.714 M/sec
    6,522 branch-misses # 6.76% of all branches [96.12%]

    0.002078681 seconds time elapsed

    Fix it by introducing a perf_evlist__add_default_attrs method that will call
    evlist_attr_init in all the perf_event_attr entries before adding the events.

    Reported-by: Ingo Molnar
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

30 May, 2012

10 commits

  • Its 'H', not 'h'. The later is for getting to the help window.

    Reported-by: Ingo Molnar
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-7zvwphhm815y2zczoxgstzuf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In non symbolic views, i.e. --sort without "symbol", as in:

    perf report --sort comm

    We're segfaulting in the --tui because we're testing the symbol resolved
    and then trying to use the symbol on the histogram entry where we're
    coalescing all hits for a COMM, and the first hist_entry for a comm may
    have a NULL symbol, i.e. the RIP didn't resolve to any symbol.

    In this case we're segfaulting, fix it by testing against the symbol in
    the histogram entry.

    Reported-by: William Cohen
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-8ylwubbcmu27ucc9ffrku3yv@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Merge back Linus's latest branch so that we pick up the uprobes changes.

    ( I tested this branch locally and while it's one from the middle of the
    merge window it's a good one to base further work off. )

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Stop using this python/OOP convention, doesn't really helps. Will do
    more from time to time till we get it cleaned up in all of /perf.

    Suggested-by: Thomas Gleixner
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-5dyxyb8o0gf4yndk27kafbd1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The defaults are:

    [annotate]

    hide_src_code = false
    use_offset = true
    jump_arrows = true
    show_nr_jumps = false

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-q4egci70rjgxh7bogbbfpcyf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • For annotate I want to be able to have variables that are the same as
    the ones representing feature toggles.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-7rhhf6m0a72p2wja4tgv1itg@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that when navigating to another function from a call site or when
    going to another annotation browser thru the main report/top browser the
    options (hide source code, jump arrows, jumpy lines, etc) remains the
    last ones selected.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-0h0tah1zj59p01581snjufne@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When hide_src_view is true we can't use browser_disasm_line->idx, that
    takes into account also non asm lines, we must use browser_disasm_line->idx_asm
    instead, otherwise we may end up with an index after the number of
    entries, oops, fix it.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-o1szpyjh3z87yi0n6x0cr8uu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Compiling page-type.c with a recent compiler produces many warnings,
    mostly related to signed/unsigned comparisons. This patch cleans up most
    of them.

    One remaining warning is about an unused parameter. The file
    doesn't define a __unused macro (or the like) yet. This can be addressed
    later.

    Signed-off-by: Ulrich Drepper
    Acked-by: KOSAKI Motohiro
    Acked-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • Programs using /proc/kpageflags need to know about the various flags. The
    provides them and the comments in the file
    indicate that it is supposed to be used by user-level code. But the file
    is not installed.

    Install the headers and mark the unstable flags as out-of-bounds. The
    page-type tool is also adjusted to not duplicate the definitions

    Signed-off-by: Ulrich Drepper
    Acked-by: KOSAKI Motohiro
    Acked-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

29 May, 2012

2 commits

  • There were places where use ui__warning (or even fprintf) to show
    critical messages. This patch converts them to ui__error so that the
    front-end code can implement appropriate behavior.

    Signed-off-by: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338265382-6872-3-git-send-email-namhyung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The commit dc41b9b8f02db ("perf ui: Change fallback policy of
    setup_browser") changed default behavior of the function but missed
    setting the use_browser variable to 0 accidently. So perf report ends up
    doing nothing in such cases. Fix it.

    Signed-off-by: Namhyung Kim
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338216802-5675-1-git-send-email-namhyung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim