01 Jun, 2012

4 commits

  • Merge misc patches from Andrew Morton:

    - the "misc" tree - stuff from all over the map

    - checkpatch updates

    - fatfs

    - kmod changes

    - procfs

    - cpumask

    - UML

    - kexec

    - mqueue

    - rapidio

    - pidns

    - some checkpoint-restore feature work. Reluctantly. Most of it
    delayed a release. I'm still rather worried that we don't have a
    clear roadmap to completion for this work.

    * emailed from Andrew Morton : (78 patches)
    kconfig: update compression algorithm info
    c/r: prctl: add ability to set new mm_struct::exe_file
    c/r: prctl: extend PR_SET_MM to set up more mm_struct entries
    c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat
    syscalls, x86: add __NR_kcmp syscall
    fs, proc: introduce /proc//task//children entry
    sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE
    aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
    eventfd: change int to __u64 in eventfd_signal()
    fs/nls: add Apple NLS
    pidns: make killed children autoreap
    pidns: use task_active_pid_ns in do_notify_parent
    rapidio/tsi721: add DMA engine support
    rapidio: add DMA engine support for RIO data transfers
    ipc/mqueue: add rbtree node caching support
    tools/selftests: add mq_perf_tests
    ipc/mqueue: strengthen checks on mqueue creation
    ipc/mqueue: correct mq_attr_ok test
    ipc/mqueue: improve performance of send/recv
    selftests: add mq_open_tests
    ...

    Linus Torvalds
     
  • While doing the checkpoint-restore in the user space one need to determine
    whether various kernel objects (like mm_struct-s of file_struct-s) are
    shared between tasks and restore this state.

    The 2nd step can be solved by using appropriate CLONE_ flags and the
    unshare syscall, while there's currently no ways for solving the 1st one.

    One of the ways for checking whether two tasks share e.g. mm_struct is to
    provide some mm_struct ID of a task to its proc file, but showing such
    info considered to be not that good for security reasons.

    Thus after some debates we end up in conclusion that using that named
    'comparison' syscall might be the best candidate. So here is it --
    __NR_kcmp.

    It takes up to 5 arguments - the pids of the two tasks (which
    characteristics should be compared), the comparison type and (in case of
    comparison of files) two file descriptors.

    Lookups for pids are done in the caller's PID namespace only.

    At moment only x86 is supported and tested.

    [akpm@linux-foundation.org: fix up selftests, warnings]
    [akpm@linux-foundation.org: include errno.h]
    [akpm@linux-foundation.org: tweak comment text]
    Signed-off-by: Cyrill Gorcunov
    Acked-by: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Andrey Vagin
    Cc: KOSAKI Motohiro
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Thomas Gleixner
    Cc: Glauber Costa
    Cc: Andi Kleen
    Cc: Tejun Heo
    Cc: Matt Helsley
    Cc: Pekka Enberg
    Cc: Eric Dumazet
    Cc: Vasiliy Kulikov
    Cc: Alexey Dobriyan
    Cc: Valdis.Kletnieks@vt.edu
    Cc: Michal Marek
    Cc: Frederic Weisbecker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • Add the mq_perf_tests tool I used when creating my mq performance patch.
    Also add a local .gitignore to keep the binaries from showing up in git
    status output.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Doug Ledford
    Cc: Stephen Rothwell
    Cc: Manfred Spraul
    Cc: Frederic Weisbecker
    Acked-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Ledford
     
  • Add a directory to house POSIX message queue subsystem specific tests.
    Add first test which checks the operation of mq_open() under various
    corner conditions.

    Signed-off-by: Doug Ledford
    Cc: KOSAKI Motohiro
    Cc: Doug Ledford
    Cc: Joe Korty
    Cc: Amerigo Wang
    Cc: Serge E. Hallyn
    Cc: Jiri Slaby
    Cc: Manfred Spraul
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Ledford
     

31 May, 2012

1 commit

  • Pull perf updates from Ingo Molnar.

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
    perf ui browser: Stop using 'self'
    perf annotate browser: Read perf config file for settings
    perf config: Allow '_' in config file variable names
    perf annotate browser: Make feature toggles global
    perf annotate browser: The idx_asm field should be used in asm only view
    perf tools: Convert critical messages to ui__error()
    perf ui: Make --stdio default when TUI is not supported
    tools lib traceevent: Silence compiler warning on 32bit build
    perf record: Fix branch_stack type in perf_record_opts
    perf tools: Reconstruct event with modifiers from perf_event_attr
    perf top: Fix counter name fixup when fallbacking to cpu-clock
    perf tools: fix thread_map__new_by_pid_str() memory leak in error path
    perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specified
    tools lib traceevent: Fix signature of create_arg_item()
    tools lib traceevent: Use proper function parameter type
    tools lib traceevent: Fix freeing arg on process_dynamic_array()
    tools lib traceevent: Fix a possibly wrong memory dereference
    tools lib traceevent: Fix a possible memory leak
    tools lib traceevent: Allow expressions in __print_symbolic() fields
    perf evlist: Explicititely initialize input_name
    ...

    Linus Torvalds
     

30 May, 2012

7 commits

  • Stop using this python/OOP convention, doesn't really helps. Will do
    more from time to time till we get it cleaned up in all of /perf.

    Suggested-by: Thomas Gleixner
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-5dyxyb8o0gf4yndk27kafbd1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The defaults are:

    [annotate]

    hide_src_code = false
    use_offset = true
    jump_arrows = true
    show_nr_jumps = false

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-q4egci70rjgxh7bogbbfpcyf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • For annotate I want to be able to have variables that are the same as
    the ones representing feature toggles.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-7rhhf6m0a72p2wja4tgv1itg@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that when navigating to another function from a call site or when
    going to another annotation browser thru the main report/top browser the
    options (hide source code, jump arrows, jumpy lines, etc) remains the
    last ones selected.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-0h0tah1zj59p01581snjufne@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When hide_src_view is true we can't use browser_disasm_line->idx, that
    takes into account also non asm lines, we must use browser_disasm_line->idx_asm
    instead, otherwise we may end up with an index after the number of
    entries, oops, fix it.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-o1szpyjh3z87yi0n6x0cr8uu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Compiling page-type.c with a recent compiler produces many warnings,
    mostly related to signed/unsigned comparisons. This patch cleans up most
    of them.

    One remaining warning is about an unused parameter. The file
    doesn't define a __unused macro (or the like) yet. This can be addressed
    later.

    Signed-off-by: Ulrich Drepper
    Acked-by: KOSAKI Motohiro
    Acked-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • Programs using /proc/kpageflags need to know about the various flags. The
    provides them and the comments in the file
    indicate that it is supposed to be used by user-level code. But the file
    is not installed.

    Install the headers and mark the unstable flags as out-of-bounds. The
    page-type tool is also adjusted to not duplicate the definitions

    Signed-off-by: Ulrich Drepper
    Acked-by: KOSAKI Motohiro
    Acked-by: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

29 May, 2012

2 commits

  • There were places where use ui__warning (or even fprintf) to show
    critical messages. This patch converts them to ui__error so that the
    front-end code can implement appropriate behavior.

    Signed-off-by: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338265382-6872-3-git-send-email-namhyung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The commit dc41b9b8f02db ("perf ui: Change fallback policy of
    setup_browser") changed default behavior of the function but missed
    setting the use_browser variable to 0 accidently. So perf report ends up
    doing nothing in such cases. Fix it.

    Signed-off-by: Namhyung Kim
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1338216802-5675-1-git-send-email-namhyung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

27 May, 2012

1 commit

  • The gcc complains about casting a pointer to unsigned long long directly:

    SUBDIR ../lib/traceevent/
    CC FPIC event-parse.o
    CC FPIC trace-seq.o
    CC FPIC parse-filter.o
    /home/namhyung/project/linux/tools/lib/traceevent/parse-filter.c: In function ‘get_value’:
    /home/namhyung/project/linux/tools/lib/traceevent/parse-filter.c:1588: warning: cast from pointer to integer of different size
    CC FPIC parse-utils.o
    BUILD STATIC LIB libtraceevent.a

    Signed-off-by: Namhyung Kim
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1338003691-3141-1-git-send-email-namhyung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

26 May, 2012

3 commits

  • The attr.branch_sample_type field is defined as u64 by the API. As
    such, we need to ensure the variable holding the value of the branch
    stack filters is also u64 otherwise we may lose bits in the future.

    Note also that the bogus definition of the field in perf_record_opts
    caused problems on big-endian PPC systems. Thanks to Anshuman Khandual
    for tracking the problem on PPC.

    Reported-by: Anshuman Khandual
    Signed-off-by: Stephane Eranian
    Cc: Anshuman Khandual
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120525211344.GA7729@quad
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • The modifiers:

    k kernel space
    u user space
    h hypervisor
    G guest
    H host
    p, pp, ppp precision level (PEBS)

    that can be suffixed to an event were lost when tools used event_name()
    to reconstruct them from the perf_event_attr entries in a perf.data
    file.

    Fix it by following the defaults used for these modifiers in the current
    codebase, so:

    $ perf record -e instructions:u usleep 1 2> /dev/null
    $ perf evlist
    instructions:u
    $ perf record -e cycles:k usleep 1 2> /dev/null
    $ perf evlist
    cycles:k
    $ perf record -e cycles:kh usleep 1 2> /dev/null
    $ perf evlist
    cycles:kh
    $ perf record -e cache-misses:G usleep 1 2> /dev/null
    $ perf evlist
    cache-misses:G
    $ perf record -e cycles:ppk usleep 1 2> /dev/null
    $ perf evlist
    cycles:kpp
    $

    Also works with 'top', 'report', etc.

    More work needed to cover tracepoints and software events while not
    dragging lots of baggage to the python binding, this is a minimal fix
    for v3.5.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4hl5glle0hxlklw4usva1mkt@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In 40491eaa "perf top: Update event name when falling back to cpu-clock"
    we freed counter->name but didn't reset it to NULL, then when setting it
    to the result of event_name(), event_name() would use the cached value,
    which by now was overwritten and thus we got garbage or a zero lenght
    string.

    Fix it by just freeing and setting counter->name to NULL, this way
    event_name() when called afterwards, will find the right counter name
    and cache it again.

    Found while trying 'cycles:pp' on a machine were :pp couldn't be
    honoured. Probably the best fallback here is to tell the user that that
    level of precision is not available on the PMU and then go removing 'p',
    levels of precision till we get to play 'cycles' and if even that fails,
    _then_ get to 'cpu-clock'.

    But that is the matter for another patch, this one just needs to fix the
    caching issue, which in the end will show 'cpu-clock' when tools ask for
    the event name being used, which clarifies things for the user, that
    will see that 'cycles:pp' or whatever not support event is not being
    used, some sort of fallback happened.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-w1neie2dqli89we1bzwkf4id@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

25 May, 2012

3 commits

  • The namelist array (including its content) was not freed if we fail to
    realloc a new 'threads' structure.

    Signed-off-by: Franck Bui-Huu
    Cc: David Ahern
    Link: http://lkml.kernel.org/r/1337952109-31995-1-git-send-email-fbuihuu@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Franck Bui-Huu
     
  • Pull user-space probe instrumentation from Ingo Molnar:
    "The uprobes code originates from SystemTap and has been used for years
    in Fedora and RHEL kernels. This version is much rewritten, reviews
    from PeterZ, Oleg and myself shaped the end result.

    This tree includes uprobes support in 'perf probe' - but SystemTap
    (and other tools) can take advantage of user probe points as well.

    Sample usage of uprobes via perf, for example to profile malloc()
    calls without modifying user-space binaries.

    First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled.

    If you don't know which function you want to probe you can pick one
    from 'perf top' or can get a list all functions that can be probed
    within libc (binaries can be specified as well):

    $ perf probe -F -x /lib/libc.so.6

    To probe libc's malloc():

    $ perf probe -x /lib64/libc.so.6 malloc
    Added new event:
    probe_libc:malloc (on 0x7eac0)

    You can now use it in all perf tools, such as:

    perf record -e probe_libc:malloc -aR sleep 1

    Make use of it to create a call graph (as the flat profile is going to
    look very boring):

    $ perf record -e probe_libc:malloc -gR make
    [ perf record: Woken up 173 times to write data ]
    [ perf record: Captured and wrote 44.190 MB perf.data (~1930712

    $ perf report | less

    32.03% git libc-2.15.so [.] malloc
    |
    --- malloc

    29.49% cc1 libc-2.15.so [.] malloc
    |
    --- malloc
    |
    |--0.95%-- 0x208eb1000000000
    |
    |--0.63%-- htab_traverse_noresize

    11.04% as libc-2.15.so [.] malloc
    |
    --- malloc
    |

    7.15% ld libc-2.15.so [.] malloc
    |
    --- malloc
    |

    5.07% sh libc-2.15.so [.] malloc
    |
    --- malloc
    |
    4.99% python-config libc-2.15.so [.] malloc
    |
    --- malloc
    |
    4.54% make libc-2.15.so [.] malloc
    |
    --- malloc
    |
    |--7.34%-- glob
    | |
    | |--93.18%-- 0x41588f
    | |
    | --6.82%-- glob
    | 0x41588f

    ...

    Or:

    $ perf report -g flat | less

    # Overhead Command Shared Object Symbol
    # ........ ............. ............. ..........
    #
    32.03% git libc-2.15.so [.] malloc
    27.19%
    malloc

    29.49% cc1 libc-2.15.so [.] malloc
    24.77%
    malloc

    11.04% as libc-2.15.so [.] malloc
    11.02%
    malloc

    7.15% ld libc-2.15.so [.] malloc
    6.57%
    malloc

    ...

    The core uprobes design is fairly straightforward: uprobes probe
    points register themselves at (inode:offset) addresses of
    libraries/binaries, after which all existing (or new) vmas that map
    that address will have a software breakpoint injected at that address.
    vmas are COW-ed to preserve original content. The probe points are
    kept in an rbtree.

    If user-space executes the probed inode:offset instruction address
    then an event is generated which can be recovered from the regular
    perf event channels and mmap-ed ring-buffer.

    Multiple probes at the same address are supported, they create a
    dynamic callback list of event consumers.

    The basic model is further complicated by the XOL speedup: the
    original instruction that is probed is copied (in an architecture
    specific fashion) and executed out of line when the probe triggers.
    The XOL area is a single vma per process, with a fixed number of
    entries (which limits probe execution parallelism).

    The API: uprobes are installed/removed via
    /sys/kernel/debug/tracing/uprobe_events, the API is integrated to
    align with the kprobes interface as much as possible, but is separate
    to it.

    Injecting a probe point is privileged operation, which can be relaxed
    by setting perf_paranoid to -1.

    You can use multiple probes as well and mix them with kprobes and
    regular PMU events or tracepoints, when instrumenting a task."

    Fix up trivial conflicts in mm/memory.c due to previous cleanup of
    unmap_single_vma().

    * 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
    perf probe: Detect probe target when m/x options are absent
    perf probe: Provide perf interface for uprobes
    tracing: Fix kconfig warning due to a typo
    tracing: Provide trace events interface for uprobes
    tracing: Extract out common code for kprobes/uprobes trace events
    tracing: Modify is_delete, is_return from int to bool
    uprobes/core: Decrement uprobe count before the pages are unmapped
    uprobes/core: Make background page replacement logic account for rss_stat counters
    uprobes/core: Optimize probe hits with the help of a counter
    uprobes/core: Allocate XOL slots for uprobes use
    uprobes/core: Handle breakpoint and singlestep exceptions
    uprobes/core: Rename bkpt to swbp
    uprobes/core: Make order of function parameters consistent across functions
    uprobes/core: Make macro names consistent
    uprobes: Update copyright notices
    uprobes/core: Move insn to arch specific structure
    uprobes/core: Remove uprobe_opcode_sz
    uprobes/core: Make instruction tables volatile
    uprobes: Move to kernel/events/
    uprobes/core: Clean up, refactor and improve the code
    ...

    Linus Torvalds
     
  • As:

    make DEBUG=1 -C tools/perf

    disables optimizations and _FORTIFY_SOURCE in recent distros requires
    optimizations to be enabled, seen on a Fedora 17 system:

    [acme@Fedora17 linux]$ make DEBUG=1 O=/home/acme/git/build/perf/ -C
    tools/perf install
    In file included from /usr/include/sys/types.h:26:0,
    from /usr/include/libelf.h:53,
    from /usr/include/gelf.h:53,
    from /usr/include/elfutils/libdw.h:53,
    from :2:
    /usr/include/features.h:314:4: error: #warning _FORTIFY_SOURCE requires
    compiling with optimization (-O) [-Werror=cpp

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4ccyiebqju4uatm31ky7725b@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

24 May, 2012

10 commits

  • The @type should be a type of enum event_type not enum filter_arg_type.

    This fixes following warning:

    $ make
    COMPILE FPIC parse-events.o
    COMPILE FPIC parse-filter.o
    /home/namhyung/project/trace-cmd/parse-filter.c: In function ‘create_arg_item’:
    /home/namhyung/project/trace-cmd/parse-filter.c:343:9: warning: comparison between ‘enum filter_arg_type’ and ‘enum event_type’ [-Wenum-compare]
    /home/namhyung/project/trace-cmd/parse-filter.c:339:2: warning: case value ‘8’ not in enumerated type ‘enum filter_arg_type’ [-Wswitch]
    BUILD STATIC LIB libparsevent.a
    BUILD STATIC LIB libtracecmd.a
    BUILD trace-cmd
    /usr/bin/make -C /home/namhyung/project/trace-cmd/Documentation all
    make[1]: Nothing to be done for `all'.
    Note: to build the gui, type "make gui"

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-20-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The param needs to be updated when setting args up so that
    the loop in process_defined_func() can see the correct
    param->type for the farg.

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-15-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The @arg paremeter should not be freed inside of process_XXX(),
    because it'd be freed from the caller of process_arg(). We can
    free it only after it was reused for local usage.

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-14-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • If set_op_prio() failed, the token will be freed at out_free,
    then arg->op.op would turn out to be a dangle pointer. After
    returning EVENT_ERROR from process_op(), free_arg() will be
    called and then it will finally see the dangling pointer.

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-13-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • If event_read_fields failed in the middle, each member of
    struct format_field should be freed also.

    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-11-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The __print_symbolic() function takes a sequence of key-value pairs for
    pretty-printing a constant. The new kvm:kvm_exit print fmt uses the
    expression:

    __print_symbolic(..., { 0x040 + 1, "DB excp" }, ...)

    Currently only atoms are supported and this print fmt fails to parse.
    This patch adds support for expressions instead of just atoms so that
    0x040 + 1 is parsed successfully.

    Cc: Avi Kivity
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337740619-27925-6-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Stefan Hajnoczi
     
  • …inux into perf/urgent

    Pull a 'perf evlist' fix from Arnaldo Carvalho de Melo.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • It was a global variable, so it was initialized, implicitely, to zero by
    being placed in the bss.

    Now it is just a local variable that is then passed to the __cmd_evlist
    routine, so it must be explicitely set to NULL.

    The problem manifested on a Fedora 17 system, using:

    gcc version 4.7.0 20120507 (Red Hat 4.7.0-5) (GCC)

    But not on several other systems, by luck.

    Reported-by: Ingo Molnar
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-5e8wolcjs3rgd5i6yi995gfh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Pull trivial ktest spelling fix from Steven Rostedt:
    "I promised Jesper that I would push this for 3.5, but forgot to add it
    to my queue. It's just a spelling fix, but it should go in regardless
    to hide my inability to get words in the English language correct."

    Becuse gud spealing is impurtunt.

    * tag 'ktest-v3.5-spelling' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
    ktest: Change singular "paranthesis" to plural "parentheses"

    Linus Torvalds
     
  • …rnel.org/pub/scm/linux/kernel/git/tip/tip

    Pull perf fixes from Ingo Molnar:

    - Leftover AMD PMU driver fix fix from the end of the v3.4
    stabilization cycle.

    - Late tools/perf/ changes that missed the first round:
    * endianness fixes
    * event parsing improvements
    * libtraceevent fixes factored out from trace-cmd
    * perl scripting engine fixes related to libtraceevent,
    * testcase improvements
    * perf inject / pipe mode fixes
    * plus a kernel side fix

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86: Update event scheduling constraints for AMD family 15h models

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    Revert "sched, perf: Use a single callback into the scheduler"
    perf evlist: Show event attribute details
    perf tools: Bump default sample freq to 4 kHz
    perf buildid-list: Work better with pipe mode
    perf tools: Fix piped mode read code
    perf inject: Fix broken perf inject -b
    perf tools: rename HEADER_TRACE_INFO to HEADER_TRACING_DATA
    perf tools: Add union u64_swap type for swapping u64 data
    perf tools: Carry perf_event_attr bitfield throught different endians
    perf record: Fix documentation for branch stack sampling
    perf target: Add cpu flag to sample_type if target has cpu
    perf tools: Always try to build libtraceevent
    perf tools: Rename libparsevent to libtraceevent in Makefile
    perf script: Rename struct event to struct event_format in perl engine
    perf script: Explicitly handle known default print arg type
    perf tools: Add hardcoded name term for pmu events
    perf tools: Separate 'mem:' event scanner bits
    perf tools: Use allocated list for each parsed event
    perf tools: Add support for displaying event parser debug info
    perf test: Move parse event automated tests to separated object

    Linus Torvalds
     

23 May, 2012

7 commits

  • Acked-by: Randy Dunlap
    Signed-off-by: Jesper Juhl
    Signed-off-by: Steven Rostedt

    Jesper Juhl
     
  • Pull scheduler changes from Ingo Molnar:
    "The biggest change is the cleanup/simplification of the load-balancer:
    instead of the current practice of architectures twiddling scheduler
    internal data structures and providing the scheduler domains in
    colorfully inconsistent ways, we now have generic scheduler code in
    kernel/sched/core.c:sched_init_numa() that looks at the architecture's
    node_distance() parameters and (while not fully trusting it) deducts a
    NUMA topology from it.

    This inevitably changes balancing behavior - hopefully for the better.

    There are various smaller optimizations, cleanups and fixlets as well"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched: Taint kernel with TAINT_WARN after sleep-in-atomic bug
    sched: Remove stale power aware scheduling remnants and dysfunctional knobs
    sched/debug: Fix printing large integers on 32-bit platforms
    sched/fair: Improve the ->group_imb logic
    sched/nohz: Fix rq->cpu_load[] calculations
    sched/numa: Don't scale the imbalance
    sched/fair: Revert sched-domain iteration breakage
    sched/x86: Rewrite set_cpu_sibling_map()
    sched/numa: Fix the new NUMA topology bits
    sched/numa: Rewrite the CONFIG_NUMA sched domain support
    sched/fair: Propagate 'struct lb_env' usage into find_busiest_group
    sched/fair: Add some serialization to the sched_domain load-balance walk
    sched/fair: Let minimally loaded cpu balance the group
    sched: Change rq->nr_running to unsigned int
    x86/numa: Check for nonsensical topologies on real hw as well
    x86/numa: Hard partition cpu topology masks on node boundaries
    x86/numa: Allow specifying node_distance() for numa=fake
    x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly
    sched: Update documentation and comments
    sched_rt: Avoid unnecessary dequeue and enqueue of pushable tasks in set_cpus_allowed_rt()

    Linus Torvalds
     
  • Pull perf changes from Ingo Molnar:
    "Lots of changes:

    - (much) improved assembly annotation support in perf report, with
    jump visualization, searching, navigation, visual output
    improvements and more.

    - kernel support for AMD IBS PMU hardware features. Notably 'perf
    record -e cycles:p' and 'perf top -e cycles:p' should work without
    skid now, like PEBS does on the Intel side, because it takes
    advantage of IBS transparently.

    - the libtracevents library: it is the first step towards unifying
    tracing tooling and perf, and it also gives a tracing library for
    external tools like powertop to rely on.

    - infrastructure: various improvements and refactoring of the UI
    modules and related code

    - infrastructure: cleanup and simplification of the profiling
    targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.)

    - tons of robustness fixes all around

    - various ftrace updates: speedups, cleanups, robustness
    improvements.

    - typing 'make' in tools/ will now give you a menu of projects to
    build and a short help text to explain what each does.

    - ... and lots of other changes I forgot to list.

    The perf record make bzImage + perf report regression you reported
    should be fixed."

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits)
    tracing: Remove kernel_lock annotations
    tracing: Fix initial buffer_size_kb state
    ring-buffer: Merge separate resize loops
    perf evsel: Create events initially disabled -- again
    perf tools: Split term type into value type and term type
    perf hists: Fix callchain ip printf format
    perf target: Add uses_mmap field
    ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER
    ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()
    ftrace: Make ftrace_modify_all_code() global for archs to use
    ftrace: Return record ip addr for ftrace_location()
    ftrace: Consolidate ftrace_location() and ftrace_text_reserved()
    ftrace: Speed up search by skipping pages by address
    ftrace: Remove extra helper functions
    ftrace: Sort all function addresses, not just per page
    tracing: change CPU ring buffer state from tracing_cpumask
    tracing: Check return value of tracing_dentry_percpu()
    ring-buffer: Reset head page before running self test
    ring-buffer: Add integrity check at end of iter read
    ring-buffer: Make addition of pages in ring buffer atomic
    ...

    Linus Torvalds
     
  • Pull USB 3.5-rc1 changes from Greg Kroah-Hartman:
    "Here is the big USB 3.5-rc1 pull request for the 3.5-rc1 merge window.

    It's touches a lot of different parts of the kernel, all USB drivers,
    due to some API cleanups (getting rid of the ancient err() macro) and
    some changes that are needed for USB 3.0 power management updates.

    There are also lots of new drivers, pimarily gadget, but others as
    well. We deleted a staging driver, which was nice, and finally
    dropped the obsolete usbfs code, which will make Al happy to never
    have to touch that again.

    There were some build errors in the tree that linux-next found a few
    days ago, but those were fixed by the most recent changes (all were
    due to us not building with CONFIG_PM disabled.)

    Signed-off-by: Greg Kroah-Hartman "

    * tag 'usb-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (477 commits)
    xhci: Fix DIV_ROUND_UP compile error.
    xhci: Fix compile with CONFIG_USB_SUSPEND=n
    USB: Fix core compile with CONFIG_USB_SUSPEND=n
    brcm80211: Fix compile error for .disable_hub_initiated_lpm.
    Revert "USB: EHCI: work around bug in the Philips ISP1562 controller"
    MAINTAINERS: Add myself as maintainer to the USB PHY Layer
    USB: EHCI: fix command register configuration lost problem
    USB: Remove races in devio.c
    USB: ehci-platform: remove update_device
    USB: Disable hub-initiated LPM for comms devices.
    xhci: Add Intel U1/U2 timeout policy.
    xhci: Add infrastructure for host-specific LPM policies.
    USB: Add macros for interrupt endpoint types.
    xhci: Reserve one command for USB3 LPM disable.
    xhci: Some Evaluate Context commands must succeed.
    USB: Disable USB 3.0 LPM in critical sections.
    USB: Add support to enable/disable USB3 link states.
    USB: Allow drivers to disable hub-initiated LPM.
    USB: Calculate USB 3.0 exit latencies for LPM.
    USB: Refactor code to set LPM support flag.
    ...

    Conflicts:
    arch/arm/mach-exynos/mach-nuri.c
    arch/arm/mach-exynos/mach-universal_c210.c
    drivers/net/wireless/ath/ath6kl/usb.c

    Linus Torvalds
     
  • There was no easy way to see the frequency used, and with the change of
    default, we better provide one.

    [root@sandy linux]# perf evlist -F
    cycles: sample_freq=4000
    [root@sandy linux]# perf evlist -v
    cycles: sample_freq=4000, size: 80, sample_type: 391, read_format: 7, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
    [root@sandy linux]#

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-e1p9poez3nwrgycbmwqmhlsu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Quoting Ingo:

    "While at it I'd also suggest increasing the default sampling frequency,
    from 1000 Hz per CPU to at least 4Khz auto-freq or so - this should work
    well all across the board I think. CPUs are getting faster and command/app
    run times are getting shorter, 1Khz is a bit low IMO."

    Requested-by: Ingo Molnar
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-2jafa6mkrufyekny9ei59lpu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In order for perf buildid-list to work with pipe-mode files, it needs to
    process buildids and event attr structs.

    $ perf record -o - noploop 2 | ./perf inject -b | perf buildid-list -i - -H
    noploop for 2 seconds
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.084 MB - (~3678 samples) ]
    0000000000000000000000000000000000000000 [kernel.kallsyms]
    3a0d0629efe74a8da3eeba372cdbd74ad9b8f5d5 /usr/local/bin/noploop

    The reason [kernel.kallsyms] shows a 0 build-id comes from the
    way buildids are injected in the stream.

    The buildid for the kernel is provided by a BUILD_ID record. The
    [kernel.kallsyms] is provided by a MMAP record. There is no clean and
    obvious way to link the two, unfortunately.

    In regular mode, the kernel buildid is generated from reading the ELF
    image or kallsyms and perf knows to associate [kernel.kallsyms] to it.
    Later on, when perf processes the [kernel.kallsyms] MMAP record, it will
    already have a dso for it.

    So for now, make sure perf buildid-list shows the buildids for
    everything but the kernel image.

    Signed-off-by: Stephane Eranian
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337081295-10303-6-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

22 May, 2012

2 commits

  • In __perf_session__process_pipe_events(), there was a risk we would read
    more than what a union perf_event struct can hold. this could happen in
    case, perf is reading a file which contains new record types it does not
    know about and which are larger than anything it knows about.

    In general, perf is supposed to skip records it does not understand, but
    in pipe mode, those have to be read and ignored. The fixed size header
    contains the size of the record, but that size may be larger than union
    perf_event, yet it was used as the backing to the read in:

    union perf_event event;
    void *p;

    size = event->header.size;

    p = &event;
    p += sizeof(struct perf_event_header);
    if (size - sizeof(struct perf_event_header)) {
    err = readn(self->fd, p, size - sizeof(struct perf_event_header));

    We fix this by allocating a buffer based on the size reported in the
    header. We reuse the buffer as much as we can. We realloc in case it
    becomes too small. In the common case, the performance impact is
    negligible.

    Signed-off-by: Stephane Eranian
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337081295-10303-3-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • perf inject -b was broken. It would not inject any build_id into the
    stream. Furthermore, it would strip samples from the stream.

    The reason was a missing initialization of the event attribute
    structure. The perf_tool.tool.attr() callback was pointing to a simple
    repipe. But there was no initialization of the internal data structures
    to keep track of events and event ids. That later caused event id
    lookups to fail, and sample would get removed.

    The patch simply adds back the call to perf_event__process_attr() to
    initialize the evlist structure and now build_ids are again injected.

    Signed-off-by: Stephane Eranian
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1337081295-10303-2-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian