08 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (82 commits)
    firewire: core: add forgotten dummy driver methods, remove unused ones
    firewire: add isochronous multichannel reception
    firewire: core: small clarifications in core-cdev
    firewire: core: remove unused code
    firewire: ohci: release channel in error path
    firewire: ohci: use memory barriers to order descriptor updates
    tools/firewire: nosy-dump: increment program version
    tools/firewire: nosy-dump: remove unused code
    tools/firewire: nosy-dump: use linux/firewire-constants.h
    tools/firewire: nosy-dump: break up a deeply nested function
    tools/firewire: nosy-dump: make some symbols static or const
    tools/firewire: nosy-dump: change to kernel coding style
    tools/firewire: nosy-dump: work around segfault in decode_fcp
    tools/firewire: nosy-dump: fix it on x86-64
    tools/firewire: add userspace front-end of nosy
    firewire: nosy: note ioctls in ioctl-number.txt
    firewire: nosy: use generic printk macros
    firewire: nosy: endianess fixes and annotations
    firewire: nosy: annotate __user pointers and __iomem pointers
    firewire: nosy: fix device shutdown with active client
    ...

    Linus Torvalds
     

07 Aug, 2010

1 commit

  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
    tracing/kprobes: unregister_trace_probe needs to be called under mutex
    perf: expose event__process function
    perf events: Fix mmap offset determination
    perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
    perf, powerpc: Convert the FSL driver to use local64_t
    perf tools: Don't keep unreferenced maps when unmaps are detected
    perf session: Invalidate last_match when removing threads from rb_tree
    perf session: Free the ref_reloc_sym memory at the right place
    x86,mmiotrace: Add support for tracing STOS instruction
    perf, sched migration: Librarize task states and event headers helpers
    perf, sched migration: Librarize the GUI class
    perf, sched migration: Make the GUI class client agnostic
    perf, sched migration: Make it vertically scrollable
    perf, sched migration: Parameterize cpu height and spacing
    perf, sched migration: Fix key bindings
    perf, sched migration: Ignore unhandled task states
    perf, sched migration: Handle ignored migrate out events
    perf: New migration tool overview
    tracing: Drop cpparg() macro
    perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
    ...

    Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c

    Linus Torvalds
     

05 Aug, 2010

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
    [CPUFREQ] Remove pointless printk from p4-clockmod.
    [CPUFREQ] Fix section mismatch for powernow_cpu_init in powernow-k7.c
    [CPUFREQ] Fix section mismatch for longhaul_cpu_init.
    [CPUFREQ] Fix section mismatch for longrun_cpu_init.
    [CPUFREQ] powernow-k8: Fix misleading variable naming
    [CPUFREQ] Convert pci_table entries to PCI_VDEVICE (if PCI_ANY_ID is used)
    [CPUFREQ] arch/x86/kernel/cpu/cpufreq: use for_each_pci_dev()
    [CPUFREQ] fix brace coding style issue.
    [CPUFREQ] x86 cpufreq: Make trace_power_frequency cpufreq driver independent
    [CPUFREQ] acpi-cpufreq: Fix CPU_ANY CPUFREQ_{PRE,POST}CHANGE notification
    [CPUFREQ] ondemand: don't synchronize sample rate unless multiple cpus present
    [CPUFREQ] unexport (un)lock_policy_rwsem* functions
    [CPUFREQ] ondemand: Refactor frequency increase code
    [CPUFREQ] powernow-k8: On load failure, remind the user to enable support in BIOS setup
    [CPUFREQ] powernow-k8: Limit Pstate transition latency check
    [CPUFREQ] Fix PCC driver error path
    [CPUFREQ] fix double freeing in error path of pcc-cpufreq
    [CPUFREQ] pcc driver should check for pcch method before calling _OSC
    [CPUFREQ] fix memory leak in cpufreq_add_dev
    [CPUFREQ] revert "[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)"

    Manually fix up non-data merge conflict introduced by new calling
    conventions for trace_power_start() in commit 6f4f2723d085 ("x86
    cpufreq: Make trace_power_frequency cpufreq driver independent"), which
    didn't update the intel_idle native hardware cpuidle driver.

    Linus Torvalds
     

04 Aug, 2010

3 commits

  • The event__process function is useful in processing /proc//maps. All of
    the functions that are called from event__process are defined in util/event.c.
    Though its defined in builtin-top.c, it could be reused for perf probe for
    uprobes. Hence moving it to util/event.c and exporting the function.

    LKML-Reference:
    Signed-off-by: Srikar Dronamraju
    Signed-off-by: Arnaldo Carvalho de Melo

    Srikar Dronamraju
     
  • Fix buggy-looking code which unnecessarily adjusts the file offset
    fields read from /proc/*/maps.

    This may have gone unnoticed since the offset is usually 0 (and the
    logic in util/symbol.c may work incorrectly for other offset values).

    Commiter note:

    This fixes a bug introduced in 4af8b35, there is no need to shift pgoff
    twice, the show_map_vma routine in fs/proc/task_mmu.c already converts
    it from the number of pages to the size in bytes, and that is what
    appears in /proc/PID/map.

    Cc: Nicolas Pitre
    Cc: Will Deacon
    LKML-Reference:
    Signed-off-by: Dave Martin
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • and fix the broken case if a core's frequency depends on others.

    trace_power_frequency was only implemented in a rather ungeneric way
    in acpi-cpufreq driver's target() function only.
    -> Move the call to trace_power_frequency to
    cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
    notifier is triggered.
    This will support power frequency tracing by all cpufreq drivers

    trace_power_frequency did not trace frequency changes correctly when
    the userspace governor was used or when CPU cores' frequency depend
    on each other.
    -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
    which gets switched automatically fixes this.

    Robert Schoene provided some important fixes on top of my initial
    quick shot version which are integrated in this patch:
    - Forgot some changes in power_end trace (TP_printk/variable names)
    - Variable dummy in power_end must now be cpu_id
    - Use static 64 bit variable instead of unsigned int for cpu_id

    Signed-off-by: Thomas Renninger
    CC: davej@redhat.com
    CC: arjan@infradead.org
    CC: linux-kernel@vger.kernel.org
    CC: robert.schoene@tu-dresden.de
    Tested-by: robert.schoene@tu-dresden.de
    Signed-off-by: Dave Jones

    Thomas Renninger
     

03 Aug, 2010

3 commits

  • For a file with:

    [root@emilia linux-2.6-tip]# perf report -D -fi allmodconfig-j32.perf.data | grep events:
    TOTAL events: 36933
    MMAP events: 9056
    LOST events: 0
    COMM events: 1702
    EXIT events: 1887
    THROTTLE events: 8
    UNTHROTTLE events: 8
    FORK events: 1894
    READ events: 0
    SAMPLE events: 22378
    ATTR events: 0
    EVENT_TYPE events: 0
    TRACING_DATA events: 0
    BUILD_ID events: 0
    [root@emilia linux-2.6-tip]#

    Testing with valgrind and making perf_session__delete() a nop, so that
    we can notice how many maps were actually deleted due to not having any
    samples on it:

    ==== HEAP SUMMARY:

    Before:

    ==10339== in use at exit: 8,909,997 bytes in 68,690 blocks
    ==10339== total heap usage: 78,696 allocs, 10,007 frees, 11,925,853 bytes allocated

    After:

    ==10506== in use at exit: 8,902,605 bytes in 68,606 blocks
    ==10506== total heap usage: 78,696 allocs, 10,091 frees, 11,925,853 bytes allocated

    I.e. just 84 detected unmaps with no hits out of 9056 for this workload,
    not much, but in some other long running workload this may save more
    bytes.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • If we receive two PERF_RECORD_EXIT for the same thread, we can end up
    reusing session->last_match and trying to remove the thread twice from
    the rb_tree, causing a segfault, so invalidade last_match in
    perf_session__remove_thread.

    Receiving two PERF_RECORD_EXIT for the same thread is a bug, but its a
    harmless one if we make the tool more robust, like this patch does.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Which is at perf_session__destroy_kernel_maps, counterpart to the
    perf_session__create_kernel_maps where the kmap structure is located, just
    after the vmlinux_maps.

    Make it also check if the kernel maps were actually created, which may not
    be the case if, for instance, perf_session__new can't complete due to
    permission problems in, for instance, a 'perf report' case, when a
    segfault will take place, that is how this was noticed.

    The problem was introduced in d65a458, thus post .35.

    This also adds code to release guest machines as them are also created
    in perf_session__create_kernel_maps, so should be deleted on this newly
    introduced counterpart, perf_session__destroy_kernel_maps.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

02 Aug, 2010

12 commits

  • Conflicts:
    drivers/firewire/core-card.c
    drivers/firewire/core-cdev.c

    and forgotten #include in drivers/firewire/ohci.c

    Signed-off-by: Stefan Richter

    Stefan Richter
     
  • Conflicts:
    tools/perf/Makefile
    tools/perf/util/hist.c

    Merge reason: Resolve the conflicts and update to latest upstream.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • …ic/random-tracing into perf/core

    Ingo Molnar
     
  • Librarize the task state and event headers helpers as they can
    be generally useful.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • Export the GUI facility in the common library path. It is
    going to be useful for other scheduler views.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • Make the perf migration GUI generic so that it can be reused for
    other kinds of trace painting. No more notion of CPUs or runqueue
    from the GUI class, it's now used as a library by the trace parser.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • With scheduler traces covering more than two cpus, rectangles
    of the CPUs 3 and more are not visibles.

    This makes the vertical navigation scrollable so that all of the
    CPUs rectangles are available.

    We also want to be able to zoom vertically, so that we can fit at
    best the screen with CPU rectangles, but that's for later.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • Without vertical zoom, it is not possible to see all CPUs in a trace
    taken on a larger machine. This patch parameterizes the height and
    spacing of CPUs so that you can fit more cpus into the screen.

    Ideally we should dynamically size/space the CPU rectangles with some
    minimum threshold. Until then, this patch is a stop-gap.

    Signed-off-by: Nikhil Rao
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Tom Zanussi
    Signed-off-by: Frederic Weisbecker

    Nikhil Rao
     
  • EVT_KEY_DOWN and EVT_LEFT_DOWN events are not bound to the RootFrame
    event handler. As a result, zoom/scroll via keyboard events do not
    work. This patch adds the missing bindings.

    Signed-off-by: Nikhil Rao
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Tom Zanussi
    Signed-off-by: Frederic Weisbecker

    Nikhil Rao
     
  • Stop printing an error message when we don't have the letter
    for a given task state. All we need to know is if the task is
    in the TASK_RUNNING state.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • Migrate out events may happen on tasks that are not in the
    runqueue, for example this is the case for tasks that are
    sleeping. In this case, we don't want to log the migrate out
    event in the source runqueue because the task is not eventually
    in the runqueue and we have already logged its sleep event.

    This fixes timeslices that spuriously propagate a sleep event
    from the previous timeslice.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Nikhil Rao
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • This brings a GUI tool that displays an overview of the load
    of tasks proportion in each CPUs.

    The CPUs forward progress is cut in timeslices. A new timeslice
    is created for every runqueue event: a task gets pushed out or
    pulled in the runqueue.

    For each timeslice, every CPUs rectangle is colored with a red
    power that describes the local load against the total load.
    This more red is the rectangle, the higher is the given CPU load.
    This load is the number of tasks running on the CPU, without
    any distinction against the scheduler policy of the tasks, for
    now.

    Also for each timeslice, the event origin is depicted on the
    CPUs that triggered it using a thin colored line on top of the
    rectangle timeslice.

    These events are:

    * sleep: a task went to sleep and has then been pulled out the
    runqueue. The origin color in the thin line is dark blue.

    * wake up: a task woke up and has then been pushed in the
    runqueue. The origin color is yellow.

    * wake up new: a new task woke up and has then been pushed in the
    runqueue. The origin color is green.

    * migrate in: a task migrated in the runqueue due to a load
    balancing operation. The origin color is violet.

    * migrate out: reverse of the previous one. Migrate in events
    usually have paired migrate out events in another runqueue.
    The origin color is light blue.

    Clicking on a timeslice provides the runqueue event details
    and the runqueue state.

    The CPU rectangles can be navigated using the usual arrow
    controls. Horizontal zooming in/out is possible with the
    "+" and "-" buttons.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Li Zefan
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Mike Galbraith
    Cc: Venkatesh Pallipadi
    Cc: Pierre Tardy
    Cc: Nikhil Rao
    Cc: Li Zefan

    Frederic Weisbecker
     

31 Jul, 2010

3 commits


30 Jul, 2010

7 commits

  • As a precursor for perf to support uprobes, rename fields/functions
    that had kprobe in their name but can be shared across perf-kprobes
    and perf-uprobes to probe.

    Cc: Ananth N Mavinakayanahalli
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: "Frank Ch. Eigler"
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jim Keniston
    Cc: Linus Torvalds
    Cc: Mark Wielaard
    Cc: Mathieu Desnoyers
    Cc: Naren A Devaiah
    Cc: Oleg Nesterov
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Randy Dunlap
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Srikar Dronamraju
    Signed-off-by: Arnaldo Carvalho de Melo

    Srikar Dronamraju
     
  • Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Changes:
    * Simplification of the main search loop on dso__load()
    * Replace the search with a 2-pass search:
    * First, try to find an image with a proper symtab.
    * Second, repeat the search, accepting dynsym.

    A second scan should only ever happen when needed debug images are
    missing from the buildid cache or stale, i.e., when the cache is out of
    sync.

    Currently, the second scan also happens when using separated debug
    images, since the caching logic doesn't currently know how to cache
    those. Improvements to the cache behaviour ought to solve that.

    Signed-off-by: Dave Martin
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • Signed-off-by: Dave Martin
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • If we have a buildid, then we never want to load an image which has no buildid,
    or which has a different buildid, so it makes sense for the check to be built
    into dso__load and not done separately. This is fine for old distros which
    don't use buildid at all since we do no check in that case.

    This refactoring also alleviates some subtle race condition issues by not
    opening ELF images twice to check the buildid and then load the symbols, which
    could lead to weirdness if an image is replaced under our feet.

    Signed-off-by: Dave Martin
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we can reduce the noise on valgrind when looking for memory
    leaks.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

28 Jul, 2010

1 commit


27 Jul, 2010

8 commits

  • Tidy-up patch to remove some code and struct perf_session data members
    which are no longer needed due to the previous patch: "perf tools: Don't
    abbreviate file paths relative to the cwd".

    LKML-Reference:
    Signed-off-by: Dave Martin
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • This avoids around some problems where the full path is executables and DSOs it
    needed for finding debug symbols on platforms with separated debug symbol files
    such as Ubuntu. This is simpler than tracking an extra name for each image.

    The only impact should be that paths in verbose output from the perf tools
    become absolute, instead of relative to .

    LKML-Reference:
    Signed-off-by: Dave Martin
    Signed-off-by: Arnaldo Carvalho de Melo

    Dave Martin
     
  • The stock newt checkbox tree widget we were using was not really
    suitable for hist entry + callchain browsing.

    The problems with it were manifold:

    - We needed to traverse the whole hist_entry rb_tree to add each entry +
    callchains beforehand.

    - No control over the colors used for each row

    So a new tree widget, based mostly on slang, was written.

    It extends the ui_browser class already used for annotate to allow the
    user to fold/unfold branches in the callchains tree, using extra fields
    in the symbol_map class that is embedded in hist_entry and
    callchain_node instances to store the folding state and when changing
    this state calculates the number of rows that are produced when showing
    a particular hist_entry instance.

    This greatly speeds up browsing as we don't have to upfront touch all
    the entries and only calculate callchain related operations when some
    callchain branch is actually unfolded.

    The memory footprint is also reduced as the data structure is not
    duplicated, just some extra fields for controling callchain state and to
    simplify the process of seeking thru entries (nr_rows, row_offset) were
    added.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we gain two columns and look more like classical (at least in
    TUIs) scroll bars bars.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When we call ui_browser__show we may have called
    ui_browser__refresh_dimensions to check if the maximum lenght for the
    contained entries changed, such as when zooming in and out DSOs or
    threads in the hist browser.

    For that to happen we must delete the old form, that will take care of
    deleting the vertical scrollbar, etc, and then recreate them, with the
    new dimensions.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Will be used to figure out the window width needed in the new tree
    widget.

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Since version 0.3 from Kristian's repository, there should actually be
    no change in functionality except for the x86-64 fix. Nevertheless,
    make it distinct from the original nosy-dump --- just in case and also
    because of potential future changes.

    Signed-off-by: Stefan Richter

    Stefan Richter