22 Feb, 2012

1 commit

  • Consolidate the uprobes code under kernel/events/, where the various
    core kernel event handling routines live.

    Acked-by: Peter Zijlstra
    Cc: Srikar Dronamraju
    Cc: Jim Keniston
    Cc: Oleg Nesterov
    Cc: Masami Hiramatsu
    Cc: Arnaldo Carvalho de Melo
    Cc: Anton Arapov
    Cc: Ananth N Mavinakayanahalli
    Link: http://lkml.kernel.org/n/tip-biuyhhwohxgbp2vzbap5yr8o@git.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

17 Feb, 2012

3 commits

  • Make the uprobes code readable to me:

    - improve the Kconfig text so that a mere mortal gets some idea
    what CONFIG_UPROBES=y is really about

    - do trivial renames to standardize around the uprobes_*() namespace

    - clean up and simplify various code flow details

    - separate basic blocks of functionality

    - line break artifact and white space related removal

    - use standard local varible definition blocks

    - use vertical spacing to make things more readable

    - remove unnecessary volatile

    - restructure comment blocks to make them more uniform and
    more readable in general

    Cc: Srikar Dronamraju
    Cc: Jim Keniston
    Cc: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Masami Hiramatsu
    Cc: Arnaldo Carvalho de Melo
    Cc: Anton Arapov
    Cc: Ananth N Mavinakayanahalli
    Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Add uprobes support to the core kernel, with x86 support.

    This commit adds the kernel facilities, the actual uprobes
    user-space ABI and perf probe support comes in later commits.

    General design:

    Uprobes are maintained in an rb-tree indexed by inode and offset
    (the offset here is from the start of the mapping). For a unique
    (inode, offset) tuple, there can be at most one uprobe in the
    rb-tree.

    Since the (inode, offset) tuple identifies a unique uprobe, more
    than one user may be interested in the same uprobe. This provides
    the ability to connect multiple 'consumers' to the same uprobe.

    Each consumer defines a handler and a filter (optional). The
    'handler' is run every time the uprobe is hit, if it matches the
    'filter' criteria.

    The first consumer of a uprobe causes the breakpoint to be
    inserted at the specified address and subsequent consumers are
    appended to this list. On subsequent probes, the consumer gets
    appended to the existing list of consumers. The breakpoint is
    removed when the last consumer unregisters. For all other
    unregisterations, the consumer is removed from the list of
    consumers.

    Given a inode, we get a list of the mms that have mapped the
    inode. Do the actual registration if mm maps the page where a
    probe needs to be inserted/removed.

    We use a temporary list to walk through the vmas that map the
    inode.

    - The number of maps that map the inode, is not known before we
    walk the rmap and keeps changing.
    - extending vm_area_struct wasn't recommended, it's a
    size-critical data structure.
    - There can be more than one maps of the inode in the same mm.

    We add callbacks to the mmap methods to keep an eye on text vmas
    that are of interest to uprobes. When a vma of interest is mapped,
    we insert the breakpoint at the right address.

    Uprobe works by replacing the instruction at the address defined
    by (inode, offset) with the arch specific breakpoint
    instruction. We save a copy of the original instruction at the
    uprobed address.

    This is needed for:

    a. executing the instruction out-of-line (xol).
    b. instruction analysis for any subsequent fixups.
    c. restoring the instruction back when the uprobe is unregistered.

    We insert or delete a breakpoint instruction, and this
    breakpoint instruction is assumed to be the smallest instruction
    available on the platform. For fixed size instruction platforms
    this is trivially true, for variable size instruction platforms
    the breakpoint instruction is typically the smallest (often a
    single byte).

    Writing the instruction is done by COWing the page and changing
    the instruction during the copy, this even though most platforms
    allow atomic writes of the breakpoint instruction. This also
    mirrors the behaviour of a ptrace() memory write to a PRIVATE
    file map.

    The core worker is derived from KSM's replace_page() logic.

    In essence, similar to KSM:

    a. allocate a new page and copy over contents of the page that
    has the uprobed vaddr
    b. modify the copy and insert the breakpoint at the required
    address
    c. switch the original page with the copy containing the
    breakpoint
    d. flush page tables.

    replace_page() is being replicated here because of some minor
    changes in the type of pages and also because Hugh Dickins had
    plans to improve replace_page() for KSM specific work.

    Instruction analysis on x86 is based on instruction decoder and
    determines if an instruction can be probed and determines the
    necessary fixups after singlestep. Instruction analysis is done
    at probe insertion time so that we avoid having to repeat the
    same analysis every time a probe is hit.

    A lot of code here is due to the improvement/suggestions/inputs
    from Peter Zijlstra.

    Changelog:

    (v10):
    - Add code to clear REX.B prefix as suggested by Denys Vlasenko
    and Masami Hiramatsu.

    (v9):
    - Use insn_offset_modrm as suggested by Masami Hiramatsu.

    (v7):

    Handle comments from Peter Zijlstra:

    - Dont take reference to inode. (expect inode to uprobe_register to be sane).
    - Use PTR_ERR to set the return value.
    - No need to take reference to inode.
    - use PTR_ERR to return error value.
    - register and uprobe_unregister share code.

    (v5):

    - Modified del_consumer as per comments from Peter.
    - Drop reference to inode before dropping reference to uprobe.
    - Use i_size_read(inode) instead of inode->i_size.
    - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
    - Includes errno.h as recommended by Stephen Rothwell to fix a build issue
    on sparc defconfig
    - Remove restrictions while unregistering.
    - Earlier code leaked inode references under some conditions while
    registering/unregistering.
    - Continue the vma-rmap walk even if the intermediate vma doesnt
    meet the requirements.
    - Validate the vma found by find_vma before inserting/removing the
    breakpoint
    - Call del_consumer under mutex_lock.
    - Use hash locks.
    - Handle mremap.
    - Introduce find_least_offset_node() instead of close match logic in
    find_uprobe
    - Uprobes no more depends on MM_OWNER; No reference to task_structs
    while inserting/removing a probe.
    - Uses read_mapping_page instead of grab_cache_page so that the pages
    have valid content.
    - pass NULL to get_user_pages for the task parameter.
    - call SetPageUptodate on the new page allocated in write_opcode.
    - fix leaking a reference to the new page under certain conditions.
    - Include Instruction Decoder if Uprobes gets defined.
    - Remove const attributes for instruction prefix arrays.
    - Uses mm_context to know if the application is 32 bit.

    Signed-off-by: Srikar Dronamraju
    Also-written-by: Jim Keniston
    Reviewed-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Andi Kleen
    Cc: Christoph Hellwig
    Cc: Steven Rostedt
    Cc: Roland McGrath
    Cc: Masami Hiramatsu
    Cc: Arnaldo Carvalho de Melo
    Cc: Anton Arapov
    Cc: Ananth N Mavinakayanahalli
    Cc: Stephen Rothwell
    Cc: Denys Vlasenko
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Linux-mm
    Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
    [ Made various small edits to the commit log ]
    Signed-off-by: Ingo Molnar

    Srikar Dronamraju
     
  • …/acme/linux into perf/core

    Includes smaller fixes and improvements plus the exclude_{host,guest} feature
    test and fallback to handle older kernels.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

    Ingo Molnar
     

15 Feb, 2012

2 commits

  • Instead of requiring that users of perf_record_opts set
    .sample_id_all_avail to true, just invert the logic, using
    .sample_id_all_missing, that doesn't need to be explicitely initialized
    since gcc will zero members ommitted in a struct initialization.

    Just like the newly introduced .exclude_{guest,host} feature test.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Just fall back to resetting those fields, if set, warning the user that
    that feature is not available.

    If guest samples appear they will just be discarded because no struct
    machine will be found and thus the event will be accounted as not
    handled and dropped, see 0c09571.

    Reported-by: Namhyung Kim
    Tested-by: Joerg Roedel
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Joerg Roedel
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

14 Feb, 2012

14 commits

  • The perf_event_attr size needs to be initialized in all cases because it
    captures the ABI version.

    This patch moves the initialization of the field from the
    perf_event_open() syscall stub to its proper location in the
    event_attr_init().

    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120209151238.GA10272@quad
    Signed-off-by: Stephane Eranian
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • There is individual code for each feature to process header sections.

    Adding a function pointer .process to struct feature_ops for keeping the
    implementation in separate functions. Code to process header sections is
    now a generic function.

    Cc: Ingo Molnar
    Link: http://lkml.kernel.org/r/1328884916-5901-2-git-send-email-robert.richter@amd.com
    Signed-off-by: Robert Richter
    Signed-off-by: Arnaldo Carvalho de Melo

    Robert Richter
     
  • Needed for later changes. No modified functionality.

    Cc: Ingo Molnar
    Link: http://lkml.kernel.org/r/1328884916-5901-1-git-send-email-robert.richter@amd.com
    Signed-off-by: Robert Richter
    Signed-off-by: Arnaldo Carvalho de Melo

    Robert Richter
     
  • Adding implementation os bitmap_or function to the bitmap object. It is
    stolen from the kernel lib/bitmap.o object.

    It is used in upcomming patches.

    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1327674868-10486-5-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding sysfs object to provide sysfs mount information in the same way
    as debugfs object does.

    The object provides following function:
    sysfs_find_mountpoint

    which returns the sysfs mount mount.

    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1327674868-10486-4-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Following debugfs object functions are not referenced
    within the code:

    int debugfs_valid_entry(const char *path);
    int debugfs_umount(void);
    int debugfs_write(const char *entry, const char *value);
    int debugfs_read(const char *entry, char *buffer, size_t size);
    void debugfs_force_cleanup(void);
    int debugfs_make_path(const char *element, char *buffer, int size);

    Removing them.

    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1327674868-10486-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The ctype.h in symbol.c was needed because of isupper(). However we now
    have it in util.h, it can be changed to use our implementation.

    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328836217-9118-3-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The implementation of sane ctype macros only depends on symbols in
    util.h not cache.h.

    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328836217-9118-2-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The util.h header provides various ctype macros but lacks those two.

    Add them.

    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328836217-9118-1-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Setting perf_guest to true by default makes no sense because the perf
    subcommands can not setup guest symbol information and thus not process
    and guest samples. The only exception is perf-kvm which changes the
    perf_guest value on its own. So change the default for perf_guest back
    to false.

    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jason Wang
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Arnaldo Carvalho de Melo

    Joerg Roedel
     
  • The perf sample processing code relies on a valid machine object. Make
    sure that this path is only entered when such a object exists.

    A counter for samples where no machine object exits is also introduced
    to give the user a message about these samples.

    Reported-by: David Ahern
    Reported-by: Jason Wang
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jason Wang
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328893505-4115-2-git-send-email-joerg.roedel@amd.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Arnaldo Carvalho de Melo

    Joerg Roedel
     
  • Allow a user to collect events for multiple threads or processes
    using a comma separated list.

    e.g., collect data on a VM and its vhost thread:
    perf top -p 21483,21485
    perf stat -p 21483,21485 -ddd
    perf record -p 21483,21485

    or monitoring vcpu threads
    perf top -t 21488,21489
    perf stat -t 21488,21489 -ddd
    perf record -t 21488,21489

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • For latest tip/perf/core tree Compiles are failing on:

    GEN common-cmds.h
    make: *** No rule to make target `../../arch/x86/lib/memset_64.S', needed by `builtin-annotate.o'. Stop.

    Resolve by adding memset.* to the tar file.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1329145057-26302-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The perf python extention (perf.so) file lacks its dependencies in the
    Makefile so that it cannot be refreshed if one of source files it depends
    is changed. Fix it by putting them in a separate file and processing it in
    both of Makefile and setup.py.

    Reported-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1329043524-12470-1-git-send-email-namhyung@gmail.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

11 Feb, 2012

4 commits

  • Fix to decode grouped AVX with VEX pp bits which should be
    handled as same as last-prefixes. This fixes below warnings
    in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y.

    Warning: arch/x86/tools/test_get_len found difference at :ffffffff810d5fc0
    Warning: ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0
    Warning: objdump says 5 bytes, but insn_get_length() says 4
    ...

    With this change, test_get_len can decode it correctly.

    $ arch/x86/tools/test_get_len -v -y
    ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0
    Succeed: decoded and checked 1 instructions

    Reported-by: Ingo Molnar
    Signed-off-by: Masami Hiramatsu
    Cc: yrl.pp-manager.tt@hitachi.com
    Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomain
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Reflect the change in the soft and hard lockup thresholds and
    their relation to the frequency of the hrtimer and NMI events in
    the code comments. While at it, remove references to files that
    do not exist anymore.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Don Zickus
    Link: http://lkml.kernel.org/r/1328827342-6253-3-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Fernando Luis Vázquez Cao
     
  • The soft and hard lockup thresholds have changed so the
    corresponding Kconfig entries need to be updated accordingly.
    Add a reference to watchdog_thresh while at it.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Don Zickus
    Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Fernando Luis Vázquez Cao
     
  • The soft and hard lockup detectors are now built on top of the
    hrtimer and perf subsystems. Update the documentation
    accordingly.

    Signed-off-by: Fernando Luis Vazquez Cao
    Acked-by: Randy Dunlap
    Signed-off-by: Don Zickus
    Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Fernando Luis Vázquez Cao
     

09 Feb, 2012

2 commits

  • A recent refactoring of perf-record introduced the following:

    perf record -a -B
    Couldn't generating buildids. Use --no-buildid to profile anyway.
    sleep: Terminated

    I believe the triple negative was meant to be only a double negative.
    :-) While I'm there, fixed the grammar on the error message.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The current version of perf detects whether or not the perf.data file is
    written in a different endianness using the attr_size field in the
    header of the file. This field represents sizeof(struct perf_event_attr)
    as known to perf record. If the sizes do not match, then perf tries the
    byte-swapped version. If they match, then the tool assumes a different
    endianness.

    The issue with the approach is that it assumes the size of
    perf_event_attr always has to match between perf record and perf report.
    However, the kernel perf_event ABI is extensible. New fields can be
    added to struct perf_event_attr. Consequently, it is not possible to use
    attr_size to detect endianness.

    This patch takes another approach by using the magic number written at
    the beginning of the perf.data file to detect endianness. The magic
    number is an eight-byte signature. It's primary purpose is to identify
    (signature) a perf.data file. But it could also be used to encode the
    endianness.

    The patch introduces a new value for this signature. The key difference
    is that the signature is written differently in the file depending on
    the endianness. Thus, by comparing the signature from the file with the
    tool's own signature it is possible to detect endianness. The new
    signature is "PERFILE2".

    Backward compatiblity with existing perf.data file is ensured.

    Tested-by: David Ahern
    Acked-by: David Ahern
    Cc: Andi Kleen
    Cc: Anshuman Khandual
    Cc: Arun Sharma
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Lin Ming
    Cc: Peter Zijlstra
    Cc: Roberto Agostino Vitillo
    Cc: Robert Richter
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.com
    Signed-off-by: Stephane Eranian
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

07 Feb, 2012

11 commits

  • Stephane Eranian reported that doing a scheduler latency
    measurements with perf on AMD doesn't work out as expected due
    to the fact that the sched_clock() granularity is too coarse,
    i.e. done in jiffies due to the sched_clock_stable not set,
    which, if set, would mean that we get to use the TSC as sample
    source which would give us much higher precision.

    However, there's no reason not to set sched_clock_stable on AMD
    because all families from F10h and upwards do have an invariant
    TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]).

    Make it so, #1.

    Signed-off-by: Borislav Petkov
    Cc: Borislav Petkov
    Cc: Venki Pallipadi
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Arnaldo Carvalho de Melo
    Cc: Robert Richter
    Cc: Eric Dumazet
    Cc: Andreas Herrmann
    Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad
    [ Should any non-standard system break the TSC, we should
    mark them so explicitly, in their platform init handler, or
    in a DMI quirk. ]
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     
  • …/acme/linux into perf/core

    perf/core fixes and improvements.

    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Ingo Molnar
     
  • Linux 3.3-rc2

    Pick up the latest fixes.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The output of cpu-clock event is controlled in nsec_printout(),
    but its alignment was broken:

    Performance counter stats for 'sleep 1':

    6,038,774 instructions # 0.00 insns per cycle
    180 faults # 0.007 K/sec [99.95%]
    1,282,201 branches # 0.053 M/sec [99.84%]
    24126.221811 cpu-clock [99.62%]
    24121.689540 task-clock # 24.098 CPUs utilized [99.52%]

    1.001001017 seconds time elapsed

    This patch fixes this:

    Performance counter stats for 'sleep 1':

    13,540,843 instructions # 0.00 insns per cycle
    180 faults # 0.007 K/sec [99.94%]
    2,875,386 branches # 0.119 M/sec [99.82%]
    24144.221137 cpu-clock [99.61%]
    24133.515366 task-clock # 24.109 CPUs utilized [99.52%]

    1.001020946 seconds time elapsed

    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328514285-26232-2-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The default 'M/sec' unit is not useful if the result is small enough.

    Adjust it dynamically according to the value.

    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328514285-26232-1-git-send-email-namhyung.kim@lge.com
    Signed-off-by: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Currently we can put the object files in a different directory by using
    'O=' comand line argument.

    However the generated documentation files don't honor this directive,

    This patch fixes that. It's been tested for man target but the others
    seems currently broken so no tests have been done on them so far.

    Link: http://lkml.kernel.org/r/1328541443-18003-1-git-send-email-fbuihuu@gmail.com
    Signed-off-by: Franck Bui-Huu
    Signed-off-by: Arnaldo Carvalho de Melo

    Franck Bui-Huu
     
  • By adding following objects:
    bench/mem-memset-x86-64-asm.o
    bench/mem-memcpy-x86-64-asm.o
    the x86_64 perf binary ended up with executable stack.

    The reason was that above objects are assembler sourced and are missing the
    GNU-stack note section. In such case the linker assumes that the final binary
    should not be restricted at all and mark the stack as RWX.

    Adding section ".note.GNU-stack" definition to mentioned objects, with all
    flags disabled, thus omiting those objects from linker stack flags decision.

    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
    Reported-by: Clark Williams
    Acked-by: Eric Dumazet
    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    [ committer note: Remaining bits after what was already added to perf/urgent ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • So that we can get the perf bench exec stack fixes and then apply the
    remaining fix for the files added after what is in perf/urgent.

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • This patch fixes an issue where perf report shows nan% for certain
    perf.data files. The below is from a report for a do_fork probe:

    -nan% sshd [kernel.kallsyms] [k] do_fork
    -nan% packagekitd [kernel.kallsyms] [k] do_fork
    -nan% dbus-daemon [kernel.kallsyms] [k] do_fork
    -nan% bash [kernel.kallsyms] [k] do_fork

    A git bisect shows commit f3bda2c as the cause. However, looking back
    through the git history, I saw commit 640c03c which seems to have
    removed the required initialization for perf_sample->period. The problem
    only started showing after commit f3bda2c. The below patch re-introduces
    the initialization and it fixes the problem for me.

    With the below patch, for the same perf.data:

    73.08% bash [kernel.kallsyms] [k] do_fork
    8.97% 11-dhclient [kernel.kallsyms] [k] do_fork
    6.41% sshd [kernel.kallsyms] [k] do_fork
    3.85% 20-chrony [kernel.kallsyms] [k] do_fork
    2.56% sendmail [kernel.kallsyms] [k] do_fork

    This patch applies over current linux-tip commit 9949284.

    Problem introduced in:

    $ git describe 640c03c
    v2.6.37-rc3-83-g640c03c

    Cc: Ananth N Mavinakayanahalli
    Cc: Ingo Molnar
    Cc: Robert Richter
    Cc: Srikar Dronamraju
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/20120203170113.5190.25558.stgit@localhost6.localdomain6
    Signed-off-by: Naveen N. Rao
    Signed-off-by: Arnaldo Carvalho de Melo

    Naveen N. Rao
     
  • In some perf ancient versions we used '[kernel.kallsyms._text]' as the
    name for the kernel map.

    This got changed with commit:
    perf: 'perf kvm' tool for monitoring guest performance from host
    commit a1645ce12adb6c9cc9e19d7695466204e3f017fe
    Author: Zhang, Yanmin

    and we started to use following name '[kernel.kallsyms]_text'.

    This name change is important for the report code dealing with ancient
    perf data. When processing the kernel map event, we need to recognize
    the old naming (dont match the last ']') and initialize the kernel map
    correctly.

    The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the
    superfluous ']' to get correct symbol name.

    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • By adding following objects:
    bench/mem-memcpy-x86-64-asm.o
    the x86_64 perf binary ended up with executable stack.

    The reason was that above object are assembler sourced and is missing the
    GNU-stack note section. In such case the linker assumes that the final binary
    should not be restricted at all and mark the stack as RWX.

    Adding section ".note.GNU-stack" definition to mentioned object, with all
    flags disabled, thus omiting this object from linker stack flags decision.

    Problem introduced in:

    $ git describe ea7872b
    v2.6.37-rc2-19-gea7872b

    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
    Reported-by: Clark Williams
    Acked-by: Eric Dumazet
    Cc: Corey Ashford
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
    Signed-off-by: Jiri Olsa
    [ committer note: Backported fix to perf/urgent (3.3-rc2+) ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

03 Feb, 2012

3 commits

  • With the new throttling/unthrottling code introduced with
    commit:

    e050e3f0a71b ("perf: Fix broken interrupt rate throttling")

    we occasionally hit two WARN_ON_ONCE() checks in:

    - intel_pmu_pebs_enable()
    - intel_pmu_lbr_enable()
    - x86_pmu_start()

    The assertions are no longer problematic. There is a valid
    path where they can trigger but it is harmless.

    The assertion can be triggered with:

    $ perf record -e instructions:pp ....

    Leading to paths:

    intel_pmu_pebs_enable
    intel_pmu_enable_event
    x86_perf_event_set_period
    x86_pmu_start
    perf_adjust_freq_unthr_context
    perf_event_task_tick
    scheduler_tick

    And:

    intel_pmu_lbr_enable
    intel_pmu_enable_event
    x86_perf_event_set_period
    x86_pmu_start
    perf_adjust_freq_unthr_context.
    perf_event_task_tick
    scheduler_tick

    cpuc->enabled is always on because when we get to
    perf_adjust_freq_unthr_context() the PMU is not totally
    disabled. Furthermore when we need to adjust a period,
    we only stop the event we need to change and not the
    entire PMU. Thus, when we re-enable, cpuc->enabled is
    already set. Note that when we stop the event, both
    pebs and lbr are stopped if necessary (and possible).

    Signed-off-by: Stephane Eranian
    Cc: peterz@infradead.org
    Link: http://lkml.kernel.org/r/20120202110401.GA30911@quad
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    rbd: fix safety of rbd_put_client()
    rbd: fix a memory leak in rbd_get_client()
    ceph: create a new session lock to avoid lock inversion
    ceph: fix length validation in parse_reply_info()
    ceph: initialize client debugfs outside of monc->mutex
    ceph: change "ceph.layout" xattr to be "ceph.file.layout"

    Linus Torvalds
     
  • Signed-off-by: Josh Triplett
    Signed-off-by: Linus Torvalds

    Josh Triplett