22 Feb, 2012
1 commit
-
Consolidate the uprobes code under kernel/events/, where the various
core kernel event handling routines live.Acked-by: Peter Zijlstra
Cc: Srikar Dronamraju
Cc: Jim Keniston
Cc: Oleg Nesterov
Cc: Masami Hiramatsu
Cc: Arnaldo Carvalho de Melo
Cc: Anton Arapov
Cc: Ananth N Mavinakayanahalli
Link: http://lkml.kernel.org/n/tip-biuyhhwohxgbp2vzbap5yr8o@git.kernel.org
Signed-off-by: Ingo Molnar
17 Feb, 2012
3 commits
-
Make the uprobes code readable to me:
- improve the Kconfig text so that a mere mortal gets some idea
what CONFIG_UPROBES=y is really about- do trivial renames to standardize around the uprobes_*() namespace
- clean up and simplify various code flow details
- separate basic blocks of functionality
- line break artifact and white space related removal
- use standard local varible definition blocks
- use vertical spacing to make things more readable
- remove unnecessary volatile
- restructure comment blocks to make them more uniform and
more readable in generalCc: Srikar Dronamraju
Cc: Jim Keniston
Cc: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Masami Hiramatsu
Cc: Arnaldo Carvalho de Melo
Cc: Anton Arapov
Cc: Ananth N Mavinakayanahalli
Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.org
Signed-off-by: Ingo Molnar -
Add uprobes support to the core kernel, with x86 support.
This commit adds the kernel facilities, the actual uprobes
user-space ABI and perf probe support comes in later commits.General design:
Uprobes are maintained in an rb-tree indexed by inode and offset
(the offset here is from the start of the mapping). For a unique
(inode, offset) tuple, there can be at most one uprobe in the
rb-tree.Since the (inode, offset) tuple identifies a unique uprobe, more
than one user may be interested in the same uprobe. This provides
the ability to connect multiple 'consumers' to the same uprobe.Each consumer defines a handler and a filter (optional). The
'handler' is run every time the uprobe is hit, if it matches the
'filter' criteria.The first consumer of a uprobe causes the breakpoint to be
inserted at the specified address and subsequent consumers are
appended to this list. On subsequent probes, the consumer gets
appended to the existing list of consumers. The breakpoint is
removed when the last consumer unregisters. For all other
unregisterations, the consumer is removed from the list of
consumers.Given a inode, we get a list of the mms that have mapped the
inode. Do the actual registration if mm maps the page where a
probe needs to be inserted/removed.We use a temporary list to walk through the vmas that map the
inode.- The number of maps that map the inode, is not known before we
walk the rmap and keeps changing.
- extending vm_area_struct wasn't recommended, it's a
size-critical data structure.
- There can be more than one maps of the inode in the same mm.We add callbacks to the mmap methods to keep an eye on text vmas
that are of interest to uprobes. When a vma of interest is mapped,
we insert the breakpoint at the right address.Uprobe works by replacing the instruction at the address defined
by (inode, offset) with the arch specific breakpoint
instruction. We save a copy of the original instruction at the
uprobed address.This is needed for:
a. executing the instruction out-of-line (xol).
b. instruction analysis for any subsequent fixups.
c. restoring the instruction back when the uprobe is unregistered.We insert or delete a breakpoint instruction, and this
breakpoint instruction is assumed to be the smallest instruction
available on the platform. For fixed size instruction platforms
this is trivially true, for variable size instruction platforms
the breakpoint instruction is typically the smallest (often a
single byte).Writing the instruction is done by COWing the page and changing
the instruction during the copy, this even though most platforms
allow atomic writes of the breakpoint instruction. This also
mirrors the behaviour of a ptrace() memory write to a PRIVATE
file map.The core worker is derived from KSM's replace_page() logic.
In essence, similar to KSM:
a. allocate a new page and copy over contents of the page that
has the uprobed vaddr
b. modify the copy and insert the breakpoint at the required
address
c. switch the original page with the copy containing the
breakpoint
d. flush page tables.replace_page() is being replicated here because of some minor
changes in the type of pages and also because Hugh Dickins had
plans to improve replace_page() for KSM specific work.Instruction analysis on x86 is based on instruction decoder and
determines if an instruction can be probed and determines the
necessary fixups after singlestep. Instruction analysis is done
at probe insertion time so that we avoid having to repeat the
same analysis every time a probe is hit.A lot of code here is due to the improvement/suggestions/inputs
from Peter Zijlstra.Changelog:
(v10):
- Add code to clear REX.B prefix as suggested by Denys Vlasenko
and Masami Hiramatsu.(v9):
- Use insn_offset_modrm as suggested by Masami Hiramatsu.(v7):
Handle comments from Peter Zijlstra:
- Dont take reference to inode. (expect inode to uprobe_register to be sane).
- Use PTR_ERR to set the return value.
- No need to take reference to inode.
- use PTR_ERR to return error value.
- register and uprobe_unregister share code.(v5):
- Modified del_consumer as per comments from Peter.
- Drop reference to inode before dropping reference to uprobe.
- Use i_size_read(inode) instead of inode->i_size.
- Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
- Includes errno.h as recommended by Stephen Rothwell to fix a build issue
on sparc defconfig
- Remove restrictions while unregistering.
- Earlier code leaked inode references under some conditions while
registering/unregistering.
- Continue the vma-rmap walk even if the intermediate vma doesnt
meet the requirements.
- Validate the vma found by find_vma before inserting/removing the
breakpoint
- Call del_consumer under mutex_lock.
- Use hash locks.
- Handle mremap.
- Introduce find_least_offset_node() instead of close match logic in
find_uprobe
- Uprobes no more depends on MM_OWNER; No reference to task_structs
while inserting/removing a probe.
- Uses read_mapping_page instead of grab_cache_page so that the pages
have valid content.
- pass NULL to get_user_pages for the task parameter.
- call SetPageUptodate on the new page allocated in write_opcode.
- fix leaking a reference to the new page under certain conditions.
- Include Instruction Decoder if Uprobes gets defined.
- Remove const attributes for instruction prefix arrays.
- Uses mm_context to know if the application is 32 bit.Signed-off-by: Srikar Dronamraju
Also-written-by: Jim Keniston
Reviewed-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Andi Kleen
Cc: Christoph Hellwig
Cc: Steven Rostedt
Cc: Roland McGrath
Cc: Masami Hiramatsu
Cc: Arnaldo Carvalho de Melo
Cc: Anton Arapov
Cc: Ananth N Mavinakayanahalli
Cc: Stephen Rothwell
Cc: Denys Vlasenko
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Linux-mm
Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
[ Made various small edits to the commit log ]
Signed-off-by: Ingo Molnar -
…/acme/linux into perf/core
Includes smaller fixes and improvements plus the exclude_{host,guest} feature
test and fallback to handle older kernels.Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
15 Feb, 2012
2 commits
-
Instead of requiring that users of perf_record_opts set
.sample_id_all_avail to true, just invert the logic, using
.sample_id_all_missing, that doesn't need to be explicitely initialized
since gcc will zero members ommitted in a struct initialization.Just like the newly introduced .exclude_{guest,host} feature test.
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Just fall back to resetting those fields, if set, warning the user that
that feature is not available.If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.Reported-by: Namhyung Kim
Tested-by: Joerg Roedel
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Joerg Roedel
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
14 Feb, 2012
14 commits
-
The perf_event_attr size needs to be initialized in all cases because it
captures the ABI version.This patch moves the initialization of the field from the
perf_event_open() syscall stub to its proper location in the
event_attr_init().Cc: Ingo Molnar
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120209151238.GA10272@quad
Signed-off-by: Stephane Eranian
Signed-off-by: Arnaldo Carvalho de Melo -
There is individual code for each feature to process header sections.
Adding a function pointer .process to struct feature_ops for keeping the
implementation in separate functions. Code to process header sections is
now a generic function.Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/1328884916-5901-2-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter
Signed-off-by: Arnaldo Carvalho de Melo -
Needed for later changes. No modified functionality.
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/1328884916-5901-1-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter
Signed-off-by: Arnaldo Carvalho de Melo -
Adding implementation os bitmap_or function to the bitmap object. It is
stolen from the kernel lib/bitmap.o object.It is used in upcomming patches.
Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1327674868-10486-5-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo -
Adding sysfs object to provide sysfs mount information in the same way
as debugfs object does.The object provides following function:
sysfs_find_mountpointwhich returns the sysfs mount mount.
Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1327674868-10486-4-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo -
Following debugfs object functions are not referenced
within the code:int debugfs_valid_entry(const char *path);
int debugfs_umount(void);
int debugfs_write(const char *entry, const char *value);
int debugfs_read(const char *entry, char *buffer, size_t size);
void debugfs_force_cleanup(void);
int debugfs_make_path(const char *element, char *buffer, int size);Removing them.
Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1327674868-10486-3-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo -
The ctype.h in symbol.c was needed because of isupper(). However we now
have it in util.h, it can be changed to use our implementation.Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328836217-9118-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
The implementation of sane ctype macros only depends on symbols in
util.h not cache.h.Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328836217-9118-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
The util.h header provides various ctype macros but lacks those two.
Add them.
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328836217-9118-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Setting perf_guest to true by default makes no sense because the perf
subcommands can not setup guest symbol information and thus not process
and guest samples. The only exception is perf-kvm which changes the
perf_guest value on its own. So change the default for perf_guest back
to false.Cc: David Ahern
Cc: Ingo Molnar
Cc: Jason Wang
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel
Signed-off-by: Arnaldo Carvalho de Melo -
The perf sample processing code relies on a valid machine object. Make
sure that this path is only entered when such a object exists.A counter for samples where no machine object exits is also introduced
to give the user a message about these samples.Reported-by: David Ahern
Reported-by: Jason Wang
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jason Wang
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328893505-4115-2-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel
Signed-off-by: Arnaldo Carvalho de Melo -
Allow a user to collect events for multiple threads or processes
using a comma separated list.e.g., collect data on a VM and its vhost thread:
perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485or monitoring vcpu threads
perf top -t 21488,21489
perf stat -t 21488,21489 -ddd
perf record -t 21488,21489Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
For latest tip/perf/core tree Compiles are failing on:
GEN common-cmds.h
make: *** No rule to make target `../../arch/x86/lib/memset_64.S', needed by `builtin-annotate.o'. Stop.Resolve by adding memset.* to the tar file.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1329145057-26302-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The perf python extention (perf.so) file lacks its dependencies in the
Makefile so that it cannot be refreshed if one of source files it depends
is changed. Fix it by putting them in a separate file and processing it in
both of Makefile and setup.py.Reported-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1329043524-12470-1-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
11 Feb, 2012
4 commits
-
Fix to decode grouped AVX with VEX pp bits which should be
handled as same as last-prefixes. This fixes below warnings
in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y.Warning: arch/x86/tools/test_get_len found difference at :ffffffff810d5fc0
Warning: ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0
Warning: objdump says 5 bytes, but insn_get_length() says 4
...With this change, test_get_len can decode it correctly.
$ arch/x86/tools/test_get_len -v -y
ffffffff810d6069: c5 f9 73 de 04 vpsrldq $0x4,%xmm6,%xmm0
Succeed: decoded and checked 1 instructionsReported-by: Ingo Molnar
Signed-off-by: Masami Hiramatsu
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar -
Reflect the change in the soft and hard lockup thresholds and
their relation to the frequency of the hrtimer and NMI events in
the code comments. While at it, remove references to files that
do not exist anymore.Signed-off-by: Fernando Luis Vazquez Cao
Signed-off-by: Don Zickus
Link: http://lkml.kernel.org/r/1328827342-6253-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar -
The soft and hard lockup thresholds have changed so the
corresponding Kconfig entries need to be updated accordingly.
Add a reference to watchdog_thresh while at it.Signed-off-by: Fernando Luis Vazquez Cao
Signed-off-by: Don Zickus
Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar -
The soft and hard lockup detectors are now built on top of the
hrtimer and perf subsystems. Update the documentation
accordingly.Signed-off-by: Fernando Luis Vazquez Cao
Acked-by: Randy Dunlap
Signed-off-by: Don Zickus
Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar
09 Feb, 2012
2 commits
-
A recent refactoring of perf-record introduced the following:
perf record -a -B
Couldn't generating buildids. Use --no-buildid to profile anyway.
sleep: TerminatedI believe the triple negative was meant to be only a double negative.
:-) While I'm there, fixed the grammar on the error message.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The current version of perf detects whether or not the perf.data file is
written in a different endianness using the attr_size field in the
header of the file. This field represents sizeof(struct perf_event_attr)
as known to perf record. If the sizes do not match, then perf tries the
byte-swapped version. If they match, then the tool assumes a different
endianness.The issue with the approach is that it assumes the size of
perf_event_attr always has to match between perf record and perf report.
However, the kernel perf_event ABI is extensible. New fields can be
added to struct perf_event_attr. Consequently, it is not possible to use
attr_size to detect endianness.This patch takes another approach by using the magic number written at
the beginning of the perf.data file to detect endianness. The magic
number is an eight-byte signature. It's primary purpose is to identify
(signature) a perf.data file. But it could also be used to encode the
endianness.The patch introduces a new value for this signature. The key difference
is that the signature is written differently in the file depending on
the endianness. Thus, by comparing the signature from the file with the
tool's own signature it is possible to detect endianness. The new
signature is "PERFILE2".Backward compatiblity with existing perf.data file is ensured.
Tested-by: David Ahern
Acked-by: David Ahern
Cc: Andi Kleen
Cc: Anshuman Khandual
Cc: Arun Sharma
Cc: David Ahern
Cc: Ingo Molnar
Cc: Lin Ming
Cc: Peter Zijlstra
Cc: Roberto Agostino Vitillo
Cc: Robert Richter
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.com
Signed-off-by: Stephane Eranian
Signed-off-by: Arnaldo Carvalho de Melo
07 Feb, 2012
11 commits
-
Stephane Eranian reported that doing a scheduler latency
measurements with perf on AMD doesn't work out as expected due
to the fact that the sched_clock() granularity is too coarse,
i.e. done in jiffies due to the sched_clock_stable not set,
which, if set, would mean that we get to use the TSC as sample
source which would give us much higher precision.However, there's no reason not to set sched_clock_stable on AMD
because all families from F10h and upwards do have an invariant
TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]).Make it so, #1.
Signed-off-by: Borislav Petkov
Cc: Borislav Petkov
Cc: Venki Pallipadi
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Arnaldo Carvalho de Melo
Cc: Robert Richter
Cc: Eric Dumazet
Cc: Andreas Herrmann
Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad
[ Should any non-standard system break the TSC, we should
mark them so explicitly, in their platform init handler, or
in a DMI quirk. ]
Signed-off-by: Ingo Molnar -
…/acme/linux into perf/core
perf/core fixes and improvements.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Linux 3.3-rc2
Pick up the latest fixes.
Signed-off-by: Ingo Molnar
-
The output of cpu-clock event is controlled in nsec_printout(),
but its alignment was broken:Performance counter stats for 'sleep 1':
6,038,774 instructions # 0.00 insns per cycle
180 faults # 0.007 K/sec [99.95%]
1,282,201 branches # 0.053 M/sec [99.84%]
24126.221811 cpu-clock [99.62%]
24121.689540 task-clock # 24.098 CPUs utilized [99.52%]1.001001017 seconds time elapsed
This patch fixes this:
Performance counter stats for 'sleep 1':
13,540,843 instructions # 0.00 insns per cycle
180 faults # 0.007 K/sec [99.94%]
2,875,386 branches # 0.119 M/sec [99.82%]
24144.221137 cpu-clock [99.61%]
24133.515366 task-clock # 24.109 CPUs utilized [99.52%]1.001020946 seconds time elapsed
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328514285-26232-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
The default 'M/sec' unit is not useful if the result is small enough.
Adjust it dynamically according to the value.
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328514285-26232-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Currently we can put the object files in a different directory by using
'O=' comand line argument.However the generated documentation files don't honor this directive,
This patch fixes that. It's been tested for man target but the others
seems currently broken so no tests have been done on them so far.Link: http://lkml.kernel.org/r/1328541443-18003-1-git-send-email-fbuihuu@gmail.com
Signed-off-by: Franck Bui-Huu
Signed-off-by: Arnaldo Carvalho de Melo -
By adding following objects:
bench/mem-memset-x86-64-asm.o
bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.The reason was that above objects are assembler sourced and are missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.Adding section ".note.GNU-stack" definition to mentioned objects, with all
flags disabled, thus omiting those objects from linker stack flags decision.Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams
Acked-by: Eric Dumazet
Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
[ committer note: Remaining bits after what was already added to perf/urgent ]
Signed-off-by: Arnaldo Carvalho de Melo -
So that we can get the perf bench exec stack fixes and then apply the
remaining fix for the files added after what is in perf/urgent.Signed-off-by: Arnaldo Carvalho de Melo
-
This patch fixes an issue where perf report shows nan% for certain
perf.data files. The below is from a report for a do_fork probe:-nan% sshd [kernel.kallsyms] [k] do_fork
-nan% packagekitd [kernel.kallsyms] [k] do_fork
-nan% dbus-daemon [kernel.kallsyms] [k] do_fork
-nan% bash [kernel.kallsyms] [k] do_forkA git bisect shows commit f3bda2c as the cause. However, looking back
through the git history, I saw commit 640c03c which seems to have
removed the required initialization for perf_sample->period. The problem
only started showing after commit f3bda2c. The below patch re-introduces
the initialization and it fixes the problem for me.With the below patch, for the same perf.data:
73.08% bash [kernel.kallsyms] [k] do_fork
8.97% 11-dhclient [kernel.kallsyms] [k] do_fork
6.41% sshd [kernel.kallsyms] [k] do_fork
3.85% 20-chrony [kernel.kallsyms] [k] do_fork
2.56% sendmail [kernel.kallsyms] [k] do_forkThis patch applies over current linux-tip commit 9949284.
Problem introduced in:
$ git describe 640c03c
v2.6.37-rc3-83-g640c03cCc: Ananth N Mavinakayanahalli
Cc: Ingo Molnar
Cc: Robert Richter
Cc: Srikar Dronamraju
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120203170113.5190.25558.stgit@localhost6.localdomain6
Signed-off-by: Naveen N. Rao
Signed-off-by: Arnaldo Carvalho de Melo -
In some perf ancient versions we used '[kernel.kallsyms._text]' as the
name for the kernel map.This got changed with commit:
perf: 'perf kvm' tool for monitoring guest performance from host
commit a1645ce12adb6c9cc9e19d7695466204e3f017fe
Author: Zhang, Yanminand we started to use following name '[kernel.kallsyms]_text'.
This name change is important for the report code dealing with ancient
perf data. When processing the kernel map event, we need to recognize
the old naming (dont match the last ']') and initialize the kernel map
correctly.The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the
superfluous ']' to get correct symbol name.Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
Signed-off-by: Arnaldo Carvalho de Melo -
By adding following objects:
bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.The reason was that above object are assembler sourced and is missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.Adding section ".note.GNU-stack" definition to mentioned object, with all
flags disabled, thus omiting this object from linker stack flags decision.Problem introduced in:
$ git describe ea7872b
v2.6.37-rc2-19-gea7872bReported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams
Acked-by: Eric Dumazet
Cc: Corey Ashford
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa
[ committer note: Backported fix to perf/urgent (3.3-rc2+) ]
Signed-off-by: Arnaldo Carvalho de Melo
03 Feb, 2012
3 commits
-
With the new throttling/unthrottling code introduced with
commit:e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
we occasionally hit two WARN_ON_ONCE() checks in:
- intel_pmu_pebs_enable()
- intel_pmu_lbr_enable()
- x86_pmu_start()The assertions are no longer problematic. There is a valid
path where they can trigger but it is harmless.The assertion can be triggered with:
$ perf record -e instructions:pp ....
Leading to paths:
intel_pmu_pebs_enable
intel_pmu_enable_event
x86_perf_event_set_period
x86_pmu_start
perf_adjust_freq_unthr_context
perf_event_task_tick
scheduler_tickAnd:
intel_pmu_lbr_enable
intel_pmu_enable_event
x86_perf_event_set_period
x86_pmu_start
perf_adjust_freq_unthr_context.
perf_event_task_tick
scheduler_tickcpuc->enabled is always on because when we get to
perf_adjust_freq_unthr_context() the PMU is not totally
disabled. Furthermore when we need to adjust a period,
we only stop the event we need to change and not the
entire PMU. Thus, when we re-enable, cpuc->enabled is
already set. Note that when we stop the event, both
pebs and lbr are stopped if necessary (and possible).Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/20120202110401.GA30911@quad
Signed-off-by: Ingo Molnar -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fix safety of rbd_put_client()
rbd: fix a memory leak in rbd_get_client()
ceph: create a new session lock to avoid lock inversion
ceph: fix length validation in parse_reply_info()
ceph: initialize client debugfs outside of monc->mutex
ceph: change "ceph.layout" xattr to be "ceph.file.layout" -
Signed-off-by: Josh Triplett
Signed-off-by: Linus Torvalds