01 Aug, 2012
3 commits
-
Merge Andrew's second set of patches:
- MM
- a few random fixes
- a couple of RTC leftovers* emailed patches from Andrew Morton : (120 commits)
rtc/rtc-88pm80x: remove unneed devm_kfree
rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails
mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables
tmpfs: distribute interleave better across nodes
mm: remove redundant initialization
mm: warn if pg_data_t isn't initialized with zero
mips: zero out pg_data_t when it's allocated
memcg: gix memory accounting scalability in shrink_page_list
mm/sparse: remove index_init_lock
mm/sparse: more checks on mem_section number
mm/sparse: optimize sparse_index_alloc
memcg: add mem_cgroup_from_css() helper
memcg: further prevent OOM with too many dirty pages
memcg: prevent OOM with too many dirty pages
mm: mmu_notifier: fix freed page still mapped in secondary MMU
mm: memcg: only check anon swapin page charges for swap cache
mm: memcg: only check swap cache pages for repeated charging
mm: memcg: split swapin charge function into private and public part
mm: memcg: remove needless !mm fixup to init_mm when charging
mm: memcg: remove unneeded shmem charge type
... -
"fault-injection: add tool to run command with failslab or
fail_page_alloc" added tools/testing/fault-injection/failcmd.sh to make it
easier to inject slab/page allocation failures by fault injection.failcmd.sh prints the following warning when running with arguments
for command.# ./failcmd.sh echo aaa
failcmd.sh: line 209: [: echo: binary operator expected
aaaThis warning is caused by an improper check whether at least one
parameter is left after parsing command options.Fix it by testing the length of $1 instead of $@
Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pull perf updates from Ingo Molnar:
"The biggest changes are Intel Nehalem-EX PMU uncore support, uprobes
updates/cleanups/fixes from Oleg and diverse tooling updates (mostly
fixes) now that Arnaldo is back from vacation."* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
uprobes: __replace_page() needs munlock_vma_page()
uprobes: Rename vma_address() and make it return "unsigned long"
uprobes: Fix register_for_each_vma()->vma_address() check
uprobes: Introduce vaddr_to_offset(vma, vaddr)
uprobes: Teach build_probe_list() to consider the range
uprobes: Remove insert_vm_struct()->uprobe_mmap()
uprobes: Remove copy_vma()->uprobe_mmap()
uprobes: Fix overflow in vma_address()/find_active_uprobe()
uprobes: Suppress uprobe_munmap() from mmput()
uprobes: Uprobe_mmap/munmap needs list_for_each_entry_safe()
uprobes: Clean up and document write_opcode()->lock_page(old_page)
uprobes: Kill write_opcode()->lock_page(new_page)
uprobes: __replace_page() should not use page_address_in_vma()
uprobes: Don't recheck vma/f_mapping in write_opcode()
perf/x86: Fix missing struct before structure name
perf/x86: Fix format definition of SNB-EP uncore QPI box
perf/x86: Make bitfield unsigned
perf/x86: Fix LLC-* and node-* events on Intel SandyBridge
perf/x86: Add Intel Nehalem-EX uncore support
perf/x86: Fix typo in format definition of uncore PCU filter
...
31 Jul, 2012
7 commits
-
Merge Andrew's first set of patches:
"Non-MM patches:- lots of misc bits
- tree-wide have_clk() cleanups
- quite a lot of printk tweaks. I draw your attention to "printk:
convert the format for KERN_ to a 2 byte pattern" which
looks a bit scary. But afaict it's solid.- backlight updates
- lib/ feature work (notably the addition and use of memweight())
- checkpatch updates
- rtc updates
- nilfs updates
- fatfs updates (partial, still waiting for acks)
- kdump, proc, fork, IPC, sysctl, taskstats, pps, etc
- new fault-injection feature work"
* Merge emailed patches from Andrew Morton : (128 commits)
drivers/misc/lkdtm.c: fix missing allocation failure check
lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()
fault-injection: add tool to run command with failslab or fail_page_alloc
fault-injection: add selftests for cpu and memory hotplug
powerpc: pSeries reconfig notifier error injection module
memory: memory notifier error injection module
PM: PM notifier error injection module
cpu: rewrite cpu-notifier-error-inject module
fault-injection: notifier error injection
c/r: fcntl: add F_GETOWNER_UIDS option
resource: make sure requested range is included in the root range
include/linux/aio.h: cpp->C conversions
fs: cachefiles: add support for large files in filesystem caching
pps: return PTR_ERR on error in device_create
taskstats: check nla_reserve() return
sysctl: suppress kmemleak messages
ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION
ipc: compat: use signed size_t types for msgsnd and msgrcv
ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC
ipc: add COMPAT_SHMLBA support
... -
This adds tools/testing/fault-injection/failcmd.sh to run a command while
injecting slab/page allocation failures via fault injection.Example:
Run a command "make -C tools/testing/selftests/ run_tests" with
injecting slab allocation failure.# ./tools/testing/fault-injection/failcmd.sh \
-- make -C tools/testing/selftests/ run_testsSame as above except to specify 100 times failures at most instead of
one time at most by default.# ./tools/testing/fault-injection/failcmd.sh --times=100 \
-- make -C tools/testing/selftests/ run_testsSame as above except to inject page allocation failure instead of slab
allocation failure.# env FAILCMD_TYPE=fail_page_alloc \
./tools/testing/fault-injection/failcmd.sh --times=100 \
-- make -C tools/testing/selftests/ run_testsSigned-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This adds two selftests
* tools/testing/selftests/cpu-hotplug/on-off-test.sh is testing script
for CPU hotplug1. Online all hot-pluggable CPUs
2. Offline all hot-pluggable CPUs
3. Online all hot-pluggable CPUs again
4. Exit if cpu-notifier-error-inject.ko is not available
5. Offline all hot-pluggable CPUs in preparation for testing
6. Test CPU hot-add error handling by injecting notifier errors
7. Online all hot-pluggable CPUs in preparation for testing
8. Test CPU hot-remove error handling by injecting notifier errors* tools/testing/selftests/memory-hotplug/on-off-test.sh is doing the
similar thing for memory hotplug.1. Online all hot-pluggable memory
2. Offline 10% of hot-pluggable memory
3. Online all hot-pluggable memory again
4. Exit if memory-notifier-error-inject.ko is not available
5. Offline 10% of hot-pluggable memory in preparation for testing
6. Test memory hot-add error handling by injecting notifier errors
7. Online all hot-pluggable memory in preparation for testing
8. Test memory hot-remove error handling by injecting notifier errorsSigned-off-by: Akinobu Mita
Suggested-by: Andrew Morton
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Cc: Greg KH
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Dave Jones
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pull ktest changes from Steven Rostedt:
"Set of updates for v3.6 (some fixes too)Seems that you opened the merge window the day I left for the beach.
I just got back (yes us Americans only take a week vacation), and just
got the last of my ktest quilt queue into git."* tag 'ktest-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Allow perl regex expressions in conditional statements
ktest: Ignore errors it tests if IGNORE_ERRORS is set
ktest: Reset saved min (force) configs for each test
ktest: Add check for bug or panic during reboot
ktest: Add MAX_MONITOR_WAIT option
ktest: Fix config bisect with how make oldnoconfig works
ktest: Add CONFIG_BISECT_CHECK option
ktest: Add PRE_INSTALL option
ktest: Add PRE/POST_KTEST and TEST options
ktest: Remove commented exit -
Add '=~' and '!~' to the list of allowed conditionals for DEFAULT and
TEST_START section if statements.ie.
TEST_START IF TEST =~ .*test$
Signed-off-by: Steven Rostedt
-
The option IGNORE_ERRORS is used to allow a test to succeed even if a
warning appears from the kernel. Sometimes kernels will produce warnings
that are not associated with a test, and the user wants to test
something else.The IGNORE_ERRORS works for boot up, but was not preventing test runs to
succeed if the kernel produced a warning.Signed-off-by: Steven Rostedt
-
Pull SLAB changes from Pekka Enberg:
"Most of the changes included are from Christoph Lameter's "common
slab" patch series that unifies common parts of SLUB, SLAB, and SLOB
allocators. The unification is needed for Glauber Costa's "kmem
memcg" work that will hopefully appear for v3.7.The rest of the changes are fixes and speedups by various people."
* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (32 commits)
mm: Fix build warning in kmem_cache_create()
slob: Fix early boot kernel crash
mm, slub: ensure irqs are enabled for kmemcheck
mm, sl[aou]b: Move kmem_cache_create mutex handling to common code
mm, sl[aou]b: Use a common mutex definition
mm, sl[aou]b: Common definition for boot state of the slab allocators
mm, sl[aou]b: Extract common code for kmem_cache_create()
slub: remove invalid reference to list iterator variable
mm: Fix signal SIGFPE in slabinfo.c.
slab: move FULL state transition to an initcall
slab: Fix a typo in commit 8c138b "slab: Get rid of obj_size macro"
mm, slab: Build fix for recent kmem_cache changes
slab: rename gfpflags to allocflags
slub: refactoring unfreeze_partials()
slub: use __cmpxchg_double_slab() at interrupt disabled place
slab/mempolicy: always use local policy from interrupt context
slab: Get rid of obj_size macro
mm, sl[aou]b: Extract common fields from struct kmem_cache
slab: Remove some accessors
slab: Use page struct fields instead of casting
...
27 Jul, 2012
2 commits
-
Pull ACPI & power management update from Len Brown:
"Re-write of the turbostat tool.
lower overhead was necessary for measuring very large system when
they are very idle.IVB support in intel_idle
It's what I run on my IVB, others should be able to also:-)ACPICA core update
We have found some bugs due to divergence between Linux and the
upstream ACPICA base. Most of these patches are to reduce that
divergence to reduce the risk of future bugs.Some cpuidle updates, mostly for non-Intel
More will be coming, as they depend on this part.Some thermal management changes needed by non-ACPI systems.
Some _OST (OS Status Indication) updates for hot ACPI hot-plug."
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits)
Thermal: Documentation update
Thermal: Add Hysteresis attributes
Thermal: Make Thermal trip points writeable
ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check
tools/power: turbostat: fix large c1% issue
tools/power: turbostat v2 - re-write for efficiency
ACPICA: Update to version 20120711
ACPICA: AcpiSrc: Fix some translation issues for Linux conversion
ACPICA: Update header files copyrights to 2012
ACPICA: Add new ACPI table load/unload external interfaces
ACPICA: Split file: tbxface.c -> tbxfload.c
ACPICA: Add PCC address space to space ID decode function
ACPICA: Fix some comment fields
ACPICA: Table manager: deploy new firmware error/warning interfaces
ACPICA: Add new interfaces for BIOS(firmware) errors and warnings
ACPICA: Split exception code utilities to a new file, utexcep.c
ACPI: acpi_pad: tune round_robin_time
ACPICA: Update to version 20120620
ACPICA: Add support for implicit notify on multiple devices
ACPICA: Update comments; no functional change
... -
Pull USB patches from Greg Kroah-Hartman:
"Here's the big USB patch set for the 3.6-rc1 merge window.Lots of little changes in here, primarily for gadget controllers and
drivers. There's some scsi changes that I think also went in through
the scsi tree, but they merge just fine. All of these patches have
been in the linux-next tree for a while now.Signed-off-by: Greg Kroah-Hartman "
Fix up trivial conflicts in include/scsi/scsi_device.h (same libata
conflict that Jeff had already encountered)* tag 'usb-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (207 commits)
usb: Add USB_QUIRK_RESET_RESUME for all Logitech UVC webcams
usb: Add quirk detection based on interface information
usb: s3c-hsotg: Add header file protection macros in s3c-hsotg.h
USB: ehci-s5p: Add vbus setup function to the s5p ehci glue layer
USB: add USB_VENDOR_AND_INTERFACE_INFO() macro
USB: notify phy when root hub port connect change
USB: remove 8 bytes of padding from usb_host_interface on 64 bit builds
USB: option: add ZTE MF821D
USB: sierra: QMI mode MC7710 moved to qcserial
USB: qcserial: adding Sierra Wireless devices
USB: qcserial: support generic Qualcomm serial ports
USB: qcserial: make probe more flexible
USB: qcserial: centralize probe exit path
USB: qcserial: consolidate usb_set_interface calls
USB: ehci-s5p: Add support for device tree
USB: ohci-exynos: Add support for device tree
USB: ehci-omap: fix compile failure(v1)
usb: host: tegra: pass correct pointer in ehci_setup()
USB: ehci-fsl: Update ifdef check to work on 64-bit ppc
USB: serial: keyspan: Removed unrequired parentheses.
...
26 Jul, 2012
2 commits
-
…coupled', 'cpuidle-tweaks', 'intel_idle-ivb', 'ost', 'red-hat-bz-772730', 'thermal', 'thermal-spear' and 'turbostat-v2' into release
-
A large enough symbol size causes an overflow in the size parameter to
the histogram allocation, leading to a segfault in
symbol__inc_addr_samples later on when this histogram is accessed.In the case of being called via perf-report, this returns back and
gracefully ignores the sample, eventually ignoring the chained return
value of perf_session_deliver_event in flush_sample_queue.Signed-off-by: Cody Schafer
Acked-by: Namhyung Kim
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Sukadev Bhattiprolu
Link: http://lkml.kernel.org/r/1342753525-4521-1-git-send-email-cody@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
25 Jul, 2012
17 commits
-
The TRACEEVENT-CFLAGS file is used to detect any change on compiler
flags. Just ignore it.Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1341559297-25725-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Cross compiling perf requires setting ARCH and CROSS_COMPILE variables,
but libtraceevent couldn't detect the changes so it ends up believing no
recompiling is required. Thus the linker failed like:LINK perf
../lib/traceevent//libtraceevent.a: member ../lib/traceevent//libtraceevent.a(event-parse.o) in archive is not an object
collect2: ld returned 1 exit status
make: *** [perf] Error 1This patch fixes this by adding TRACEEVENT-CFLAGS file like
PERF-CFLAGS to track those changes.Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1341559297-25725-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Bison 2.6 started to generate parse_events_parse() declaration in header. In
this case we have redundant redeclaration:util/parse-events.c:29:5: error: redundant redeclaration of ‘parse_events_parse’ [-Werror=redundant-decls]
In file included from util/parse-events.c:14:0:
util/parse-events-bison.h:99:5: note: previous declaration of ‘parse_events_parse’ was here
cc1: all warnings being treated as errorsLet's disable -Wredundant-decls for util/parse-events.c since it includes
header we can't control.Signed-off-by: Kirill A. Shutemov
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120723210407.GA25186@shutemov.name
Signed-off-by: Arnaldo Carvalho de Melo -
Perf uses GNU-specific version of strerror_r(). The GNU-specific strerror_r()
returns a pointer to a string containing the error message. This may be either
a pointer to a string that the function stores in buf, or a pointer to some
(immutable) static string (in which case buf is unused).In glibc-2.16 GNU version was marked with attribute warn_unused_result. It
triggers few warnings in perf:util/target.c: In function ‘perf_target__strerror’:
util/target.c:114:13: error: ignoring return value of ‘strerror_r’, declared with attribute warn_unused_result [-Werror=unused-result]
ui/browsers/hists.c: In function ‘hist_browser__dump’:
ui/browsers/hists.c:981:13: error: ignoring return value of ‘strerror_r’, declared with attribute warn_unused_result [-Werror=unused-result]They are bugs.
Let's fix strerror_r() usage.
Signed-off-by: Kirill A. Shutemov
Acked-by: Ulrich Drepper
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/20120723210654.GA25248@shutemov.name
[ committer note: s/assert/BUG_ON/g ]
Signed-off-by: Arnaldo Carvalho de Melo -
There have one problem about hw_breakpoint perf event, as watched, the
events reported to userspace is not correctly, sometime one trigger
bp_event report several events, sometime bp_event cannot go through to
user.The root cause is attr->freq is 1 passed to kernel defaultly in bp
events, this make kernel calculate event period not as expect, make
sample period to 1 will change attr->freq to 0, to fix this problem.This patch is similar with commit f92128 about tracepoint events:
perf: Make the trace events sample period default to 1Signed-off-by: Jovi Zhang
Cc: Ingo Molnar
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/CACV3sbLF8taiCq_VYW-sgRJyupeMzg58C7ZXfMe3xZUiH_Mx6w@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adding automated test for DSO data reading. Testing raw/cached reads
from different file/cache locations.Signed-off-by: Jiri Olsa
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1342959280-5361-18-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adding dso data caching so we don't need to open/read/close, each time
we want dso data.The DSO data caching affects following functions:
dso__data_read_offset
dso__data_read_addrEach DSO read tries to find the data (based on offset) inside the cache.
If it's not present it fills the cache from file, and returns the data.
If it is present, data are returned with no file read.Each data read is cached by reading cache page sized/aligned amount of
DSO data. The cache page size is hardcoded to 4096. The cache is using
RB tree with file offset as a sort key.Signed-off-by: Jiri Olsa
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1342959280-5361-17-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adding following interface for DSO object to allow
reading of DSO image data:dso__data_fd
- opens DSO and returns file descriptor
Binary types are used to locate/open DSO in following order:
DSO_BINARY_TYPE__BUILD_ID_CACHE
DSO_BINARY_TYPE__SYSTEM_PATH_DSO
In other word we first try to open DSO build-id path,
and if that fails we try to open DSO system path.dso__data_read_offset
- reads DSO data from specified offsetdso__data_read_addr
- reads DSO data from specified address/map.Signed-off-by: Jiri Olsa
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1342959280-5361-11-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adding interface to access DSOs so it could be used
from another place.New DSO binary type is added - making current SYMTAB__*
types more general:
DSO_BINARY_TYPE__* = SYMTAB__*Following function is added to return path based on the specified
binary type:
dso__binary_type_fileSigned-off-by: Jiri Olsa
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1342959280-5361-10-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Tiny cosmetic fix. The lack of a newline between hists callchains was
looking slightly messy.Before:
0.24% swapper [kernel.kallsyms] [k] _raw_spin_lock_irq
|
--- _raw_spin_lock_irq
run_timer_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
smp_apic_timer_interrupt
apic_timer_interrupt
default_idle
amd_e400_idle
cpu_idle
start_secondary
0.10% perf [kernel.kallsyms] [k] lock_is_held
|
--- lock_is_held
__might_sleep
mutex_lock_nested
perf_event_for_each_child
perf_ioctl
do_vfs_ioctl
sys_ioctl
system_call_fastpath
ioctl
cmd_record
run_builtin
main
__libc_start_mainAfter:
0.24% swapper [kernel.kallsyms] [k] _raw_spin_lock_irq
|
--- _raw_spin_lock_irq
run_timer_softirq
__do_softirq
call_softirq
do_softirq
irq_exit
smp_apic_timer_interrupt
apic_timer_interrupt
default_idle
amd_e400_idle
cpu_idle
start_secondary0.10% perf [kernel.kallsyms] [k] lock_is_held
|
--- lock_is_held
__might_sleep
mutex_lock_nested
perf_event_for_each_child
perf_ioctl
do_vfs_ioctl
sys_ioctl
system_call_fastpath
ioctl
cmd_record
run_builtin
main
__libc_start_mainSigned-off-by: Frederic Weisbecker
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342631456-7233-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
Trace events have a period (weight) of 1 by default. This can be
overriden on events definition by using the __perf_count() macro.For example, the sched_stat_runtime() is weighted with the runtime of
the task that fired the event.By default, perf handles such weighted event by dividing it into
individual events carrying a weight of 1. For example if
sched_stat_runtime is fired and the task has run 5000000 nsecs, perf
divides it into 5000000 events in the buffer.This behaviour makes weighted events unusable because they quickly
fullfill the buffers and we lose most events.The commit 5d81e5cfb37a174e8ddc0413e2e70cdf05807ace ("events: Don't
divide events if it has field period") solves this problem by sending
only one event when PERF_SAMPLE_PERIOD flag is set. The weight is
carried in the sample itself such that we don't need to demultiplex it
anymore.This patch provides the last missing piece to use this feature by
setting PERF_SAMPLE_PERIOD from perf tools when we deal with trace
events.Before:
$ ./perf record -e sched:* -a sleep 1
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 1.619 MB perf.data (~70749 samples) ]
Warning:
Processed 16909 events and lost 1 chunks!Check IO/CPU overload!
$ ./perf script
perf 1894 [003] 824.898327: sched_migrate_task: comm=perf pid=1898 prio=120 orig_cpu=2 dest_cpu=0
perf 1894 [003] 824.898335: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898336: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898337: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898338: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898339: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898340: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898341: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
[...]After:
$ ./perf record -e sched:* -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.074 MB perf.data (~3228 samples) ]$ ./perf script
perf 1461 [000] 554.286957: sched_migrate_task: comm=perf pid=1465 prio=120 orig_cpu=3 dest_cpu=1
perf 1461 [000] 554.286964: sched_stat_sleep: comm=perf pid=1465 delay=133047190 [ns]
perf 1461 [000] 554.286967: sched_wakeup: comm=perf pid=1465 prio=120 success=1 target_cpu=001
swapper 0 [001] 554.286976: sched_stat_wait: comm=perf pid=1465 delay=0 [ns]
swapper 0 [001] 554.286983: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf
[...]Signed-off-by: Frederic Weisbecker
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342631456-7233-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
Include the omitted number of characters printed for the first entry.
Not that it really matters because nobody seem to care about the number
of printed characters for now. But just in case.Signed-off-by: Frederic Weisbecker
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342631456-7233-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adds the attributes to the event line in the header dump.
From:
event : name = cycles, type = 0, config = 0x0, config1 = 0x0,
config2 = 0x0, excl_usr = 0, excl_kern = 0, ...
to
event : name = cycles, type = 0, config = 0x0, config1 = 0x0,
config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0,
excl_guest = 0, precise_ip = 0, ...Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1342826756-64663-8-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
After 7ed97ad use of the guestmount option without a subdir for *each*
VM generates an error message for each sample related to that VM. Once
per VM is enough.Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342826756-64663-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
Guest kernel symbols are not resolved despite passing the information
needed to resolve them. e.g.,perf kvm --guest --guestmount=/tmp/guest-mount record -a -- sleep 1
perf kvm --guest --guestmount=/tmp/guest-mount report --stdio36.55% [guest/11399] [unknown] [g] 0xffffffff81600bc8
33.19% [guest/10474] [unknown] [g] 0x00000000c0116e00
30.26% [guest/11094] [unknown] [g] 0xffffffff8100a288
43.69% [guest/10474] [unknown] [g] 0x00000000c0103d90
37.38% [guest/11399] [unknown] [g] 0xffffffff81600bc8
12.24% [guest/11094] [unknown] [g] 0xffffffff810aa91d
6.69% [guest/11094] [unknown] [u] 0x00007fa784d721c3which is just pathetic.
After a maddening 2 days sifting through perf minutia I found it --
id_hdr_size is not initialized for guest machines. This shows up on the
report side as random garbage for the cpu and timestamp, e.g.,29816 7310572949125804849 0x1ac0 [0x50]: PERF_RECORD_MMAP ...
That messes up the sample sorting such that synthesized guest maps are
processed last.With this patch you get a much more helpful report:
12.11% [guest/11399] [guest.kernel.kallsyms.11399] [g] irqtime_account_process_tick
10.58% [guest/11399] [guest.kernel.kallsyms.11399] [g] run_timer_softirq
6.95% [guest/11094] [guest.kernel.kallsyms.11094] [g] printk_needs_cpu
6.50% [guest/11094] [guest.kernel.kallsyms.11094] [g] do_timer
6.45% [guest/11399] [guest.kernel.kallsyms.11399] [g] idle_balance
4.90% [guest/11094] [guest.kernel.kallsyms.11094] [g] native_read_tsc
...v2:
- changed rbtree walk to use rb_first per Namhyung's suggestionTested-by: Jiri Olsa
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342826756-64663-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
e.g., perf kvm --host --guest report -i perf.data --stdio -D
shows:1 599127912065356 0x143b8 [0x48]: PERF_RECORD_SAMPLE(IP, 5): 5671/5676: 0x7fdf95a061c0 period: 1 addr: 0
... chain: nr:2
..... 0: ffffffffffffff80
..... 1: fffffffffffffe00
... thread: qemu-kvm:5671
...... dso:(IP, 5) means sample in guest userspace. Those samples should not be lumped
into the VMM's host thread. i.e, the report output:56.86% qemu-kvm [unknown] [u] 0x00007fdf95a061c0
With this patch the output emphasizes it is a guest userspace hit:
56.86% [guest/5671] [unknown] [u] 0x00007fdf95a061c0
Looking at 3 VMs (2 64-bit, 1 32-bit) with each running a CPU bound
process (openssl speed), perf report currently shows:93.84% 117726 qemu-kvm [unknown] [u] 0x00007fd7dcaea8e5
which is wrong. With this patch you get:
31.50% 39258 [guest/18772] [unknown] [u] 0x00007fd7dcaea8e5
31.50% 39236 [guest/11230] [unknown] [u] 0x0000000000a57340
30.84% 39232 [guest/18395] [unknown] [u] 0x00007f66f641e107Tested-by: Jiri Olsa
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342826756-64663-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
COMM events are not generated in the context of a guest machine, so the
thread name is never set for the VMM process. For example, the qemu-kvm
name applies to the process in the host machine, not the guest machine.
So, samples for guest machines are currently displayed as:99.67% :5671 [unknown] [g] 0xffffffff81366b41
where 5671 is the pid of the VMM. With this patch the samples in the guest
machine are shown as:18.43% [guest/5671] [unknown] [g] 0xffffffff810d68b7
Tested-by: Jiri Olsa
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342826756-64663-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
24 Jul, 2012
1 commit
-
Current debug message is:
Problems creating module maps, continuing anyway...When running multiple VMs it would be nice to know which machine the
message is referring to:$ perf kvm --guest --guestmount=/tmp/guest-mount record -av -- sleep 10
Problems creating module maps for guest 6613, continuing anyway...Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342826756-64663-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
21 Jul, 2012
1 commit
-
The min configs are saved in a perl hash called force_configs, and this
hash is used to add configs to the .config file. But it was not being
reset between tests and a min config from a previous test would affect
the min config of the next test causing undesirable results.Reset the force_config hash at the start of each test.
Signed-off-by: Steven Rostedt
20 Jul, 2012
7 commits
-
Under some conditions, c1% was displayed as very large number,
much higher than 100%.c1% is not measured, it is derived as "that, which is left over"
from other counters. However, the other counters are not collected
atomically, and so it is possible for c1% to be calaculagted as
a small negative number -- displayed as very large positive.There was a check for mperf vs tsc for this already,
but it needed to also include the other counters
that are used to calculate c1.Signed-off-by: Len Brown
-
Measuring large profoundly-idle configurations
requires turbostat to be more lightweight.
Otherwise, the operation of turbostat itself
can interfere with the measurements.This re-write makes turbostat topology aware.
Hardware is accessed in "topology order".
Redundant hardware accesses are deleted.
Redundant output is deleted.
Also, output is buffered and
local RDTSC use replaces remote MSR access for TSC.From a feature point of view, the output
looks different since redundant figures are absent.
Also, there are now -c and -p options -- to restrict
output to the 1st thread in each core, and the 1st
thread in each package, respectively. This is helpful
to reduce output on big systems, where more detail
than the "-s" system summary is desired.
Finally, periodic mode output is now on stdout, not stderr.Turbostat v2 is also slightly more robust in
handling run-time CPU online/offline events,
as it now checks the actual map of on-line cpus rather
than just the total number of on-line cpus.Signed-off-by: Len Brown
-
Usually the target is booted into a dependable kernel when a test
starts. The test will install the test kernel and reboot the box. But
there may be a time that the kernel is running an unreliable kernel and
the reboot may crash.Have ktest detect crashes on a reboot and force a power-cycle instead.
This can usually happen if a test kernel was installed to run manual
tests, but the user forgot to reboot to the known good kernel.Signed-off-by: Steven Rostedt
-
If the console is constantly outputting content, this can cause ktest
to get stuck waiting on the monitor to settle down.The option MAX_MONITOR_WAIT is the maximum time (in seconds) for ktest
to wait for the console to flush.Signed-off-by: Steven Rostedt
-
With a name like 'oldnoconfig' one may think that the config generated
would disable all configs that were not defined (selecting "no" for all
options). But this is not the case. It selects the default. If a config
has a 'default y', then it is added if not specified.This broke the config bisect, because options not specified by a config
will just use the default, where it expected to turn off. This caused an
option to be enabled that disabled an option that would break the build.
The end result was that we never found the bad config at the end of the
test.Instead of using 'make oldnoconfig', ktest now builds the options it
expects enabled and disabled. When it turns off an option, it will no
longer remove it, but actually set it to:# CONFIG_FOO is not set.
Signed-off-by: Steven Rostedt
-
The config-bisect can take a bad config and bisect it down to find out
what config actually breaks the config. But as all tests will apply a
minconfig (defined by a user) to apply before booting, it is possible
that the minconfig could actually make the bad config work (minconfigs
can disable configs). The end result is that the config bisect test will
not find a config that breaks. This can be rather frustrating to the
user.The CONFIG_BISECT_CHECK option, when set to 1, will make sure that the
bad config (with the minconfig applied) still fails before trying to
bisect.And yes, I did get burned by this.
Signed-off-by: Steven Rostedt
-
Add the PRE_INSTALL option that will allow a user to specify a shell
command to be executed before the install operation executes.Signed-off-by: Steven Rostedt