Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

06 Apr, 2014

1 commit

2d1eb87ae Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm ... Browse Code »

Pull ARM changes from Russell King:

- Perf updates from Will Deacon:
- Support for Qualcomm Krait processors (run perf on your phone!)
- Support for Cortex-A12 (run perf stat on your FPGA!)
- Support for perf_sample_event_took, allowing us to automatically decrease
the sample rate if we can't handle the PMU interrupts quickly enough
(run perf record on your FPGA!).

- Basic uprobes support from David Long:
This patch series adds basic uprobes support to ARM. It is based on
patches developed earlier by Rabin Vincent. That approach of adding
hooks into the kprobes instruction parsing code was not well received.
This approach separates the ARM instruction parsing code in kprobes out
into a separate set of functions which can be used by both kprobes and
uprobes. Both kprobes and uprobes then provide their own semantic action
tables to process the results of the parsing.

- ARMv7M (microcontroller) updates from Uwe Kleine-König

- OMAP DMA updates (recently added Vinod's Ack even though they've been
sitting in linux-next for a few months) to reduce the reliance of
omap-dma on the code in arch/arm.

- SA11x0 changes from Dmitry Eremin-Solenikov and Alexander Shiyan

- Support for Cortex-A12 CPU

- Align support for ARMv6 with ARMv7 so they can cooperate better in a
single zImage.

- Addition of first AT_HWCAP2 feature bits for ARMv8 crypto support.

- Removal of IRQ_DISABLED from various ARM files

- Improved efficiency of virt_to_page() for single zImage

- Patch from Ulf Hansson to permit runtime PM callbacks to be available for
AMBA devices for suspend/resume as well.

- Finally kill asm/system.h on ARM.

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (89 commits)
dmaengine: omap-dma: more consolidation of CCR register setup
dmaengine: omap-dma: move IRQ handling to omap-dma
dmaengine: omap-dma: move register read/writes into omap-dma.c
ARM: omap: dma: get rid of 'p' allocation and clean up
ARM: omap: move dma channel allocation into plat-omap code
ARM: omap: dma: get rid of errata global
ARM: omap: clean up DMA register accesses
ARM: omap: remove almost-const variables
ARM: omap: remove references to disable_irq_lch
dmaengine: omap-dma: cleanup errata 3.3 handling
dmaengine: omap-dma: provide register read/write functions
dmaengine: omap-dma: use cached CCR value when enabling DMA
dmaengine: omap-dma: move barrier to omap_dma_start_desc()
dmaengine: omap-dma: move clnk_ctrl setting to preparation functions
dmaengine: omap-dma: improve efficiency loading C.SA/C.EI/C.FI registers
dmaengine: omap-dma: consolidate clearing channel status register
dmaengine: omap-dma: move CCR buffering disable errata out of the fast path
dmaengine: omap-dma: provide register definitions
dmaengine: omap-dma: consolidate setup of CCR
dmaengine: omap-dma: consolidate setup of CSDP
...

Linus Torvalds
2014-04-06 04:20:43 +0800

04 Apr, 2014

1 commit

32d01dc7b Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup updates from Tejun Heo:
"A lot updates for cgroup:

- The biggest one is cgroup's conversion to kernfs. cgroup took
after the long abandoned vfs-entangled sysfs implementation and
made it even more convoluted over time. cgroup's internal objects
were fused with vfs objects which also brought in vfs locking and
object lifetime rules. Naturally, there are places where vfs rules
don't fit and nasty hacks, such as credential switching or lock
dance interleaving inode mutex and cgroup_mutex with object serial
number comparison thrown in to decide whether the operation is
actually necessary, needed to be employed.

After conversion to kernfs, internal object lifetime and locking
rules are mostly isolated from vfs interactions allowing shedding
of several nasty hacks and overall simplification. This will also
allow implmentation of operations which may affect multiple cgroups
which weren't possible before as it would have required nesting
i_mutexes.

- Various simplifications including dropping of module support,
easier cgroup name/path handling, simplified cgroup file type
handling and task_cg_lists optimization.

- Prepatory changes for the planned unified hierarchy, which is still
a patchset away from being actually operational. The dummy
hierarchy is updated to serve as the default unified hierarchy.
Controllers which aren't claimed by other hierarchies are
associated with it, which BTW was what the dummy hierarchy was for
anyway.

- Various fixes from Li and others. This pull request includes some
patches to add missing slab.h to various subsystems. This was
triggered xattr.h include removal from cgroup.h. cgroup.h
indirectly got included a lot of files which brought in xattr.h
which brought in slab.h.

There are several merge commits - one to pull in kernfs updates
necessary for converting cgroup (already in upstream through
driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
cgroup: remove useless argument from cgroup_exit()
cgroup: fix spurious lockdep warning in cgroup_exit()
cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
cgroup: break kernfs active_ref protection in cgroup directory operations
cgroup: fix cgroup_taskset walking order
cgroup: implement CFTYPE_ONLY_ON_DFL
cgroup: make cgrp_dfl_root mountable
cgroup: drop const from @buffer of cftype->write_string()
cgroup: rename cgroup_dummy_root and related names
cgroup: move ->subsys_mask from cgroupfs_root to cgroup
cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
cgroup: reorganize cgroup bootstrapping
cgroup: relocate setting of CGRP_DEAD
cpuset: use rcu_read_lock() to protect task_cs()
cgroup_freezer: document freezer_fork() subtleties
cgroup: update cgroup_transfer_tasks() to either succeed or fail
cgroup: drop task_lock() protection around task->cgroups
cgroup: update how a newly forked task gets associated with css_set
...

Linus Torvalds
2014-04-04 04:05:42 +0800

19 Mar, 2014

1 commit

6fe50a28b uprobes: allow ignoring of probe hits ... Browse Code »

Allow arches to decided to ignore a probe hit. ARM will use this to
only call handlers if the conditions to execute a conditionally executed
instruction are satisfied.

Signed-off-by: David A. Long
Acked-by: Oleg Nesterov

David A. Long
2014-03-19 04:39:34 +0800

27 Feb, 2014

5 commits

4a2345937 perf: Optimize group_sched_in() ... Browse Code »

Use the ctx pmu instead of the event pmu.

When a group leader is a software event but the group contains
hardware events, the entire group is on the hardware PMU.

Using the hardware PMU for the transaction makes most sense since
that's the most expensive one to programm (and software PMUs generally
don't have TXN support anyway).

Signed-off-by: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-sctoo9t2f3nn2c9g568928q3@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-02-27 19:43:26 +0800
fdded676c perf: Remove redundant PMU assignment ... Browse Code »

Currently perf_branch_stack_sched_in iterates over the set of pmus,
checks that each pmu has a flush_branch_stack callback, then overwrites
the pmu before calling the callback. This is either redundant or broken.

In systems with a single hw pmu, pmu == cpuctx->ctx.pmu, and thus the
assignment is redundant.

In systems with multiple hw pmus (i.e. multiple pmus with task_ctx_nr ==
perf_hw_context) the pmus share the same perf_cpu_context. Thus the
assignment can cause one of the pmus to flush its branch stack
repeatedly rather than causing each of the pmus to flush their branch
stacks. Worse still, if only some pmus have the callback the assignment
can result in a branch to NULL.

This patch removes the redundant assignment.

Signed-off-by: Mark Rutland
Acked-by: Will Deacon
Signed-off-by: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1392054264-23570-3-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar

Mark Rutland
2014-02-27 19:43:24 +0800
9e3170411 perf: Fix prototype of find_pmu_context() ... Browse Code »

For some reason find_pmu_context() is defined as returning void * rather
than a __percpu struct perf_cpu_context *. As all the requisite types are
defined in advance there's no reason to keep it that way.

This patch modifies the prototype of pmu_find_context to return a
__percpu struct perf_cpu_context *.

Signed-off-by: Mark Rutland
Signed-off-by: Peter Zijlstra
Reviewed-by: Dave Martin
Acked-by: Will Deacon
Link: http://lkml.kernel.org/r/1392054264-23570-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar

Mark Rutland
2014-02-27 19:43:21 +0800
ff5a7088f Merge branch 'perf/urgent' into perf/core ... Browse Code »

Merge the latest fixes before queueing up new changes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2014-02-27 19:41:17 +0800
e3703f8cd perf: Fix hotplug splat ... Browse Code »

Drew Richardson reported that he could make the kernel go *boom* when hotplugging
while having perf events active.

It turned out that when you have a group event, the code in
__perf_event_exit_context() fails to remove the group siblings from
the context.

We then proceed with destroying and freeing the event, and when you
re-plug the CPU and try and add another event to that CPU, things go
*boom* because you've still got dead entries there.

Reported-by: Drew Richardson
Signed-off-by: Peter Zijlstra
Cc: Will Deacon
Cc:
Link: http://lkml.kernel.org/n/tip-k6v5wundvusvcseqj1si0oz0@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-02-27 19:38:03 +0800

22 Feb, 2014

1 commit

cd578abb2 perf/x86: Warn to early_printk() in case irq_work is too slow ... Browse Code »

On Mon, Feb 10, 2014 at 08:45:16AM -0800, Dave Hansen wrote:
> The reason I coded this up was that NMIs were firing off so fast that
> nothing else was getting a chance to run. With this patch, at least the
> printk() would come out and I'd have some idea what was going on.

It will start spewing to early_printk() (which is a lot nicer to use
from NMI context too) when it fails to queue the IRQ-work because its
already enqueued.

It does have the false-positive for when two CPUs trigger the warn
concurrently, but that should be rare and some extra clutter on the
early printk shouldn't be a problem.

Cc: hpa@zytor.com
Cc: tglx@linutronix.de
Cc: dzickus@redhat.com
Cc: Dave Hansen
Cc: mingo@kernel.org
Fixes: 6a02ad66b2c4 ("perf/x86: Push the duration-logging printk() to IRQ context")
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20140211150116.GO27965@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2014-02-22 04:49:07 +0800

13 Feb, 2014

1 commit

924f0d9a2 cgroup: drop @skip_css from cgroup_taskset_for_each() ... Browse Code »

If !NULL, @skip_css makes cgroup_taskset_for_each() skip the matching
css. The intention of the interface is to make it easy to skip css's
(cgroup_subsys_states) which already match the migration target;
however, this is entirely unnecessary as migration taskset doesn't
include tasks which are already in the target cgroup. Drop @skip_css
from cgroup_taskset_for_each().

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann

Tejun Heo
2014-02-13 19:58:41 +0800

12 Feb, 2014

1 commit

5a17f543e cgroup: improve css_from_dir() into css_tryget_from_dir() ... Browse Code »
13

css_from_dir() returns the matching css (cgroup_subsys_state) given a
dentry and subsystem. The function doesn't pin the css before
returning and requires the caller to be holding RCU read lock or
cgroup_mutex and handling pinning on the caller side.

Given that users of the function are likely to want to pin the
returned css (both existing users do) and that getting and putting
css's are very cheap, there's no reason for the interface to be tricky
like this.

Rename css_from_dir() to css_tryget_from_dir() and make it try to pin
the found css and return it only if pinning succeeded. The callers
are updated so that they no longer do RCU locking and pinning around
the function and just use the returned css.

This will also ease converting cgroup to kernfs.

Signed-off-by: Tejun Heo
Acked-by: Michal Hocko
Acked-by: Li Zefan
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki

Tejun Heo
2014-02-12 00:52:47 +0800

09 Feb, 2014

1 commit

6a02ad66b perf/x86: Push the duration-logging printk() to IRQ context ... Browse Code »

Calling printk() from NMI context is bad (TM), so move it to IRQ
context.

This also avoids the problem where the printk() time is measured by
the generic NMI duration goo and triggers a second warning.

Signed-off-by: Peter Zijlstra
Cc: Don Zickus
Cc: Dave Hansen
Link: http://lkml.kernel.org/n/tip-75dv35xf6dhhmeb7nq6fua31@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-02-09 20:17:21 +0800

08 Feb, 2014

1 commit

073219e99 cgroup: clean up cgroup_subsys names and initialization ... Browse Code »

cgroup_subsys is a bit messier than it needs to be.

* The name of a subsys can be different from its internal identifier
defined in cgroup_subsys.h. Most subsystems use the matching name
but three - cpu, memory and perf_event - use different ones.

* cgroup_subsys_id enums are postfixed with _subsys_id and each
cgroup_subsys is postfixed with _subsys. cgroup.h is widely
included throughout various subsystems, it doesn't and shouldn't
have claim on such generic names which don't have any qualifier
indicating that they belong to cgroup.

* cgroup_subsys->subsys_id should always equal the matching
cgroup_subsys_id enum; however, we require each controller to
initialize it and then BUG if they don't match, which is a bit
silly.

This patch cleans up cgroup_subsys names and initialization by doing
the followings.

* cgroup_subsys_id enums are now postfixed with _cgrp_id, and each
cgroup_subsys with _cgrp_subsys.

* With the above, renaming subsys identifiers to match the userland
visible names doesn't cause any naming conflicts. All non-matching
identifiers are renamed to match the official names.

cpu_cgroup -> cpu
mem_cgroup -> memory
perf -> perf_event

* controllers no longer need to initialize ->subsys_id and ->name.
They're generated in cgroup core and set automatically during boot.

* Redundant cgroup_subsys declarations removed.

* While updating BUG_ON()s in cgroup_init_early(), convert them to
WARN()s. BUGging that early during boot is stupid - the kernel
can't print anything, even through serial console and the trap
handler doesn't even link stack frame properly for back-tracing.

This patch doesn't introduce any behavior changes.

v2: Rebased on top of fe1217c4f3f7 ("net: net_cls: move cgroupfs
classid handling into core").

Signed-off-by: Tejun Heo
Acked-by: Neil Horman
Acked-by: "David S. Miller"
Acked-by: "Rafael J. Wysocki"
Acked-by: Michal Hocko
Acked-by: Peter Zijlstra
Acked-by: Aristeu Rozanski
Acked-by: Ingo Molnar
Acked-by: Li Zefan
Cc: Johannes Weiner
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: Serge E. Hallyn
Cc: Vivek Goyal
Cc: Thomas Graf

Tejun Heo
2014-02-08 23:36:58 +0800

23 Jan, 2014

1 commit

60eaa0190 Merge tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"This pull request has a new feature to ftrace, namely the trace event
triggers by Tom Zanussi. A trigger is a way to enable an action when
an event is hit. The actions are:

o trace on/off - enable or disable tracing
o snapshot - save the current trace buffer in the snapshot
o stacktrace - dump the current stack trace to the ringbuffer
o enable/disable events - enable or disable another event

Namhyung Kim added updates to the tracing uprobes code. Having the
uprobes add support for fetch methods.

The rest are various bug fixes with the new code, and minor ones for
the old code"

* tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (38 commits)
tracing: Fix buggered tee(2) on tracing_pipe
tracing: Have trace buffer point back to trace_array
ftrace: Fix synchronization location disabling and freeing ftrace_ops
ftrace: Have function graph only trace based on global_ops filters
ftrace: Synchronize setting function_trace_op with ftrace_trace_function
tracing: Show available event triggers when no trigger is set
tracing: Consolidate event trigger code
tracing: Fix counter for traceon/off event triggers
tracing: Remove double-underscore naming in syscall trigger invocations
tracing/kprobes: Add trace event trigger invocations
tracing/probes: Fix build break on !CONFIG_KPROBE_EVENT
tracing/uprobes: Add @+file_offset fetch method
uprobes: Allocate ->utask before handler_chain() for tracing handlers
tracing/uprobes: Add support for full argument access methods
tracing/uprobes: Fetch args before reserving a ring buffer
tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()
tracing/probes: Implement 'memory' fetch method for uprobes
tracing/probes: Add fetch{,_size} member into deref fetch method
tracing/probes: Move 'symbol' fetch method to kprobes
tracing/probes: Implement 'stack' fetch method for uprobes
...

Linus Torvalds
2014-01-23 08:35:21 +0800

16 Jan, 2014

1 commit

860fc2f26 Merge branch 'perf/urgent' into perf/core ... Browse Code »

Pick up the latest fixes, refresh the development tree.

Signed-off-by: Ingo Molnar

Ingo Molnar
2014-01-16 16:33:30 +0800

12 Jan, 2014

2 commits

a21b0b354 perf: Introduce a flag to enable close-on-exec in perf_event_open() ... Browse Code »
13

Unlike recent modern userspace API such as:

epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC),
fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC),
signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC),
or the venerable general purpose open (O_CLOEXEC),

perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC
(eg. close-on-exec) flag on file descriptor it returns to userspace.

The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow
perf_event_open() syscall to atomically set close-on-exec.

Having this flag will enable userspace to remove the file descriptor
from the list of file descriptors being inherited across exec,
without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the
associated race condition between the current thread and another
thread calling fork(2) then execve(2).

Links:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
http://danwalsh.livejournal.com/53603.html

- Notes in DMA buffer sharing: leak and security hole
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428

Signed-off-by: Yann Droneaud
Cc: Arnaldo Carvalho de Melo
Cc: Al Viro
Cc: Andrew Morton
Cc: Paul Mackerras
Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com
Signed-off-by: Ingo Molnar

Yann Droneaud
2014-01-12 17:16:59 +0800
f3ae75de9 perf/x86: Fix active_entry initialization ... Browse Code »

This patch fixes a problem with the initialization of the
struct perf_event active_entry field. It is defined inside
an anonymous union and was initialized in perf_event_alloc()
using INIT_LIST_HEAD(). However at that time, we do not know
whether the event is going to use active_entry or hlist_entry (SW).
Or at last, we don't want to make that determination there.
The problem is that hlist and list_head are not initialized
the same way. One is okay with NULL (from kzmalloc), the other
needs to pointers to point to self.

This patch resolves this problem by dropping the union.
This will avoid problems later on, if someone starts using
active_entry or hlist_entry without verifying that they
actually overlap. This also solves the initialization
problem.

Signed-off-by: Stephane Eranian
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: vincent.weaver@maine.edu
Cc: maria.n.dimakopoulou@gmail.com
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1389176153-3128-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar

Stephane Eranian
2014-01-12 17:16:07 +0800

03 Jan, 2014

1 commit

72fd293aa uprobes: Allocate ->utask before handler_chain() for tracing handlers ... Browse Code »

uprobe_trace_print() and uprobe_perf_print() need to pass the additional
info to call_fetch() methods, currently there is no simple way to do this.

current->utask looks like a natural place to hold this info, but we need
to allocate it before handler_chain().

This is a bit unfortunate, perhaps we will find a better solution later,
but this is simple and should work right now.

Signed-off-by: Oleg Nesterov
Acked-by: Masami Hiramatsu
Acked-by: Oleg Nesterov
Cc: Srikar Dronamraju
Signed-off-by: Namhyung Kim

Oleg Nesterov
2014-01-03 09:57:04 +0800

17 Dec, 2013

2 commits

bad7192b8 perf: Fix PERF_EVENT_IOC_PERIOD to force-reset the period ... Browse Code »
13

Vince Weaver reports that, on all architectures apart from ARM,
PERF_EVENT_IOC_PERIOD doesn't actually update the period until the next
event fires. This is counter-intuitive behaviour and is better dealt
with in the core code.

This patch ensures that the period is forcefully reset when dealing with
such a request in the core code. A subsequent patch removes the
equivalent hack from the ARM back-end.

Reported-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Signed-off-by: Will Deacon
Link: http://lkml.kernel.org/r/1385560479-11014-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-12-17 22:21:33 +0800
443772776 perf: Disable all pmus on unthrottling and rescheduling ... Browse Code »

Currently, only one PMU in a context gets disabled during unthrottling
and event_sched_{out,in}(), however, events in one context may belong to
different pmus, which results in PMUs being reprogrammed while they are
still enabled.

This means that mixed PMU use [which is rare in itself] resulted in
potentially completely unreliable results: corrupted events, bogus
results, etc.

This patch temporarily disables PMUs that correspond to
each event in the context while these events are being modified.

Signed-off-by: Alexander Shishkin
Reviewed-by: Andi Kleen
Signed-off-by: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Stephane Eranian
Cc: Alexander Shishkin
Link: http://lkml.kernel.org/r/1387196256-8030-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar

Alexander Shishkin
2013-12-17 22:04:00 +0800

11 Dec, 2013

1 commit

c7f2e3cd6 perf: Optimize ring-buffer write by depending on control dependencies ... Browse Code »

Remove a full barrier from the ring-buffer write path by relying on
a control dependency to order a LOAD -> STORE scenario.

Cc: "Paul E. McKenney"
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-8alv40z6ikk57jzbaobnxrjl@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-12-11 22:53:22 +0800

27 Nov, 2013

1 commit

71ad88efe perf: Add active_entry list head to struct perf_event ... Browse Code »

This patch adds a new field to the struct perf_event.
It is intended to be used to chain events which are
active (enabled). It helps in the hardware layer
for PMUs which do not have actual counter restrictions, i.e.,
free running read-only counters. Active events are chained
as opposed to being tracked via the counter they use.

To save space we use a union with hlist_entry as both
are mutually exclusive (suggested by Jiri Olsa).

Signed-off-by: Stephane Eranian
Reviewed-by: Andi Kleen
Signed-off-by: Peter Zijlstra
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1384275531-10892-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar

Stephane Eranian
2013-11-27 18:16:38 +0800

21 Nov, 2013

1 commit

09897d78d Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ole… ... Browse Code »

…g/misc into perf/core

Pull uprobes cleanups from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-11-21 16:59:27 +0800

20 Nov, 2013

4 commits

ad439356a uprobes: Document xol_area and arch_uprobe->insn/ixol ... Browse Code »

Document xol_area and arch_uprobe.

Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju

Oleg Nesterov
2013-11-20 23:31:07 +0800
c912dae60 uprobes: Cleanup !CONFIG_UPROBES decls, unexport xol_area ... Browse Code »

1. Don't include asm/uprobes.h unconditionally, we only need
it if CONFIG_UPROBES.

2. Move the definition of "struct xol_area" into uprobes.c.

Perhaps we should simply kill struct uprobes_state, it buys
nothing.

3. Kill the dummy definition of uprobe_get_swbp_addr(), nobody
except handle_swbp() needs it.

4. Purely cosmetic, but move the decl of uprobe_get_swbp_addr()
up, close to other __weak helpers.

Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju

Oleg Nesterov
2013-11-20 23:31:01 +0800
803200e24 uprobes: Don't assume that arch_uprobe->insn/ixol is u8[MAX_UINSN_BYTES] ... Browse Code »

arch_uprobe should be opaque as much as possible to the generic
code, but currently it assumes that insn/ixol must be u8[] of the
known size. Remove this unnecessary dependency, we can use "&" and
and sizeof() with the same effect.

Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju

Oleg Nesterov
2013-11-20 23:31:00 +0800
324734311 uprobes: Add uprobe_task->dup_xol_work/dup_xol_addr ... Browse Code »

uprobe_task->vaddr is a bit strange. The generic code uses it only
to pass the additional argument to arch_uprobe_pre_xol(), and since
it is always equal to instruction_pointer() this looks even more
strange.

And both utask->vaddr and and utask->autask have the same scope,
they only have the meaning when the task executes the probed insn
out-of-line, so it is safe to reuse both in UTASK_RUNNING state.

This all means that logically ->vaddr belongs to arch_uprobe_task
and we should probably move it there, arch_uprobe_pre_xol() can
record instruction_pointer() itself.

OTOH, it is also used by uprobe_copy_process() and dup_xol_work()
for another purpose, this doesn't look clean and doesn't allow to
move this member into arch_uprobe_task.

This patch adds the union with 2 anonymous structs into uprobe_task.

The first struct is autask + vaddr, this way we "almost" move vaddr
into autask.

The second struct has 2 new members for uprobe_copy_process() paths:
->dup_xol_addr which can be used instead ->vaddr, and ->dup_xol_work
which can be used to avoid kmalloc() and simplify the code.

Note that this union will likely have another member(s), we need
something like "private_data_for_handlers" so that the tracing
handlers could use it to communicate with call_fetch() methods.

Signed-off-by: Oleg Nesterov
Reviewed-by: Masami Hiramatsu
Acked-by: Srikar Dronamraju

Oleg Nesterov
2013-11-20 23:30:57 +0800

19 Nov, 2013

1 commit

06db0b217 perf: Remove fragile swevent hlist optimization ... Browse Code »

Currently we only allocate a single cpu hashtable for per-cpu
swevents; do away with this optimization for it is fragile in the face
of things like perf_pmu_migrate_context().

The easiest thing is to make sure all CPUs are consistent wrt state.

Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20130913111447.GN31370@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-19 23:57:42 +0800

13 Nov, 2013

1 commit

008208c6b list: introduce list_next_entry() and list_prev_entry() ... Browse Code »

Add two trivial helpers list_next_entry() and list_prev_entry(), they
can have a lot of users including list.h itself. In fact the 1st one is
already defined in events/core.c and bnx2x_sp.c, so the patch simply
moves the definition to list.h.

Signed-off-by: Oleg Nesterov
Cc: Eilon Greenstein
Cc: Greg Kroah-Hartman
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-11-13 11:09:23 +0800

12 Nov, 2013

1 commit

ad5d69899 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf updates from Ingo Molnar:
"As a first remark I'd like to note that the way to build perf tooling
has been simplified and sped up, in the future it should be enough for
you to build perf via:

cd tools/perf/
make install

(ie without the -j option.) The build system will figure out the
number of CPUs and will do a parallel build+install.

The various build system inefficiencies and breakages Linus reported
against the v3.12 pull request should now be resolved - please
(re-)report any remaining annoyances or bugs.

Main changes on the perf kernel side:

* Performance optimizations:
. perf ring-buffer code optimizations, by Peter Zijlstra
. perf ring-buffer code optimizations, by Oleg Nesterov
. x86 NMI call-stack processing optimizations, by Peter Zijlstra
. perf context-switch optimizations, by Peter Zijlstra
. perf sampling speedups, by Peter Zijlstra
. x86 Intel PEBS processing speedups, by Peter Zijlstra

* Enhanced hardware support:
. for Intel Ivy Bridge-EP uncore PMUs, by Zheng Yan
. for Haswell transactions, by Andi Kleen, Peter Zijlstra

* Core perf events code enhancements and fixes by Oleg Nesterov:
. for uprobes, if fork() is called with pending ret-probes
. for uprobes platform support code

* New ABI details by Andi Kleen:
. Report x86 Haswell TSX transaction abort cost as weight

Main changes on the perf tooling side (some of these tooling changes
utilize the above kernel side changes):

* 'perf report/top' enhancements:

. Convert callchain children list to rbtree, greatly reducing the
time taken for callchain processing, from Namhyung Kim.

. Add new COMM infrastructure, further improving histogram
processing, from Frédéric Weisbecker, one fix from Namhyung Kim.

. Add /proc/kcore based live-annotation improvements, including
build-id cache support, multi map 'call' instruction navigation
fixes, kcore address validation, objdump workarounds. From
Adrian Hunter.

. Show progress on histogram collapsing, that can take a long
time, from Namhyung Kim.

. Add --max-stack option to limit callchain stack scan in 'top'
and 'report', improving callchain processing when reducing the
stack depth is an option, from Waiman Long.

. Add new option --ignore-vmlinux for perf top, from Willy
Tarreau.

* 'perf trace' enhancements:

. 'perf trace' now can can use a 'perf probe' dynamic tracepoints
to hook into the userspace -> kernel pathname copy so that it
can map fds to pathnames without reading /proc/pid/fd/ symlinks.
From Arnaldo Carvalho de Melo.

. Show VFS path associated with fd in live sessions, using a
'vfs_getname' 'perf probe' created dynamic tracepoint or by
looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.

. Add 'trace' beautifiers for lots of syscall arguments, from
Arnaldo Carvalho de Melo.

. Implement more compact 'trace' output by suppressing zeroed
args, from Arnaldo Carvalho de Melo.

. Show thread COMM by default in 'trace', from Arnaldo Carvalho de
Melo.

. Add option to show full timestamp in 'trace', from David Ahern.

. Add 'record' command in 'trace', to record raw_syscalls:*, from
David Ahern.

. Add summary option to dump syscall statistics in 'trace', from
David Ahern.

. Improve error messages in 'trace', providing hints about system
configuration steps needed for using it, from Ramkumar
Ramachandra.

. 'perf trace' now emits hints as to why tracing is not possible,
helping the user to setup the system to allow tracing in the
desired permission granularity, telling if the problem is due to
debugfs not being mounted or with not enough permission for
!root, /proc/sys/kernel/perf_event_paranoit value, etc. From
Arnaldo Carvalho de Melo.

* 'perf record' enhancements:

. Check maximum frequency rate for record/top, emitting better
error messages, from Jiri Olsa.

. 'perf record' code cleanups, from David Ahern.

. Improve write_output error message in 'perf record', from Adrian
Hunter.

. Allow specifying B/K/M/G unit to the --mmap-pages arguments,
from Jiri Olsa.

. Fix command line callchain attribute tests to handle the new
-g/--call-chain semantics, from Arnaldo Carvalho de Melo.

* 'perf kvm' enhancements:

. Disable live kvm command if timerfd is not supported, from David
Ahern.

. Fix detection of non-core features, from David Ahern.

* 'perf list' enhancements:

. Add usage to 'perf list', from David Ahern.

. Show error in 'perf list' if tracepoints not available, from
Pekka Enberg.

* 'perf probe' enhancements:

. Support "$vars" meta argument syntax for local variables,
allowing asking for all possible variables at a given probe
point to be collected when it hits, from Masami Hiramatsu.

* 'perf sched' enhancements:

. Address the root cause of that 'perf sched' stack initialization
build slowdown, by programmatically setting a big array after
moving the global variable back to the stack. Fix from Adrian
Hunter.

* 'perf script' enhancements:

. Set up output options for in-stream attributes, from Adrian
Hunter.

. Print addr by default for BTS in 'perf script', from Adrian
Juntmer

* 'perf stat' enhancements:

. Improved messages when doing profiling in all or a subset of
CPUs using a workload as the session delimitator, as in:

'perf stat --cpu 0,2 sleep 10s'

from Arnaldo Carvalho de Melo.

. Add units to nanosec-based counters in 'perf stat', from David
Ahern.

. Remove bogus info when using 'perf stat' -e cycles/instructions,
from Ramkumar Ramachandra.

* 'perf lock' enhancements:

. 'perf lock' fixes and cleanups, from Davidlohr Bueso.

* 'perf test' enhancements:

. Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing
and 'perf test', from Adrian Hunter.

. Clarify the "sample parsing" test entry, from Arnaldo Carvalho
de Melo.

. Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test,
from Arnaldo Carvalho de Melo.

. Memory leak fixes in 'perf test', from Felipe Pena.

* 'perf bench' enhancements:

. Change the procps visible command-name of invididual benchmark
tests plus cleanups, from Ingo Molnar.

* Generic perf tooling infrastructure/plumbing changes:

. Separating data file properties from session, code
reorganization from Jiri Olsa.

. Fix version when building out of tree, as when using one of
these:

$ make help | grep perf
perf-tar-src-pkg - Build perf-3.12.0.tar source tarball
perf-targz-src-pkg - Build perf-3.12.0.tar.gz source tarball
perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
perf-tarxz-src-pkg - Build perf-3.12.0.tar.xz source tarball
$

from David Ahern.

. Enhance option parse error message, showing just the help lines
of the options affected, from Namhyung Kim.

. libtraceevent updates from upstream trace-cmd repo, from Steven
Rostedt.

. Always use perf_evsel__set_sample_bit to set sample_type, from
Adrian Hunter.

. Memory and mmap leak fixes from Chenggang Qin.

. Assorted build fixes for from David Ahern and Jiri Olsa.

. Speed up and prettify the build system, from Ingo Molnar.

. Implement addr2line directly using libbfd, from Roberto Vitillo.

. Separate the GTK support in a separate libperf-gtk.so DSO, that
is only loaded when --gtk is specified, from Namhyung Kim.

. perf bash completion fixes and improvements from Ramkumar
Ramachandra.

. Support for Openembedded/Yocto -dbg packages, from Ricardo
Ribalda Delgado.

And lots and lots of other fixes and code reorganizations that did not
make it into the list, see the shortlog, diffstat and the Git log for
details!"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits)
uprobes: Fix the memory out of bound overwrite in copy_insn()
uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()
perf tools: Remove unneeded include
perf record: Remove post_processing_offset variable
perf record: Remove advance_output function
perf record: Refactor feature handling into a separate function
perf trace: Don't relookup fields by name in each sample
perf tools: Fix version when building out of tree
perf evsel: Ditch evsel->handler.data field
uprobes: Export write_opcode() as uprobe_write_opcode()
uprobes: Introduce arch_uprobe->ixol
uprobes: Kill module_init() and module_exit()
uprobes: Move function declarations out of arch
perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
perf: Factor out strncpy() in perf_event_mmap_event()
tools/perf: Add required memory barriers
perf: Fix arch_perf_out_copy_user default
perf: Update a stale comment
perf: Optimize perf_output_begin() -- address calculation
...

Linus Torvalds
2013-11-12 09:06:34 +0800

10 Nov, 2013

2 commits

2ded0980a uprobes: Fix the memory out of bound overwrite in copy_insn() ... Browse Code »

1. copy_insn() doesn't look very nice, all calculations are
confusing and it is not immediately clear why do we read
the 2nd page first.

2. The usage of inode->i_size is wrong on 32-bit machines.

3. "Instruction at end of binary" logic is simply wrong, it
doesn't handle the case when uprobe->offset > inode->i_size.

In this case "bytes" overflows, and __copy_insn() writes to
the memory outside of uprobe->arch.insn.

Yes, uprobe_register() checks i_size_read(), but this file
can be truncated after that. All i_size checks are racy, we
do this only to catch the obvious mistakes.

Change copy_insn() to call __copy_insn() in a loop, simplify
and fix the bytes/nbytes calculations.

Note: we do not care if we read extra bytes after inode->i_size
if we got the valid page. This is fine because the task gets the
same page after page-fault, and arch_uprobe_analyze_insn() can't
know how many bytes were actually read anyway.

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2013-11-10 00:05:43 +0800
70d7f9872 uprobes: Fix the wrong usage of current->utask in uprobe_copy_process() ... Browse Code »

Commit aa59c53fd459 "uprobes: Change uprobe_copy_process() to dup
xol_area" has a stupid typo, we need to setup t->utask->vaddr but
the code wrongly uses current->utask.

Even with this bug dup_xol_work() works "in practice", but only
because get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE) likely
returns the same address every time.

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2013-11-10 00:05:41 +0800

07 Nov, 2013

4 commits

0324e7453 Merge tag 'driver-core-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core / sysfs patches from Greg KH:
"Here's the big driver core / sysfs update for 3.13-rc1.

There's lots of dev_groups updates for different subsystems, as they
all get slowly migrated over to the safe versions of the attribute
groups (removing userspace races with the creation of the sysfs
files.) Also in here are some kobject updates, devres expansions, and
the first round of Tejun's sysfs reworking to enable it to be used by
other subsystems as a backend for an in-kernel filesystem.

All of these have been in linux-next for a while with no reported
issues"

* tag 'driver-core-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (83 commits)
sysfs: rename sysfs_assoc_lock and explain what it's about
sysfs: use generic_file_llseek() for sysfs_file_operations
sysfs: return correct error code on unimplemented mmap()
mdio_bus: convert bus code to use dev_groups
device: Make dev_WARN/dev_WARN_ONCE print device as well as driver name
sysfs: separate out dup filename warning into a separate function
sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c
sysfs: remove unused sysfs_get_dentry() prototype
sysfs: honor bin_attr.attr.ignore_lockdep
sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr
devres: restore zeroing behavior of devres_alloc()
sysfs: fix sysfs_write_file for bin file
input: gameport: convert bus code to use dev_groups
input: serio: remove bus usage of dev_attrs
input: serio: use DEVICE_ATTR_RO()
i2o: convert bus code to use dev_groups
memstick: convert bus code to use dev_groups
tifm: convert bus code to use dev_groups
virtio: convert bus code to use dev_groups
ipack: convert bus code to use dev_groups
...

Linus Torvalds
2013-11-07 10:42:15 +0800
f72d41fa9 uprobes: Export write_opcode() as uprobe_write_opcode() ... Browse Code »

set_swbp() and set_orig_insn() are __weak, but this is pointless
because write_opcode() is static.

Export write_opcode() as uprobe_write_opcode() for the upcoming
arm port, this way it can actually override set_swbp() and use
__opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2013-11-07 03:00:09 +0800
8a8de66c4 uprobes: Introduce arch_uprobe->ixol ... Browse Code »

Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.

This is not true for arm which needs to create another insn to
execute it out-of-line.

So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2013-11-07 03:00:05 +0800
736e89d9f uprobes: Kill module_init() and module_exit() ... Browse Code »

Turn module_init() into __initcall() and kill module_exit().

This code can't be compiled as a module so these module_*()
calls only add the confusion, especially if arch-dependant
code needs its own initialization hooks.

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2013-11-07 02:59:50 +0800

06 Nov, 2013

4 commits

c7e548b45 perf: Factor out strncpy() in perf_event_mmap_event() ... Browse Code »

While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.

Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com
Signed-off-by: Ingo Molnar

Oleg Nesterov
2013-11-06 19:34:28 +0800
0a196848c perf: Fix arch_perf_out_copy_user default ... Browse Code »
10

The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-06 19:34:25 +0800
394570b79 perf: Update a stale comment ... Browse Code »

Signed-off-by: Peter Zijlstra
Cc: Benjamin Herrenschmidt
Cc: Frederic Weisbecker
Cc: Mathieu Desnoyers
Cc: Michael Ellerman
Cc: Michael Neuling
Cc: "Paul E. McKenney"
Cc: james.hogan@imgtec.com
Cc: Vince Weaver
Cc: Victor Kaplansky
Cc: Oleg Nesterov
Cc: Anton Blanchard
Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-06 19:34:23 +0800
524feca5e perf: Optimize perf_output_begin() -- address calculation ... Browse Code »

Rewrite the handle address calculation code to be clearer.

Saves 8 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra
Cc: Benjamin Herrenschmidt
Cc: Frederic Weisbecker
Cc: Mathieu Desnoyers
Cc: Michael Ellerman
Cc: Michael Neuling
Cc: "Paul E. McKenney"
Cc: james.hogan@imgtec.com
Cc: Vince Weaver
Cc: Victor Kaplansky
Cc: Oleg Nesterov
Cc: Anton Blanchard
Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-06 19:34:22 +0800