Eric Lee / smarc-fsl-linux-kernel

07 Aug, 2010

1 commit

c4efd6b56 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug
sched: No need for bootmem special cases
sched: Revert nohz_ratelimit() for now
sched: Reduce update_group_power() calls
sched: Update rq->clock for nohz balanced cpus
sched: Fix spelling of sibling
sched, cpuset: Drop __cpuexit from cpu hotplug callbacks
sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check()
sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()
sched: thread_group_cputime: Simplify, document the "alive" check
sched: Remove the obsolete exit_state/signal hacks
sched: task_tick_rt: Remove the obsolete ->signal != NULL check
sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless
sched: Fix comments to make them DocBook happy
sched: Fix fix_small_capacity
powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
powerpc: Enable asymmetric SMT scheduling on POWER7
sched: Add asymmetric group packing option for sibling domain
sched: Fix capacity calculations for SMT4
sched: Change nohz idle load balancing logic to push model
...

Linus Torvalds
2010-08-07 00:39:22 +0800

18 Jun, 2010

1 commit

4cb6948e5 Merge commit 'v2.6.35-rc3' into sched/core ... Browse Code »

Merge reason: Update to the latest -rc.

Ingo Molnar
2010-06-18 16:46:35 +0800

10 Jun, 2010

1 commit

c726b61c6 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/freder… ... Browse Code »

…ic/random-tracing into perf/core

Ingo Molnar
2010-06-10 00:55:57 +0800

09 Jun, 2010

12 commits

e78505958 perf: Convert perf_event to local_t ... Browse Code »

Since now all modification to event->count (and ->prev_count
and ->period_left) are local to a cpu, change then to local64_t so we
avoid the LOCK'ed ops.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:37 +0800
a6e6dea68 perf: Add perf_event::child_count ... Browse Code »

Only child counters adding back their values into the parent counter
are responsible for cross-cpu updates to event->count.

So if we pull that out into a new child_count variable, we get an
event->count that is only modified locally.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:37 +0800
b5e58793c perf: Add perf_event_count() ... Browse Code »

Create a helper function for those sites that want to read the event count.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:36 +0800
d57e34fdd perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed ... Browse Code »

Currently there are perf_buffer_alloc() + perf_buffer_init() + some
separate bits, fold it all into a single perf_buffer_alloc() and only
leave the attachment to the event separate.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:35 +0800
ca5135e6b perf: Rename perf_mmap_data to perf_buffer ... Browse Code »

Rename to clarify code.

s/perf_mmap_data/perf_buffer/g and selective s/data/buffer/g

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:35 +0800
8d2cacbbb perf: Cleanup {start,commit,cancel}_txn details ... Browse Code »

Clarify some of the transactional group scheduling API details
and change it so that a successfull ->commit_txn also closes
the transaction.

Signed-off-by: Peter Zijlstra
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:34 +0800
3af9e8592 perf: Add non-exec mmap() tracking ... Browse Code »

Add the capacility to track data mmap()s. This can be used together
with PERF_SAMPLE_ADDR for data profiling.

Signed-off-by: Anton Blanchard
[Updated code for stable perf ABI]
Signed-off-by: Eric B Munson
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Eric B Munson
2010-06-09 17:12:34 +0800
8ed92280b perf, trace: Remove superfluous rcu_read_lock() ... Browse Code »

__DO_TRACE() already calls the callbacks under rcu_read_lock_sched(),
which is sufficient for our needs, avoid doing it again.

Signed-off-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:33 +0800
ecc55f84b perf, trace: Inline perf_swevent_put_recursion_context() ... Browse Code »

Inline perf_swevent_put_recursion_context into perf_tp_event(), this
shrinks the per trace template code footprint and saves a function
call.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 17:12:33 +0800
c676329ab sched_clock: Add local_clock() API and improve documentation ... Browse Code »

For people who otherwise get to write: cpu_clock(smp_processor_id()),
there is now: local_clock().

Also, as per suggestion from Andrew, provide some documentation on
the various clock interfaces, and minimize the unsigned long long vs
u64 mess.

Signed-off-by: Peter Zijlstra
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Jens Axboe
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 16:34:49 +0800
b0f82b81f perf: Drop the skip argument from perf_arch_fetch_regs_caller ... Browse Code »

Drop this argument now that we always want to rewind only to the
state of the first caller.
It means frame pointers are not necessary anymore to reliably get
the source of an event. But this also means we need this helper
to be a macro now, as an inline function is not an option since
we need to know when to provide a default implentation.

Signed-off-by: Frederic Weisbecker
Signed-off-by: Paul Mackerras
Cc: David Miller
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo

Frederic Weisbecker
2010-06-09 05:31:27 +0800
f6ab91add perf: Fix signed comparison in perf_adjust_period() ... Browse Code »

Frederic reported that frequency driven swevents didn't work properly
and even caused a division-by-zero error.

It turns out there are two bugs, the division-by-zero comes from a
failure to deal with that in perf_calculate_period().

The other was more interesting and turned out to be a wrong comparison
in perf_adjust_period(). The comparison was between an s64 and u64 and
got implicitly converted to an unsigned comparison. The problem is
that period_left is typically < 0, so it ended up being always true.

Cure this by making the local period variables s64.

Reported-by: Frederic Weisbecker
Tested-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-09 00:43:00 +0800

03 Jun, 2010

1 commit

c6df8d5ab perf: Fix crash in swevents ... Browse Code »

Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.

If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.

Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.

Reported-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-06-03 23:03:08 +0800

31 May, 2010

4 commits

74048f895 perf_events: Fix unincremented buffer base on partial copy ... Browse Code »

If a sample size crosses to the next page boundary, the copy
will be made in more than one step. However we forget to advance
the source offset for the next copy, leading to unexpected double
copies that completely mess up the traces.

This fixes various kinds of bad traces that have irrelevant
data inside, as an example:

geany-4979 [001] 5758.077775: sched_switch: prev_comm=! prev_pid=121
prev_prio=0 prev_state=S|D|Z|X|x ==> next_comm= next_pid=7497072
next_prio=0

Signed-off-by: Frederic Weisbecker
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2010-05-31 14:46:10 +0800
90151c35b perf_events: Fix event scheduling issues introduced by transactional API ... Browse Code »

The transactional API patch between the generic and model-specific
code introduced several important bugs with event scheduling, at
least on X86. If you had pinned events, e.g., watchdog, and were
over-committing the PMU, you would get bogus counts. The bug was
showing up on Intel CPU because events would move around more
often that on AMD. But the problem also existed on AMD, though
harder to expose.

The issues were:

- group_sched_in() was missing a cancel_txn() in the error path

- cpuc->n_added was not properly maintained, leading to missing
actions in hw_perf_enable(), i.e., n_running being 0. You cannot
update n_added until you know the transaction has succeeded. In
case of failed transaction n_added was not adjusted back.

- in case of failed transactions, event_sched_out() was called
and eventually invoked x86_disable_event() to touch the HW reg.
But with transactions, on X86, event_sched_in() does not touch
HW registers, it simply collects events into a list. Thus, you
could end up calling x86_disable_event() on a counter which
did not correspond to the current event when idx != -1.

The patch modifies the generic and X86 code to avoid all those problems.

First, we keep track of the number of events added last. In case the
transaction fails, we substract them from n_added. This approach is
necessary (as opposed to delaying updates to n_added) because not all
event updates use the transaction API, e.g., single events.

Second, we encapsulate the event_sched_in() and event_sched_out() in
group_sched_in() inside the transaction. That makes the operations
symmetrical and you can also detect that you are inside a transaction
and skip the HW reg access by checking cpuc->group_flag.

With this patch, you can now overcommit the PMU even with pinned
system-wide events present and still get valid counts.

Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephane Eranian
2010-05-31 14:46:10 +0800
8a49542c0 perf_events: Fix races in group composition ... Browse Code »

Group siblings don't pin each-other or the parent, so when we destroy
events we must make sure to clean up all cross referencing pointers.

In particular, for destruction of a group leader we must be able to
find all its siblings and remove their reference to it.

This means that detaching an event from its context must not detach it
from the group, otherwise we can end up failing to clear all pointers.

Solve this by clearly separating the attachment to a context and
attachment to a group, and keep the group composed until we destroy
the events.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-31 14:46:09 +0800
ac9721f3f perf_events: Fix races and clean up perf_event and perf_mmap_data interaction ... Browse Code »

In order to move toward separate buffer objects, rework the whole
perf_mmap_data construct to be a more self-sufficient entity, one
with its own lifetime rules.

This greatly sanitizes the whole output redirection code, which
was riddled with bugs and races.

Signed-off-by: Peter Zijlstra
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-31 14:46:08 +0800

28 May, 2010

1 commit

ea635c64e Fix racy use of anon_inode_getfd() in perf_event.c ... Browse Code »

once anon_inode_getfd() is called, you can't expect *anything* about
struct file that descriptor points to - another thread might be doing
whatever it likes with descriptor table at that point.

Cc: stable
Signed-off-by: Al Viro

Al Viro
2010-05-28 10:03:08 +0800

21 May, 2010

8 commits

580d607cd perf: Optimize perf_tp_event_match() ... Browse Code »

Since we know tracepoints come from kernel context,
avoid conditionals that try and establish that very
fact.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:38:00 +0800
a94ffaaf5 perf: Remove more code from the fastpath ... Browse Code »

Sanity checks cost instructions.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:59 +0800
3cafa9fbb perf: Optimize the !vmalloc backed buffer ... Browse Code »

Reduce code and data by using the knowledge that for
!PERF_USE_VMALLOC data_order is always 0.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:59 +0800
5d967a8be perf: Optimize perf_output_copy() ... Browse Code »

Reduce the clutter in perf_output_copy() by keeping
an interator in perf_output_handle.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:59 +0800
adb8e118f perf: Fix wakeup storm for RO mmap()s ... Browse Code »

RO mmap()s don't update the tail pointer, so
comparing against it for determining the written data
size doesn't really do any good.

Keep track of when we last did a wakeup, and compare
against that.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:58 +0800
0f139300c perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffers ... Browse Code »

Since we want to ensure buffers only have a single
writer, we must avoid creating one with multiple.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:57 +0800
1c024eca5 perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events ... Browse Code »

Avoid the swevent hash-table by using per-tracepoint
hlists.

Also, avoid conditionals on the fast path by ordering
with probe unregister so that we should never get on
the callback path without the data being there.

Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-21 17:37:56 +0800
acd35a463 perf: Fix forgotten preempt_enable by nested writers ... Browse Code »

A writer that gets a reference to the buffer handle disables
preemption. When we put that reference, we check if we are
the outer most writer and if not, we simply return and defer
the head update to the outer most writer. The problem here
is that preemption is only reenabled by the outer most, that
produces preemption count imbalance for every nested writer
that exit.

So just don't forget to always re-enable preemption when we
put the buffer reference, whoever we are.

Fixes lots of sleeping in atomic warnings, visible with lock
events recording.

Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Stephane Eranian
Cc: Robert Richter

Frederic Weisbecker
2010-05-21 03:28:34 +0800

20 May, 2010

2 commits

dfacc4d6c Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/fred… ... Browse Code »

…eric/random-tracing into perf/core

Ingo Molnar
2010-05-20 20:38:55 +0800
49f135ed0 perf: Comply with new rcu checks API ... Browse Code »

The software events hlist doesn't fully comply with the new
rcu checks api.

We need to consider three different sides that access the hlist:

- the hlist allocation/release side. This side happens when an
events is created or released, accesses to the hlist are
serialized under the cpuctx mutex.

- the events insertion/removal in the hlist. This side is always
serialized against the above one. The hlist is always present
during such operations. This side happens when a software event
is scheduled in/out. The serialization that ensures the software
event is really attached to the context is made under the
ctx->lock.

- events triggering. This is the read side, it can happen
concurrently with any update side.

This patch deals with them one by one and anticipates with the
separate rcu mem space patches in preparation.

This patch fixes various annoying rcu warnings.

Reported-by: Paul E. McKenney
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras

Frederic Weisbecker
2010-05-20 16:40:37 +0800

19 May, 2010

7 commits

6d1acfd5c perf: Optimize perf_output_*() by avoiding local_xchg() ... Browse Code »

Since the x86 XCHG ins implies LOCK, avoid the use by
using a sequence count instead.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:49 +0800
fa5881514 perf: Optimize the hotpath by converting the perf output buffer to local_t ... Browse Code »

Since there is now only a single writer, we can use
local_t instead and avoid all these pesky LOCK insn.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:49 +0800
ef60777c9 perf: Optimize the perf_output() path by removing IRQ-disables ... Browse Code »

Since we can now assume there is only a single writer
to each buffer, we can remove per-cpu lock thingy and
use a simply nest-count to the same effect.

This removes the need to disable IRQs.

Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:48 +0800
c7920614c perf: Disallow mmap() on per-task inherited events ... Browse Code »

Since we now have working per-task-per-cpu events for
a while, disallow mmap() on per-task inherited
events. Those things were a performance problem
anyway, and doing away with it allows us to optimize
the buffer somewhat by assuming there is only a
single writer.

Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:48 +0800
a19d35c11 perf: Optimize buffer placement by allocating buffers NUMA aware ... Browse Code »

Ensure cpu bound buffers live on the right NUMA node.

Suggested-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:47 +0800
00d1d0b09 perf: Fix errors path in perf_output_begin() ... Browse Code »

In case the sampling buffer has no "payload" pages,
nr_pages is 0. The problem is that the error path in
perf_output_begin() skips to a label which assumes
perf_output_lock() has been issued which is not the
case. That triggers a WARN_ON() in
perf_output_unlock().

This patch fixes the problem by skipping
perf_output_unlock() in case data->nr_pages is 0.

Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephane Eranian
2010-05-19 00:35:47 +0800
4f41c013f perf/ftrace: Optimize perf/tracepoint interaction for single events ... Browse Code »

When we've got but a single event per tracepoint
there is no reason to try and multiplex it so don't.

Signed-off-by: Peter Zijlstra
Tested-by: Ingo Molnar
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-19 00:35:46 +0800

11 May, 2010

2 commits

96c21a460 perf: Fix exit() vs event-groups ... Browse Code »

Corey reported that the value scale times of group siblings are not
updated when the monitored task dies.

The problem appears to be that we only update the group leader's
time values, fix it by updating the whole group.

Reported-by: Corey Ashford
Signed-off-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Mike Galbraith
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: # .34.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-11 23:08:24 +0800
050735b08 perf: Fix exit() vs PERF_FORMAT_GROUP ... Browse Code »

Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't
work as expected if the task the counters were attached to quit
before the read() call.

The cause is that we unconditionally destroy the grouping when
we remove counters from their context. Fix this by splitting off
the group destroy from the list removal such that
perf_event_remove_from_context() does not do this and change
perf_event_release() to do so.

Reported-by: Corey Ashford
Reported-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: # .34.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-05-11 21:46:43 +0800