14 Apr, 2011
8 commits
-
In preparation of calling select_task_rq() without rq->lock held, drop
the dependency on the rq argument.Reviewed-by: Frank Rowand
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20110405152729.031077745@chello.nl
Signed-off-by: Ingo Molnar -
Currently p->pi_lock already serializes p->sched_class, also put
p->cpus_allowed and try_to_wake_up() under it, this prepares the way
to do the first part of ttwu() without holding rq->lock.By having p->sched_class and p->cpus_allowed serialized by p->pi_lock,
we prepare the way to call select_task_rq() without holding rq->lock.Reviewed-by: Frank Rowand
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20110405152728.990364093@chello.nl
Signed-off-by: Ingo Molnar -
Provide a generic p->on_rq because the p->se.on_rq semantics are
unfavourable for lockless wakeups but needed for sched_fair.In particular, p->on_rq is only cleared when we actually dequeue the
task in schedule() and not on any random dequeue as done by things
like __migrate_task() and __sched_setscheduler().This also allows us to remove p->se usage from !sched_fair code.
Reviewed-by: Frank Rowand
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110405152728.949545047@chello.nl -
Collect all ttwu() stat code into a single function and ensure its
always called for an actual wakeup (changing p->state to
TASK_RUNNING).Reviewed-by: Frank Rowand
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110405152728.908177058@chello.nl -
try_to_wake_up() would only return a success when it would have to
place a task on a rq, change that to every time we change p->state to
TASK_RUNNING, because that's the real measure of wakeups.This results in that success is always true for the tracepoints.
Reviewed-by: Frank Rowand
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110405152728.866866929@chello.nl -
wq_worker_waking_up() needs to match wq_worker_sleeping(), since the
latter is only called on deactivate, move the former near activate.Signed-off-by: Peter Zijlstra
Cc: Tejun Heo
Link: http://lkml.kernel.org/n/top-t3m7n70n9frmv4pv2n5fwmov@git.kernel.org
Signed-off-by: Ingo Molnar -
Since we now have p->on_cpu unconditionally available, use it to
re-implement mutex_spin_on_owner.Requested-by: Thomas Gleixner
Reviewed-by: Frank Rowand
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110405152728.826338173@chello.nl -
Always provide p->on_cpu so that we can determine if its on a cpu
without having to lock the rq.Reviewed-by: Frank Rowand
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20110405152728.785452014@chello.nl
Signed-off-by: Ingo Molnar
13 Apr, 2011
1 commit
-
We really only want to unplug the pending IO when the process actually
goes to sleep. So move the test for flushing the plug up to the place
where we actually deactivate the task - where we have properly checked
for preemption and for the process really sleeping.Acked-by: Jens Axboe
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds
08 Apr, 2011
2 commits
-
…-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file -
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
Fix common misspellings
05 Apr, 2011
1 commit
-
Instead of the possible multiple-evaluation of num_online_cpus()
in rebalance_domains() that Linus reported, avoid it altogether
in the normal case since it's implemented with a Hamming weight
function over a cpu bitmask which can be darn expensive for those
with big iron.This also makes it cleaner, smaller and documents the code.
Reported-by: Linus Torvalds
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
31 Mar, 2011
2 commits
-
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi
-
sched_setscheduler() (in sched.c) is called in order of changing the
scheduling policy and/or the real-time priority of a task. Thus,
if we find out that neither of those are actually being modified, it
is possible to return earlier and save the overhead of a full
deactivate+activate cycle of the task in question.Beside that, if we have more than one SCHED_FIFO task with the same
priority on the same rq (which means they share the same priority queue)
having one of them changing its position in the priority queue because of
a sched_setscheduler (as it happens by means of the deactivate+activate)
that does not actually change the priority violates POSIX which states,
for SCHED_FIFO:"If a thread whose policy or priority has been modified by
pthread_setschedprio() is a running thread or is runnable, the effect on
its position in the thread list depends on the direction of the
modification, as follows: a. b. If the priority is unchanged, the
thread does not change position in the thread list. c. "http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_08.html
(ed: And the POSIX specification here does, briefly and somewhat unexpectedly,
match what common sense tells us as well. )Signed-off-by: Dario Faggioli
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
26 Mar, 2011
1 commit
-
…l/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched, doc: Update sched-design-CFS.txt
sched: Remove unused 'rq' variable and cpu_rq() call from alloc_fair_sched_group()
sched.h: Fix a typo ("its")
sched: Fix yield_to kernel-doc
25 Mar, 2011
1 commit
-
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...Fix up conflicts in fs/{aio.c,super.c}
24 Mar, 2011
1 commit
-
CAP_IPC_OWNER and CAP_IPC_LOCK can be checked against current_user_ns(),
because the resource comes from current's own ipc namespace.setuid/setgid are to uids in own namespace, so again checks can be against
current_user_ns().Changelog:
Jan 11: Use task_ns_capable() in place of sched_capable().
Jan 11: Use nsown_capable() as suggested by Bastian Blank.
Jan 11: Clarify (hopefully) some logic in futex and sched.c
Feb 15: use ns_capable for ipc, not nsown_capable
Feb 23: let copy_ipcs handle setting ipc_ns->user_ns
Feb 23: pass ns down rather than taking it from current[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn
Acked-by: "Eric W. Biederman"
Acked-by: Daniel Lezcano
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Mar, 2011
1 commit
-
Signed-off-by: Sergey Senozhatsky
Cc: Steven Rostedt
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
20 Mar, 2011
1 commit
-
Add missing function parameters for yield_to():
Warning(kernel/sched.c:5470): No description found for parameter 'p'
Warning(kernel/sched.c:5470): No description found for parameter 'preempt'Signed-off-by: Randy Dunlap
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
17 Mar, 2011
2 commits
-
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: That's all, folks
fs/locks.c: Remove stale FIXME left over from BKL conversion
ipx: remove the BKL
appletalk: remove the BKL
x25: remove the BKL
ufs: remove the BKL
hpfs: remove the BKL
drivers: remove extraneous includes of smp_lock.h
tracing: don't trace the BKL
adfs: remove the big kernel lock -
Fix kernel-doc warning for runqueue_is_locked():
Warning(kernel/sched.c:664): missing initial short description on line:
Signed-off-by: Randy Dunlap
Cc: Ingo Molnar
Signed-off-by: Linus Torvalds
16 Mar, 2011
4 commits
-
Fix kernel-doc warning for runqueue_is_locked():
Warning(kernel/sched.c:664): missing initial short description
Signed-off-by: Randy Dunlap
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar -
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (116 commits)
x86: Enable forced interrupt threading support
x86: Mark low level interrupts IRQF_NO_THREAD
x86: Use generic show_interrupts
x86: ioapic: Avoid redundant lookup of irq_cfg
x86: ioapic: Use new move_irq functions
x86: Use the proper accessors in fixup_irqs()
x86: ioapic: Use irq_data->state
x86: ioapic: Simplify irq chip and handler setup
x86: Cleanup the genirq name space
genirq: Add chip flag to force mask on suspend
genirq: Add desc->irq_data accessor
genirq: Add comments to Kconfig switches
genirq: Fixup fasteoi handler for oneshot mode
genirq: Provide forced interrupt threading
sched: Switch wait_task_inactive to schedule_hrtimeout()
genirq: Add IRQF_NO_THREAD
genirq: Allow shared oneshot interrupts
genirq: Prepare the handling of shared oneshot interrupts
genirq: Make warning in handle_percpu_event useful
x86: ioapic: Move trigger defines to io_apic.h
...Fix up trivial(?) conflicts in arch/x86/pci/xen.c due to genirq name
space changes clashing with the Xen cleanups. The set_irq_msi() had
moved to xen_bind_pirq_msi_to_irq(). -
…/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
sched: Resched proper CPU on yield_to()
sched: Allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy
sched: Allow SCHED_BATCH to preempt SCHED_IDLE tasks
sched: Clean up the IRQ_TIME_ACCOUNTING code
sched: Add #ifdef around irq time accounting functions
sched, autogroup: Stop claiming ownership of the root task group
sched, autogroup: Stop going ahead if autogroup is disabled
sched, autogroup, sysctl: Use proc_dointvec_minmax() instead
sched: Fix the group_imb logic
sched: Clean up some f_b_g() comments
sched: Clean up remnants of sd_idle
sched: Wholesale removal of sd_idle logic
sched: Add yield_to(task, preempt) functionality
sched: Use a buddy to implement yield_task_fair()
sched: Limit the scope of clear_buddies
sched: Check the right ->nr_running in yield_task_fair()
sched: Avoid expensive initial update_cfs_load(), on UP too
sched: Fix switch_from_fair()
sched: Simplify the idle scheduling class
softirqs: Account ksoftirqd time as cpustat softirq
... -
…git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
perf probe: Clean up probe_point_lazy_walker() return value
tracing: Fix irqoff selftest expanding max buffer
tracing: Align 4 byte ints together in struct tracer
tracing: Export trace_set_clr_event()
tracing: Explain about unstable clock on resume with ring buffer warning
ftrace/graph: Trace function entry before updating index
ftrace: Add .ref.text as one of the safe areas to trace
tracing: Adjust conditional expression latency formatting.
tracing: Fix event alignment: skb:kfree_skb
tracing: Fix event alignment: mce:mce_record
tracing: Fix event alignment: kvm:kvm_hv_hypercall
tracing: Fix event alignment: module:module_request
tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
tracing: Remove lock_depth from event entry
perf header: Stop using 'self'
perf session: Use evlist/evsel for managing perf.data attributes
perf top: Don't let events to eat up whole header line
perf top: Fix events overflow in top command
ring-buffer: Remove unused #include <linux/trace_irq.h>
tracing: Add an 'overwrite' trace_option.
...
11 Mar, 2011
1 commit
-
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().Signed-off-by: Trond Myklebust
10 Mar, 2011
1 commit
-
This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.Signed-off-by: Jens Axboe
05 Mar, 2011
1 commit
-
This removes the implementation of the big kernel lock,
at last. A lot of people have worked on this in the
past, I so the credit for this patch should be with
everyone who participated in the hunt.The names on the Cc list are the people that were the
most active in this, according to the recorded git
history, in alphabetical order.Signed-off-by: Arnd Bergmann
Acked-by: Alan Cox
Cc: Alessio Igor Bogani
Cc: Al Viro
Cc: Andrew Hendry
Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Eric W. Biederman
Cc: Frederic Weisbecker
Cc: Hans Verkuil
Acked-by: Ingo Molnar
Cc: Jan Blunck
Cc: John Kacur
Cc: Jonathan Corbet
Cc: Linus Torvalds
Cc: Matthew Wilcox
Cc: Oliver Neukum
Cc: Paul Menage
Acked-by: Thomas Gleixner
Cc: Trond Myklebust
04 Mar, 2011
2 commits
-
yield_to_task_fair() has code to resched the CPU of yielding task when the
intention is to resched the CPU of the task that is being yielded to.Change here fixes the problem and also makes the resched conditional on
rq != p_rq.Signed-off-by: Venkatesh Pallipadi
Reviewed-by: Rik van Riel
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
The current scheduler implementation returns -EPERM when trying to
change from SCHED_IDLE to SCHED_OTHER or SCHED_BATCH. Since SCHED_IDLE
is considered to be a nice 20 on steroids, changing to another policy
should be allowed provided the RLIMIT_NICE is accounted for.This patch allows the following test-case to pass with RLIMIT_NICE=40,
but still fail with RLIMIT_NICE=10 when the calling process is run
from a typical shell (nice 0, or 20 in rlimit terms).int main()
{
int ret;
struct sched_param sp;
sp.sched_priority = 0;/* switch to SCHED_IDLE */
ret = sched_setscheduler(0, SCHED_IDLE, &sp);
printf("setscheduler IDLE: %d\n", ret);
if (ret) return ret;/* switch back to SCHED_OTHER */
ret = sched_setscheduler(0, SCHED_OTHER, &sp);
printf("setscheduler OTHER: %d\n", ret);return ret;
}$ ulimit -e
40
$ ./test
setscheduler IDLE: 0
setscheduler OTHER: 0$ ulimit -e 10
$ ulimit -e
10
$ ./test
setscheduler IDLE: 0
setscheduler OTHER: -1Signed-off-by: Darren Hart
Signed-off-by: Peter Zijlstra
Cc: Richard Purdie
LKML-Reference:
Signed-off-by: Ingo Molnar
26 Feb, 2011
2 commits
-
Fix this warning:
lkml.org/lkml/2011/1/30/124
kernel/sched.c:3719: warning: 'irqtime_account_idle_ticks' defined but not used
kernel/sched.c:3720: warning: 'irqtime_account_process_tick' defined but not usedIn a cleaner way than:
7e9498705e81: sched: Add #ifdef around irq time accounting functions
This patch will not have any functional impact.
Signed-off-by: Venkatesh Pallipadi
Cc: heiko.carstens@de.ibm.com
Cc: a.p.zijlstra@chello.nl
LKML-Reference:
Signed-off-by: Ingo Molnar -
When we force thread hard and soft interrupts the startup of ksoftirqd
would hang in kthread_bind() when wait_task_inactive() calls
schedule_timeout_uninterruptible() because there is no softirq yet
which will wake us up.Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
LKML-Reference:
25 Feb, 2011
1 commit
-
Get rid of this:
kernel/sched.c:3731:13: warning: 'irqtime_account_idle_ticks' defined but not used
kernel/sched.c:3732:13: warning: 'irqtime_account_process_tick' defined but not usedSigned-off-by: Heiko Carstens
Cc: Venkatesh Pallipadi
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Feb, 2011
1 commit
-
Make the ::exit method act like ::attach, it is after all very nearly
the same thing.The bug had no effect on correctness - fixing it is an optimization for
the scheduler. Also, later perf-cgroups patches rely on it.Signed-off-by: Peter Zijlstra
Acked-by: Paul Menage
LKML-Reference:
Signed-off-by: Ingo Molnar
14 Feb, 2011
1 commit
-
…rostedt/linux-2.6-trace into perf/core
12 Feb, 2011
1 commit
-
When the fuction graph tracer starts, it needs to make a special
stack for each task to save the real return values of the tasks.
All running tasks have this stack created, as well as any new
tasks.On CPU hot plug, the new idle task will allocate a stack as well
when init_idle() is called. The problem is that cpu hotplug does
not create a new idle_task. Instead it uses the idle task that
existed when the cpu went down.ftrace_graph_init_task() will add a new ret_stack to the task
that is given to it. Because a clone will make the task
have a stack of its parent it does not check if the task's
ret_stack is already NULL or not. When the CPU hotplug code
starts a CPU up again, it will allocate a new stack even
though one already existed for it.The solution is to treat the idle_task specially. In fact, the
function_graph code already does, just not at init_idle().
Instead of using the ftrace_graph_init_task() for the idle task,
which that function expects the task to be a clone, have a
separate ftrace_graph_init_idle_task(). Also, we will create a
per_cpu ret_stack that is used by the idle task. When we call
ftrace_graph_init_idle_task() it will check if the idle task's
ret_stack is NULL, if it is, then it will assign it the per_cpu
ret_stack.Reported-by: Benjamin Herrenschmidt
Suggested-by: Peter Zijlstra
Cc: Stable Tree
Signed-off-by: Steven Rostedt
03 Feb, 2011
3 commits
-
Currently only implemented for fair class tasks.
Add a yield_to_task method() to the fair scheduling class. allowing the
caller of yield_to() to accelerate another thread in it's thread group,
task group.Implemented via a scheduler hint, using cfs_rq->next to encourage the
target being selected. We can rely on pick_next_entity to keep things
fair, so noone can accelerate a thread that has already used its fair
share of CPU time.This also means callers should only call yield_to when they really
mean it. Calling it too often can result in the scheduler just
ignoring the hint.Signed-off-by: Rik van Riel
Signed-off-by: Marcelo Tosatti
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Use the buddy mechanism to implement yield_task_fair. This
allows us to skip onto the next highest priority se at every
level in the CFS tree, unless doing so would introduce gross
unfairness in CPU time distribution.We order the buddy selection in pick_next_entity to check
yield first, then last, then next. We need next to be able
to override yield, because it is possible for the "next" and
"yield" task to be different processen in the same sub-tree
of the CFS tree. When they are, we need to go into that
sub-tree regardless of the "yield" hint, and pick the correct
entity once we get to the right level.Signed-off-by: Rik van Riel
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Oleg reported that on architectures with
__ARCH_WANT_INTERRUPTS_ON_CTXSW the IPI from
task_oncpu_function_call() can land before perf_event_task_sched_in()
and cause interesting situations for eg. perf_install_in_context().This patch reworks the task_oncpu_function_call() interface to give a
more usable primitive as well as rework all its users to hopefully be
more obvious as well as remove the races.While looking at the code I also found a number of races against
perf_event_task_sched_out() which can flip contexts between tasks so
plug those too.Reported-and-reviewed-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
27 Jan, 2011
1 commit
-
Fix the build on UP.
Signed-off-by: Peter Zijlstra
Cc: Paul Turner
LKML-Reference:
Signed-off-by: Ingo Molnar