08 May, 2011
1 commit
-
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Makefile: Use gcc to determine ARCH
perf events, x86: Fix Intel Nehalem and Westmere last level cache event definitions
hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg()
sh, hw_breakpoints: Fix racy access to ptrace breakpoints
arm, hw_breakpoints: Fix racy access to ptrace breakpoints
powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints
x86, hw_breakpoints: Fix racy access to ptrace breakpoints
ptrace: Prepare to fix racy accesses on task breakpoints
07 May, 2011
1 commit
-
This partially reverts commit e6e1e2593592a8f6f6380496655d8c6f67431266.
That commit changed the structure layout of the trace structure, which
in turn broke PowerTOP (1.9x generation) quite badly.I appreciate not wanting to expose the variable in question, and
PowerTOP was not using it, so I've replaced the variable with just a
padding field - that way if in the future a new field is needed it can
just use this padding field.Signed-off-by: Arjan van de Ven
Signed-off-by: Linus Torvalds
06 May, 2011
1 commit
-
…ds/linux-2.6 into perf/urgent
05 May, 2011
1 commit
-
…eric/random-tracing into perf/urgent
03 May, 2011
1 commit
-
commit ab7798ffcf98b11a9525cf65bacdae3fd58d357f ("genirq: Expand generic
show_interrupts()") added the Kconfig option GENERIC_IRQ_SHOW_LEVEL to
accomodate PowerPC, but this doesn't actually enable the functionality due
to a typo in the #ifdef check.Signed-off-by: Geert Uytterhoeven
Cc: Linux/PPC Development
Link: http://lkml.kernel.org/r/%3Calpine.DEB.2.00.1104302251370.19068%40ayla.of.borg%3E
Signed-off-by: Thomas Gleixner
01 May, 2011
1 commit
-
* 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix deadlock in worker_maybe_bind_and_lock()
workqueue: Document debugging tricksFix up trivial spelling conflict in kernel/workqueue.c
30 Apr, 2011
3 commits
-
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86, nmi: Move LVT un-masking into irq handlers
perf events, x86: Work around the Nehalem AAJ80 erratum
perf, x86: Fix BTS condition
ftrace: Build without frame pointers on Microblaze -
…l/git/tip/linux-2.6-tip
* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimer: Initialize CLOCK_ID to HRTIMER_BASE table statically
rtc: max8925: Call dev_set_drvdata before rtc_device_register -
If a rescuer and stop_machine() bringing down a CPU race with each
other, they may deadlock on non-preemptive kernel. The CPU won't
accept a new task, so the rescuer can't migrate to the target CPU,
while stop_machine() can't proceed because the rescuer is holding one
of the CPU retrying migration. GCWQ_DISASSOCIATED is never cleared
and worker_maybe_bind_and_lock() retries indefinitely.This problem can be reproduced semi reliably while the system is
entering suspend.http://thread.gmane.org/gmane.linux.kernel/1122051
A lot of kudos to Thilo-Alexander for reporting this tricky issue and
painstaking testing.stable: This affects all kernels with cmwq, so all kernels since and
including v2.6.36 need this fix.Signed-off-by: Tejun Heo
Reported-by: Thilo-Alexander Ginkel
Tested-by: Thilo-Alexander Ginkel
Cc: stable@kernel.org
29 Apr, 2011
2 commits
-
Sedat and Bruno reported RCU stalls which turned out to be caused by
the following;sched_init() calls init_rt_bandwidth() which calls hrtimer_init()
_BEFORE_ hrtimers_init() is called. While not entirely correct this
worked because hrtimer_init() only accessed statically initialized
data (hrtimer_bases.clock_base[CLOCK_MONOTONIC])Commit e06383db9 (hrtimers: extend hrtimer base code to handle more
then 2 clockids) added an indirection to the hrtimer_bases.clock_base
lookup to avoid gap handling in the hot path. The table which is used
for the translataion from CLOCK_ID to HRTIMER_BASE index is
initialized at runtime in hrtimers_init(). So the early call of the
scheduler code translates CLOCK_MONOTONIC to HRTIMER_BASE_REALTIME.Thus the rt_bandwith timer ends up on CLOCK_REALTIME. If the timer is
armed and the wall clock time is set (e.g. ntpdate in the early boot
process - which also gives the problem deterministic behaviour
i.e. magic recovery after N hours), then the timer ends up with an
expiry time far into the future. That breaks the RT throttler
mechanism as rt runtime is accumulated and never cleared, so the rt
throttler detects a false cpu hog condition and blocks all RT tasks
until the timer finally expires. That in turn stalls the RCU thread of
TINYRCU which leads to an huge amount of RCU callbacks piling up.Make the translation table statically initialized, so we are back to
the status of
Reported-by: Bruno Prémont
Cc: John stultz
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104282353140.3005%40ionos%3E
Reviewed-by: Ingo Molnar
Signed-off-by: Thomas Gleixner -
In corner cases where softlockup watchdog is not setup successfully, the
relevant nmi perf event for hardlockup watchdog could be disabled, then
the status of the underlying hardware remains unchanged.Also, if the kthread doesn't start then the hrtimer won't run and the
hardlockup detector will falsely fire.Signed-off-by: Hillf Danton
Signed-off-by: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Apr, 2011
1 commit
-
…rostedt/linux-2.6-trace into perf/urgent
25 Apr, 2011
1 commit
-
When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.Reported-by: Oleg Nesterov
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: Prasad
Cc: Paul Mundt
Cc: v2.6.33..
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
24 Apr, 2011
1 commit
-
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Add missing syscore_suspend() and syscore_resume() calls
PM: Fix error code paths executed after failing syscore_suspend()
21 Apr, 2011
1 commit
-
Microblaze doesn't need/support FRAME_POINTERS in order to have a working
function tracer.The patch remove Kconfig warning.
Warning log:
warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP &&
FUNCTION_TRACER && KMEMCHECK) selects FRAME_POINTER which has unmet direct
dependencies (DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 ||
SUPERH || BLACKFIN || MN10300) || ARCH_WANT_FRAME_POINTERS)Signed-off-by: Michal Simek
Link: http://lkml.kernel.org/r/1301908812-8119-2-git-send-email-monstr@monstr.eu
CC: Frederic Weisbecker
CC: Ingo Molnar
Signed-off-by: Steven Rostedt
20 Apr, 2011
2 commits
-
Device suspend/resume infrastructure is used not only by the suspend
and hibernate code in kernel/power, but also by APM, Xen and the
kexec jump feature. However, commit 40dc166cb5dddbd36aa4ad11c03915ea
(PM / Core: Introduce struct syscore_ops for core subsystems PM)
failed to add syscore_suspend() and syscore_resume() calls to that
code, which generally leads to breakage when the features in question
are used.To fix this problem, add the missing syscore_suspend() and
syscore_resume() calls to arch/x86/kernel/apm_32.c, kernel/kexec.c
and drivers/xen/manage.c.Signed-off-by: Rafael J. Wysocki
Acked-by: Greg Kroah-Hartman
Acked-by: Ian Campbell -
…l/git/tip/linux-2.6-tip
* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: rtc-omap: Fix a leak of the IRQ during init failure
posix clocks: Replace mutex with reader/writer semaphore
19 Apr, 2011
2 commits
-
If syscore_suspend() fails in suspend_enter(), create_image() or
resume_target_kernel(), it is necessary to call sysdev_resume(),
because sysdev_suspend() has been called already and succeeded
and we are going to abort the transition.Signed-off-by: Rafael J. Wysocki
Acked-by: Greg Kroah-Hartman -
next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]
Reported-by: Tavis Ormandy
Analyzed-by: Robert Święcki
Cc: Eric W. Biederman
Cc: Pavel Emelyanov
Signed-off-by: Linus Torvalds
18 Apr, 2011
1 commit
-
A dynamic posix clock is protected from asynchronous removal by a mutex.
However, using a mutex has the unwanted effect that a long running clock
operation in one process will unnecessarily block other processes.For example, one process might call read() to get an external time stamp
coming in at one pulse per second. A second process calling clock_gettime
would have to wait for almost a whole second.This patch fixes the issue by using a reader/writer semaphore instead of
a mutex.Signed-off-by: Richard Cochran
Cc: John Stultz
Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E
Signed-off-by: Thomas Gleixner
17 Apr, 2011
2 commits
-
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: make unplug timer trace event correspond to the schedule() unplug
block: let io_schedule() flush the plug inline -
…linus', 'timer-fixes-for-linus' and 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setup* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_event: Fix cgrp event scheduling bug in perf_enable_on_exec()
perf: Fix a build error with some GCC versions* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix erroneous all_pinned logic
sched: Fix sched-domain avg_load calculation* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: rtc-mrst: follow on to the change of rtc_device_register()
RTC: add missing "return 0" in new alarm func for rtc-bfin.c
RTC: Fix s3c compile error due to missing s3c_rtc_setpie
RTC: Fix early irqs caused by calling rtc_set_alarm too early* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, amd: Disable GartTlbWlkErr when BIOS forgets it
x86, NUMA: Fix fakenuma boot failure
x86/mrst: Fix boot crash caused by incorrect pin to irq mapping
x86/ce4100: Add reg property to bridges
16 Apr, 2011
2 commits
-
It's a pretty close match to what we had before - the timer triggering
would mean that nobody unplugged the plug in due time, in the new
scheme this matches very closely what the schedule() unplug now is.
It's essentially the difference between an explicit unplug (IO unplug)
or an implicit unplug (timer unplug, we scheduled with pending IO
queued).Signed-off-by: Jens Axboe
-
Linus correctly observes that the most important dispatch cases
are now done from kblockd, this isn't ideal for latency reasons.
The original reason for switching dispatches out-of-line was to
avoid too deep a stack, so by _only_ letting the "accidental"
flush directly in schedule() be guarded by offload to kblockd,
we should be able to get the best of both worlds.So add a blk_schedule_flush_plug() that offloads to kblockd,
and only use that from the schedule() path.Signed-off-by: Jens Axboe
15 Apr, 2011
2 commits
-
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: only force kblockd unplugging from the schedule() path
block: cleanup the block plug helper functions
block, blk-sysfs: Use the variable directly instead of a function call
block: move queue run on unplug to kblockd
block: kill queue_sync_plugs()
block: readd plug trace event
block: add callback function for unplug notification
block: add comment on why we save and disable interrupts in flush_plug_list()
block: fixup block IO unplug trace call
block: remove block_unplug_timer() trace point
block: splice plug list to local context -
The FLAGS_HAS_TIMEOUT flag was not getting set, causing the restart_block to
restart futex_wait() without a timeout after a signal.Commit b41277dc7a18ee332d in 2.6.38 introduced the regression by accidentally
removing the the FLAGS_HAS_TIMEOUT assignment from futex_wait() during the setup
of the restart block. Restore the originaly behavior.Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=32922
Reported-by: Tim Smith
Reported-by: Torsten Hilbrich
Signed-off-by: Darren Hart
Signed-off-by: Eric Dumazet
Cc: Peter Zijlstra
Cc: John Kacur
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/%3Cdaac0eb3af607f72b9a4d3126b2ba8fb5ed3b883.1302820917.git.dvhart%40linux.intel.com%3E
Signed-off-by: Thomas Gleixner
13 Apr, 2011
1 commit
-
We really only want to unplug the pending IO when the process actually
goes to sleep. So move the test for flushing the plug up to the place
where we actually deactivate the task - where we have properly checked
for preemption and for the process really sleeping.Acked-by: Jens Axboe
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds
12 Apr, 2011
4 commits
-
It was removed with the on-stack plugging, readd it and track the
depth of requests added when flushing the plug.Signed-off-by: Jens Axboe
-
We no longer have an unplug timer running, so no point in keeping
the trace point.Signed-off-by: Jens Axboe
-
Make XEN_SAVE_RESTORE select HIBERNATE_CALLBACKS.
Remove XEN_SAVE_RESTORE dependency from PM_SLEEP.Signed-off-by: Shriram Rajagopalan
Acked-by: Ian Campbell
Signed-off-by: Rafael J. Wysocki -
Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose. However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation. Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it. Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.Signed-off-by: Rafael J. Wysocki
Tested-by: Shriram Rajagopalan
11 Apr, 2011
3 commits
-
The scheduler load balancer has specific code to deal with cases of
unbalanced system due to lots of unmovable tasks (for example because of
hard CPU affinity). In those situation, it excludes the busiest CPU that
has pinned tasks for load balance consideration such that it can perform
second 2nd load balance pass on the rest of the system.This all works as designed if there is only one cgroup in the system.
However, when we have multiple cgroups, this logic has false positives and
triggers multiple load balance passes despite there are actually no pinned
tasks at all.The reason it has false positives is that the all pinned logic is deep in
the lowest function of can_migrate_task() and is too low level:load_balance_fair() iterates each task group and calls balance_tasks() to
migrate target load. Along the way, balance_tasks() will also set a
all_pinned variable. Given that task-groups are iterated, this all_pinned
variable is essentially the status of last group in the scanning process.
Task group can have number of reasons that no load being migrated, none
due to cpu affinity. However, this status bit is being propagated back up
to the higher level load_balance(), which incorrectly think that no tasks
were moved. It kick off the all pinned logic and start multiple passes
attempt to move load onto puller CPU.To fix this, move the all_pinned aggregation up at the iterator level.
This ensures that the status is aggregated over all task-groups, not just
last one in the list.Signed-off-by: Ken Chen
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/BANLkTi=ernzNawaR5tJZEsV_QVnfxqXmsQ@mail.gmail.com
Signed-off-by: Ingo Molnar -
In function find_busiest_group(), the sched-domain avg_load isn't
calculated at all if there is a group imbalance within the domain. This
will cause erroneous imbalance calculation.The reason is that calculate_imbalance() sees sds->avg_load = 0 and it
will dump entire sds->max_load into imbalance variable, which is used
later on to migrate entire load from busiest CPU to the puller CPU.This has two really bad effect:
1. stampede of task migration, and they won't be able to break out
of the bad state because of positive feedback loop: large load
delta -> heavier load migration -> larger imbalance and the cycle
goes on.2. severe imbalance in CPU queue depth. This causes really long
scheduling latency blip which affects badly on application that
has tight latency requirement.The fix is to have kernel calculate domain avg_load in both cases. This
will ensure that imbalance calculation is always sensible and the target
is usually half way between busiest and puller CPU.Signed-off-by: Ken Chen
Signed-off-by: Peter Zijlstra
Cc:
Link: http://lkml.kernel.org/r/20110408002322.3A0D812217F@elm.corp.google.com
Signed-off-by: Ingo Molnar -
There is a bug in perf_event_enable_on_exec() when cgroup events are
active on a CPU: the cgroup events may be scheduled twice causing event
state corruptions which eventually may lead to kernel panics.The reason is that the function needs to first schedule out the cgroup
events, just like for the per-thread events. The cgroup event are
scheduled back in automatically from the perf_event_context_sched_in()
function.The patch also adds a WARN_ON_ONCE() is perf_cgroup_switch() to catch any
bogus state.Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110406005454.GA1062@quad
Signed-off-by: Ingo Molnar
09 Apr, 2011
1 commit
-
Fix erroneous syscall kernel-doc comments in kernel/signal.c.
Reported-by: Matt Fleming
Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds
08 Apr, 2011
2 commits
-
…-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file -
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
Fix common misspellings
05 Apr, 2011
3 commits
-
Instead of the possible multiple-evaluation of num_online_cpus()
in rebalance_domains() that Linus reported, avoid it altogether
in the normal case since it's implemented with a Hamming weight
function over a cpu bitmask which can be darn expensive for those
with big iron.This also makes it cleaner, smaller and documents the code.
Reported-by: Linus Torvalds
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Add kernel-doc to syscalls in signal.c.
Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds -
General coding style and comment fixes; no code changes:
- Use multi-line-comment coding style.
- Put some function signatures completely on one line.
- Hyphenate some words.
- Spell Posix as POSIX.
- Correct typos & spellos in some comments.
- Drop trailing whitespace.
- End sentences with periods.Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds