Eric Lee / smarc-fsl-linux-kernel

19 May, 2011

6 commits

de4d8d534 module: each_symbol_section instead of each_symbol ... Browse Code »

Instead of having a callback function for each symbol in the kernel,
have a callback for each array of symbols.

This eases the logic when we move to sorted symbols and binary search.

Signed-off-by: Rusty Russell
Signed-off-by: Alessio Igor Bogani

Rusty Russell
2011-05-19 15:25:26 +0800
01526ed08 module: split unset_section_ro_nx function. ... Browse Code »

Split the unprotect function into a function per section to make
the code more readable and add the missing static declaration.

Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell

Jan Glauber
2011-05-19 15:25:26 +0800
448694a1d module: undo module RONX protection correctly. ... Browse Code »

While debugging I stumbled over two problems in the code that protects module
pages.

First issue is that disabling the protection before freeing init or unload of
a module is not symmetric with the enablement. For instance, if pages are set
to RO the page range from module_core to module_core + core_ro_size is
protected. If a module is unloaded the page range from module_core to
module_core + core_size is set back to RW.
So pages that were not set to RO are also changed to RW.
This is not critical but IMHO it should be symmetric.

Second issue is that while set_memory_rw & set_memory_ro are used for
RO/RW changes only set_memory_nx is involved for NX/X. One would await that
the inverse function is called when the NX protection should be removed,
which is not the case here, unless I'm missing something.

Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell

Jan Glauber
2011-05-19 15:25:26 +0800
4d10380e7 module: zero mod->init_ro_size after init is freed. ... Browse Code »

Reset mod->init_ro_size to zero after the init part of a module is unloaded.
Otherwise we need to check if module->init is NULL in the unprotect functions
in the next patch.

Signed-off-by: Jan Glauber
Signed-off-by: Rusty Russell

Jan Glauber
2011-05-19 15:25:26 +0800
5d05c7084 minor ANSI prototype sparse fix ... Browse Code »

Fix function prototype to be ANSI-C compliant, consistent with other
function prototypes, addressing a sparse warning.

Signed-off-by: Daniel J Blueman
Signed-off-by: Rusty Russell

Daniel J Blueman
2011-05-19 15:25:25 +0800
b4bc84280 module: deal with alignment issues in built-in module versions ... Browse Code »

On m68k natural alignment is 2-byte boundary but we are trying to
align structures in __modver section on sizeof(void *) boundary.
This causes trouble when we try to access elements in this section
in array-like fashion when create "version" attributes for built-in
modules.

Moreover, as DaveM said, we can't reliably put structures into
independent objects, put them into a special section, and then expect
array access over them (via the section boundaries) after linking the
objects together to just "work" due to variable alignment choices in
different situations. The only solution that seems to work reliably
is to make an array of plain pointers to the objects in question and
put those pointers in the special section.

Reported-by: Geert Uytterhoeven
Signed-off-by: Dmitry Torokhov
Signed-off-by: Rusty Russell

Dmitry Torokhov
2011-05-19 15:25:24 +0800

17 May, 2011

2 commits

a085963a2 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tick: Clear broadcast active bit when switching to oneshot
rtc: mc13xxx: Don't call rtc_device_register while holding lock
rtc: rp5c01: Initialize drvdata before registering device
rtc: pcap: Initialize drvdata before registering device
rtc: msm6242: Initialize drvdata before registering device
rtc: max8998: Initialize drvdata before registering device
rtc: max8925: Initialize drvdata before registering device
rtc: m41t80: Initialize clientdata before registering device
rtc: ds1286: Initialize drvdata before registering device
rtc: ep93xx: Initialize drvdata before registering device
rtc: davinci: Initialize drvdata before registering device
rtc: mxc: Initialize drvdata before registering device
clocksource: Install completely before selecting

Linus Torvalds
2011-05-17 23:02:04 +0800
07f4beb0b tick: Clear broadcast active bit when switching to oneshot ... Browse Code »

The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.

The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.

In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.

The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.

[ I fundamentally hate that broadcast crap. Why the heck thought some
folks that when going into deep idle it's a brilliant concept to
switch off the last device which brings the cpu back from that
state? ]

Thanks to Christian for providing all the valuable debug information!

Reported-and-tested-by: Christian Hoffmann
Cc: John Stultz
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-05-17 05:35:41 +0800

14 May, 2011

1 commit

47a150edc Cache user_ns in struct cred ... Browse Code »

If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).

Get rid of _current_user_ns. This requires nsown_capable() to be
defined in capability.c rather than as static inline in capability.h,
so do that.

Request_key needs init_user_ns defined at current_user_ns if
!CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
at current_user_ns() define.

Compile-tested with and without CONFIG_USERNS.

Signed-off-by: Serge E. Hallyn
[ This makes a huge performance difference for acl_permission_check(),
up to 30%. And that is one of the hottest kernel functions for loads
that are pathname-lookup heavy. ]
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-05-14 02:45:33 +0800

12 May, 2011

3 commits

36cb7035e PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM ... Browse Code »

The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing
one to suspend to RAM after creating a hibernation image is currently
broken, because it doesn't clear the "ready" flag in the struct
snapshot_data object handled by it. As a result, the
SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has
returned and the user space hibernate task cannot thaw the other
processes as appropriate. Make SNAPSHOT_S2RAM clear data->ready
to fix this problem.

Tested-by: Alexandre Felipe Muller de Souza
Signed-off-by: Rafael J. Wysocki
Cc: stable@kernel.org

Rafael J. Wysocki
2011-05-12 03:10:58 +0800
9744997a8 PM / Hibernate: Make snapshot_release() restore GFP mask ... Browse Code »

If the process using the hibernate user space interface closes
/dev/snapshot after creating a hibernation image without thawing
tasks, snapshot_release() should call pm_restore_gfp_mask() to
restore the GFP mask used before the creation of the image. Make
that happen.

Tested-by: Alexandre Felipe Muller de Souza
Signed-off-by: Rafael J. Wysocki
Cc: stable@kernel.org

Rafael J. Wysocki
2011-05-12 03:10:43 +0800
87186475a PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl ... Browse Code »

A warning is printed by pm_restrict_gfp_mask() while the
SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation
image, because pm_restrict_gfp_mask() has been called once already
before the image creation and suspend_devices_and_enter() calls it
once again. This happens after commit 452aa6999e6703ffbddd7f6ea124d3
(mm/pm: force GFP_NOIO during suspend/hibernation and resume).

To avoid this issue, move pm_restrict_gfp_mask() and
pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller
in kernel/power/suspend.c.

Reported-by: Alexandre Felipe Muller de Souza
Signed-off-by: Rafael J. Wysocki
Cc: stable@kernel.org

Rafael J. Wysocki
2011-05-12 03:10:14 +0800

08 May, 2011

1 commit

8b061610d Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Makefile: Use gcc to determine ARCH
perf events, x86: Fix Intel Nehalem and Westmere last level cache event definitions
hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg()
sh, hw_breakpoints: Fix racy access to ptrace breakpoints
arm, hw_breakpoints: Fix racy access to ptrace breakpoints
powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints
x86, hw_breakpoints: Fix racy access to ptrace breakpoints
ptrace: Prepare to fix racy accesses on task breakpoints

Linus Torvalds
2011-05-08 04:17:37 +0800

07 May, 2011

1 commit

a3a4a5acd Regression: partial revert "tracing: Remove lock_depth from event entry" ... Browse Code »

This partially reverts commit e6e1e2593592a8f6f6380496655d8c6f67431266.

That commit changed the structure layout of the trace structure, which
in turn broke PowerTOP (1.9x generation) quite badly.

I appreciate not wanting to expose the variable in question, and
PowerTOP was not using it, so I've replaced the variable with just a
padding field - that way if in the future a new field is needed it can
just use this padding field.

Signed-off-by: Arjan van de Ven
Signed-off-by: Linus Torvalds

Arjan van de Ven
2011-05-07 04:20:59 +0800

06 May, 2011

1 commit

4d70230bb Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/torval… ... Browse Code »

…ds/linux-2.6 into perf/urgent

Ingo Molnar
2011-05-06 14:11:28 +0800

05 May, 2011

2 commits

e05b2efb8 clocksource: Install completely before selecting ... Browse Code »

Christian Hoffmann reported that the command line clocksource override
with acpi_pm timer fails:

Kernel command line: clocksource=acpi_pm
hpet clockevent registered
Switching to clocksource hpet
Override clocksource acpi_pm is not HRT compatible.
Cannot switch while in HRT/NOHZ mode.

The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we
actually end up selecting the clocksource before we enqueue it into
the watchdog list, so that's why we see the warning and fail to switch
to acpi_pm timer as requested. That's particularly bad when we want to
debug timekeeping related problems in early boot.

Put the selection call last.

Reported-by: Christian Hoffmann
Signed-off-by: John Stultz
Cc: stable@kernel.org # 32...
Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3E
Signed-off-by: Thomas Gleixner

john stultz
2011-05-05 21:23:26 +0800
98bb31886 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/fred… ... Browse Code »

…eric/random-tracing into perf/urgent

Ingo Molnar
2011-05-05 02:33:42 +0800

03 May, 2011

1 commit

94b2c363d genirq: Fix typo CONFIG_GENIRC_IRQ_SHOW_LEVEL ... Browse Code »

commit ab7798ffcf98b11a9525cf65bacdae3fd58d357f ("genirq: Expand generic
show_interrupts()") added the Kconfig option GENERIC_IRQ_SHOW_LEVEL to
accomodate PowerPC, but this doesn't actually enable the functionality due
to a typo in the #ifdef check.

Signed-off-by: Geert Uytterhoeven
Cc: Linux/PPC Development
Link: http://lkml.kernel.org/r/%3Calpine.DEB.2.00.1104302251370.19068%40ayla.of.borg%3E
Signed-off-by: Thomas Gleixner

Geert Uytterhoeven
2011-05-03 03:16:37 +0800

01 May, 2011

1 commit

3fd9952df Merge branch 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

* 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix deadlock in worker_maybe_bind_and_lock()
workqueue: Document debugging tricks

Fix up trivial spelling conflict in kernel/workqueue.c

Linus Torvalds
2011-05-01 00:15:40 +0800

30 Apr, 2011

3 commits

40a963502 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86, nmi: Move LVT un-masking into irq handlers
perf events, x86: Work around the Nehalem AAJ80 erratum
perf, x86: Fix BTS condition
ftrace: Build without frame pointers on Microblaze

Linus Torvalds
2011-04-30 06:08:53 +0800
fcc4dc715 Merge branch 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimer: Initialize CLOCK_ID to HRTIMER_BASE table statically
rtc: max8925: Call dev_set_drvdata before rtc_device_register

Linus Torvalds
2011-04-30 06:08:31 +0800
5035b20fa workqueue: fix deadlock in worker_maybe_bind_and_lock() ... Browse Code »

If a rescuer and stop_machine() bringing down a CPU race with each
other, they may deadlock on non-preemptive kernel. The CPU won't
accept a new task, so the rescuer can't migrate to the target CPU,
while stop_machine() can't proceed because the rescuer is holding one
of the CPU retrying migration. GCWQ_DISASSOCIATED is never cleared
and worker_maybe_bind_and_lock() retries indefinitely.

This problem can be reproduced semi reliably while the system is
entering suspend.

http://thread.gmane.org/gmane.linux.kernel/1122051

A lot of kudos to Thilo-Alexander for reporting this tricky issue and
painstaking testing.

stable: This affects all kernels with cmwq, so all kernels since and
including v2.6.36 need this fix.

Signed-off-by: Tejun Heo
Reported-by: Thilo-Alexander Ginkel
Tested-by: Thilo-Alexander Ginkel
Cc: stable@kernel.org

Tejun Heo
2011-04-30 00:08:37 +0800

29 Apr, 2011

2 commits

ce31332d3 hrtimer: Initialize CLOCK_ID to HRTIMER_BASE table statically ... Browse Code »

Sedat and Bruno reported RCU stalls which turned out to be caused by
the following;

sched_init() calls init_rt_bandwidth() which calls hrtimer_init()
_BEFORE_ hrtimers_init() is called. While not entirely correct this
worked because hrtimer_init() only accessed statically initialized
data (hrtimer_bases.clock_base[CLOCK_MONOTONIC])

Commit e06383db9 (hrtimers: extend hrtimer base code to handle more
then 2 clockids) added an indirection to the hrtimer_bases.clock_base
lookup to avoid gap handling in the hot path. The table which is used
for the translataion from CLOCK_ID to HRTIMER_BASE index is
initialized at runtime in hrtimers_init(). So the early call of the
scheduler code translates CLOCK_MONOTONIC to HRTIMER_BASE_REALTIME.

Thus the rt_bandwith timer ends up on CLOCK_REALTIME. If the timer is
armed and the wall clock time is set (e.g. ntpdate in the early boot
process - which also gives the problem deterministic behaviour
i.e. magic recovery after N hours), then the timer ends up with an
expiry time far into the future. That breaks the RT throttler
mechanism as rt runtime is accumulated and never cleared, so the rt
throttler detects a false cpu hog condition and blocks all RT tasks
until the timer finally expires. That in turn stalls the RCU thread of
TINYRCU which leads to an huge amount of RCU callbacks piling up.

Make the translation table statically initialized, so we are back to
the status of
Reported-by: Bruno Prémont
Cc: John stultz
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104282353140.3005%40ionos%3E
Reviewed-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-04-29 16:57:11 +0800
1409f141a kernel/watchdog.c: disable nmi perf event in the error path of enabling watchdog ... Browse Code »

In corner cases where softlockup watchdog is not setup successfully, the
relevant nmi perf event for hardlockup watchdog could be disabled, then
the status of the underlying hardware remains unchanged.

Also, if the kthread doesn't start then the hrtimer won't run and the
hardlockup detector will falsely fire.

Signed-off-by: Hillf Danton
Signed-off-by: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hillf Danton
2011-04-29 02:28:21 +0800

27 Apr, 2011

1 commit

6c8a72132 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…rostedt/linux-2.6-trace into perf/urgent

Ingo Molnar
2011-04-27 16:31:29 +0800

25 Apr, 2011

1 commit

bf26c0184 ptrace: Prepare to fix racy accesses on task breakpoints ... Browse Code »

When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.

Reported-by: Oleg Nesterov
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: Prasad
Cc: Paul Mundt
Cc: v2.6.33..
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com

Frederic Weisbecker
2011-04-25 23:28:24 +0800

24 Apr, 2011

1 commit

686c4cbb1 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 ... Browse Code »

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Add missing syscore_suspend() and syscore_resume() calls
PM: Fix error code paths executed after failing syscore_suspend()

Linus Torvalds
2011-04-24 13:35:16 +0800

21 Apr, 2011

1 commit

d20ac2528 ftrace: Build without frame pointers on Microblaze ... Browse Code »

Microblaze doesn't need/support FRAME_POINTERS in order to have a working
function tracer.

The patch remove Kconfig warning.

Warning log:
warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP &&
FUNCTION_TRACER && KMEMCHECK) selects FRAME_POINTER which has unmet direct
dependencies (DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 ||
SUPERH || BLACKFIN || MN10300) || ARCH_WANT_FRAME_POINTERS)

Signed-off-by: Michal Simek
Link: http://lkml.kernel.org/r/1301908812-8119-2-git-send-email-monstr@monstr.eu
CC: Frederic Weisbecker
CC: Ingo Molnar
Signed-off-by: Steven Rostedt

Michal Simek
2011-04-21 21:06:24 +0800

20 Apr, 2011

2 commits

19234c081 PM: Add missing syscore_suspend() and syscore_resume() calls ... Browse Code »

Device suspend/resume infrastructure is used not only by the suspend
and hibernate code in kernel/power, but also by APM, Xen and the
kexec jump feature. However, commit 40dc166cb5dddbd36aa4ad11c03915ea
(PM / Core: Introduce struct syscore_ops for core subsystems PM)
failed to add syscore_suspend() and syscore_resume() calls to that
code, which generally leads to breakage when the features in question
are used.

To fix this problem, add the missing syscore_suspend() and
syscore_resume() calls to arch/x86/kernel/apm_32.c, kernel/kexec.c
and drivers/xen/manage.c.

Signed-off-by: Rafael J. Wysocki
Acked-by: Greg Kroah-Hartman
Acked-by: Ian Campbell

Rafael J. Wysocki
2011-04-20 06:36:11 +0800
4ae0ff16e Merge branch 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: rtc-omap: Fix a leak of the IRQ during init failure
posix clocks: Replace mutex with reader/writer semaphore

Linus Torvalds
2011-04-20 01:56:46 +0800

19 Apr, 2011

2 commits

2ca6f62f5 PM: Fix error code paths executed after failing syscore_suspend() ... Browse Code »

If syscore_suspend() fails in suspend_enter(), create_image() or
resume_target_kernel(), it is necessary to call sysdev_resume(),
because sysdev_suspend() has been called already and succeeded
and we are going to abort the transition.

Signed-off-by: Rafael J. Wysocki
Acked-by: Greg Kroah-Hartman

Rafael J. Wysocki
2011-04-19 05:58:59 +0800
c78193e9c next_pidmap: fix overflow condition ... Browse Code »

next_pidmap() just quietly accepted whatever 'last' pid that was passed
in, which is not all that safe when one of the users is /proc.

Admittedly the proc code should do some sanity checking on the range
(and that will be the next commit), but that doesn't mean that the
helper functions should just do that pidmap pointer arithmetic without
checking the range of its arguments.

So clamp 'last' to PID_MAX_LIMIT. The fact that we then do "last+1"
doesn't really matter, the for-loop does check against the end of the
pidmap array properly (it's only the actual pointer arithmetic overflow
case we need to worry about, and going one bit beyond isn't going to
overflow).

[ Use PID_MAX_LIMIT rather than pid_max as per Eric Biederman ]

Reported-by: Tavis Ormandy
Analyzed-by: Robert Święcki
Cc: Eric W. Biederman
Cc: Pavel Emelyanov
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-04-19 01:35:30 +0800

18 Apr, 2011

1 commit

1791f8814 posix clocks: Replace mutex with reader/writer semaphore ... Browse Code »

A dynamic posix clock is protected from asynchronous removal by a mutex.
However, using a mutex has the unwanted effect that a long running clock
operation in one process will unnecessarily block other processes.

For example, one process might call read() to get an external time stamp
coming in at one pulse per second. A second process calling clock_gettime
would have to wait for almost a whole second.

This patch fixes the issue by using a reader/writer semaphore instead of
a mutex.

Signed-off-by: Richard Cochran
Cc: John Stultz
Link: http://lkml.kernel.org/r/%3C20110330132421.GA31771%40riccoc20.at.omicron.at%3E
Signed-off-by: Thomas Gleixner

Richard Cochran
2011-04-18 16:39:38 +0800

17 Apr, 2011

2 commits

d733ed6c3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: make unplug timer trace event correspond to the schedule() unplug
block: let io_schedule() flush the plug inline

Linus Torvalds
2011-04-17 01:33:41 +0800
fdfc552ab Merge branches 'core-fixes-for-linus', 'perf-fixes-for-linus', 'sched-fixes-for-… ... Browse Code »

…linus', 'timer-fixes-for-linus' and 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setup

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_event: Fix cgrp event scheduling bug in perf_enable_on_exec()
perf: Fix a build error with some GCC versions

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix erroneous all_pinned logic
sched: Fix sched-domain avg_load calculation

* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
RTC: rtc-mrst: follow on to the change of rtc_device_register()
RTC: add missing "return 0" in new alarm func for rtc-bfin.c
RTC: Fix s3c compile error due to missing s3c_rtc_setpie
RTC: Fix early irqs caused by calling rtc_set_alarm too early

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, amd: Disable GartTlbWlkErr when BIOS forgets it
x86, NUMA: Fix fakenuma boot failure
x86/mrst: Fix boot crash caused by incorrect pin to irq mapping
x86/ce4100: Add reg property to bridges

Linus Torvalds
2011-04-17 00:45:08 +0800

16 Apr, 2011

2 commits

49cac01e1 block: make unplug timer trace event correspond to the schedule() unplug ... Browse Code »

It's a pretty close match to what we had before - the timer triggering
would mean that nobody unplugged the plug in due time, in the new
scheme this matches very closely what the schedule() unplug now is.
It's essentially the difference between an explicit unplug (IO unplug)
or an implicit unplug (timer unplug, we scheduled with pending IO
queued).

Signed-off-by: Jens Axboe

Jens Axboe
2011-04-16 19:51:05 +0800
a237c1c5b block: let io_schedule() flush the plug inline ... Browse Code »

Linus correctly observes that the most important dispatch cases
are now done from kblockd, this isn't ideal for latency reasons.
The original reason for switching dispatches out-of-line was to
avoid too deep a stack, so by _only_ letting the "accidental"
flush directly in schedule() be guarded by offload to kblockd,
we should be able to get the best of both worlds.

So add a blk_schedule_flush_plug() that offloads to kblockd,
and only use that from the schedule() path.

Signed-off-by: Jens Axboe

Jens Axboe
2011-04-16 19:27:55 +0800

15 Apr, 2011

2 commits

5853b4f06 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: only force kblockd unplugging from the schedule() path
block: cleanup the block plug helper functions
block, blk-sysfs: Use the variable directly instead of a function call
block: move queue run on unplug to kblockd
block: kill queue_sync_plugs()
block: readd plug trace event
block: add callback function for unplug notification
block: add comment on why we save and disable interrupts in flush_plug_list()
block: fixup block IO unplug trace call
block: remove block_unplug_timer() trace point
block: splice plug list to local context

Linus Torvalds
2011-04-15 23:01:13 +0800
0cd9c6494 futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setup ... Browse Code »

The FLAGS_HAS_TIMEOUT flag was not getting set, causing the restart_block to
restart futex_wait() without a timeout after a signal.

Commit b41277dc7a18ee332d in 2.6.38 introduced the regression by accidentally
removing the the FLAGS_HAS_TIMEOUT assignment from futex_wait() during the setup
of the restart block. Restore the originaly behavior.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=32922

Reported-by: Tim Smith
Reported-by: Torsten Hilbrich
Signed-off-by: Darren Hart
Signed-off-by: Eric Dumazet
Cc: Peter Zijlstra
Cc: John Kacur
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/%3Cdaac0eb3af607f72b9a4d3126b2ba8fb5ed3b883.1302820917.git.dvhart%40linux.intel.com%3E
Signed-off-by: Thomas Gleixner

Darren Hart
2011-04-15 22:34:32 +0800

13 Apr, 2011

1 commit

6631e635c block: don't flush plugged IO on forced preemtion scheduling ... Browse Code »

We really only want to unplug the pending IO when the process actually
goes to sleep. So move the test for flushing the plug up to the place
where we actually deactivate the task - where we have properly checked
for preemption and for the process really sleeping.

Acked-by: Jens Axboe
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-04-13 23:08:20 +0800