Eric Lee / smarc-fsl-linux-kernel

14 Apr, 2011

1 commit

a4c98f8bb Merge branch 'linus' into sched/locking ... Browse Code »

Merge reason: Pick up this upstream commit:

6631e635c65d: block: don't flush plugged IO on forced preemtion scheduling

As it modifies the scheduler and we'll queue up dependent patches.

Signed-off-by: Ingo Molnar

Ingo Molnar
2011-04-14 14:51:07 +0800

13 Apr, 2011

1 commit

6631e635c block: don't flush plugged IO on forced preemtion scheduling ... Browse Code »

We really only want to unplug the pending IO when the process actually
goes to sleep. So move the test for flushing the plug up to the place
where we actually deactivate the task - where we have properly checked
for preemption and for the process really sleeping.

Acked-by: Jens Axboe
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-04-13 23:08:20 +0800

12 Apr, 2011

2 commits

d419e4c0f fix XEN_SAVE_RESTORE Kconfig dependencies ... Browse Code »

Make XEN_SAVE_RESTORE select HIBERNATE_CALLBACKS.
Remove XEN_SAVE_RESTORE dependency from PM_SLEEP.

Signed-off-by: Shriram Rajagopalan
Acked-by: Ian Campbell
Signed-off-by: Rafael J. Wysocki

Shriram Rajagopalan
2011-04-12 04:54:48 +0800
1f112cee0 PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS ... Browse Code »

Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose. However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation. Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.

To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it. Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.

Signed-off-by: Rafael J. Wysocki
Tested-by: Shriram Rajagopalan

Rafael J. Wysocki
2011-04-12 04:54:42 +0800

11 Apr, 2011

3 commits

f4ad9bd20 sched: Eliminate dead code from wakeup_gran() ... Browse Code »

calc_delta_fair() checks NICE_0_LOAD already, delete duplicate check.

Signed-off-by: Shaohua Li
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Link: http://lkml.kernel.org/r/1302238389.3981.92.camel@sli10-conroe
Signed-off-by: Ingo Molnar

Shaohua Li
2011-04-11 17:08:55 +0800
b30aef17f sched: Fix erroneous all_pinned logic ... Browse Code »

The scheduler load balancer has specific code to deal with cases of
unbalanced system due to lots of unmovable tasks (for example because of
hard CPU affinity). In those situation, it excludes the busiest CPU that
has pinned tasks for load balance consideration such that it can perform
second 2nd load balance pass on the rest of the system.

This all works as designed if there is only one cgroup in the system.

However, when we have multiple cgroups, this logic has false positives and
triggers multiple load balance passes despite there are actually no pinned
tasks at all.

The reason it has false positives is that the all pinned logic is deep in
the lowest function of can_migrate_task() and is too low level:

load_balance_fair() iterates each task group and calls balance_tasks() to
migrate target load. Along the way, balance_tasks() will also set a
all_pinned variable. Given that task-groups are iterated, this all_pinned
variable is essentially the status of last group in the scanning process.
Task group can have number of reasons that no load being migrated, none
due to cpu affinity. However, this status bit is being propagated back up
to the higher level load_balance(), which incorrectly think that no tasks
were moved. It kick off the all pinned logic and start multiple passes
attempt to move load onto puller CPU.

To fix this, move the all_pinned aggregation up at the iterator level.
This ensures that the status is aggregated over all task-groups, not just
last one in the list.

Signed-off-by: Ken Chen
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/BANLkTi=ernzNawaR5tJZEsV_QVnfxqXmsQ@mail.gmail.com
Signed-off-by: Ingo Molnar

Ken Chen
2011-04-11 17:08:54 +0800
b0432d8f1 sched: Fix sched-domain avg_load calculation ... Browse Code »

In function find_busiest_group(), the sched-domain avg_load isn't
calculated at all if there is a group imbalance within the domain. This
will cause erroneous imbalance calculation.

The reason is that calculate_imbalance() sees sds->avg_load = 0 and it
will dump entire sds->max_load into imbalance variable, which is used
later on to migrate entire load from busiest CPU to the puller CPU.

This has two really bad effect:

1. stampede of task migration, and they won't be able to break out
of the bad state because of positive feedback loop: large load
delta -> heavier load migration -> larger imbalance and the cycle
goes on.

2. severe imbalance in CPU queue depth. This causes really long
scheduling latency blip which affects badly on application that
has tight latency requirement.

The fix is to have kernel calculate domain avg_load in both cases. This
will ensure that imbalance calculation is always sensible and the target
is usually half way between busiest and puller CPU.

Signed-off-by: Ken Chen
Signed-off-by: Peter Zijlstra
Cc:
Link: http://lkml.kernel.org/r/20110408002322.3A0D812217F@elm.corp.google.com
Signed-off-by: Ingo Molnar

Ken Chen
2011-04-11 17:08:54 +0800

09 Apr, 2011

1 commit

f9fa0bc1f signal.c: fix erroneous syscall kernel-doc ... Browse Code »

Fix erroneous syscall kernel-doc comments in kernel/signal.c.

Reported-by: Matt Fleming
Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2011-04-09 02:05:24 +0800

08 Apr, 2011

2 commits

8b9686ff4 Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for… ... Browse Code »

…-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, fpu: Fix FPU exception handling on non-SSE systems
x86, hibernate: Initialize mmu_cr4_features during boot
x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
x86: visws: Fixup irq overhaul fallout

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Clean up rebalance_domains() load-balance interval calculation

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix cpumask leak in __setup_irq()

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Fix listing incorrect line number with inline function
perf probe: Fix to find recursively inlined function
perf probe: Fix multiple --vars options behavior
perf probe: Fix to remove redundant close
perf probe: Fix to ensure function declared file

Linus Torvalds
2011-04-08 03:12:58 +0800
42933bac1 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 ... Browse Code »

* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
Fix common misspellings

Linus Torvalds
2011-04-08 02:14:49 +0800

05 Apr, 2011

3 commits

49c022e65 sched: Clean up rebalance_domains() load-balance interval calculation ... Browse Code »

Instead of the possible multiple-evaluation of num_online_cpus()
in rebalance_domains() that Linus reported, avoid it altogether
in the normal case since it's implemented with a Hamming weight
function over a cpu bitmask which can be darn expensive for those
with big iron.

This also makes it cleaner, smaller and documents the code.

Reported-by: Linus Torvalds
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-04-05 16:29:36 +0800
41c57892a kernel/signal.c: add kernel-doc notation to syscalls ... Browse Code »

Add kernel-doc to syscalls in signal.c.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2011-04-05 08:51:46 +0800
5aba085ed kernel/signal.c: fix typos and coding style ... Browse Code »

General coding style and comment fixes; no code changes:

- Use multi-line-comment coding style.
- Put some function signatures completely on one line.
- Hyphenate some words.
- Spell Posix as POSIX.
- Correct typos & spellos in some comments.
- Drop trailing whitespace.
- End sentences with periods.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2011-04-05 08:51:46 +0800

04 Apr, 2011

3 commits

148086bb6 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix rebalance interval calculation
sched, doc: Beef up load balancing description
sched: Leave sched_setscheduler() earlier if possible, do not disturb SCHED_FIFO tasks

Linus Torvalds
2011-04-04 23:36:58 +0800
4da7e90e6 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix task_struct reference leak
perf: Fix task context scheduling
perf: mmap 512 kiB by default
perf: Rebase max unprivileged mlock threshold on top of page size
perf tools: Fix NO_NEWT=1 python build error
perf symbols: Properly align symbol_conf.priv_size
perf tools: Emit clearer message for sys_perf_event_open ENOENT return
perf tools: Fixup exit path when not able to open events
perf symbols: Fix vsyscall symbol lookup
oprofile, x86: Allow setting EDGE/INV/CMASK for counter events

Linus Torvalds
2011-04-04 23:36:40 +0800
4352d9d44 ntp: fix non privileged system time shifting ... Browse Code »

The ADJ_SETOFFSET bit added in commit 094aa188 ("ntp: Add ADJ_SETOFFSET
mode bit") also introduced a way for any user to change the system time.
Sneaky or buggy calls to adjtimex() could set

ADJ_OFFSET_SS_READ | ADJ_SETOFFSET

which would result in a successful call to timekeeping_inject_offset().
This patch fixes the issue by adding the capability check.

Signed-off-by: Richard Cochran
Signed-off-by: Linus Torvalds

Richard Cochran
2011-04-04 23:31:23 +0800

03 Apr, 2011

1 commit

4f5058c3b genirq: Fix cpumask leak in __setup_irq() ... Browse Code »

The allocated cpumask should be freed in __setup_irq().

Signed-off-by: Xiaotian Feng
LKML-Reference:
Signed-off-by: Thomas Gleixner

Xiaotian Feng
2011-04-03 03:26:20 +0800

01 Apr, 2011

1 commit

c0bb9e45f kdump: Allow shrinking of kdump region to be overridden ... Browse Code »

On ppc64 the crashkernel region almost always overlaps an area of firmware.
This works fine except when using the sysfs interface to reduce the kdump
region. If we free the firmware area we are guaranteed to crash.

Rename free_reserved_phys_range to crash_free_reserved_phys_range and make
it a weak function so we can override it.

Signed-off-by: Anton Blanchard
Signed-off-by: Benjamin Herrenschmidt

Anton Blanchard
2011-04-01 13:14:30 +0800

31 Mar, 2011

5 commits

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800
fd1edb3aa perf: Fix task_struct reference leak ... Browse Code »

sys_perf_event_open() had an imbalance in the number of task refs it
took causing memory leakage

Cc: Jiri Olsa
Cc: Oleg Nesterov
Cc: stable@kernel.org # .37+
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-03-31 19:02:56 +0800
20443384f perf: Rebase max unprivileged mlock threshold on top of page size ... Browse Code »

Ensure we allow 512 kiB + 1 page for user control without
assuming a 4096 bytes page size.

Reported-by: Peter Zijlstra
Signed-off-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Stephane Eranian
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2011-03-31 19:02:54 +0800
3436ae129 sched: Fix rebalance interval calculation ... Browse Code »

The interval for checking scheduling domains if they are due to be
balanced currently depends on boot state NR_CPUS, which may not
accurately reflect the number of online CPUs at the time of check.

Thus replace NR_CPUS with num_online_cpus().

(ed: Should only affect those who set NR_CPUS really high, such as 4096
or so :-)

Signed-off-by: Sisir Koppaka
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Sisir Koppaka
2011-03-31 19:00:37 +0800
a51e91981 sched: Leave sched_setscheduler() earlier if possible, do not disturb SCHED_FIFO tasks ... Browse Code »

sched_setscheduler() (in sched.c) is called in order of changing the
scheduling policy and/or the real-time priority of a task. Thus,
if we find out that neither of those are actually being modified, it
is possible to return earlier and save the overhead of a full
deactivate+activate cycle of the task in question.

Beside that, if we have more than one SCHED_FIFO task with the same
priority on the same rq (which means they share the same priority queue)
having one of them changing its position in the priority queue because of
a sched_setscheduler (as it happens by means of the deactivate+activate)
that does not actually change the priority violates POSIX which states,
for SCHED_FIFO:

"If a thread whose policy or priority has been modified by
pthread_setschedprio() is a running thread or is runnable, the effect on
its position in the thread list depends on the direction of the
modification, as follows: a. b. If the priority is unchanged, the
thread does not change position in the thread list. c. "

http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_08.html

(ed: And the POSIX specification here does, briefly and somewhat unexpectedly,
match what common sense tells us as well. )

Signed-off-by: Dario Faggioli
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Dario Faggioli
2011-03-31 19:00:34 +0800

30 Mar, 2011

2 commits

78c898256 genirq: Remove the now obsolete config options and select statements ... Browse Code »

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-30 20:13:23 +0800
353c8ed44 genirq: Fix misnamed label in handle_edge_eoi_irq ... Browse Code »

Reported-by: michael@ellerman.id.au
Signed-off-by: Thomas Gleixner
Cc: linuxppc-dev@lists.ozlabs.org

Thomas Gleixner
2011-03-30 04:24:05 +0800

29 Mar, 2011

9 commits

851d7cf64 genirq: Remove move_*irq leftovers ... Browse Code »

All users converted to new interface.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-29 20:50:32 +0800
0c6f8a8b9 genirq: Remove compat code ... Browse Code »

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-29 20:48:19 +0800
a6e120ed4 alpha: Use generic show_interrupts() ... Browse Code »

The only subtle difference is that alpha uses ACTUAL_NR_IRQS and
prints the IRQF_DISABLED flag.

Change the generic implementation to deal with ACTUAL_NR_IRQS if
defined.

The IRQF_DISABLED printing is pointless, as we nowadays run all
interrupts with irqs disabled.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-29 20:47:58 +0800
cd22c0e44 genirq: Fix harmless typo ... Browse Code »

The late night fixup missed to convert the data type from irq_desc to
irq_data, which results in a harmless but annoying warning.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-29 17:36:05 +0800
e5217fb8a Merge branches 'irq-cleanup-for-linus' and 'irq-fixes-for-linus' of git://git.ke… ... Browse Code »

…rnel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'irq-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
vlynq: Convert irq functions

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq; Fix cleanup fallout
genirq: Fix typo and remove unused variable
genirq: Fix new kernel-doc warnings
genirq: Add setter for AFFINITY_SET in irq_data state
genirq: Provide setter inline for IRQD_IRQ_INPROGRESS
genirq: Remove handle_IRQ_event
arm: Ns9xxx: Remove private irq flow handler
powerpc: cell: Use the core flow handler
genirq: Provide edge_eoi flow handler
genirq: Move INPROGRESS, MASKED and DISABLED state flags to irq_data
genirq: Split irq_set_affinity() so it can be called with lock held.
genirq: Add chip flag for restricting cpu_on/offline calls
genirq: Add chip hooks for taking CPUs on/off line.
genirq: Add irq disabled flag to irq_data state
genirq: Reserve the irq when calling irq_set_chip()

Linus Torvalds
2011-03-29 08:39:54 +0800
0ef5ca1e1 genirq; Fix cleanup fallout ... Browse Code »

I missed the CONFIG_GENERIC_PENDING_IRQ dependency in the affinity
related functions and the IRQ_LEVEL propagation into irq_data
state. Did not pop up on my main test platforms. :(

Signed-off-by: Thomas Gleixner
Tested-by: David Daney

Thomas Gleixner
2011-03-29 07:41:22 +0800
243b422af Relax si_code check in rt_sigqueueinfo and rt_tgsigqueueinfo ... Browse Code »

Commit da48524eb206 ("Prevent rt_sigqueueinfo and rt_tgsigqueueinfo
from spoofing the signal code") made the check on si_code too strict.
There are several legitimate places where glibc wants to queue a
negative si_code different from SI_QUEUE:

- This was first noticed with glibc's aio implementation, which wants
to queue a signal with si_code SI_ASYNCIO; the current kernel
causes glibc's tst-aio4 test to fail because rt_sigqueueinfo()
fails with EPERM.

- Further examination of the glibc source shows that getaddrinfo_a()
wants to use SI_ASYNCNL (which the kernel does not even define).
The timer_create() fallback code wants to queue signals with SI_TIMER.

As suggested by Oleg Nesterov , loosen the check to
forbid only the problematic SI_TKILL case.

Reported-by: Klaus Dittrich
Acked-by: Julien Tinnes
Cc:
Signed-off-by: Roland Dreier
Signed-off-by: Linus Torvalds

Roland Dreier
2011-03-29 06:45:44 +0800
a6aeddd1c genirq: Fix typo and remove unused variable ... Browse Code »

Sigh, I'm overworked.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-29 02:28:56 +0800
30398bf6c genirq: Fix new kernel-doc warnings ... Browse Code »

Fix new irq-related kernel-doc warnings in 2.6.38:

Warning(kernel/irq/manage.c:149): No description found for parameter 'mask'
Warning(kernel/irq/manage.c:149): Excess function parameter 'cpumask' description in 'irq_set_affinity'
Warning(include/linux/irq.h:161): No description found for parameter 'state_use_accessors'
Warning(include/linux/irq.h:161): Excess struct/union/enum/typedef member 'state_use_accessor' description in 'irq_data'

Signed-off-by: Randy Dunlap
LKML-Reference:
Signed-off-by: Thomas Gleixner

Randy Dunlap
2011-03-29 02:13:57 +0800

28 Mar, 2011

3 commits

33b054b86 genirq: Remove handle_IRQ_event ... Browse Code »

Last user gone.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-28 22:55:11 +0800
0521c8fbb genirq: Provide edge_eoi flow handler ... Browse Code »

This is a replacment for the cell flow handler which is in the way of
cleanups. Must be selected to avoid general bloat.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-28 22:55:11 +0800
32f4125eb genirq: Move INPROGRESS, MASKED and DISABLED state flags to irq_data ... Browse Code »

We really need these flags for some of the interrupt chips. Move it
from internal state to irq_data and provide proper accessors.

Signed-off-by: Thomas Gleixner
Cc: David Daney

Thomas Gleixner
2011-03-28 22:55:10 +0800

27 Mar, 2011

3 commits

c2d0c555c genirq: Split irq_set_affinity() so it can be called with lock held. ... Browse Code »

The .irq_cpu_online() and .irq_cpu_offline() functions may need to
adjust affinity, but they are called with the descriptor lock held.
Create __irq_set_affinity_locked() which is called with the lock held.
Make irq_set_affinity() just a wrapper that acquires the lock.

[ tglx: Changed the argument to irq_data, added a !desc check and
moved the !irq_set_affinity check where it belongs ]

Signed-off-by: David Daney
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
LKML-Reference:
Signed-off-by: Thomas Gleixner

David Daney
2011-03-27 23:45:59 +0800
b3d422329 genirq: Add chip flag for restricting cpu_on/offline calls ... Browse Code »

Add a flag which indicates that the on/offline callback should only be
called on enabled interrupts.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2011-03-27 23:45:58 +0800
0fdb4b259 genirq: Add chip hooks for taking CPUs on/off line. ... Browse Code »

[ tglx: Removed the enabled argument as this is now available in
irq_data ]

Signed-off-by: David Daney
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
LKML-Reference:
Signed-off-by: Thomas Gleixner

David Daney
2011-03-27 23:45:58 +0800