Eric Lee / smarc-fsl-linux-kernel

29 Dec, 2010

1 commit

6f7f41851 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ring_buffer: Off-by-one and duplicate events in ring_buffer_read_page

Linus Torvalds
2010-12-29 07:54:24 +0800

25 Dec, 2010

1 commit

a4790c945 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: print out alloc information with KERN_DEBUG instead of KERN_INFO
kthread_work: make lockdep happy

Linus Torvalds
2010-12-25 04:59:09 +0800

24 Dec, 2010

1 commit

e1e359273 ring_buffer: Off-by-one and duplicate events in ring_buffer_read_page ... Browse Code »

Fix two related problems in the event-copying loop of
ring_buffer_read_page.

The loop condition for copying events is off-by-one.
"len" is the remaining space in the caller-supplied page.
"size" is the size of the next event (or two events).
If len == size, then there is just enough space for the next event.

size was set to rb_event_ts_length, which may include the size of two
events if the first event is a time-extend, in order to assure time-
extends are kept together with the event after it. However,
rb_advance_reader always advances by one event. This would result in the
event after any time-extend being duplicated. Instead, get the size of
a single event for the memcpy, but use rb_event_ts_length for the loop
condition.

Signed-off-by: David Sharp
LKML-Reference:
LKML-Reference:
Signed-off-by: Steven Rostedt

David Sharp
2010-12-24 01:09:30 +0800

23 Dec, 2010

1 commit

4be2c95d1 taskstats: pad taskstats netlink response for aligment issues on ia64 ... Browse Code »

The taskstats structure is internally aligned on 8 byte boundaries but the
layout of the aggregrate reply, with two NLA headers and the pid (each 4
bytes), actually force the entire structure to be unaligned. This causes
the kernel to issue unaligned access warnings on some architectures like
ia64. Unfortunately, some software out there doesn't properly unroll the
NLA packet and assumes that the start of the taskstats structure will
always be 20 bytes from the start of the netlink payload. Aligning the
start of the taskstats structure breaks this software, which we don't
want. So, for now the alignment only happens on architectures that
require it and those users will have to update to fixed versions of those
packages. Space is reserved in the packet only when needed. This ifdef
should be removed in several years e.g. 2012 once we can be confident
that fixed versions are installed on most systems. We add the padding
before the aggregate since the aggregate is already a defined type.

Commit 85893120 ("delayacct: align to 8 byte boundary on 64-bit systems")
previously addressed the alignment issues by padding out the pid field.
This was supposed to be a compatible change but the circumstances
described above mean that it wasn't. This patch backs out that change,
since it was a hack, and introduces a new NULL attribute type to provide
the padding. Padding the response with 4 bytes avoids allocating an
aligned taskstats structure and copying it back. Since the structure
weighs in at 328 bytes, it's too big to do it on the stack.

Signed-off-by: Jeff Mahoney
Reported-by: Brian Rogers
Cc: Jeff Mahoney
Cc: Guillaume Chazarain
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jeff Mahoney
2010-12-23 11:43:34 +0800

22 Dec, 2010

1 commit

4f32e9b1f kthread_work: make lockdep happy ... Browse Code »

spinlock in kthread_worker and wait_queue_head in kthread_work both
should be lockdep sensible, so change the interface to make it
suiltable for CONFIG_LOCKDEP.

tj: comment update

Reported-by: Nicolas
Signed-off-by: Yong Zhang
Signed-off-by: Andy Walls
Tested-by: Andy Walls
Cc: Tejun Heo
Cc: Andrew Morton
Signed-off-by: Tejun Heo

Yong Zhang
2010-12-22 17:27:53 +0800

21 Dec, 2010

1 commit

de5e9d582 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Remove debugging check

Linus Torvalds
2010-12-21 01:05:26 +0800

20 Dec, 2010

3 commits

050c6c9b8 sched: Remove debugging check ... Browse Code »

Linus reported that the new warning introduced by commit f26f9aff6aaf
"Sched: fix skip_clock_update optimization" triggers. The need_resched
flag can be set by other CPUs asynchronously so this debug check is
bogus - remove it.

Reported-by: Linus Torvalds
Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2010-12-20 06:24:27 +0800
55ec86f84 Merge branches 'x86-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.ker… ... Browse Code »

…nel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32: Make sure we can map all of lowmem if we need to
x86, vt-d: Handle previous faults after enabling fault handling
x86: Enable the intr-remap fault handling after local APIC setup
x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode
x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic
x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem()
bootmem: Add alloc_bootmem_align()
x86, gcc-4.6: Use gcc -m options when building vdso
x86: HPET: Chose a paranoid safe value for the ETIME check
x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix off by one in perf_swevent_init()
perf: Fix duplicate events with multiple-pmu vs software events
ftrace: Have recordmcount honor endianness in fn_ELF_R_INFO
scripts/tags.sh: Add magic for trace-events
tracing: Fix panic when lseek() called on "trace" opened for writing

Linus Torvalds
2010-12-20 02:44:54 +0800
21228e455 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix the irqtime code for 32bit
sched: Fix the irqtime code to deal with u64 wraps
nohz: Fix get_next_timer_interrupt() vs cpu hotplug
Sched: fix skip_clock_update optimization
sched: Cure more NO_HZ load average woes

Linus Torvalds
2010-12-20 02:37:37 +0800

19 Dec, 2010

1 commit

46bdfe6a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86: avoid high BIOS area when allocating address space
x86: avoid E820 regions when allocating address space
x86: avoid low BIOS area when allocating address space
resources: add arch hook for preventing allocation in reserved areas
Revert "resources: support allocating space within a region from the top down"
Revert "PCI: allocate bus resources from the top down"
Revert "x86/PCI: allocate space from the end of a region, not the beginning"
Revert "x86: allocate space within a region top-down"
Revert "PCI: fix pci_bus_alloc_resource() hang, prefer positive decode"
PCI: Update MCP55 quirk to not affect non HyperTransport variants

Linus Torvalds
2010-12-19 02:13:24 +0800

18 Dec, 2010

2 commits

fcb119183 resources: add arch hook for preventing allocation in reserved areas ... Browse Code »

This adds arch_remove_reservations(), which an arch can implement if it
needs to protect part of the address space from allocation.

Sometimes that can be done by just putting a region in the resource tree,
but there are cases where that doesn't work well. For example, x86 BIOS
E820 reservations are not related to devices, so they may overlap part of,
all of, or more than a device resource, so they may not end up at the
correct spot in the resource tree.

Acked-by: H. Peter Anvin
Signed-off-by: Bjorn Helgaas
Signed-off-by: Jesse Barnes

Bjorn Helgaas
2010-12-18 02:01:09 +0800
c0f5ac542 Revert "resources: support allocating space within a region from the top down" ... Browse Code »

This reverts commit e7f8567db9a7f6b3151b0b275e245c1cef0d9c70.

Acked-by: H. Peter Anvin
Signed-off-by: Bjorn Helgaas
Signed-off-by: Jesse Barnes

Bjorn Helgaas
2010-12-18 02:00:59 +0800

17 Dec, 2010

2 commits

be8cd644c PM / Hibernate: Restore old swap signature to avoid user space breakage ... Browse Code »

Commit 3624eb0 (PM / Hibernate: Modify signature used to mark swap)
attempted to modify hibernate signature used to mark swap partitions
containing hibernation images, so that old kernels don't try to
handle compressed images. However, this change broke resume from
hibernation on Fedora 14 that apparently doesn't pass the resume=
argument to the kernel and tries to trigger resume from early user
space. This doesn't work, because the signature is now different,
so the old signature has to be restored to avoid the problem.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=22732 .

Reported-by: Dr. David Alan Gilbert
Reported-by: Zhang Rui
Reported-by: Pascal Chapperon
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2010-12-17 00:08:58 +0800
1497dd1d2 PM / Hibernate: Fix PM_POST_* notification with user-space suspend ... Browse Code »

The user-space hibernation sends a wrong notification after the image
restoration because of thinko for the file flag check. RDONLY
corresponds to hibernation and WRONLY to restoration, confusingly.

Signed-off-by: Takashi Iwai
Signed-off-by: Rafael J. Wysocki
Cc: stable@kernel.org

Takashi Iwai
2010-12-17 00:08:43 +0800

16 Dec, 2010

4 commits

d949750fe Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…rostedt/linux-2.6-trace into perf/urgent

Ingo Molnar
2010-12-16 18:21:24 +0800
8e92c2018 sched: Fix the irqtime code for 32bit ... Browse Code »

Since the irqtime accounting is using non-atomic u64 and can be read
from remote cpus (writes are strictly cpu local, reads are not) we
have to deal with observing partial updates.

When we do observe partial updates the clock movement (in particular,
->clock_task movement) will go funny (in either direction), a
subsequent clock update (observing the full update) will make it go
funny in the oposite direction.

Since we rely on these clocks to be strictly monotonic we cannot
suffer backwards motion. One possible solution would be to simply
ignore all backwards deltas, but that will lead to accounting
artefacts, most notable: clock_task + irq_time != clock, this
inaccuracy would end up in user visible stats.

Therefore serialize the reads using a seqcount.

Reviewed-by: Venkatesh Pallipadi
Reported-by: Mikael Pettersson
Tested-by: Mikael Pettersson
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-12-16 18:17:47 +0800
fe44d6212 sched: Fix the irqtime code to deal with u64 wraps ... Browse Code »

Some ARM systems have a short sched_clock() [ which needs to be fixed
too ], but this exposed a bug in the irq_time code as well, it doesn't
deal with wraps at all.

Fix the irq_time code to deal with u64 wraps by re-writing the code to
only use delta increments, which avoids the whole issue.

Reviewed-by: Venkatesh Pallipadi
Reported-by: Mikael Pettersson
Tested-by: Mikael Pettersson
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-12-16 18:17:46 +0800
ce677831a perf: Fix off by one in perf_swevent_init() ... Browse Code »

The perf_swevent_enabled[] array has PERF_COUNT_SW_MAX elements.

Signed-off-by: Dan Carpenter
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Dan Carpenter
2010-12-16 18:14:31 +0800

15 Dec, 2010

1 commit

0fcdcfbbc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: It is likely that WORKER_NOT_RUNNING is true
MAINTAINERS: Add workqueue entry
workqueue: check the allocation of system_unbound_wq

Linus Torvalds
2010-12-15 10:50:10 +0800

14 Dec, 2010

1 commit

2d64672ed workqueue: It is likely that WORKER_NOT_RUNNING is true ... Browse Code »

Running the annotate branch profiler on three boxes, including my
main box that runs firefox, evolution, xchat, and is part of the distcc farm,
showed this with the likelys in the workqueue code:

correct incorrect % Function File Line
------- --------- - -------- ---- ----
96 996253 99 wq_worker_sleeping workqueue.c 703
96 996247 99 wq_worker_waking_up workqueue.c 677

The likely()s in this case were assuming that WORKER_NOT_RUNNING will
most likely be false. But this is not the case. The reason is
(and shown by adding trace_printks and testing it) that most of the time
WORKER_PREP is set.

In worker_thread() we have:

worker_clr_flags(worker, WORKER_PREP);

[ do work stuff ]

worker_set_flags(worker, WORKER_PREP, false);

(that 'false' means not to wake up an idle worker)

The wq_worker_sleeping() is called from schedule when a worker thread
is putting itself to sleep. Which happens most of the time outside
of that [ do work stuff ].

The wq_worker_waking_up is called by the wakeup worker code, which
is also callod outside that [ do work stuff ].

Thus, the likely and unlikely used by those two functions are actually
backwards.

Remove the annotation and let gcc figure it out.

Acked-by: Tejun Heo
Signed-off-by: Steven Rostedt
Signed-off-by: Tejun Heo

Steven Rostedt
2010-12-14 22:05:54 +0800

09 Dec, 2010

4 commits

dbd87b5af nohz: Fix get_next_timer_interrupt() vs cpu hotplug ... Browse Code »

This fixes a bug as seen on 2.6.32 based kernels where timers got
enqueued on offline cpus.

If a cpu goes offline it might still have pending timers. These will
be migrated during CPU_DEAD handling after the cpu is offline.
However while the cpu is going offline it will schedule the idle task
which will then call tick_nohz_stop_sched_tick().

That function in turn will call get_next_timer_intterupt() to figure
out if the tick of the cpu can be stopped or not. If it turns out that
the next tick is just one jiffy off (delta_jiffies == 1)
tick_nohz_stop_sched_tick() incorrectly assumes that the tick should
not stop and takes an early exit and thus it won't update the load
balancer cpu.

Just afterwards the cpu will be killed and the load balancer cpu could
be the offline cpu.

On 2.6.32 based kernel get_nohz_load_balancer() gets called to decide
on which cpu a timer should be enqueued (see __mod_timer()). Which
leads to the possibility that timers get enqueued on an offline cpu.
These will never expire and can cause a system hang.

This has been observed 2.6.32 kernels. On current kernels
__mod_timer() uses get_nohz_timer_target() which doesn't have that
problem. However there might be other problems because of the too
early exit tick_nohz_stop_sched_tick() in case a cpu goes offline.

The easiest and probably safest fix seems to be to let
get_next_timer_interrupt() just lie and let it say there isn't any
pending timer if the current cpu is offline.

I also thought of moving migrate_[hr]timers() from CPU_DEAD to
CPU_DYING, but seeing that there already have been fixes at least in
the hrtimer code in this area I'm afraid that this could add new
subtle bugs.

Signed-off-by: Heiko Carstens
Signed-off-by: Peter Zijlstra
LKML-Reference:
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar

Heiko Carstens
2010-12-09 03:15:07 +0800
f26f9aff6 Sched: fix skip_clock_update optimization ... Browse Code »

idle_balance() drops/retakes rq->lock, leaving the previous task
vulnerable to set_tsk_need_resched(). Clear it after we return
from balancing instead, and in setup_thread_stack() as well, so
no successfully descheduled or never scheduled task has it set.

Need resched confused the skip_clock_update logic, which assumes
that the next call to update_rq_clock() will come nearly immediately
after being set. Make the optimization robust against the waking
a sleeper before it sucessfully deschedules case by checking that
the current task has not been dequeued before setting the flag,
since it is that useless clock update we're trying to save, and
clear unconditionally in schedule() proper instead of conditionally
in put_prev_task().

Signed-off-by: Mike Galbraith
Reported-by: Bjoern B. Brandenburg
Tested-by: Yong Zhang
Signed-off-by: Peter Zijlstra
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-12-09 03:15:06 +0800
0f004f5a6 sched: Cure more NO_HZ load average woes ... Browse Code »

There's a long-running regression that proved difficult to fix and
which is hitting certain people and is rather annoying in its effects.

Damien reported that after 74f5187ac8 (sched: Cure load average vs
NO_HZ woes) his load average is unnaturally high, he also noted that
even with that patch reverted the load avgerage numbers are not
correct.

The problem is that the previous patch only solved half the NO_HZ
problem, it addressed the part of going into NO_HZ mode, not of
comming out of NO_HZ mode. This patch implements that missing half.

When comming out of NO_HZ mode there are two important things to take
care of:

- Folding the pending idle delta into the global active count.
- Correctly aging the averages for the idle-duration.

So with this patch the NO_HZ interaction should be complete and
behaviour between CONFIG_NO_HZ=[yn] should be equivalent.

Furthermore, this patch slightly changes the load average computation
by adding a rounding term to the fixed point multiplication.

Reported-by: Damien Wyart
Reported-by: Tim McGrath
Tested-by: Damien Wyart
Tested-by: Orion Poplawski
Tested-by: Kyle McMartin
Signed-off-by: Peter Zijlstra
Cc: stable@kernel.org
Cc: Chase Douglas
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-12-09 03:15:04 +0800
516769575 perf: Fix duplicate events with multiple-pmu vs software events ... Browse Code »

Because the multi-pmu bits can share contexts between struct pmu
instances we could get duplicate events by iterating the pmu list.

Signed-off-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-12-09 03:14:08 +0800

08 Dec, 2010

2 commits

6313e3c21 Merge branches 'x86-fixes-for-linus', 'perf-fixes-for-linus' and 'sched-fixes-fo… ... Browse Code »

…r-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/pvclock: Zero last_value on resume

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf record: Fix eternal wait for stillborn child
perf header: Don't assume there's no attr info if no sample ids is provided
perf symbols: Figure out start address of kernel map from kallsyms
perf symbols: Fix kallsyms kernel/module map splitting

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: Fix printk_needs_cpu() return value on offline cpus
printk: Fix wake_up_klogd() vs cpu hotplug

Linus Torvalds
2010-12-08 22:40:59 +0800
81e8d2162 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix incorrect proc spurious output

Linus Torvalds
2010-12-08 00:14:28 +0800

07 Dec, 2010

3 commits

7787d2c2f Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 ... Browse Code »

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / Hibernate: Fix memory corruption related to swap
PM / Hibernate: Use async I/O when reading compressed hibernation image

Linus Torvalds
2010-12-07 07:51:14 +0800
c9e664f1f PM / Hibernate: Fix memory corruption related to swap ... Browse Code »

There is a problem that swap pages allocated before the creation of
a hibernation image can be released and used for storing the contents
of different memory pages while the image is being saved. Since the
kernel stored in the image doesn't know of that, it causes memory
corruption to occur after resume from hibernation, especially on
systems with relatively small RAM that need to swap often.

This issue can be addressed by keeping the GFP_IOFS bits clear
in gfp_allowed_mask during the entire hibernation, including the
saving of the image, until the system is finally turned off or
the hibernation is aborted. Unfortunately, for this purpose
it's necessary to rework the way in which the hibernate and
suspend code manipulates gfp_allowed_mask.

This change is based on an earlier patch from Hugh Dickins.

Signed-off-by: Rafael J. Wysocki
Reported-by: Ondrej Zary
Acked-by: Hugh Dickins
Reviewed-by: KAMEZAWA Hiroyuki
Cc: stable@kernel.org

Rafael J. Wysocki
2010-12-07 06:52:08 +0800
9f339caf8 PM / Hibernate: Use async I/O when reading compressed hibernation image ... Browse Code »

This is a fix for reading LZO compressed image using async I/O.
Essentially, instead of having just one page into which we keep
reading blocks from swap, we allocate enough of them to cover the
largest compressed size and then let block I/O pick them all up. Once
we have them all (and here we wait), we decompress them, as usual.
Obviously, the very first block we still pick up synchronously,
because we need to know the size of the lot before we pick up the
rest.

Also fixed the copyright line, which I've forgotten before.

Signed-off-by: Bojan Smojver
Signed-off-by: Rafael J. Wysocki

Bojan Smojver
2010-12-07 06:38:29 +0800

03 Dec, 2010

1 commit

33dd94ae1 do_exit(): make sure that we run with get_fs() == USER_DS ... Browse Code »

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit(). do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing. I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.

A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.

Let's just stick it in do_exit instead.

[akpm@linux-foundation.org: update code comment]
Signed-off-by: Nelson Elhage
Cc: KOSAKI Motohiro
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nelson Elhage
2010-12-03 06:51:16 +0800

01 Dec, 2010

2 commits

25c9170ed genirq: Fix incorrect proc spurious output ... Browse Code »

Since commit a1afb637(switch /proc/irq/*/spurious to seq_file) all
/proc/irq/XX/spurious files show the information of irq 0.

Current irq_spurious_proc_open() passes on NULL as the 3rd argument,
which is used as an IRQ number in irq_spurious_proc_show(), to the
single_open(). Because of this, all the /proc/irq/XX/spurious file
shows IRQ 0 information regardless of the IRQ number.

To fix the problem, irq_spurious_proc_open() must pass on the
appropreate data (IRQ number) to single_open().

Signed-off-by: Kenji Kaneshige
Reviewed-by: Yong Zhang
LKML-Reference:
Cc: stable@kernel.org [2.6.33+]
Signed-off-by: Thomas Gleixner

Kenji Kaneshige
2010-12-01 15:44:26 +0800
364829b12 tracing: Fix panic when lseek() called on "trace" opened for writing ... Browse Code »

The file_ops struct for the "trace" special file defined llseek as seq_lseek().
However, if the file was opened for writing only, seq_open() was not called,
and the seek would dereference a null pointer, file->private_data.

This patch introduces a new wrapper for seq_lseek() which checks if the file
descriptor is opened for reading first. If not, it does nothing.

Cc:
Signed-off-by: Slava Pestov
LKML-Reference:
Signed-off-by: Steven Rostedt

Slava Pestov
2010-12-01 01:18:17 +0800

29 Nov, 2010

1 commit

a9e40a249 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix the software context switch counter
perf, x86: Fixup Kconfig deps
x86, perf, nmi: Disable perf if counters are not accessible
perf: Fix inherit vs. context rotation bug

Linus Torvalds
2010-11-29 04:25:02 +0800

27 Nov, 2010

3 commits

031875527 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
posix-cpu-timers: Rcu_read_lock/unlock protect find_task_by_vpid call

Linus Torvalds
2010-11-27 06:29:20 +0800
d2f30c73a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf symbols: Remove incorrect open-coded container_of()
perf record: Handle restrictive permissions in /proc/{kallsyms,modules}
x86/kprobes: Prevent kprobes to probe on save_args()
irq_work: Drop cmpxchg() result
perf: Fix owner-list vs exit
x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG
tracing: Fix recursive user stack trace
perf,hw_breakpoint: Initialize hardware api earlier
x86: Ignore trap bits on single step exceptions
tracing: Force arch_local_irq_* notrace for paravirt
tracing: Fix module use of trace_bprintk()

Linus Torvalds
2010-11-27 06:28:17 +0800
1b065fdff Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix idle balancing
sched: Fix volanomark performance regression

Linus Torvalds
2010-11-27 06:27:54 +0800

26 Nov, 2010

4 commits

61ab25447 nohz: Fix printk_needs_cpu() return value on offline cpus ... Browse Code »

This patch fixes a hang observed with 2.6.32 kernels where timers got enqueued
on offline cpus.

printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu, will
call tick_nohz_stop_sched_tick(). That function in turn will call
printk_needs_cpu() in order to check if the local tick can be disabled. On
offline cpus this function should naturally return 0 since regardless if the
tick gets disabled or not the cpu will be dead short after. That is besides the
fact that __cpu_disable() should already have made sure that no interrupts on
the offlined cpu will be delivered anyway.

In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued. If
printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.

This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might be
other problems because of the too early exit tick_nohz_stop_sched_tick() in
case a cpu goes offline.

Easiest way to fix this is just to test if the current cpu is offline and call
printk_tick() directly which clears the condition.

Alternatively I tried a cpu hotplug notifier which would clear the condition,
however between calling the notifier function and printk_needs_cpu() something
could have called printk() again and the problem is back again. This seems to
be the safest fix.

Signed-off-by: Heiko Carstens
Signed-off-by: Peter Zijlstra
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Heiko Carstens
2010-11-26 22:03:12 +0800
49f413834 printk: Fix wake_up_klogd() vs cpu hotplug ... Browse Code »

wake_up_klogd() may get called from preemptible context but uses
__raw_get_cpu_var() to write to a per cpu variable. If it gets preempted
between getting the address and writing to it, the cpu in question could be
offline if the process gets scheduled back and hence writes to the per cpu data
of an offline cpu.

This buggy behaviour was introduced with fa33507a "printk: robustify
printk, fix #2" which was supposed to fix a "using smp_processor_id() in
preemptible" warning.

Let's use this_cpu_write() instead which disables preemption and makes sure
that the outlined scenario cannot happen.

Signed-off-by: Heiko Carstens
Acked-by: Eric Dumazet
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Heiko Carstens
2010-11-26 22:03:11 +0800
ee6dcfa40 perf: Fix the software context switch counter ... Browse Code »

Stephane noticed that because the perf_sw_event() call is inside the
perf_event_task_sched_out() call it won't get called unless we
have a per-task counter.

Reported-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-11-26 22:00:59 +0800
dddd3379a perf: Fix inherit vs. context rotation bug ... Browse Code »

It was found that sometimes children of tasks with inherited events had
one extra event. Eventually it turned out to be due to the list rotation
no being exclusive with the list iteration in the inheritance code.

Cure this by temporarily disabling the rotation while we inherit the events.

Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
LKML-Reference:
Cc:
Signed-off-by: Ingo Molnar

Thomas Gleixner
2010-11-26 22:00:56 +0800