06 Oct, 2008
1 commit
05 Oct, 2008
2 commits
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevents: check broadcast tick device not the clock events device -
…git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86 setup: correct segfault in generation of 32-bit reloc kernel
04 Oct, 2008
28 commits
-
While working on the new version of the code for SCHED_SPORADIC I
noticed something strange in the present throttling mechanism. More
specifically in the throttling timer handler in sched_rt.c
(do_sched_rt_period_timer()) and in rt_rq_enqueue().The problem is that, when unthrottling a runqueue, rt_rq_enqueue() only
asks for rescheduling if the runqueue has a sched_entity associated to
it (i.e., rt_rq->rt_se != NULL).
Now, if the runqueue is the root rq (which has a rt_se = NULL)
rescheduling does not take place, and it is delayed to some undefined
instant in the future.This imply some random bandwidth usage by the RT tasks under throttling.
For instance, setting rt_runtime_us/rt_period_us = 950ms/1000ms an RT
task will get less than 95%. In our tests we got something varying
between 70% to 95%.
Using smaller time values, e.g., 95ms/100ms, things are even worse, and
I can see values also going down to 20-25%!!The tests we performed are simply running 'yes' as a SCHED_FIFO task,
and checking the CPU usage with top, but we can investigate thoroughly
if you think it is needed.Things go much better, for us, with the attached patch... Don't know if
it is the best approach, but it solved the issue for us.Signed-off-by: Dario Faggioli
Signed-off-by: Michael Trimarchi
Acked-by: Peter Zijlstra
Cc:
Signed-off-by: Ingo Molnar -
Impact: jiffies increment too fast.
Hugh Dickins noted that with NOHZ=n and HIGHRES=n jiffies get
incremented too fast. The reason is a wrong check in the broadcast
enter/exit code, which keeps the local apic timer in periodic mode
when the switch happens.Signed-off-by: Thomas Gleixner
-
…s/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid() -
Make the ACPI /proc/acpi/wakeup interface set the appropriate wake-up bits
of physical devices corresponding to the ACPI devices and make those bits
be set initially for devices that are enabled to wake up by default. This
is needed to restore the 2.6.26 and earlier behavior for the PCI devices
that were previously handled correctly with the help of the
/proc/acpi/wakeup interface.Signed-off-by: Rafael J. Wysocki
Cc: Len Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Check the return value of led_classdev_register and unregister all
registered devices, if registering one device fails. Also the dynamic
memory handling is totally bogus. You can't allocate multiple chunks via
kzalloc() and expect them to be in order later. I wonder how this ever
worked.Signed-off-by: Sven Wegener
Acked-by: Nate Case
Tested-by: Nate Case
Acked-by: Richard Purdie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
On initialization, we first do the ioremap and then register the led devices.
On deinitialization, we do it in reverse order. This prevents someone calling
into the brightness_set functions with an invalid latch_address.Signed-off-by: Sven Wegener
Acked-by: Rod Whitby
Acked-by: Richard Purdie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The tasklet checks RAW.BLOCK twice, and does not check RAW.XFER. This is
obviously wrong, and could theoretically cause the driver to hang.Reported-by: Nicolas Ferre
Signed-off-by: Haavard Skinnemoen
Acked-by: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The "Documentation" section of this file mentions that when an interface
change is made, I should be CCed with info about the change (so that
man-pages can document it). Additionally request that this info be CCed
to the new linux-api@vger.kernel.org list.Signed-off-by: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mention that patches that change the kernel-userland interface should
be CCed to the new list linux-api@vger.kernel.org.Signed-off-by: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Nowadays, man-pages has an associated mailing list. Mention that list
in MAINTAINERS.Signed-off-by: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove myself from the kernel MAINTAINERS file for cpusets. I am leaving
SGI and probably will not be active in Linux kernel work. I can be
reached at . Contact Derek Fults for future
SGI+cpuset related issues. I'm off to the next chapter of this good life.Signed-off-by: Paul Jackson
Cc: Paul Menage
Cc: Derek Fults
Cc: John Hesterberg
Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
include/linux/stacktrace.h:13: warning:
'struct task_struct' declared inside parameter list(This might be a hard error on sparc64, which uses this header and has
-Werror)Reported-by: "Randy.Dunlap"
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Accept zero (the default!) as a per-transfer clock speed override.
Signed-off-by: Lennert Buytenhek
Signed-off-by: David Brownell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix infinite recursive notifier in the fbdev layer. This causes recursive
locking. Dmitry Baryshkov found the problem and confirmed that the patch
fixes the bug.After doing
# echo 1 > /sys/class/graphics/fb0/blank
I got the following in my kernel log:=============================================
[ INFO: possible recursive locking detected ]
2.6.27-rc6-00086-gda63874-dirty #97
---------------------------------------------
echo/1564 is trying to acquire lock:
((fb_notifier_list).rwsem){..--}, at: [] __blocking_notifier_call_chain+0x38/0x6cbut task is already holding lock:
((fb_notifier_list).rwsem){..--}, at: [] __blocking_notifier_call_chain+0x38/0x6cother info that might help us debug this:
2 locks held by echo/1564:
#0: (&buffer->mutex){--..}, at: [] sysfs_write_file+0x30/0x80
#1: ((fb_notifier_list).rwsem){..--}, at: [] __blocking_notifier_call_chain+0x38/0x6cstack backtrace:
[] (dump_stack+0x0/0x14) from [] (print_deadlock_bug+0xa4/0xd0)
[] (print_deadlock_bug+0x0/0xd0) from [] (check_deadlock+0x148/0x17c)
r6:c397a1e0 r5:c397a530 r4:c04fcf98
[] (check_deadlock+0x0/0x17c) from [] (validate_chain+0x3c4/0x4f0)
[] (validate_chain+0x0/0x4f0) from [] (__lock_acquire+0x5e8/0x6b4)
[] (__lock_acquire+0x0/0x6b4) from [] (lock_acquire+0x64/0x78)
[] (lock_acquire+0x0/0x78) from [] (down_read+0x4c/0x60)
r7:00000009 r6:ffffffff r5:c0427a40 r4:c005a384
[] (down_read+0x0/0x60) from [] (__blocking_notifier_call_chain+0x38/0x6c)
r5:c0427a40 r4:c0427a74
[] (__blocking_notifier_call_chain+0x0/0x6c) from [] (blocking_notifier_call_chain+0x20/0x28)
r8:00000009 r7:c086d640 r6:c3967940 r5:00000000 r4:c38984b8
[] (blocking_notifier_call_chain+0x0/0x28) from [] (fb_notifier_call_chain+0x1c/0x24)
[] (fb_notifier_call_chain+0x0/0x24) from [] (fb_blank+0x64/0x70)
[] (fb_blank+0x0/0x70) from [] (fbcon_blank+0x114/0x1bc)
r5:00000001 r4:c38984b8
[] (fbcon_blank+0x0/0x1bc) from [] (do_blank_screen+0x1e0/0x2a0)
[] (do_blank_screen+0x0/0x2a0) from [] (fbcon_fb_blanked+0x74/0x94)
r5:c3967940 r4:00000001
[] (fbcon_fb_blanked+0x0/0x94) from [] (fbcon_event_notify+0x100/0x12c)
r5:fffffffe r4:c39bc194
[] (fbcon_event_notify+0x0/0x12c) from [] (notifier_call_chain+0x38/0x7c)
[] (notifier_call_chain+0x0/0x7c) from [] (__blocking_notifier_call_chain+0x54/0x6c)
r8:c3b51ea0 r7:00000009 r6:ffffffff r5:c0427a40 r4:c0427a74
[] (__blocking_notifier_call_chain+0x0/0x6c) from [] (blocking_notifier_call_chain+0x20/0x28)
r8:00000001 r7:c3a7e000 r6:00000000 r5:00000000 r4:c38984b8
[] (blocking_notifier_call_chain+0x0/0x28) from [] (fb_notifier_call_chain+0x1c/0x24)
[] (fb_notifier_call_chain+0x0/0x24) from [] (fb_blank+0x64/0x70)
[] (fb_blank+0x0/0x70) from [] (store_blank+0x54/0x7c)
r5:c38984b8 r4:c3b51ec4
[] (store_blank+0x0/0x7c) from [] (dev_attr_store+0x28/0x2c)
r8:00000001 r7:c042bf80 r6:c39eba10 r5:c3967c30 r4:c38e0140
[] (dev_attr_store+0x0/0x2c) from [] (flush_write_buffer+0x54/0x68)
[] (flush_write_buffer+0x0/0x68) from [] (sysfs_write_file+0x58/0x80)
r8:c3b51f78 r7:c3bcb070 r6:c39eba10 r5:00000001 r4:00000001
[] (sysfs_write_file+0x0/0x80) from [] (vfs_write+0xb8/0x148)
[] (vfs_write+0x0/0x148) from [] (sys_write+0x44/0x70)
r7:00000004 r6:c3bcb070 r5:00000000 r4:00000000
[] (sys_write+0x0/0x70) from [] (ret_fast_syscall+0x0/0x2c)
r6:4001b000 r5:00000001 r4:401dc658Signed-off-by: Krzysztof Helt
Reported-by: Dmitry Baryshkov
Testted-by: Dmitry Baryshkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When userspace uses SIGIO notification and forgets to disable it before
closing file descriptor, rtc->async_queue contains stale pointer to struct
file. When user space enables again SIGIO notification in different
process, kernel dereferences this (poisoned) pointer and crashes.So disable SIGIO notification on close.
Kernel panic:
(second run of qemu (requires echo 1024 > /sys/class/rtc/rtc0/max_user_freq))general protection fault: 0000 [1] PREEMPT
CPU 0
Modules linked in: af_packet snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq usbhid tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761 tda9875 uhci_hcd ehci_hcd usbcore bttv snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer ir_common compat_ioctl32 snd_page_alloc videodev v4l1_compat snd_mpu401_uart snd_rawmidi v4l2_common videobuf_dma_sg videobuf_core snd_seq_device snd btcx_risc soundcore tveeprom i2c_viapro
Pid: 5781, comm: qemu-system-x86 Not tainted 2.6.27-rc6 #363
RIP: 0010:[] [] __lock_acquire+0x3db/0x73f
RSP: 0000:ffffffff80674cb8 EFLAGS: 00010002
RAX: ffff8800224c62f0 RBX: 0000000000000046 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800224c62f0
RBP: ffffffff80674d08 R08: 0000000000000002 R09: 0000000000000001
R10: ffffffff80238941 R11: 0000000000000001 R12: 0000000000000000
R13: 6b6b6b6b6b6b6b6b R14: ffff88003a450080 R15: 0000000000000000
FS: 00007f98b69516f0(0000) GS:ffffffff80623200(0000) knlGS:00000000f7cc86d0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000a87000 CR3: 0000000022598000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-system-x86 (pid: 5781, threadinfo ffff880028812000, task ffff88003a450080)
Stack: ffffffff80674cf8 0000000180238440 0000000200000002 0000000000000000
ffff8800224c62f0 0000000000000046 0000000000000000 0000000000000002
0000000000000002 0000000000000000 ffffffff80674d68 ffffffff8024fc7a
Call Trace:
[] lock_acquire+0x85/0xa9
[] ? send_sigio+0x2a/0x184
[] _read_lock+0x3e/0x4a
[] ? send_sigio+0x2a/0x184
[] send_sigio+0x2a/0x184
[] ? __lock_acquire+0x6e1/0x73f
[] ? kill_fasync+0x2c/0x4e
[] __kill_fasync+0x54/0x65
[] kill_fasync+0x3a/0x4e
[] rtc_update_irq+0x9c/0xa5
[] cmos_interrupt+0xae/0xc0
[] handle_IRQ_event+0x25/0x5a
[] handle_edge_irq+0xdd/0x123
[] do_IRQ+0xe4/0x144
[] ret_from_intr+0x0/0xf
[] ? __alloc_pages_internal+0xe7/0x3ad
[] ? clear_page_c+0x7/0x10
[] ? get_page_from_freelist+0x385/0x450
[] ? __alloc_pages_internal+0xe7/0x3ad
[] ? anon_vma_prepare+0x2e/0xf6
[] ? handle_mm_fault+0x227/0x6a5
[] ? do_page_fault+0x494/0x83f
[] ? error_exit+0x0/0xa9Code: cc 41 39 45 28 74 24 e8 5e 1d 0f 00 85 c0 0f 84 6a 03 00 00 83 3d 8f a9 aa 00 00 be 47 03 00 00 0f 84 6a 02 00 00 e9 53 03 00 00 ff 85 38 01 00 00 45 8b be 90 06 00 00 41 83 ff 2f 76 24 e8
RIP [] __lock_acquire+0x3db/0x73f
RSP
---[ end trace 431877d860448760 ]---
Kernel panic - not syncing: Aiee, killing interrupt handler!Signed-off-by: Marcin Slusarz
Acked-by: Alessandro Zummo
Acked-by: David Brownell
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field. The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used. This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.Please apply this to the 2.6.27-rcX release stream.
Signed-off-by: Paul Moore
Signed-off-by: James Morris -
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] SMTC: Fix SMTC dyntick support.
[MIPS] SMTC: Close tiny holes in the SMTC IPI replay system.
[MIPS] SMTC: Fix holes in SMTC and FPU affinity support.
[MIPS] SMTC: Build fix: Fix filename in Makefile
[MIPS] Build fix: Fix irq flags type -
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] qdio: prevent stack clobber
[S390] nohz: Fix __udelay. -
Impact: segfault on build of a 32-bit relocatable kernel
When converting arch/x86/boot/compressed/relocs.c to support unlimited
sections, the computation of sym_strtab in walk_relocs() was done
incorrectly. This causes a segfault for some people when building the
relocatable 32-bit kernel.Pointed out by Anonymous .
Signed-off-by: H. Peter Anvin
-
.. small detail, but the silly e1000e initcall warning debugging caused
me to look at this code. Rather than gouge my eyes out with a spoon, I
just fixed it.Signed-off-by: Linus Torvalds
-
Don't print more information than fits into the string on the
stack. Combine the informational output of qdio to fit into
one line.Signed-off-by: Jan Glauber
Signed-off-by: Martin Schwidefsky -
This fixes a regression that came with 934b2857cc576ae53c92a66e63fce7ddcfa74691
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.In addition this patch fixes also the following:
rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky -
Rework of SMTC support to make it work with the new clock event system,
allowing "tickless" operation, and to make it compatible with the use of
the "wait_irqoff" idle loop. The new clocking scheme means that the
previously optional IPI instant replay mechanism is now required, and has
been made more robust.Signed-off-by: Kevin D. Kissell
Signed-off-by: Ralf Baechle -
Signed-off-by: Kevin D. Kissell
Signed-off-by: Ralf Baechle -
Signed-off-by: Kevin D. Kissell
Signed-off-by: Ralf Baechle -
Signed-off-by: Ralf Baechle
-
Though from a hardware perspective it would be sensible to use only a
32-bit unsigned int type Linux defines interrupt flags to be stored in
an unsigned long and nothing else.Signed-off-by: Ralf Baechle
-
Doing 'WARN_ON(preempt_count())' was horribly horribly wrong, and would
cause tons of warnings at bootup if PREEMPT was enabled because the
initcalls currently run with the kernel lock, which increments the
preempt count.At the same time, the warning was also insufficient, since it didn't
check that interrupts were enabled.The proper debug function to use for something that can sleep and wants
a warning if it's called in the wrong context is 'might_sleep()'.Reported-by: Christian Borntraeger
Signed-off-by: Linus Torvalds
03 Oct, 2008
9 commits
-
This is loosely based on a patch by Jesse Barnes to check the user-space
PCI mappings though the sysfs interfaces. Quoting Jesse's original
explanation:It's fairly common for applications to map PCI resources through sysfs.
However, with the current implementation, it's possible for an application
to map far more than the range corresponding to the resourceN file it
opened. This patch plugs that hole by checking the range at mmap time,
similar to what is done on platforms like sparc64 in their lower level
PCI remapping routines.It was initially put together to help debug the e1000e NVRAM corruption
problem, since we initially thought an X driver might be walking past the
end of one of its mappings and clobbering the NVRAM. It now looks like
that's not the case, but doing the check is still important for obvious
reasons.and this version of the patch differs in that it uses a helper function
to clarify the code, and does all the checks in pages (instead of bytes)
in order to avoid overflows when doing "<< PAGE_SHIFT" etc.Acked-by: Jesse Barnes
Signed-off-by: Linus Torvalds -
Signed-off-by: Jesse Brandeburg
Signed-off-by: Linus Torvalds -
This patch adds a mutex to the e1000e driver that would help
catch any collisions of two e1000e threads accessing hardware
at the same time.description and patch updated by Jesse
Signed-off-by: Thomas Gleixner
Signed-off-by: Jesse Brandeburg
Signed-off-by: Linus Torvalds -
the stats lock is left over from e1000, e1000e no longer
has the adjust tbi stats function that required the addition
of the stats lock to begin with.adding a mutex to acquire_swflag helped catch this one too.
Signed-off-by: Jesse Brandeburg
Acked-by: Thomas Gleixner
Signed-off-by: Linus Torvalds -
thanks to tglx, we're finding some interesting reentrancy issues.
this patch removes the phy read from inside a spinlock, paving
the way for removing the spinlock completely. The phy read was
only feeding a statistic that wasn't used.Signed-off-by: Jesse Brandeburg
Acked-by: Thomas Gleixner
Signed-off-by: Linus Torvalds -
e1000e was apparently calling two functions that attempted to reserve
the SWFLAG bit for exclusive (to hardware and firmware) access to
the PHY and NVM (aka eeprom). These accesses could possibly call
msleep to wait for the resource which is not allowed from interrupt
context.Signed-off-by: Jesse Brandeburg
Acked-by: Thomas Gleixner
Tested-by: Thomas Gleixner
Signed-off-by: Linus Torvalds -
in the process of debugging things, noticed that the swflag is not reset
by the driver after reset, and the swflag is probably not reset unless
management firmware clears it after 100ms.Signed-off-by: Jesse Brandeburg
Signed-off-by: Linus Torvalds -
When we initialise a compound page we initialise the page flags and head
page pointer for all base pages spanned by that page. When we initialise
a gigantic page (a page of order greater than or equal to MAX_ORDER) we
have to initialise more than MAX_ORDER_NR_PAGES pages. Currently we
assume that all elements of the mem_map in this page are contigious in
memory. However this is only guarenteed out to MAX_ORDER_NR_PAGES pages,
and with SPARSEMEM enabled they will not be contigious. This leads us to
walk off the end of the first section and scribble on everything which
follows, BAD.When we reach a MAX_ORDER_NR_PAGES boundary we much locate the next
section of the mem_map. As gigantic pages can only be maximally aligned
we know this will occur at exact multiple of MAX_ORDER_NR_PAGES pages from
the start of the page.This is a bug fix for the gigantic page support in hugetlbfs.
Credit to Mel Gorman for spotting the issue.
Signed-off-by: Andy Whitcroft
Cc: Mel Gorman
Cc: Jon Tollefson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The previous patch db203d53d474aa068984e409d807628f5841da1b ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).However do_truncate is what caused the lock ordering error, due to it
taking i_mutex. In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.Acked-by: David Howells
Acked-by: Hugh Dickins
Signed-off-by: Nick Piggin
Cc: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds