Eric Lee / smarc-fsl-linux-kernel

27 Jun, 2006

15 commits

99f895518 [PATCH] proc: don't lock task_structs indefinitely ... Browse Code »

Every inode in /proc holds a reference to a struct task_struct. If a
directory or file is opened and remains open after the the task exits this
pinning continues. With 8K stacks on a 32bit machine the amount pinned per
file descriptor is about 10K.

Normally I would figure a reasonable per user process limit is about 100
processes. With 80 processes, with a 1000 file descriptors each I can trigger
the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
data.

This patch replaces the struct task_struct pointer with a pointer to a struct
task_ref which has a struct task_struct pointer. The so the pinning of dead
tasks does not happen.

The code now has to contend with the fact that the task may now exit at any
time. Which is a little but not muh more complicated.

With this change it takes about 1000 processes each opening up 1000 file
descriptors before I can trigger the OOM killer. Much better.

[mlp@google.com: task_mmu small fixes]
Signed-off-by: Eric W. Biederman
Cc: Trond Myklebust
Cc: Paul Jackson
Cc: Oleg Nesterov
Cc: Albert Cahalan
Signed-off-by: Prasanna Meda
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2006-06-27 00:58:25 +0800
48e6484d4 [PATCH] proc: Rewrite the proc dentry flush on exit optimization ... Browse Code »

To keep the dcache from filling up with dead /proc entries we flush them on
process exit. However over the years that code has gotten hairy with a
dentry_pointer and a lock in task_struct and misdocumented as a correctness
feature.

I have rewritten this code to look and see if we have a corresponding entry in
the dcache and if so flush it on process exit. This removes the extra fields
in the task_struct and allows me to trivially handle the case of a
/proc//task/ entry as well as the current /proc/ entries.

Signed-off-by: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2006-06-27 00:58:24 +0800
e6f47f978 [PATCH] Notify page fault call chain ... Browse Code »

With this patch Kprobes now registers for page fault notifications only when
their is an active probe registered. Once all the active probes are
unregistered their is no need to be notified of page faults and kprobes
unregisters itself from the page fault notifications. Hence we will have ZERO
side effects when no probes are active.

Signed-off-by: Anil S Keshavamurthy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Anil S Keshavamurthy
2006-06-27 00:58:22 +0800
3d5631e06 [PATCH] Kprobes registers for notify page fault ... Browse Code »

Kprobes now registers for page fault notifications.

Signed-off-by: Anil S Keshavamurthy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Anil S Keshavamurthy
2006-06-27 00:58:22 +0800
367216567 [PATCH] Kprobe: multi kprobe posthandler for booster ... Browse Code »

If there are multi kprobes on the same probepoint, there will be one extra
aggr_kprobe on the head of kprobe list. The aggr_kprobe has
aggr_post_handler/aggr_break_handler whether the other kprobe
post_hander/break_handler is NULL or not. This patch modifies this, only
when there is one or more kprobe in the list whose post_handler is not
NULL, post_handler of aggr_kprobe will be set as aggr_post_handler.

[soshima@redhat.com: !CONFIG_PREEMPT fix]
Signed-off-by: bibo, mao
Cc: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: "Keshavamurthy, Anil S"
Cc: Prasanna S Panchamukhi
Cc: Jim Keniston
Cc: Yumiko Sugita
Cc: Hideo Aoki
Signed-off-by: Satoshi Oshima
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

mao, bibo
2006-06-27 00:58:22 +0800
19923c190 [PATCH] fix and optimize clock source update ... Browse Code »

This fixes the clock source updates in update_wall_time() to correctly
track the time coming in via current_tick_length(). Optimize the fast
paths to be as short as possible to keep the overhead low.

Signed-off-by: Roman Zippel
Acked-by: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Zippel
2006-06-27 00:58:21 +0800
a27525497 [PATCH] time: rename clocksource functions ... Browse Code »

As suggested by Roman Zippel, change clocksource functions to use
clocksource_xyz rather then xyz_clocksource to avoid polluting the
namespace.

Signed-off-by: John Stultz
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:21 +0800
5d0cf410e [PATCH] Time: i386 Clocksource Drivers ... Browse Code »

Implement the time sources for i386 (acpi_pm, cyclone, hpet, pit, and tsc).
With this patch, the conversion of the i386 arch to the generic timekeeping
code should be complete.

The patch should be fairly straight forward, only adding the new clocksources.

[hirofumi@mail.parknet.co.jp: acpi_pm cleanup]
Signed-off-by: John Stultz
Signed-off-by: Adrian Bunk
Signed-off-by: Paul Mundt
Signed-off-by: John Stultz
Signed-off-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:21 +0800
cf3c769b4 [PATCH] Time: Introduce arch generic time accessors ... Browse Code »

Introduces clocksource switching code and the arch generic time accessor
functions that use the clocksource infrastructure.

Signed-off-by: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:20 +0800
5eb6d2053 [PATCH] Time: Use clocksource abstraction for NTP adjustments ... Browse Code »

Instead of incrementing xtime by tick_nsec + ntp adjustments, use the
clocksource abstraction to increment and scale time. Using the clocksource
abstraction allows other clocksources to be used consistently in the face of
late or lost ticks, while preserving the existing behavior via the jiffies
clocksource.

This removes the need to keep time_phase adjustments as we just use the
current_tick_length() function as the NTP interface and accumulate time using
shifted nanoseconds.

The basics of this design was by Roman Zippel, however it is my own
interpretation and implementation, so the credit should go to him and the
blame to me.

Signed-off-by: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:20 +0800
260a42309 [PATCH] Time: Let user request precision from current_tick_length() ... Browse Code »

Change the current_tick_length() function so it takes an argument which
specifies how much precision to return in shifted nanoseconds. This provides
a simple way to convert between NTPs internal nanoseconds shifted by
(SHIFT_SCALE - 10) to other shifted nanosecond units that are used by the
clocksource abstraction.

Signed-off-by: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:20 +0800
ad596171e [PATCH] Time: Use clocksource infrastructure for update_wall_time ... Browse Code »

Modify the update_wall_time function so it increments time using the
clocksource abstraction instead of jiffies. Since the only clocksource driver
currently provided is the jiffies clocksource, this should result in no
functional change. Additionally, a timekeeping_init and timekeeping_resume
function has been added to initialize and maintain some of the new timekeping
state.

[hirofumi@mail.parknet.co.jp: fixlet]
Signed-off-by: John Stultz
Signed-off-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:20 +0800
734efb467 [PATCH] Time: Clocksource Infrastructure ... Browse Code »

This introduces the clocksource management infrastructure. A clocksource is a
driver-like architecture generic abstraction of a free-running counter. This
code defines the clocksource structure, and provides management code for
registering, selecting, accessing and scaling clocksources.

Additionally, this includes the trivial jiffies clocksource, a lowest common
denominator clocksource, provided mainly for use as an example.

[hirofumi@mail.parknet.co.jp: Don't enable IRQ too early]
Signed-off-by: John Stultz
Signed-off-by: Ingo Molnar
Signed-off-by: Paul Mundt
Signed-off-by: John Stultz
Signed-off-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

john stultz
2006-06-27 00:58:20 +0800
81615b624 [PATCH] Convert kernel/cpu.c to mutexes ... Browse Code »

Convert kernel/cpu.c from semaphore to mutex.

I've reviewed all lock_cpu_hotplug() critical sections, and they all seem to
fit mutex semantics.

Signed-off-by: Ingo Molnar
Cc: Rusty Russell
Cc: Ashok Raj
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-06-27 00:58:16 +0800
1fb00c6cb [PATCH] work around ppc64 bootup bug by making mutex-debugging save/restore irqs ... Browse Code »

It seems ppc64 wants to lock mutexes in early bootup code, with interrupts
disabled, and they expect interrupts to stay disabled, else they crash.

Work around this bug by making mutex debugging variants save/restore irq
flags.

Signed-off-by: Ingo Molnar
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-06-27 00:58:16 +0800

26 Jun, 2006

21 commits

3448097fc Revert "swsusp special saveable pages support" commits ... Browse Code »

This reverts commits

3e3318dee0878d42ed62a19c292a2ac284135db3 [PATCH] swsusp: x86_64 mark special saveable/unsaveable pages
b6370d96e09944c6e3ae8d5743ca8a8ab1f79f6c [PATCH] swsusp: i386 mark special saveable/unsaveable pages
ce4ab0012b32c1a4a1d6e934aeb73bf3151c48d9 [PATCH] swsusp: add architecture special saveable pages support

because not only do they apparently cause page faults on x86, the
infrastructure doesn't compile on powerpc.

Signed-off-by: Linus Torvalds

Linus Torvalds
2006-06-26 09:41:00 +0800
72cf2709b Fix PM_TRACE dependency: works only on 32-bit x86 for now ... Browse Code »

Not that x86-64 and other architecture support should be difficult to
add (trivial fixups to the data format and add the proper linker script
entry).

Signed-off-by: Linus Torvalds

Linus Torvalds
2006-06-26 01:04:15 +0800
77787bfb4 [PATCH] pacct: none-delayed process accounting accumulation ... Browse Code »

In current 2.6.17 implementation, signal_struct refered from task_struct is
used for per-process data structure. The pacct facility also uses it as a
per-process data structure to store stime, utime, minflt, majflt. But those
members are saved in __exit_signal(). It's too late.

For example, if some threads exits at same time, pacct facility has a
possibility to drop accountings for a part of those threads. (see, the
following 'The results of original 2.6.17 kernel') I think accounting
information should be completely collected into the per-process data structure
before writing out an accounting record.

This patch fixes this matter. Accumulation of stime, utime, minflt and majflt
are done before generating accounting record.

[mingo@elte.hu: fix acct_collect() siglock bug found by lockdep]
Signed-off-by: KaiGai Kohei
Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KaiGai Kohei
2006-06-26 01:01:25 +0800
f6ec29a42 [PATCH] pacct: avoidance to refer the last thread as a representation of the process ... Browse Code »

When pacct facility generate an 'ac_flag' field in accounting record, it
refers a task_struct of the thread which died last in the process. But any
other task_structs are ignored.

Therefore, pacct facility drops ASU flag even if root-privilege operations are
used by any other threads except the last one. In addition, AFORK flag is
always set when the thread of group-leader didn't die last, although this
process has called execve() after fork().

We have a same matter in ac_exitcode. The recorded ac_exitcode is an exit
code of the last thread in the process. There is a possibility this exitcode
is not the group leader's one.

KaiGai Kohei
2006-06-26 01:01:25 +0800
0e4648141 [PATCH] pacct: add pacct_struct to fix some pacct bugs. ... Browse Code »

The pacct facility need an i/o operation when an accounting record is
generated. There is a possibility to wake OOM killer up. If OOM killer is
activated, it kills some processes to make them release process memory
regions.

But acct_process() is called in the killed processes context before calling
exit_mm(), so those processes cannot release own memory. In the results, any
processes stop in this point and it finally cause a system stall.

KaiGai Kohei
2006-06-26 01:01:25 +0800
9e37bd301 [PATCH] kthread: move kernel-doc and put it into DocBook ... Browse Code »

Move kthread API kernel-doc from kthread.h to kthread.c & fix it.
Add kthread API to kernel-api DocBook.

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-06-26 01:01:24 +0800
fa9799e33 [PATCH] ktime/hrtimer: fix kernel-doc comments ... Browse Code »

Fix kernel-doc formatting in ktime.h and hrtimer.[ch] files.

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-06-26 01:01:23 +0800
fc75cdfa5 [PATCH] cpu hotplug: fix CPU_UP_CANCEL handling ... Browse Code »

If a cpu hotplug callback fails on CPU_UP_PREPARE, all callbacks will be
called with CPU_UP_CANCELED. A few of these callbacks assume that on
CPU_UP_PREPARE a pointer to task has been stored in a percpu array. This
assumption is not true if CPU_UP_PREPARE fails and the following calls to
kthread_bind() in CPU_UP_CANCELED will cause an addressing exception
because of passing a NULL pointer.

Signed-off-by: Heiko Carstens
Cc: Ashok Raj
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Heiko Carstens
2006-06-26 01:01:22 +0800
8bdd1d125 [PATCH] kthread: convert stop_machine into a kthread ... Browse Code »

- Update stop_machine.c to spawn stop_machine as kthreads rather than the
deprecated kernel_threads.

- Update stop_machine to use the more efficient kthread_bind() before
running task in place of set_cpus_allowed() after.

[akpm@osdl.org: remove now-wrong set_cpus_allowed()]
Signed-off-by: Serge E. Hallyn
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2006-06-26 01:01:22 +0800
2aa92581f [PATCH] Link error when futexes are disabled on 64bit architectures ... Browse Code »

If futexes are disabled we fail to link on ppc64.

Signed-off-by: Anton Blanchard
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Anton Blanchard
2006-06-26 01:01:17 +0800
838cd153a [PATCH] N32 sigset and __COMPAT_ENDIAN_SWAP__ ... Browse Code »

I'm testing glibc on MIPS64, little-endian, N32, O32 and N64 multilibs.

Among the NPTL test failures seen are some arising from sigsuspend problems
for N32: it blocks the wrong signals, so SIGCANCEL (SIGRTMIN) is blocked
despite glibc's carefully excluding it from sets of signals to block.
Specifically, testing suggests it blocks signal N^32 instead of signal N,
so (in the example tested) blocking SIGUSR1 (17) blocks signal 49 instead.

glibc's sigset_t uses an array of unsigned long, as does the kernel.
In both cases, signal N+1 is represented as
(1UL << (N % (8 * sizeof (unsigned long)))) in word number
(N / (8 * sizeof (unsigned long))).

Thus the N32 glibc uses an array of 32-bit words and the N64 kernel uses an
array of 64-bit words. For little-endian, the layout is the same, with
signals 1-32 in the first 4 bytes, signals 33-64 in the second, etc.; for
big-endian, userspace has that layout while in the kernel each 8 bytes have
the two halves swapped from the userspace layout.

The N32 sigsuspend syscall uses sigset_from_compat to convert the userspace
sigset to kernel format. If __COMPAT_ENDIAN_SWAP__ is *not* set, this uses
logic of the form

set->sig[0] = compat->sig[0] | (((long)compat->sig[1]) << 32 )

to convert the userspace sigset to a kernel one. This looks correct to me
for both big and little endian, given that in userspace compat->sig[1] will
represent signals 33-64, and so will the high 32 bits of set->sig[0] in the
kernel. If however __COMPAT_ENDIAN_SWAP__ *is* set, as it is for
__MIPSEL__, it uses

set->sig[0] = compat->sig[1] | (((long)compat->sig[0]) << 32 );

which seems incorrect for both big and little endian, and would
explain the observed symptoms.

This code is the only use of __COMPAT_ENDIAN_SWAP__, so if incorrect
then that macro serves no purpose, in which case something like the
following patch would seem appropriate to remove it.

Signed-off-by: Joseph Myers
Signed-off-by: Ralf Baechle
Cc: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

akpm@osdl.org
2006-06-26 01:01:15 +0800
eab03ac7b [PATCH] Get rid of /proc/sys/proc ... Browse Code »

The table is empty, why does it still exist?

Signed-off-by: Stephen Hemminger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stephen Hemminger
2006-06-26 01:01:15 +0800
3b9c04106 [PATCH] printk time parameter ... Browse Code »

Currently, enabling/disabling printk timestamps is only possible through
reboot (bootparam) or recompile. I normally do not run with timestamps
(since syslog handles that in a good manner), but for measuring small
kernel delays (e.g. irq probing - see parport thread) I needed subsecond
precision, but then again, just for some minutes rather than all kernel
messages to come. The following patch adds a module_param() with which the
timestamps can be en-/disabled in a live system through
/sys/modules/printk/parameters/printk_time.

Signed-off-by: Jan Engelhardt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Engelhardt
2006-06-26 01:01:13 +0800
11e64757f [PATCH] Remove unecessary NULL check in kernel/acct.c ... Browse Code »

copy_process() appears to be the only caller of acct_clear_integrals() and
does not pass in NULL task pointers. Remove the unecessary check.

Signed-off-by: Matt Helsley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matt Helsley
2006-06-26 01:01:09 +0800
3b364b8d5 [PATCH] constify parts of kernel/power/ ... Browse Code »

Signed-off-by: Andreas Mohr
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andreas Mohr
2006-06-26 01:01:08 +0800
b61367732 [PATCH] schedule_on_each_cpu(): reduce kmalloc() size ... Browse Code »

schedule_on_each_cpu() presently does a large kmalloc - 96 kbytes on 1024 CPU
64-bit.

Rework it so that we do one 8192-byte allocation and then a pile of tiny ones,
via alloc_percpu(). This has a much higher chance of success (100% in the
current VM).

This also has the effect of reducing the memory requirements from NR_CPUS*n to
num_possible_cpus()*n.

Cc: Christoph Lameter
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-06-26 01:01:07 +0800
83cc5ed3c [PATCH] kernel/sys.c: cleanups ... Browse Code »

- proper prototypes for the following functions:
- ctrl_alt_del() (in include/linux/reboot.h)
- getrusage() (in include/linux/resource.h)
- make the following needlessly global functions static:
- kernel_restart_prepare()
- kernel_kexec()

[akpm@osdl.org: compile fix]
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-06-26 01:01:06 +0800
76a8ad293 [PATCH] Make printk work for really early debugging ... Browse Code »

Currently printk is no use for early debugging because it refuses to
actually print anything to the console unless
cpu_online(smp_processor_id()) is true.

The stated explanation is that console drivers may require per-cpu
resources, or otherwise barf, because the system is not yet setup
correctly. Fair enough.

However some console drivers might be quite happy running early during
boot, in fact we have one, and so it'd be nice if printk understood that.

So I added a flag (which I would have called CON_BOOT, but that's taken)
called CON_ANYTIME, which indicates that a console is happy to be called
anytime, even if the cpu is not yet online.

Tested on a Power 5 machine, with both a CON_ANYTIME driver and a bogus
console driver that BUG()s if called while offline. No problems AFAICT.
Built for i386 UP & SMP.

Signed-off-by: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Ellerman
2006-06-26 01:01:05 +0800
bbb1747d4 [PATCH] Allow raw_notifier callouts to unregister themselves ... Browse Code »

Since raw_notifier chains don't benefit from any centralized locking
protections, they shouldn't suffer from the associated limitations. Under
some circumstances it might make sense for a raw_notifier callout routine
to unregister itself from the notifier chain. This patch (as678) changes
the notifier core to allow for such things.

Signed-off-by: Alan Stern
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Stern
2006-06-26 01:01:01 +0800
bfe5d8341 [PATCH] Define __raw_get_cpu_var and use it ... Browse Code »

There are several instances of per_cpu(foo, raw_smp_processor_id()), which
is semantically equivalent to __get_cpu_var(foo) but without the warning
that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
those architectures with optimized per-cpu implementations, namely ia64,
powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
on those platforms.

This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
raw_smp_processor_id()) on architectures that use the generic per-cpu
implementation, and turns into __get_cpu_var(x) on the architectures that
have an optimized per-cpu implementation.

Signed-off-by: Paul Mackerras
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Acked-by: Martin Schwidefsky
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Mackerras
2006-06-26 01:01:01 +0800
f867d2a2e [PATCH] ensure NULL deref can't possibly happen in is_exported() ... Browse Code »

If CONFIG_KALLSYMS is defined and if it should happen that is_exported() is
given a NULL 'mod' and lookup_symbol(name, __start___ksymtab,
__stop___ksymtab) returns 0, then we'll end up dereferencing a NULL
pointer.

Signed-off-by: Jesper Juhl
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jesper Juhl
2006-06-26 01:00:59 +0800

25 Jun, 2006

1 commit

eb71c87a4 Add some basic resume trace facilities ... Browse Code »

Considering that there isn't a lot of hw we can depend on during resume,
this is about as good as it gets.

This is x86-only for now, although the basic concept (and most of the
code) will certainly work on almost any platform.

Signed-off-by: Linus Torvalds

Linus Torvalds
2006-06-25 05:44:01 +0800

23 Jun, 2006

3 commits

125e18745 [PATCH] More BUG_ON conversion ... Browse Code »

Signed-off-by: Eric Sesterhenn
Signed-off-by: Alexey Dobriyan
Cc: Bartlomiej Zolnierkiewicz
Cc: Alan Cox
Cc: James Bottomley
Acked-by: "Salyzyn, Mark"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Sesterhenn
2006-06-23 22:43:08 +0800
908dcecda [PATCH] adjust handle_IRR_event() return type ... Browse Code »

Correct the return type of handle_IRQ_event() (inconsistency noticed during
Xen development), and remove redundant declarations. The return type
adjustment required breaking out the definition of irqreturn_t into a
separate header, in order to satisfy current include order dependencies.

Signed-off-by: Jan Beulich

Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Russell King
Cc: Ian Molton
Cc: Mikael Starvik
Cc: Yoshinori Sato
Cc: Hirokazu Takata
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: William Lee Irwin III
Cc: "David S. Miller"
Cc: Miles Bader
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2006-06-23 22:43:08 +0800
3439dd86e [PATCH] When CONFIG_BASE_SMALL=1, cascade() may enter an infinite loop ... Browse Code »

When CONFIG_BASE_SAMLL=1, cascade() in may enter the infinite loop.
Because of CONFIG_BASE_SMALL=1(TVR_BITS=6 and TVN_BITS=4), the list
base->tv5 may cascade into base->tv5. So, the kernel enters the infinite
loop in the function cascade().

I created a test module to verify this bug, and a patch to fix it.

#include
#include
#include
#include
#if 0
#include
#else
#define kdb_printf printk
#endif

#define TVN_BITS (CONFIG_BASE_SMALL ? 4 : 6)
#define TVR_BITS (CONFIG_BASE_SMALL ? 6 : 8)
#define TVN_SIZE (1 << TVN_BITS)
#define TVR_SIZE (1 << TVR_BITS)
#define TVN_MASK (TVN_SIZE - 1)
#define TVR_MASK (TVR_SIZE - 1)

#define TV_SIZE(N) (N*TVN_BITS + TVR_BITS)

struct timer_list timer0;
struct timer_list dummy_timer1;
struct timer_list dummy_timer2;

void dummy_timer_fun(unsigned long data) {
}
unsigned long j=0;
void check_timer_base(unsigned long data)
{
kdb_printf("check_timer_base %08x\n",jiffies);
mod_timer(&timer0,(jiffies & (~0xFFF)) + 0x1FFF);
}

int init_module(void)
{
init_timer(&timer0);
timer0.data = (unsigned long)0;
timer0.function = check_timer_base;
mod_timer(&timer0,jiffies+1);

init_timer(&dummy_timer1);
dummy_timer1.data = (unsigned long)0;
dummy_timer1.function = dummy_timer_fun;

init_timer(&dummy_timer2);
dummy_timer2.data = (unsigned long)0;
dummy_timer2.function = dummy_timer_fun;

j=jiffies;
j&=(~((1<<<
Cc: Matt Mackall
Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Porpoise
2006-06-23 22:43:08 +0800