Eric Lee / smarc-fsl-linux-kernel

08 Oct, 2007

2 commits

291041e93 fix bogus reporting of signals by audit ... Browse Code »

Async signals should not be reported as sent by current in audit log. As
it is, we call audit_signal_info() too early in check_kill_permission().
Note that check_kill_permission() has that test already - it needs to know
if it should apply current-based permission checks. So the solution is to
move the call of audit_signal_info() between those.

Bogosity in question is easily reproduced - add a rule watching for e.g.
kill(2) from specific process (so that audit_signal_info() would not
short-circuit to nothing), say load_policy, watch the bogus OBJ_PID entry
in audit logs claiming that write(2) on selinuxfs file issued by
load_policy(8) had somehow managed to send a signal to syslogd...

Signed-off-by: Al Viro
Acked-by: Steve Grubb
Acked-by: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2007-10-08 07:28:43 +0800
74922be14 Fix timer_stats printout of events/sec ... Browse Code »

When using /proc/timer_stats on ppc64 I noticed the events/sec field wasnt
accurate. Sometimes the integer part was incorrect due to rounding (we
werent taking the fractional seconds into consideration).

The fraction part is also wrong, we need to pad the printf statement and
take the bottom three digits of 1000 times the value.

Signed-off-by: Anton Blanchard
Acked-by: Ingo Molnar
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Anton Blanchard
2007-10-08 07:28:43 +0800

02 Oct, 2007

1 commit

30084fbd1 sched: fix profile=sleep ... Browse Code »

fix sleep profiling - we lost this chunk in the CFS merge.

Found-by: Mel Gorman
Signed-off-by: Ingo Molnar

Ingo Molnar
2007-10-02 20:13:08 +0800

01 Oct, 2007

2 commits

9f96cb1e8 robust futex thread exit race ... Browse Code »

Calling handle_futex_death in exit_robust_list for the different robust
mutexes of a thread basically frees the mutex. Another thread might grab
the lock immediately which updates the next pointer of the mutex.
fetch_robust_entry over the next pointer might therefore branch into the
robust mutex list of a different thread. This can cause two problems: 1)
some mutexes held by the dead thread are not getting freed and 2) some
mutexs held by a different thread are freed.

The next point need to be read before calling handle_futex_death.

Signed-off-by: Martin Schwidefsky
Acked-by: Ingo Molnar
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Martin Schwidefsky
2007-10-01 22:52:23 +0800
4047727e5 Fix SMP poweroff hangs ... Browse Code »

We need to disable all CPUs other than the boot CPU (usually 0) before
attempting to power-off modern SMP machines. This fixes the
hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
new toybox.

Signed-off-by: Mark Lord
Acked-by: Thomas Gleixner
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Lord
2007-10-01 22:52:23 +0800

27 Sep, 2007

1 commit

459685c75 hibernation doesn't even build on frv - tons of helpers are missing ... Browse Code »

Signed-off-by: Al Viro
Acked-By: David Howells
Signed-off-by: Linus Torvalds

Al Viro
2007-09-27 00:22:04 +0800

23 Sep, 2007

1 commit

b7e113dc9 clockevents: remove the suspend/resume workaround^Wthinko ... Browse Code »

In a desparate attempt to fix the suspend/resume problem on Andrews
VAIO I added a workaround which enforced the broadcast of the oneshot
timer on resume. This was actually resolving the problem on the VAIO
but was just a stupid workaround, which was not tackling the root
cause: the assignement of lower idle C-States in the ACPI processor_idle
code. The cpuidle patches, which utilize the dynamic tick feature and
go faster into deeper C-states exposed the problem again. The correct
solution is the previous patch, which prevents lower C-states across
the suspend/resume.

Remove the enforcement code, including the conditional broadcast timer
arming, which helped to pamper over the real problem for quite a time.
The oneshot broadcast flag for the cpu, which runs the resume code can
never be set at the time when this code is executed. It only gets set,
when the CPU is entering a lower idle C-State.

Signed-off-by: Thomas Gleixner
Tested-by: Andrew Morton
Cc: Len Brown
Cc: Venkatesh Pallipadi
Cc: Rafael J. Wysocki
Signed-off-by: Linus Torvalds

Thomas Gleixner
2007-09-23 08:15:34 +0800

21 Sep, 2007

1 commit

b8fceee17 signalfd simplification ... Browse Code »

This simplifies signalfd code, by avoiding it to remain attached to the
sighand during its lifetime.

In this way, the signalfd remain attached to the sighand only during
poll(2) (and select and epoll) and read(2). This also allows to remove
all the custom "tsk == current" checks in kernel/signal.c, since
dequeue_signal() will only be called by "current".

I think this is also what Ben was suggesting time ago.

The external effect of this, is that a thread can extract only its own
private signals and the group ones. I think this is an acceptable
behaviour, in that those are the signals the thread would be able to
fetch w/out signalfd.

Signed-off-by: Davide Libenzi
Signed-off-by: Linus Torvalds

Davide Libenzi
2007-09-21 04:19:59 +0800

20 Sep, 2007

6 commits

9c95e7319 sched: fix invalid sched_class use ... Browse Code »

When using rt_mutex, a NULL pointer dereference is occurred at
enqueue_task_rt. Here is a scenario;
1) there are two threads, the thread A is fair_sched_class and
thread B is rt_sched_class.
2) Thread A is boosted up to rt_sched_class, because the thread A
has a rt_mutex lock and the thread B is waiting the lock.
3) At this time, when thread A create a new thread C, the thread
C has a rt_sched_class.
4) When doing wake_up_new_task() for the thread C, the priority
of the thread C is out of the RT priority range, because the
normal priority of thread A is not the RT priority. It makes
data corruption by overflowing the rt_prio_array.
The new thread C should be fair_sched_class.

The new thread should be valid scheduler class before queuing.
This patch fixes to set the suitable scheduler class.

Signed-off-by: Hiroshi Shimamoto
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Hiroshi Shimamoto
2007-09-20 05:34:46 +0800
1799e35d5 sched: add /proc/sys/kernel/sched_compat_yield ... Browse Code »

add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield()
more agressive, by moving the yielding task to the last position
in the rbtree.

with sched_compat_yield=0:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield
2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop

with sched_compat_yield=1:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop
2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-09-20 05:34:46 +0800
28f300d23 Fix user namespace exiting OOPs ... Browse Code »

It turned out, that the user namespace is released during the do_exit() in
exit_task_namespaces(), but the struct user_struct is released only during the
put_task_struct(), i.e. MUCH later.

On debug kernels with poisoned slabs this will cause the oops in
uid_hash_remove() because the head of the chain, which resides inside the
struct user_namespace, will be already freed and poisoned.

Since the uid hash itself is required only when someone can search it, i.e.
when the namespace is alive, we can safely unhash all the user_struct-s from
it during the namespace exiting. The subsequent free_uid() will complete the
user_struct destruction.

For example simple program

#include

char stack[2 * 1024 * 1024];

int f(void *foo)
{
return 0;
}

int main(void)
{
clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0);
return 0;
}

run on kernel with CONFIG_USER_NS turned on will oops the
kernel immediately.

This was spotted during OpenVZ kernel testing.

Signed-off-by: Pavel Emelyanov
Signed-off-by: Alexey Dobriyan
Acked-by: "Serge E. Hallyn"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-09-20 02:24:18 +0800
735de2230 Convert uid hash to hlist ... Browse Code »

Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses
list_heads, thus occupying twice as much place as it could. Convert it to
hlist_heads.

Signed-off-by: Pavel Emelyanov
Signed-off-by: Alexey Dobriyan
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-09-20 02:24:18 +0800
d8a4821dc kernel/user.c: Use list_for_each_entry instead of list_for_each ... Browse Code »

kernel/user.c: Convert list_for_each to list_for_each_entry in
uid_hash_find()

Signed-off-by: Matthias Kaehlcke
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthias Kaehlcke
2007-09-20 02:24:18 +0800
efc63c4fb Fix UTS corruption during clone(CLONE_NEWUTS) ... Browse Code »

struct utsname is copied from master one without any exclusion.

Here is sample output from one proggie doing

sethostname("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
sethostname("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");

and another

clone(,, CLONE_NEWUTS, ...)
uname()

hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaabbbbb'
hostname = 'bbbaaaaaaaaaaaaaaaaaaaaaaaaaaa'
hostname = 'aaaaaaaabbbbbbbbbbbbbbbbbbbbbb'
hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaabbbb'
hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaabb'
hostname = 'aaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
hostname = 'bbbbbbbbbbbbbbbbaaaaaaaaaaaaaa'

Hostname is sometimes corrupted.

Yes, even _the_ simplest namespace activity had bug in it. :-(

Signed-off-by: Alexey Dobriyan
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-09-20 02:24:17 +0800

16 Sep, 2007

5 commits

5e41d0d60 clockevents: prevent stale tick update on offline cpu ... Browse Code »

Taking a cpu offline removes the cpu from the online mask before the
CPU_DEAD notification is done. The clock events layer does the cleanup
of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is
used to avoid xtime lock contention by assigning the task of jiffies
xtime updates to one CPU. If a CPU is taken offline, then this
assignment becomes stale. This went unnoticed because most of the time
the offline CPU went dead before the online CPU reached __cpu_die(),
where the CPU_DEAD state is checked. In the case that the offline CPU did
not reach the DEAD state before we reach __cpu_die(), the code in there
goes to sleep for 100ms. Due to the stale time update assignment, the
system is stuck forever.

Take the assignment away when a cpu is not longer in the cpu_online_mask.
We do this in the last call to tick_nohz_stop_sched_tick() when the offline
CPU is on the way to the final play_dead() idle entry.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2007-09-16 21:36:43 +0800
31d9b3938 clockevents: do not shutdown the oneshot broadcast device ... Browse Code »

When a cpu goes offline it is removed from the broadcast masks. If the
mask becomes empty the code shuts down the broadcast device. This is
wrong, because the broadcast device needs to be ready for the online
cpu going idle (into a c-state, which stops the local apic timer).

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2007-09-16 21:36:43 +0800
07eec6af4 clockevents: Enforce oneshot broadcast when broadcast mask is set on resume ... Browse Code »

The jinxed VAIO refuses to resume without hitting keys on the keyboard
when this is not enforced. It is unclear why the cpu ends up in a lower
C State without notifying the clock events layer, but enforcing the
oneshot broadcast here is safe.

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2007-09-16 21:36:43 +0800
6a669ee8a timekeeping: Prevent time going backwards on resume ... Browse Code »

Timekeeping resume adjusts xtime by adding the slept time in seconds and
resets the reference value of the clock source (clock->cycle_last).
clock->cycle last is used to calculate the delta between the last xtime
update and the readout of the clock source in __get_nsec_offset(). xtime
plus the offset is the current time. The resume code ignores the delta
which had already elapsed between the last xtime update and the actual
time of suspend. If the suspend time is short, then we can see time
going backwards on resume.

Suspend:
offs_s = clock->read() - clock->cycle_last;
now = xtime + offs_s;
timekeeping_suspend_time = read_rtc();

Resume:
sleep_time = read_rtc() - timekeeping_suspend_time;
xtime.tv_sec += sleep_time;
clock->cycle_last = clock->read();
offs_r = clock->read() - clock->cycle_last;
now = xtime + offs_r;

if sleep_time_seconds == 0 and offs_r < offs_s, then time goes
backwards.

Fix this by storing the offset from the last xtime update and add it to
xtime during resume, when we reset clock->cycle_last:

sleep_time = read_rtc() - timekeeping_suspend_time;
xtime.tv_sec += sleep_time;
xtime += offs_s; /* Fixup xtime offset at suspend time */
clock->cycle_last = clock->read();
offs_r = clock->read() - clock->cycle_last;
now = xtime + offs_r;

Thanks to Marcelo for tracking this down on the OLPC and providing the
necessary details to analyze the root cause.

Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Tosatti

Thomas Gleixner
2007-09-16 21:36:43 +0800
3be909506 timekeeping: access rtc outside of xtime lock ... Browse Code »

Lockdep complains about the access of rtc in timekeeping_suspend
inside the interrupt disabled region of the write locked xtime lock.
Move the access outside.

Signed-off-by: Thomas Gleixner
Cc: John Stultz

Thomas Gleixner
2007-09-16 21:36:43 +0800

12 Sep, 2007

3 commits

298a5df45 Fix "no_sync_cmos_clock" logic inversion in kernel/time/ntp.c ... Browse Code »

Seems to me that this timer will only get started on platforms that say
they don't want it?

Signed-off-by: Tony Breeds
Cc: Paul Mackerras
Cc: Gabriel Paubert
Cc: Zachary Amsden
Acked-by: Thomas Gleixner
Cc: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Breeds
2007-09-12 08:21:27 +0800
3210f0ecd Restore call_usermodehelper_pipe() behaviour ... Browse Code »

The semantics of call_usermodehelper_pipe() used to be that it would fork
the helper, and wait for the kernel thread to be started. This was
implemented by setting sub_info.wait to 0 (implicitly), and doing a
wait_for_completion().

As part of the cleanup done in 0ab4dc92278a0f3816e486d6350c6652a72e06c8,
call_usermodehelper_pipe() was changed to pass 1 as the value for wait to
call_usermodehelper_exec().

This is equivalent to setting sub_info.wait to 1, which is a change from
the previous behaviour. Using 1 instead of 0 causes
__call_usermodehelper() to start the kernel thread running
wait_for_helper(), rather than directly calling ____call_usermodehelper().

The end result is that the calling kernel code blocks until the user mode
helper finishes. As the helper is expecting input on stdin, and now no one
is writing anything, everything locks up (observed in do_coredump).

The fix is to change the 1 to UMH_WAIT_EXEC (aka 0), indicating that we
want to wait for the kernel thread to be started, but not for the helper to
finish.

Signed-off-by: Michael Ellerman
Acked-by: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Ellerman
2007-09-12 08:21:20 +0800
179c85ea5 futex_compat: fix list traversal bugs ... Browse Code »

The futex list traversal on the compat side appears to have
a bug.

It's loop termination condition compares:

while (compat_ptr(uentry) != &head->list)

But that can't be right because "uentry" has the special
"pi" indicator bit still potentially set at bit 0. This
is cleared by fetch_robust_entry() into the "entry"
return value.

What this seems to mean is that the list won't terminate
when list iteration gets back to the the head. And we'll
also process the list head like a normal entry, which could
cause all kinds of problems.

So we should check for equality with "entry". That pointer
is of the non-compat type so we have to do a little casting
to keep the compiler and sparse happy.

The same problem can in theory occur with the 'pending'
variable, although that has not been reported from users
so far.

Based on the original patch from David Miller.

Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: David Miller
Signed-off-by: Arnd Bergmann
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2007-09-12 08:21:20 +0800

11 Sep, 2007

1 commit

7d9414329 Fix spurious syscall tracing after PTRACE_DETACH + PTRACE_ATTACH ... Browse Code »

When PTRACE_SYSCALL was used and then PTRACE_DETACH is used, the
TIF_SYSCALL_TRACE flag is left set on the formerly-traced task. This
means that when a new tracer comes along and does PTRACE_ATTACH, it's
possible he gets a syscall tracing stop even though he's never used
PTRACE_SYSCALL. This happens if the task was in the middle of a system
call when the second PTRACE_ATTACH was done. The symptom is an
unexpected SIGTRAP when the tracer thinks that only SIGSTOP should have
been provoked by his ptrace calls so far.

A few machines already fixed this in ptrace_disable (i386, ia64, m68k).
But all other machines do not, and still have this bug. On x86_64, this
constitutes a regression in IA32 compatibility support.

Since all machines now use TIF_SYSCALL_TRACE for this, I put the
clearing of TIF_SYSCALL_TRACE in the generic ptrace_detach code rather
than adding it to every other machine's ptrace_disable.

Signed-off-by: Roland McGrath
Signed-off-by: Linus Torvalds

Roland McGrath
2007-09-11 09:57:47 +0800

05 Sep, 2007

8 commits

116978308 sched: fix ideal_runtime calculations for reniced tasks ... Browse Code »

fix ideal_runtime:

- do not scale it using niced_granularity()
it is against sum_exec_delta, so its wall-time, not fair-time.

- move the whole check into __check_preempt_curr_fair()
so that wakeup preemption can also benefit from the new logic.

this also results in code size reduction:

text data bss dec hex filename
13391 228 1204 14823 39e7 sched.o.before
13369 228 1204 14801 39d1 sched.o.after

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
4a55b4503 sched: improve prev_sum_exec_runtime setting ... Browse Code »

Second preparatory patch for fix-ideal runtime:

Mark prev_sum_exec_runtime at the beginning of our run, the same spot
that adds our wait period to wait_runtime. This seems a more natural
location to do this, and it also reduces the code a bit:

text data bss dec hex filename
13397 228 1204 14829 39ed sched.o.before
13391 228 1204 14823 39e7 sched.o.after

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
7c92e54f6 sched: simplify __check_preempt_curr_fair() ... Browse Code »

Preparatory patch for fix-ideal-runtime:

simplify __check_preempt_curr_fair(): get rid of the integer return.

text data bss dec hex filename
13404 228 1204 14836 39f4 sched.o.before
13393 228 1204 14825 39e9 sched.o.after

functionality is unchanged.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
cf2ab4696 sched: fix xtensa build warning ... Browse Code »

rename RSR to SRR - 'RSR' is already defined on xtensa.

found by Adrian Bunk.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
2491b2b89 sched: debug: fix sum_exec_runtime clearing ... Browse Code »

when cleaning sched-stats also clear prev_sum_exec_runtime.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
a206c0721 sched: debug: fix cfs_rq->wait_runtime accounting ... Browse Code »

the cfs_rq->wait_runtime debug/statistics counter was not maintained
properly - fix this.

this also removes some code:

text data bss dec hex filename
13420 228 1204 14852 3a04 sched.o.before
13404 228 1204 14836 39f4 sched.o.after

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-09-05 20:32:49 +0800
a0dc72601 sched: fix niced_granularity() shift ... Browse Code »

fix niced_granularity(). This resulted in under-scheduling for
CPU-bound negative nice level tasks (and this in turn caused
higher than necessary latencies in nice-0 tasks).

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
7fd0d2dde sched: fix MC/HT scheduler optimization, without breaking the FUZZ logic. ... Browse Code »

First fix the check
if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task)
with this
if (*imbalance < busiest_load_per_task)

As the current check is always false for nice 0 tasks (as
SCHED_LOAD_SCALE_FUZZ is same as busiest_load_per_task for nice 0
tasks).

With the above change, imbalance was getting reset to 0 in the corner
case condition, making the FUZZ logic fail. Fix it by not corrupting the
imbalance and change the imbalance, only when it finds that the HT/MC
optimization is needed.

Signed-off-by: Suresh Siddha
Signed-off-by: Ingo Molnar

Suresh Siddha
2007-09-05 20:32:48 +0800

01 Sep, 2007

1 commit

5e7a39275 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: clean up task_new_fair()
sched: small schedstat fix
sched: fix wait_start_fair condition in update_stats_wait_end()
sched: call update_curr() in task_tick_fair()
sched: make the scheduler converge to the ideal latency
sched: fix sleeper bonus limit

Linus Torvalds
2007-09-01 01:52:00 +0800

31 Aug, 2007

6 commits

60187d270 sigqueue_free: fix the race with collect_signal() ... Browse Code »

Spotted by taoyue and Jeremy Katz .

collect_signal: sigqueue_free:

list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;

__sigqueue_free(first); __sigqueue_free(q);

Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.

In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.

Hopefully this can explain these mysterious bug-reports, see

http://marc.info/?t=118766926500003
http://marc.info/?t=118466273000005

Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.

Also, this patch removes tasklist lock/unlock, ->siglock is enough.

Signed-off-by: Oleg Nesterov
Cc: taoyue
Cc: Jeremy Katz
Cc: Sukadev Bhattiprolu
Cc: Alexey Dobriyan
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Roland McGrath
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2007-08-31 16:42:23 +0800
99db67bc0 userns: don't leak root user ... Browse Code »

Signed-off-by: Alexey Dobriyan
Acked-by: Cedric Le Goater
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-08-31 16:42:23 +0800
59845b1ff request_irq: fix DEBUG_SHIRQ handling ... Browse Code »

Mariusz Kozlowski reported lockdep's warning:

> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> ---------------------------------
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> (&tp->lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
> [] __lock_acquire+0x949/0x11ac
> [] lock_acquire+0x99/0xb2
> [] _spin_lock+0x35/0x42
> [] rtl8139_interrupt+0x27/0x46b [8139too]
> [] handle_IRQ_event+0x28/0x59
> [] handle_level_irq+0xad/0x10b
> [] do_IRQ+0x93/0xd0
> [] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
> #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
>
> stack backtrace:
...
> [] _spin_lock+0x35/0x42
> [] rtl8139_interrupt+0x27/0x46b [8139too]
> [] free_irq+0x11b/0x146
> [] rtl8139_close+0x8a/0x14a [8139too]
> [] dev_close+0x57/0x74
...

This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.

But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).

Reported-by: Mariusz Kozlowski
Signed-off-by: Jarek Poplawski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jarek Poplawski
2007-08-31 16:42:23 +0800
f3de4be9d PM: Fix dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION ... Browse Code »

Dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION introduced by commit
296699de6bdc717189a331ab6bbe90e05c94db06 "Introduce CONFIG_SUSPEND for
suspend-to-Ram and standby" are incorrect, as they don't cover the facts that
(1) not all architectures support suspend and (2) SMP hibernation is only
possible on X86 and PPC64 (if CONFIG_PPC64_SWSUSP is set).

Signed-off-by: Rafael J. Wysocki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2007-08-31 16:42:22 +0800
b07e35f94 setpgid(child) fails if the child was forked by sub-thread ... Browse Code »

Spotted by Marcin Kowalczyk .

sys_setpgid(child) fails if the child was forked by sub-thread.

Fix the "is it our child" check. The previous commit
ee0acf90d320c29916ba8c5c1b2e908d81f5057d was not complete.

(this patch asks for the new same_thread_group() helper, but mainline doesn't
have it yet).

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc:
Tested-by: "Marcin 'Qrczak' Kowalczyk"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2007-08-31 16:42:22 +0800
f2ab6d888 Assign task_struct.exit_code before taskstats_exit() ... Browse Code »

taskstats.ac_exitcode is assigned to task_struct.exit_code in bacct_add_tsk()
through the following kernel function calls:

do_exit()
taskstats_exit()
fill_pid()
bacct_add_tsk()

The problem is that in do_exit(), task_struct.exit_code is set to 'code' only
after taskstats_exit() has been called. So we need to move the assignment
before taskstats_exit().

Signed-off-by: Jonathan Lim
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonathan Lim
2007-08-31 16:42:22 +0800

28 Aug, 2007

2 commits

9f508f825 sched: clean up task_new_fair() ... Browse Code »

cleanup: we have the 'se' and 'curr' entity-pointers already,
no need to use p->se and current->se.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800
213c8af67 sched: small schedstat fix ... Browse Code »

small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes'
statistics counters missed newly forked tasks and thus had a constant
negative skew. Fix this.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800