30 Sep, 2006
40 commits
-
A previous patch to allow an exiting task to OOM kill itself (and thereby
avoid a little deadlock) introduced a problem. We don't want the
PF_EXITING task, even if it is 'current', to access mem reserves if there
is already a TIF_MEMDIE process in the system sucking up reserves.Also make the commenting a little bit clearer, and note that our current
scheme of effectively single threading the OOM killer is not itself
perfect.Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- It is not possible to have task->mm == &init_mm.
- task_lock() buys nothing for 'if (!p->mm)' check.
Signed-off-by: Oleg Nesterov
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
No logic changes, but imho easier to read.
Signed-off-by: Oleg Nesterov
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The only one usage of TASK_DEAD outside of last schedule path,
select_bad_process:
for_each_task(p) {
if (!p->mm)
continue;
...
if (p->state == TASK_DEAD)
continue;
...TASK_DEAD state is set at the end of do_exit(), this means that p->mm
was already set == NULL by exit_mm(), so this task was already rejected
by 'if (!p->mm)' above.Note also that the caller holds tasklist_lock, this means that p can't
pass exit_notify() and then set TASK_DEAD when p->mm != NULL.Also, remove open-coded is_init().
Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I am not sure about this patch, I am asking Ingo to take a decision.
task_struct->state == EXIT_DEAD is a very special case, to avoid a confusion
it makes sense to introduce a new state, TASK_DEAD, while EXIT_DEAD should
live only in ->exit_state as documented in sched.h.Note that this state is not visible to user-space, get_task_state() masks off
unsuitable states.Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
After the previous change (->flags & PF_DEAD) (->state == EXIT_DEAD), we
don't need PF_DEAD any longer.Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
schedule() checks PF_DEAD on every context switch and sets ->state = EXIT_DEAD
to ensure that the exiting task will be deactivated. Note that this EXIT_DEAD
is in fact a "random" value, we can use any bit except normal TASK_XXX values.It is better to set this state in do_exit() along with PF_DEAD flag and remove
that check in schedule().We are safe wrt concurrent try_to_wake_up() (for example ptrace, tkill), it
can not change task's ->state: the 'state' argument of try_to_wake_up() can't
have EXIT_DEAD bit. And in case when try_to_wake_up() sees a stale value of
->state == TASK_RUNNING it will do nothing.Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce the disable_irq_nosync_lockdep_irqsave() and
enable_irq_lockdep_irqrestore() APIs. These are needed for NE2000; basically
NE2000 calls disable_irq and enable_irq as locking against the IRQ handler,
but both in cases where interrupts are on and off. This means that lockdep
needs to track the old state of the virtual irq flags on disable_irq, and
restore these at enable_irq time.Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Cc: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Everyone passes valid pointer there.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If register_filesystem() fails mux workqueue must be killed.
Signed-off-by: Alexey Dobriyan
Cc: Eric Van Hensbergen
Cc: Ron Minnich
Cc: Latchesar Ionkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It always returns 0, so relying on it is useless. The only caller isn't
checking return value. In general, un-, de-, -free functions should return
void.Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If register_filesystem() fails, vxfs_inode cache must be destroyed.
Signed-off-by: Alexey Dobriyan
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Two lines -- two bugs. :-(
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently, __acquire and __release take a lock expression, but __cond_lock
takes only a condition, not the lock acquired if the expression evaluates
to true. Change __cond_lock to accept a lock expression, and change all
the callers to pass in a lock expression.Signed-off-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix "quiet" parameter doc. No trailing '=' sign, no value after it. And
it disables "most" kernel messages, not all of them.Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
At the beginning of the routine, "copied" is set to 0, but it is no good
because in lines 805 and 812 it is set to other values. Finally, the
routine returns as if it copied 12 (=ENOMEM) bytes less than it actually
did.Signed-off-by: Frederik Deweerdt
Acked-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In the case below we are locking the whole disk not a partition. This
change simply brings the code in line with the piece above where when we
are the 'first' opener, and we are a partition.Signed-off-by: Jason Baron
Acked-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
spin_trylock_irq and spin_trylock_irqsave use _spin_trylock, which does not
use the __cond_lock wrapper annotation and thus does not affect the lock
context; change them to use spin_trylock instead, which does use
__cond_lock.Signed-off-by: Josh Triplett
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The lock annotations used on spinlocks and rwlocks currently use
__{acquires,releases}(spinlock_t) and __{acquires,releases}(rwlock_t),
respectively. This loses the information of which lock actually got
acquired or released, and assumes a different type for the parameter of
__acquires and __releases than the rest of the kernel. While the current
implementations of __acquires and __releases throw away their argument,
this will not always remain the case. Change this to use the lock
parameter instead, to preserve this information and increase consistency in
usage of __acquires and __releases.Signed-off-by: Josh Triplett
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alexey Dobriyan
Acked-by: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If your driver implements "break on" and "break off" this ensures you won't
get multiple overlapping requests or requests in parallel. If your driver
has its own break handling then its still your problem as the driver
author.Break is also now serialized against writes from user space properly but no
new guarantees are made driver level about writes from the line discipline
itself (eg flow control or echo)Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
[akpm@osdl.org: build fix]
[akpm@osdl.org: warning fix]
Signed-off-by: Alan Cox
Acked-by: David S. Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Adds a missing exit, if the file that should be parsed couldn't be opened.
Without it crashes with a segfault, cause the filedescriptor is accessed
even if the file could not be opened.Signed-off-by: Henrik Kretzschmar
Acked-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
use rcu locks for find_task_by_pid().
Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It is ok to do find_task_by_pid() + get_task_struct() under
rcu_read_lock(), we cand drop tasklist_lock.Note that testing of ->exit_state is racy with or without tasklist anyway.
Signed-off-by: Oleg Nesterov
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
During testing I've found that the mount pending flag can be left set at
exit from autofs4_lookup after a failed mount request. This shouldn't be
allowed to happen and causes incorrect error returns.Signed-off-by: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The check for an empty directory in the autofs4_follow_link method fails
occassionally due to old dentrys. We had the same problem
autofs4_revalidate ages ago. I thought we wouldn't need this in
autofs4_follow_link, silly me.Signed-off-by: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
copy_process:
// holds tasklist_lock + ->siglock
/*
* inherit ioprio
*/
p->ioprio = current->ioprio;Why? ->ioprio was already copied in dup_task_struct(). I guess this is
needed to ensure that the child can't escape
sys_ioprio_set(IOPRIO_WHO_{PGRP,USER}), yes?In that case we don't need ->siglock held, and the comment should be
updated.Signed-off-by: Oleg Nesterov
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove open-coded has_rt_policy(), no changes in kernel/exit.o
Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I am not sure this patch is correct: I can't understand what the current
code does, and I don't know what it was supposed to do.The comment says:
* can't change policy, except between SCHED_NORMAL
* and SCHED_BATCH:The code:
if (((policy != SCHED_NORMAL && p->policy != SCHED_BATCH) &&
(policy != SCHED_BATCH && p->policy != SCHED_NORMAL)) &&But this is equivalent to:
if ( (is_rt_policy(policy) && has_rt_policy(p)) &&
which means something different. We can't _decrease_ the current
->rt_priority with such a check (if rlim[RLIMIT_RTPRIO] == 0).Probably, it was supposed to be:
if ( !(policy == SCHED_NORMAL && p->policy == SCHED_BATCH) &&
!(policy == SCHED_BATCH && p->policy == SCHED_NORMAL)this matches the comment, but strange: it doesn't allow to _drop_ the
realtime priority when rlim[RLIMIT_RTPRIO] == 0.I think the right check would be:
/* can't set/change rt policy */
if (is_rt_policy(policy) &&
policy != p->policy &&
!rlim_rtprio)
return -EPERM;Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Imho, makes the code a bit easier to read.
Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use rcu locks instead. sched_setscheduler() now takes ->siglock
before reading ->signal->rlim[].Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Get rid of an extraneous printk in kernel_restart().
Signed-off-by: Cal Peake
Acked-by: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If ____call_usermodehelper fails, we're not interested in the child
process' exit value, but the real error, so let's stop wait_for_helper from
overwriting it in that case.Issue discovered by Benedikt Böhm while working on a Linux-VServer usermode
helper.Signed-off-by: Björn Steinbrink
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently this module just returns 1 if anything on module init fails. Store
the error code of the different function calls and return their error on
problems.Signed-off-by: Rolf Eike Beer
Cc: Greg KH
Signed-off-by: Andrew Morton
[ Fixed to not unregister twice on error ]
Signed-off-by: Linus Torvalds -
It is sure confusing that linux/ptrace.h has:
#define PTRACE_SINGLESTEP 9
#define PTRACE_ATTACH 0x10
#define PTRACE_DETACH 0x11
#define PTRACE_SYSCALL 24
All the low-numbered constants are in decimal, but the last two in hex.
It sure makes it likely that someone will look at this and think that
9, 10, 11 are used, and that 16 and 17 are not used.How about we use the same notation for all the numbers [0,24] in the
same short list?Signed-off-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add some documentation comments for the cdev interface.
Signed-off-by: Jonathan Corbet
Cc: Rolf Eike Beer
Acked-by: "Randy.Dunlap"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
[akpm@osdl.org: fix]
Cc: Alan Cox
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If we are going to BUG() not panic() here then we should cover the case of
the BUG being compiled outSigned-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds