Doug / smarc-fsl-linux-kernel | Embedian Git Server

09 Feb, 2008

40 commits

d12619b5f fix group stop with exit race ... Browse Code »

do_signal_stop() counts all sub-thread and sets ->group_stop_count
accordingly. Every thread should decrement ->group_stop_count and stop,
the last one should notify the parent.

However a sub-thread can exit before it notices the signal_pending(), or it
may be somewhere in do_exit() already. In that case the group stop never
finishes properly.

Note: this is a minimal fix, we can add some optimizations later. Say we
can return quickly if thread_group_empty(). Also, we can move some signal
related code from exit_notify() to exit_signals().

Signed-off-by: Oleg Nesterov
Acked-by: Davide Libenzi
Cc: Ingo Molnar
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
430c62312 start the global /sbin/init with 0,0 special pids ... Browse Code »

As Eric pointed out, there is no problem with init starting with sid == pgid
== 0, and this was historical linux behavior changed in 2.6.18.

Remove kernel_init()->__set_special_pids(), this is unneeded and complicates
the rules for sys_setsid().

This change and the previous change in daemonize() mean that /sbin/init does
not need the special "session != 1" hack in sys_setsid() any longer. We can't
remove this check yet, we should cleanup copy_process(CLONE_NEWPID) first, so
update the comment only.

Signed-off-by: Oleg Nesterov
Acked-by: "Eric W. Biederman"
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
297bd42b1 move daemonized kernel threads into the swapper's session ... Browse Code »

Daemonized kernel threads run in the init's session. This doesn't match the
behaviour of kthread_create()'ed threads, and this is one of the 2 reasons
why we need a special hack in sys_setsid().

Now that set_special_pids() was changed to use struct pid, not pid_t, we can
use init_struct_pid and set 0,0 special pids.

Signed-off-by: Oleg Nesterov
Acked-by: "Eric W. Biederman"
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
8520d7c7f teach set_special_pids() to use struct pid ... Browse Code »

Change set_special_pids() to work with struct pid, not pid_t from global name
space. This again speedups and imho cleanups the code, also a preparation for
the next patch.

Signed-off-by: Oleg Nesterov
Acked-by: "Eric W. Biederman"
Acked-by: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
e4cc0a9c8 fix setsid() for sub-namespace /sbin/init ... Browse Code »

sys_setsid() still deals with pid_t's from the global namespace. This means
that the "session > 1" check can't help for sub-namespace init, setsid() can't
succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.

Remove the usage of task_struct->pid and convert the code to use "struct pid".
This also simplifies and speedups the code, saves one find_pid().

Signed-off-by: Oleg Nesterov
Cc: "Eric W. Biederman"
Acked-by: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
4e021306c sys_setpgid(): simplify pid/ns interaction ... Browse Code »

sys_setpgid() does unneeded conversions from pid_t to "struct pid" and vice
versa. Use "struct pid" more consistently. Saves one find_vpid() and
eliminates the explicit usage of ->nsproxy->pid_ns. Imho, cleanups the
code.

Also use the same_thread_group() helper.

Signed-off-by: Oleg Nesterov
Acked-by: Pavel Emelyanov
Acked-by: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
c543f1ee0 wait_task_zombie: remove ->exit_state/exit_signal checks for WNOWAIT ... Browse Code »

The first "p->exit_state != EXIT_ZOMBIE" check doesn't make too much sense.
The exit_state was EXIT_ZOMBIE when the function was called, and another
thread can change it to EXIT_DEAD right after the check.

The second condition is not possible, detached non-traced threads were already
filtered out by eligible_child(), we didn't drop tasklist since then.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:27 +0800
3a515e4a6 wait_task_continued/zombie: don't use task_pid_nr_ns() lockless ... Browse Code »

Surprise, the other two wait_task_*() functions also abuse the
task_pid_nr_ns() function, and may cause read-after-free or report nr == 0
in wait_task_continued(). wait_task_zombie() doesn't have this problem,
but it is still better to cache pid_t rather than call task_pid_nr_ns()
three times on the saved pid_namespace.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
f2cc3eb13 do_wait: fix security checks ... Browse Code »

Imho, the current usage of security_task_wait() is not logical.

Suppose we have the single child p, and security_task_wait(p) return
-EANY. In that case waitpid(-1) returns this error. Why? Isn't it
better to return ECHLD? We don't really have reapable children.

Now suppose that child was stolen by gdb. In that case we find this
child on ->ptrace_children and set flag = 1, but we don't check that the
child was denied. So, do_wait(..., WNOHANG) returns 0, this doesn't
match the behaviour above. Without WNOHANG do_wait() blocks only to
return the error later, when the child will be untraced. Inho, really
strange.

I think eligible_child() should return the error only if the child's pid
was requested explicitly, otherwise we should silently ignore the tasks
which were nacked by security_task_wait().

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Cc: Chris Wright
Cc: Eric Paris
Cc: James Morris
Cc: Stephen Smalley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
96fabbf55 do_wait: cleanup delay_group_leader() usage ... Browse Code »

eligible_child() == 2 means delay_group_leader(). With the previous patch
this only matters for EXIT_ZOMBIE task, we can move that special check to
the only place it is really needed.

Also, with this patch we don't skip security_task_wait() for the group
leaders in a non-empty thread group. I don't really understand the exact
semantics of security_task_wait(), but imho this change is a bugfix.

Also rearrange the code a bit to kill an ugly "check_continued" backdoor.

Signed-off-by: Oleg Nesterov
Cc: Eric Paris
Cc: James Morris
Cc: Roland McGrath
Cc: Stephen Smalley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
1bad95c3b wait_task_stopped(): remove unneeded delay_group_leader check ... Browse Code »

wait_task_stopped() doesn't need the "delay_group_leader" parameter. If
the child is not traced it must be a group leader. With or without
subthreads ->group_stop_count == 0 when the whole task is stopped.

Signed-off-by: Oleg Nesterov
Cc: Mika Penttila
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
20686a309 ptrace_stop: fix racy nonstop_code setting ... Browse Code »

If the tracer is gone and we are not going to stop, ptrace_stop() sets
->exit_code = nostop_code. However, the tracer could actually clear the
exit code before detaching. In that case get_signal_to_deliver() "resends"
the signal which was cancelled by the debugger. For example, it is
possible that a quick PTRACE_ATTACH + PTRACE_DETACH can leave the tracee in
STOPPED state.

Change the behaviour of ptrace_stop(). If the caller is ptrace notify(),
we should always clear ->exit_code. If the caller is
get_signal_to_deliver(), we should not touch it at all. To do so, change
the nonstop_code parameter to "bool clear_code" and change the callers
accordingly.

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
9cbab8100 do_wait: factor out "retval != 0" checks ... Browse Code »

Every branch if the main "if" statement does the same code at the end. Move
it down. Also, fix the indentation.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
ee7c82da8 wait_task_stopped: simplify and fix races with SIGCONT/SIGKILL/untrace ... Browse Code »

wait_task_stopped() has multiple races with SIGCONT/SIGKILL. tasklist_lock
does not pin the child in TASK_TRACED/TASK_STOPPED stated, almost all info
reported (including exit_code) may be wrong.

In fact, the code under write_lock_irq(tasklist_lock) is not safe. The child
may be PTRACE_DETACH'ed at this time by another subthread, in that case it is
possible we are no longer its ->parent.

Change wait_task_stopped() to take ->siglock before inspecting the task. This
guarantees that the child can't resume and (for example) clear its
->exit_code, so we don't need to use xchg(&p->exit_code) and re-check. The
only exception is ptrace_stop() which changes ->state and ->exit_code without
->siglock held during abort. But this can only happen if both the tracer and
the tracee are dying (coredump is in progress), we don't care.

With this patch wait_task_stopped() doesn't move the child to the end of
the ->parent list on success. This optimization could be restored, but
in that case we have to take write_lock(tasklist) and do some nasty
checks.

Also change the do_wait() since we don't return EAGAIN any longer.

[akpm@linux-foundation.org: fix up after Willy renamed everything]
Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
6405f7f46 ptrace_stop: fix the race with ptrace detach+attach ... Browse Code »

If the tracer went away (may_ptrace_stop() failed), ptrace_stop() drops
tasklist and then changes the ->state from TASK_TRACED to TASK_RUNNING.

This can fool another tracer which attaches to us in between. Change the
->state under tasklist_lock to ensure that ptrace_check_attach() can't wrongly
succeed. Also, remove the unnecessary mb().

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
c0c0b649d ptrace_check_attach: remove unneeded ->signal != NULL check ... Browse Code »

It is not possible to see the PT_PTRACED task without ->signal/sighand under
tasklist_lock, release_task() does ptrace_unlink() first. If the task was
already released before, ptrace_attach() can't succeed and set PT_PTRACED.
Remove this check.

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
34a1738f7 kill my_ptrace_child() ... Browse Code »

Now that my_ptrace_child() is trivial we can use the "p->ptrace & PT_PTRACED"
inline and simplify the corresponding logic in do_wait: we can't find the
child in TASK_TRACED state without PT_PTRACED flag set, ptrace_untrace()
either sets TASK_STOPPED or wakes up the tracee.

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
6b39c7bfb kill PT_ATTACHED ... Browse Code »

Since the patch

"Fix ptrace_attach()/ptrace_traceme()/de_thread() race"
commit f5b40e363ad6041a96e3da32281d8faa191597b9

we set PT_ATTACHED and change child->parent "atomically" wrt task_list lock.

This means we can remove the checks like "PT_ATTACHED && ->parent != ptracer"
which were needed to catch the "ptrace attach is in progress" case. We can
also remove the flag itself since nobody else uses it.

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2008-02-09 01:22:26 +0800
01b8b07a5 IPC: consolidate sem_exit_ns(), msg_exit_ns() and shm_exit_ns() ... Browse Code »

sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
ipc_namespace is released to free all ipcs of each type. But in fact, they
do the same thing: they loop around all ipcs to free them individually by
calling a specific routine.

This patch proposes to consolidate this by introducing a common function,
free_ipcs(), that do the job. The specific routine to call on each
individual ipcs is passed as parameter. For this, these ipc-specific
'free' routines are reworked to take a generic 'struct ipc_perm' as
parameter.

Signed-off-by: Pierre Peiffer
Cc: Cedric Le Goater
Cc: Pavel Emelyanov
Cc: Nadia Derbey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pierre Peiffer
2008-02-09 01:22:26 +0800
ed2ddbf88 IPC: make struct ipc_ids static in ipc_namespace ... Browse Code »

Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
are dynamically allocated for each icp_namespace as the ipc_namespace
itself (for the init namespace, they are initialized with pointers to
static variables instead)

It is so for historical reason: in fact, before the use of idr to store the
ipcs, the ipcs were stored in tables of variable length, depending of the
maximum number of ipc allowed. Now, these 'struct ipc_ids' have a fixed
size. As they are allocated in any cases for each new ipc_namespace, there
is no gain of memory in having them allocated separately of the struct
ipc_namespace.

This patch proposes to make this table static in the struct ipc_namespace.
Thus, we can allocate all in once and get rid of all the code needed to
allocate and free these ipc_ids separately.

Signed-off-by: Pierre Peiffer
Acked-by: Cedric Le Goater
Cc: Pavel Emelyanov
Cc: Nadia Derbey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pierre Peiffer
2008-02-09 01:22:26 +0800
4b9fcb0ec IPC/semaphores: consolidate SEM_STAT and IPC_STAT commands ... Browse Code »

These commands (SEM_STAT and IPC_STAT) are rather doing the same things
(only the meaning of the id given as input and the return value differ).
However, for the semaphores, they are handled in two different places (two
different functions).

This patch consolidates this for clarification by handling these both
commands in the same place in semctl_nolock(). It also removes one unused
parameter for this function.

Signed-off-by: Pierre Peiffer
Cc: Nadia Derbey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pierre Peiffer
2008-02-09 01:22:26 +0800
b2d75cddc ipc: uninline some code from util.h ... Browse Code »

ipc_lock_check_down(), ipc_lock_check() and ipcget() seem too large to be
inline. Besides, they give no optimization being inline as they perform
calls inside in any case.

Moving them into ipc/util.c saves 500 bytes of vmlinux and shortens IPC
internal API.

$ ./scripts/bloat-o-meter vmlinux-orig vmlinux
add/remove: 3/2 grow/shrink: 0/10 up/down: 490/-989 (-499)
function old new delta
ipcget - 392 +392
ipc_lock_check_down - 49 +49
ipc_lock_check - 49 +49
sys_semget 119 105 -14
sys_shmget 108 86 -22
sys_msgget 100 78 -22
do_msgsnd 665 631 -34
do_msgrcv 680 644 -36
do_shmat 771 733 -38
sys_msgctl 1302 1229 -73
ipcget_new 80 - -80
sys_semtimedop 1534 1452 -82
sys_semctl 2034 1922 -112
sys_shmctl 1919 1765 -154
ipcget_public 322 - -322

The ipcget() growth is the result of gcc inlining of currently static
ipcget_new/_public.

Signed-off-by: Pavel Emelyanov
Cc: Nadia Derbey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-09 01:22:25 +0800
a46c99942 serial_core: bring mostly into line with coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
b4c502a70 8250: enable rate reporting via termios ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
6f803cd08 serial8250: coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
5756ee999 8250_pci: coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
3b0fd36dd 8250_hub6: codding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
0fe1c1371 8250_hp300: coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
8b4a483be 8250_gsc: coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
ce2e204f0 8250_early: coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
355d95a1c tty_ioctl: drag screaming into compliance with the coding style ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
37bdfb074 tty_io: drag screaming into coding style compliance ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
66c6ceae3 tty_audit: fix checkpatch complaint ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
4129a6454 rocket: don't let random users reset the controller ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
6df3526b6 rocket: first pass at termios reporting ... Browse Code »

Also removes a cflag comparison that caused some mode changes to get wrongly
ignored

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
4edf1827e n_tty: clean up old code to follow coding style and (mostly) checkpatch ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:25 +0800
db1acaa63 moxa: first pass at termios reporting ... Browse Code »

Signed-off-by: Alan Cox
Cc: Jiri Slaby
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2008-02-09 01:22:24 +0800
92dfc9dc7 fix "modules: make module_address_lookup() safe" ... Browse Code »

Get the constness right, avoid nasty cast.

Cc: Ingo Molnar
Cc: Kyle McMartin
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2008-02-09 01:22:24 +0800
6d7623943 modules: include sections.h to avoid defining linker variables explicitly ... Browse Code »

module.c should not define linker variables on its own. We have an include
file for that.

Signed-off-by: Christoph Lameter
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2008-02-09 01:22:24 +0800
88173507e Modules: handle symbols that have a zero value ... Browse Code »

The module subsystem cannot handle symbols that are zero. If symbols are
present that have a zero value then the module resolver prints out a
message that these symbols are unresolved.

[akinobu.mita@gmail.com: fix __find_symbl() error checks]
Cc: Mathieu Desnoyers
Cc: Kay Sievers
Cc: Rusty Russell
Cc: Andi Kleen
Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2008-02-09 01:22:24 +0800