Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

29 Mar, 2007

1 commit

14e9d5730 [PATCH] pid: Properly detect orphaned process groups in exit_notify ... Browse Code »

In commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 when converting the
orphaned process group handling to use struct pid I made a small
mistake. I accidentally replaced an == with a !=.

Besides just being a dumb thing to do apparently this has a bad side
effect. The improper orphaned process group detection causes kwin to
die after a suspend/resume cycle.

I'm amazed this patch has been around as long as it has without anyone
else noticing something funny going on.

And the following people deserve credit for spotting and helping
to reproduce this.

Thanks to: Sid Boyce
Thanks to: "Michael Wu"

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-03-29 23:16:23 +0800

13 Feb, 2007

4 commits

3e7cd6c41 [PATCH] pid: replace is_orphaned_pgrp with is_current_pgrp_orphaned ... Browse Code »

Every call to is_orphaned_pgrp passed in process_group(current) which is racy
with respect to another thread changing our process group. It didn't bite us
because we were dealing with integers and the worse we would get would be a
stale answer.

In switching the checks to use struct pid to be a little more efficient and
prepare the way for pid namespaces this race became apparent.

So I simplified the calls to the more specialized is_current_pgrp_orphaned so
I didn't have to worry about making logic changes to avoid the race.

Signed-off-by: Eric W. Biederman
Cc: Alan Cox
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-02-13 01:48:32 +0800
0475ac084 [PATCH] pid: use struct pid for talking about process groups in exitc ... Browse Code »

Modify has_stopped_jobs and will_become_orphan_pgrp to use struct pid based
process groups. This reduces the number of hash tables looks ups and paves
the way for multiple pid spaces.

Signed-off-by: Eric W. Biederman
Cc: Alan Cox
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-02-13 01:48:32 +0800
04a2e6a5c [PATCH] pid: make session_of_pgrp use struct pid instead of pid_t ... Browse Code »

To properly implement a pid namespace I need to deal exclusively in terms of
struct pid, because pid_t values become ambiguous.

To this end session_of_pgrp is transformed to take and return a struct pid
pointer. To avoid the need to worry about reference counting I now require my
caller to hold the appropriate locks. Leaving callers repsonsible for
increasing the reference count if they need access to the result outside of
the locks.

Since session_of_pgrp currently only has one caller and that caller simply
uses only test the result for equality with another process group, the locking
change means I don't actually have to acquire the tasklist_lock at all.

tiocspgrp is also modified to take and release the lock. The logic there is a
little more complicated but nothing I won't need when I convert pgrp of a tty
to a struct pid pointer.

Signed-off-by: Eric W. Biederman
Cc: Alan Cox
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-02-13 01:48:31 +0800
944be0b22 [PATCH] close_files(): add scheduling point ... Browse Code »

close_files() can sometimes take long enough to trigger the soft lockup
detector.

Cc: Eric Dumazet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2007-02-13 01:48:30 +0800

12 Feb, 2007

1 commit

72fd4a35a [PATCH] Numerous fixes to kernel-doc info in source files. ... Browse Code »

A variety of (mostly) innocuous fixes to the embedded kernel-doc content in
source files, including:

* make multi-line initial descriptions single line
* denote some function names, constants and structs as such
* change erroneous opening '/*' to '/**' in a few places
* reword some text for clarity

Signed-off-by: Robert P. J. Day
Cc: "Randy.Dunlap"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robert P. J. Day
2007-02-12 02:51:32 +0800

31 Jan, 2007

3 commits

0f2452855 [PATCH] namespaces: fix task exit disaster ... Browse Code »

This is based on a patch by Eric W. Biederman, who pointed out that pid
namespaces are still fake, and we only have one ever active.

So for the time being, we can modify any code which could access
tsk->nsproxy->pid_ns during task exit to just use &init_pid_ns instead,
and move the exit_task_namespaces call in do_exit() back above
exit_notify(), so that an exiting nfs server has a valid tsk->sighand to
work with.

Long term, pulling pid_ns out of nsproxy might be the cleanest solution.

Signed-off-by: Eric W. Biederman

[ Eric's patch fixed to take care of free_pid() too ]

Signed-off-by: Serge E. Hallyn
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2007-01-31 05:40:36 +0800
444f378b2 Revert "[PATCH] namespaces: fix exit race by splitting exit" ... Browse Code »

This reverts commit 7a238fcba0629b6f2edbcd37458bae56fcf36be5 in
preparation for a better and simpler fix proposed by Eric Biederman
(and fixed up by Serge Hallyn)

Acked-by: Serge E. Hallyn
Signed-off-by: Linus Torvalds

Linus Torvalds
2007-01-31 05:35:18 +0800
7a238fcba [PATCH] namespaces: fix exit race by splitting exit ... Browse Code »

Fix exit race by splitting the nsproxy putting into two pieces. First
piece reduces the nsproxy refcount. If we dropped the last reference, then
it puts the mnt_ns, and returns the nsproxy as a hint to the caller. Else
it returns NULL. The second piece of exiting task namespaces sets
tsk->nsproxy to NULL, and drops the references to other namespaces and
frees the nsproxy only if an nsproxy was passed in.

A little awkward and should probably be reworked, but hopefully it fixes
the NFS oops.

Signed-off-by: Serge E. Hallyn
Cc: Herbert Poetzl
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Cedric Le Goater
Cc: Daniel Hokka Zakrisson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2007-01-31 00:26:44 +0800

01 Jan, 2007

1 commit

241ceee0b [PATCH] restore ->pdeath_signal behaviour ... Browse Code »

Commit b2b2cbc4b2a2f389442549399a993a8306420baf introduced a user-
visible change: ->pdeath_signal is sent only when the entire thread
group exits.

While this change is imho good, it may break things. So restore the
old behaviour for now.

Signed-off-by: Oleg Nesterov
To: Albert Cahalan
Cc: Eric W. Biederman
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Ingo Molnar
Cc: Qi Yong
Cc: Roland McGrath
Signed-off-by: Linus Torvalds

Oleg Nesterov
2007-01-01 06:41:18 +0800

23 Dec, 2006

2 commits

b2b2cbc4b [PATCH] Fix reparenting to the same thread group. (take 2) ... Browse Code »

This patch fixes the case when we reparent to a different thread in the
same thread group. This modifies the code so that we do not send
signals and do not change the signal to send to SIGCHLD unless we have
change the thread group of our parents. It also suppresses sending
pdeath_sig in this cas as well since the result of geppid doesn't
change.

Thanks to Oleg for spotting my bug of only fixing this for non-ptraced
tasks.

Signed-off-by: Eric W. Biederman
Cc: Mike Galbraith
Cc: Albert Cahalan
Cc: Andrew Morton
Cc: Roland McGrath
Cc: Ingo Molnar
Cc: Coywolf Qi Hunt
Acked-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Eric W. Biederman
2006-12-23 01:03:41 +0800
01b2d93ca [PATCH] fdtable: Provide free_fdtable() wrapper ... Browse Code »

Christoph Hellwig has expressed concerns that the recent fdtable changes
expose the details of the RCU methodology used to release no-longer-used
fdtable structures to the rest of the kernel. The trivial patch below
addresses these concerns by introducing the appropriate free_fdtable()
calls, which simply wrap the release RCU usage. Since free_fdtable() is a
one-liner, it makes sense to promote it to an inline helper.

Signed-off-by: Vadim Lobanov
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vadim Lobanov
2006-12-23 00:55:50 +0800

11 Dec, 2006

2 commits

4fd45812c [PATCH] fdtable: Remove the free_files field ... Browse Code »

An fdtable can either be embedded inside a files_struct or standalone (after
being expanded). When an fdtable is being discarded after all RCU references
to it have expired, we must either free it directly, in the standalone case,
or free the files_struct it is contained within, in the embedded case.

Currently the free_files field controls this behavior, but we can get rid of
it entirely, as all the necessary information is already recorded. We can
distinguish embedded and standalone fdtables using max_fds, and if it is
embedded we can divine the relevant files_struct using container_of().

Signed-off-by: Vadim Lobanov
Cc: Christoph Hellwig
Cc: Al Viro
Cc: Dipankar Sarma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vadim Lobanov
2006-12-11 01:57:22 +0800
bbea9f696 [PATCH] fdtable: Make fdarray and fdsets equal in size ... Browse Code »

Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets. The code allows the number of fds supported by the
fdarray (fdtable->max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable->max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal. This
patch removes fdtable->max_fdset. As an added bonus, most of the supporting
code becomes simpler.

Signed-off-by: Vadim Lobanov
Cc: Christoph Hellwig
Cc: Al Viro
Cc: Dipankar Sarma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vadim Lobanov
2006-12-11 01:57:22 +0800

09 Dec, 2006

7 commits

62dfb5541 [PATCH] session_of_pgrp: kill unnecessary do_each_task_pid(PIDTYPE_PGID) ... Browse Code »

All members of the process group have the same sid and it can't be == 0.

NOTE: this code (and a similar one in sys_setpgid) was needed because it
was possibe to have ->session == 0. It's not possible any longer since

[PATCH] pidhash: don't use zero pids
Commit: c7c6464117a02b0d54feb4ebeca4db70fa493678

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-12-09 00:28:52 +0800
84d737866 [PATCH] add child reaper to pid_namespace ... Browse Code »

Add a per pid_namespace child-reaper. This is needed so processes are reaped
within the same pid space and do not spill over to the parent pid space. Its
also needed so containers preserve existing semantic that pid == 1 would reap
orphaned children.

This is based on Eric Biederman's patch: http://lkml.org/lkml/2006/2/6/285

Signed-off-by: Sukadev Bhattiprolu
Signed-off-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2006-12-09 00:28:52 +0800
6b3286ed1 [PATCH] rename struct namespace to struct mnt_namespace ... Browse Code »

Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with
other namespaces being developped for the containers : pid, uts, ipc, etc.
'namespace' variables and attributes are also renamed to 'mnt_ns'

Signed-off-by: Kirill Korotaev
Signed-off-by: Cedric Le Goater
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill Korotaev
2006-12-09 00:28:51 +0800
1ec320afd [PATCH] add process_session() helper routine: deprecate old field ... Browse Code »

Add an anonymous union and ((deprecated)) to catch direct usage of the
session field.

[akpm@osdl.org: fix various missed conversions]
[jdike@addtoit.com: fix UML bug]
Signed-off-by: Jeff Dike
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2006-12-09 00:28:51 +0800
937949d9e [PATCH] add process_session() helper routine ... Browse Code »

Replace occurences of task->signal->session by a new process_session() helper
routine.

It will be useful for pid namespaces to abstract the session pid number.

Signed-off-by: Cedric Le Goater
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2006-12-09 00:28:51 +0800
ae424ae4b [PATCH] make set_special_pids() static ... Browse Code »

Make set_special_pids() static, the only caller is daemonize().

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-12-09 00:28:38 +0800
24ec839c4 [PATCH] tty: ->signal->tty locking ... Browse Code »

Fix the locking of signal->tty.

Use ->sighand->siglock to protect ->signal->tty; this lock is already used
by most other members of ->signal/->sighand. And unless we are 'current'
or the tasklist_lock is held we need ->siglock to access ->signal anyway.

(NOTE: sys_unshare() is broken wrt ->sighand locking rules)

Note that tty_mutex is held over tty destruction, so while holding
tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys
are governed by their open file handles. This leaves some holes for tty
access from signal->tty (or any other non file related tty access).

It solves the tty SLAB scribbles we were seeing.

(NOTE: the change from group_send_sig_info to __group_send_sig_info needs to
be examined by someone familiar with the security framework, I think
it is safe given the SEND_SIG_PRIV from other __group_send_sig_info
invocations)

[schwidefsky@de.ibm.com: 3270 fix]
[akpm@osdl.org: various post-viro fixes]
Signed-off-by: Peter Zijlstra
Acked-by: Alan Cox
Cc: Oleg Nesterov
Cc: Prarit Bhargava
Cc: Chris Wright
Cc: Roland McGrath
Cc: Stephen Smalley
Cc: James Morris
Cc: "David S. Miller"
Cc: Jeff Dike
Cc: Martin Schwidefsky
Cc: Jan Kara
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2006-12-09 00:28:38 +0800

08 Dec, 2006

1 commit

115085ea0 [PATCH] taskstats: cleanup do_exit() path ... Browse Code »

do_exit:
taskstats_exit_alloc()
...
taskstats_exit_send()
taskstats_exit_free()

I think this is not good, let it be a single function exported to the core
kernel, taskstats_exit(), which does alloc + send + free itself.

Signed-off-by: Oleg Nesterov
Cc: Balbir Singh
Cc: Shailabh Nagar
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-12-08 00:39:34 +0800

29 Oct, 2006

1 commit

093a8e8ae [PATCH] taskstats_tgid_free: fix usage ... Browse Code »

taskstats_tgid_free() is called on copy_process's error path. This is wrong.

IF (clone_flags & CLONE_THREAD)
We should not clear ->signal->taskstats, current uses it,
it probably has a valid accumulated info.
ELSE
taskstats_tgid_init() set ->signal->taskstats = NULL,
there is nothing to free.

Move the callsite to __exit_signal(). We don't need any locking, entire
thread group is exiting, nobody should have a reference to soon to be
released ->signal.

Signed-off-by: Oleg Nesterov
Cc: Shailabh Nagar
Cc: Balbir Singh
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-10-29 02:30:54 +0800

02 Oct, 2006

3 commits

fab413a33 [PATCH] namespaces: exit_task_namespaces() invalidates nsproxy ... Browse Code »

exit_task_namespaces() has replaced the former exit_namespace(). It
invalidates task->nsproxy and associated namespaces. This is an issue for
the (futur) pid namespace which is required to be valid in exit_notify().

This patch moves exit_task_namespaces() after exit_notify() to keep nsproxy
valid.

Signed-off-by: Cedric Le Goater
Cc: Serge E. Hallyn
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Andrey Savochkin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2006-10-02 22:57:21 +0800
1651e14e2 [PATCH] namespaces: incorporate fs namespace into nsproxy ... Browse Code »

This moves the mount namespace into the nsproxy. The mount namespace count
now refers to the number of nsproxies point to it, rather than the number of
tasks. As a result, the unshare_namespace() function in kernel/fork.c no
longer checks whether it is being shared.

Signed-off-by: Serge Hallyn
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Andrey Savochkin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2006-10-02 22:57:20 +0800
ab516013a [PATCH] namespaces: add nsproxy ... Browse Code »

This patch adds a nsproxy structure to the task struct. Later patches will
move the fs namespace pointer into this structure, and introduce a new utsname
namespace into the nsproxy.

The vserver and openvz functionality, then, would be implemented in large part
by virtualizing/isolating more and more resources into namespaces, each
contained in the nsproxy.

[akpm@osdl.org: build fix]
Signed-off-by: Serge Hallyn
Cc: Kirill Korotaev
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Andrey Savochkin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2006-10-02 22:57:20 +0800

01 Oct, 2006

2 commits

8f0ab5147 [PATCH] csa: convert CONFIG tag for extended accounting routines ... Browse Code »

There were a few accounting data/macros that are used in CSA but are #ifdef'ed
inside CONFIG_BSD_PROCESS_ACCT. This patch is to change those ifdef's from
CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT. A few defines are moved from
kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and
include/linux/tsacct_kern.h.

Signed-off-by: Jay Lan
Cc: Shailabh Nagar
Cc: Balbir Singh
Cc: Jes Sorensen
Cc: Chris Sturtivant
Cc: Tony Ernst
Cc: Guillaume Thouvenin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jay Lan
2006-10-01 15:39:29 +0800
0d67a46df [PATCH] BLOCK: Remove duplicate declaration of exit_io_context() [try #6] ... Browse Code »

Remove the duplicate declaration of exit_io_context() from linux/sched.h.

Signed-Off-By: David Howells
Signed-off-by: Jens Axboe

David Howells
2006-10-01 02:31:20 +0800

30 Sep, 2006

8 commits

c394cc9fb [PATCH] introduce TASK_DEAD state ... Browse Code »

I am not sure about this patch, I am asking Ingo to take a decision.

task_struct->state == EXIT_DEAD is a very special case, to avoid a confusion
it makes sense to introduce a new state, TASK_DEAD, while EXIT_DEAD should
live only in ->exit_state as documented in sched.h.

Note that this state is not visible to user-space, get_task_state() masks off
unsuitable states.

Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-09-30 00:18:21 +0800
55a101f8f [PATCH] kill PF_DEAD flag ... Browse Code »

After the previous change (->flags & PF_DEAD) (->state == EXIT_DEAD), we
don't need PF_DEAD any longer.

Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-09-30 00:18:20 +0800
29b884921 [PATCH] set EXIT_DEAD state in do_exit(), not in schedule() ... Browse Code »

schedule() checks PF_DEAD on every context switch and sets ->state = EXIT_DEAD
to ensure that the exiting task will be deactivated. Note that this EXIT_DEAD
is in fact a "random" value, we can use any bit except normal TASK_XXX values.

It is better to set this state in do_exit() along with PF_DEAD flag and remove
that check in schedule().

We are safe wrt concurrent try_to_wake_up() (for example ptrace, tkill), it
can not change task's ->state: the 'state' argument of try_to_wake_up() can't
have EXIT_DEAD bit. And in case when try_to_wake_up() sees a stale value of
->state == TASK_RUNNING it will do nothing.

Signed-off-by: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-09-30 00:18:20 +0800
1c573afeb [PATCH] reparent_to_init(): use has_rt_policy() ... Browse Code »

Remove open-coded has_rt_policy(), no changes in kernel/exit.o

Signed-off-by: Oleg Nesterov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-09-30 00:18:18 +0800
54306cf04 [PATCH] exit: fix crash case ... Browse Code »

If we are going to BUG() not panic() here then we should cover the case of
the BUG being compiled out

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2006-09-30 00:18:16 +0800
b9ecb2bd5 [PATCH] has_stopped_jobs() cleanup ... Browse Code »

This check has been obsolete since the introduction of TASK_TRACED. Now
TASK_STOPPED always means job control stop.

Signed-off-by: Roland McGrath
Cc: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland McGrath
2006-09-30 00:18:15 +0800
f400e198b [PATCH] pidspace: is_init() ... Browse Code »

This is an updated version of Eric Biederman's is_init() patch.
(http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
replaces a few more instances of ->pid == 1 with is_init().

Further, is_init() checks pid and thus removes dependency on Eric's other
patches for now.

Eric's original description:

There are a lot of places in the kernel where we test for init
because we give it special properties. Most significantly init
must not die. This results in code all over the kernel test
->pid == 1.

Introduce is_init to capture this case.

With multiple pid spaces for all of the cases affected we are
looking for only the first process on the system, not some other
process that has pid == 1.

Signed-off-by: Eric W. Biederman
Signed-off-by: Sukadev Bhattiprolu
Cc: Dave Hansen
Cc: Serge Hallyn
Cc: Cedric Le Goater
Cc:
Acked-by: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sukadev Bhattiprolu
2006-09-30 00:18:12 +0800
3b9b8ab65 [PATCH] Fix unserialized task->files changing ... Browse Code »

Fixed race on put_files_struct on exec with proc. Restoring files on
current on error path may lead to proc having a pointer to already kfree-d
files_struct.

->files changing at exit.c and khtread.c are safe as exit_files() makes all
things under lock.

Found during OpenVZ stress testing.

[akpm@osdl.org: add export]
Signed-off-by: Pavel Emelianov
Signed-off-by: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill Korotaev
2006-09-30 00:18:12 +0800

03 Sep, 2006

1 commit

3b6362b83 [PATCH] eligible_child: remove an obsolete ->tgid check ... Browse Code »

It is not possible to find a sub-thread in ->children/->ptrace_children
lists, ptrace_attach() does not allow to attach to sub-threads.

Even if it was possible to ptrace the task from the same thread group,
we can't allow to release ->group_leader while there are others (ptracer)
threads in the same group.

Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2006-09-03 05:51:27 +0800

02 Sep, 2006

1 commit

35df17c57 [PATCH] task delay accounting fixes ... Browse Code »

Cleanup allocation and freeing of tsk->delays used by delay accounting.
This solves two problems reported for delay accounting:

1. oops in __delayacct_blkio_ticks
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1844.html

Currently tsk->delays is getting freed too early in task exit which can
cause a NULL tsk->delays to get accessed via reading of /proc//stats.
The patch fixes this problem by freeing tsk->delays closer to when
task_struct itself is freed up. As a result, it also eliminates the use of
tsk->delays_lock which was only being used (inadequately) to safeguard
access to tsk->delays while a task was exiting.

2. Possible memory leak in kernel/delayacct.c
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1389.html

The patch cleans up tsk->delays allocations after a bad fork which was
missing earlier.

The patch has been tested to fix the problems listed above and stress
tested with rapid calls to delay accounting's taskstats command interface
(which is the other path that can access the same data, besides the /proc
interface causing the oops above).

Signed-off-by: Shailabh Nagar
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
2006-09-02 02:39:08 +0800

15 Jul, 2006

2 commits

f9fd8914c [PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks ... Browse Code »

On systems with a large number of cpus, with even a modest rate of tasks
exiting per cpu, the volume of taskstats data sent on thread exit can
overflow a userspace listener's buffers.

One approach to avoiding overflow is to allow listeners to get data for a
limited and specific set of cpus. By scaling the number of listeners
and/or the cpus they monitor, userspace can handle the statistical data
overload more gracefully.

In this patch, each listener registers to listen to a specific set of cpus
by specifying a cpumask. The interest is recorded per-cpu. When a task
exits on a cpu, its taskstats data is unicast to each listener interested
in that cpu.

Thanks to Andrew Morton for pointing out the various scalability and
general concerns of previous attempts and for suggesting this design.

[akpm@osdl.org: build fix]
Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Signed-off-by: Chandra Seetharaman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
2006-07-15 12:53:57 +0800
ad4ecbcba [PATCH] delay accounting taskstats interface send tgid once ... Browse Code »

Send per-tgid data only once during exit of a thread group instead of once
with each member thread exit.

Currently, when a thread exits, besides its per-tid data, the per-tgid data
of its thread group is also sent out, if its thread group is non-empty.
The per-tgid data sent consists of the sum of per-tid stats for all
*remaining* threads of the thread group.

This patch modifies this sending in two ways:

- the per-tgid data is sent only when the last thread of a thread group
exits. This cuts down heavily on the overhead of sending/receiving
per-tgid data, especially when other exploiters of the taskstats
interface aren't interested in per-tgid stats

- the semantics of the per-tgid data sent are changed. Instead of being
the sum of per-tid data for remaining threads, the value now sent is the
true total accumalated statistics for all threads that are/were part of
the thread group.

The patch also addresses a minor issue where failure of one accounting
subsystem to fill in the taskstats structure was causing the send of
taskstats to not be sent at all.

The patch has been tested for stability and run cerberus for over 4 hours
on an SMP.

[akpm@osdl.org: bugfixes]
Signed-off-by: Shailabh Nagar
Signed-off-by: Balbir Singh
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shailabh Nagar
2006-07-15 12:53:57 +0800