Eric Lee / smarc-fsl-linux-kernel

31 Oct, 2005

4 commits

1291cf416 [PATCH] fix de_thread() vs do_coredump() deadlock ... Browse Code »

de_thread() sends SIGKILL to all sub-threads and waits them to die in 'D'
state. It is possible that one of the threads already dequeued coredump
signal. When de_thread() unlocks ->sighand->lock that thread can enter
do_coredump()->coredump_wait() and cause a deadlock.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:25 +0800
2384f55f8 [PATCH] coredump_wait() cleanup ... Browse Code »

This patch deletes pointless code from coredump_wait().

1. It does useless mm->core_waiters inc/dec under mm->mmap_sem,
but any changes to ->core_waiters have no effect until we drop
->mmap_sem.

2. It calls yield() for absolutely unknown reason.

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:23 +0800
932aeafbe [PATCH] fix de_thread vs it_real_fn() deadlock ... Browse Code »

de_thread() calls del_timer_sync(->real_timer) under ->sighand->siglock.
This is deadlockable, it_real_fn sends a signal and needs this lock too.

Also, delete unneeded ->real_timer.data assignment.

Signed-off-by: Oleg Nesterov
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:19 +0800
9e4e23bcc [PATCH] little de_thread() cleanup ... Browse Code »

Trivial, saves one 'if' branch in de_thread().

Signed-off-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2005-10-31 09:37:17 +0800

30 Oct, 2005

3 commits

c74df32c7 [PATCH] mm: ptd_alloc take ptlock ... Browse Code »

Second step in pushing down the page_table_lock. Remove the temporary
bridging hack from __pud_alloc, __pmd_alloc, __pte_alloc: expect callers not
to hold page_table_lock, whether it's on init_mm or a user mm; take
page_table_lock internally to check if a racing task already allocated.

Convert their callers from common code. But avoid coming back to change them
again later: instead of moving the spin_lock(&mm->page_table_lock) down,
switch over to new macros pte_alloc_map_lock and pte_unmap_unlock, which
encapsulate the mapping+locking and unlocking+unmapping together, and in the
end may use alternatives to the mm page_table_lock itself.

These callers all hold mmap_sem (some exclusively, some not), so at no level
can a page table be whipped away from beneath them; and pte_alloc uses the
"atomic" pmd_present to test whether it needs to allocate. It appears that on
all arches we can safely descend without page_table_lock.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2005-10-30 12:40:40 +0800
365e9c87a [PATCH] mm: update_hiwaters just in time ... Browse Code »

update_mem_hiwater has attracted various criticisms, in particular from those
concerned with mm scalability. Originally it was called whenever rss or
total_vm got raised. Then many of those callsites were replaced by a timer
tick call from account_system_time. Now Frank van Maarseveen reports that to
be found inadequate. How about this? Works for Frank.

Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
update_hiwater_rss and update_hiwater_vm. Don't attempt to keep
mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
by 1): those are hot paths. Do the opposite, update only when about to lower
rss (usually by many), or just before final accounting in do_exit. Handle
mm->hiwater_vm in the same way, though it's much less of an issue. Demand
that whoever collects these hiwater statistics do the work of taking the
maximum with rss or total_vm.

And there has been no collector of these hiwater statistics in the tree. The
new convention needs an example, so match Frank's usage by adding a VmPeak
line above VmSize to /proc//status, and also a VmHWM line above VmRSS
(High-Water-Mark or High-Water-Memory).

There was a particular anomaly during mremap move, that hiwater_vm might be
captured too high. A fleeting such anomaly remains, but it's quickly
corrected now, whereas before it would stick.

What locking? None: if the app is racy then these statistics will be racy,
it's not worth any overhead to make them exact. But whenever it suits,
hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
page_table_lock (for now) or with preemption disabled (later on): without
going to any trouble, minimize the time between reading current values and
updating, to minimize those occasions when a racing thread bumps a count up
and back down in between.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2005-10-30 12:40:39 +0800
4294621f4 [PATCH] mm: rss = file_rss + anon_rss ... Browse Code »

I was lazy when we added anon_rss, and chose to change as few places as
possible. So currently each anonymous page has to be counted twice, in rss
and in anon_rss. Which won't be so good if those are atomic counts in some
configurations.

Change that around: keep file_rss and anon_rss separately, and add them
together (with get_mm_rss macro) when the total is needed - reading two
atomics is much cheaper than updating two atomics. And update anon_rss
upfront, typically in memory.c, not tucked away in page_add_anon_rmap.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2005-10-30 12:40:38 +0800

19 Oct, 2005

1 commit

834f2a4a1 VFS: Allow the filesystem to return a full file pointer on open intent ... Browse Code »

This is needed by NFSv4 for atomicity reasons: our open command is in
fact a lookup+open, so we need to be able to propagate open context
information from lookup() into the resulting struct file's
private_data field.

Signed-off-by: Trond Myklebust

Trond Myklebust
2005-10-19 05:20:16 +0800

15 Sep, 2005

2 commits

2fd4ef85e [PATCH] error path in setup_arg_pages() misses vm_unacct_memory() ... Browse Code »

Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
security_vm_enough_memory tend to forget to vm_unacct_memory when a
failure occurs further down (typically in setup_arg_pages variants).

These are all users of insert_vm_struct, and that reservation will only
be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.

So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
Committed_AS each time they're run. But don't add VM_ACCOUNT to them,
it's inappropriate to reserve against the very unlikely case that gdb
be used to COW a vDSO page - we ought to do something about that in
do_wp_page, but there are yet other inconsistencies to be resolved.

The safe and economical way to fix this is to let insert_vm_struct do
the security_vm_enough_memory check when it finds VM_ACCOUNT is set.

And the MIPS irix_brk has been calling security_vm_enough_memory before
calling do_brk which repeats it, doubly accounting and so also leaking.
Remove that, and all the fs and arch calls to security_vm_enough_memory:
give it a less misleading name later on.

Signed-off-by: Hugh Dickins
Signed-Off-By: Kirill Korotaev
Signed-off-by: Linus Torvalds

Hugh Dickins
2005-09-15 02:18:13 +0800
fb085cf1d [PATCH] Fix fs/exec.c:788 (de_thread()) BUG_ON ... Browse Code »

It turns out that the BUG_ON() in fs/exec.c: de_thread() is unreliable
and can trigger due to the test itself being racy.

de_thread() does
while (atomic_read(&sig->count) > count) {
}
.....
.....
BUG_ON(!thread_group_empty(current));

but release_task does
write_lock_irq(&tasklist_lock)
__exit_signal
(this is where atomic_dec(&sig->count) is run)
__exit_sighand
__unhash_process
takes write lock on tasklist_lock
remove itself out of PIDTYPE_TGID list
write_unlock_irq(&tasklist_lock)

so there's a clear (although small) window between the
atomic_dec(&sig->count) and the actual PIDTYPE_TGID unhashing of the
thread.

And actually there is no need for all threads to have exited at this
point, so we simply kill the BUG_ON.

Big thanks to Marc Lehmann who provided the test-case.

Fixes Bug 5170 (http://bugme.osdl.org/show_bug.cgi?id=5170)

Signed-off-by: Alexander Nyberg
Cc: Roland McGrath
Cc: Andrew Morton
Cc: Ingo Molnar
Acked-by: Andi Kleen
Signed-off-by: Linus Torvalds

Alexander Nyberg
2005-09-15 01:26:34 +0800

10 Sep, 2005

1 commit

badf16621 [PATCH] files: break up files struct ... Browse Code »

In order for the RCU to work, the file table array, sets and their sizes must
be updated atomically. Instead of ensuring this through too many memory
barriers, we put the arrays and their sizes in a separate structure. This
patch takes the first step of putting the file table elements in a separate
structure fdtable that is embedded withing files_struct. It also changes all
the users to refer to the file table using files_fdtable() macro. Subsequent
applciation of RCU becomes easier after this.

Signed-off-by: Dipankar Sarma
Signed-Off-By: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dipankar Sarma
2005-09-10 04:57:55 +0800

13 Jul, 2005

1 commit

532312503 [PATCH] reset real_timer target on exec leader change ... Browse Code »

When a noninitial thread does exec, it becomes the new group leader. If
there is a ITIMER_REAL timer running, it points at the old group leader and
when it fires it can follow a stale pointer. The timer data needs to be
reset to point at the exec'ing thread that is becoming the group leader.
This has to synchronize with any concurrent firing of the timer to make
sure that it_real_fn can never run when the data points to a thread that
might have been reaped already.

Signed-off-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland McGrath
2005-07-13 07:01:01 +0800

24 Jun, 2005

1 commit

d6e711448 [PATCH] setuid core dump ... Browse Code »

Add a new `suid_dumpable' sysctl:

This value can be used to query and set the core dump mode for setuid
or otherwise protected/tainted binaries. The modes are

0 - (default) - traditional behaviour. Any process which has changed
privilege levels or is execute only will not be dumped

1 - (debug) - all processes dump core when possible. The core dump is
owned by the current user and no security is applied. This is intended
for system debugging situations only. Ptrace is unchecked.

2 - (suidsafe) - any binary which normally would not be dumped is dumped
readable by root only. This allows the end user to remove such a dump but
not access it directly. For security reasons core dumps in this mode will
not overwrite one another or other files. This mode is appropriate when
adminstrators are attempting to debug problems in a normal environment.

(akpm:

> > +EXPORT_SYMBOL(suid_dumpable);
>
> EXPORT_SYMBOL_GPL?

No problem to me.

> > if (current->euid == current->uid && current->egid == current->gid)
> > current->mm->dumpable = 1;
>
> Should this be SUID_DUMP_USER?

Actually the feedback I had from last time was that the SUID_ defines
should go because its clearer to follow the numbers. They can go
everywhere (and there are lots of places where dumpable is tested/used
as a bool in untouched code)

> Maybe this should be renamed to `dump_policy' or something. Doing that
> would help us catch any code which isn't using the #defines, too.

Fair comment. The patch was designed to be easy to maintain for Red Hat
rather than for merging. Changing that field would create a gigantic
diff because it is used all over the place.

)

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2005-06-24 00:45:26 +0800

19 Jun, 2005

1 commit

c2a0f5943 Clean up subthread exec ... Browse Code »

Make sure we re-parent itimers, and use BUG_ON() instead of an explicit
conditional BUG().

Linus Torvalds
2005-06-19 04:06:22 +0800

06 May, 2005

2 commits

367720923 [PATCH] comments on locking of task->comm ... Browse Code »

Add some comments about task->comm, to explain what it is near its definition
and provide some important pointers to its uses.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paolo 'Blaisorblade' Giarrusso
2005-05-06 07:36:48 +0800
75c96f858 [PATCH] make some things static ... Browse Code »

This patch makes some needlessly global identifiers static.

Signed-off-by: Adrian Bunk
Acked-by: Arjan van de Ven
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2005-05-06 07:36:47 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800