Eric Lee / smarc-fsl-linux-kernel

20 Jul, 2011

1 commit

1b5d783c9 consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling ... Browse Code »

new helper: would_dump(bprm, file). Checks if we are allowed to
read the file and if we are not - sets ENFORCE_NODUMP. Exported,
used in places that previously open-coded the same logics.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:10 +0800

09 Apr, 2011

1 commit

0e028465d exec: unify do_execve/compat_do_execve code ... Browse Code »

Add the appropriate members into struct user_arg_ptr and teach
get_user_arg_ptr() to handle is_compat = T case correctly.

This allows us to remove the compat_do_execve() code from fs/compat.c
and reimplement compat_do_execve() as the trivial wrapper on top of
do_execve_common(is_compat => true).

In fact, this fixes another (minor) bug. "compat_uptr_t str" can
overflow after "str += len" in compat_copy_strings() if a 64bit
application execs via sys32_execve().

Unexport acct_arg_size() and get_arg_page(), fs/compat.c doesn't
need them any longer.

Signed-off-by: Oleg Nesterov
Reviewed-by: KOSAKI Motohiro
Tested-by: KOSAKI Motohiro

Oleg Nesterov
2011-04-09 21:53:56 +0800

14 Jan, 2011

1 commit

f670d0ecd binfmt_elf: cleanups ... Browse Code »

This cleans up a few bits in binfmt_elf.c and binfmts.h:

- the hasvdso field in struct linux_binfmt is unused, so remove it and
the only initialization of it

- the elf_map CPP symbol is not defined anywhere in the kernel, so
remove an unnecessary #ifndef elf_map

- reduce excessive indentation in elf_format's initializer

- add missing spaces, remove extraneous spaces

No functional changes, but tested on x86 (32 and 64 bit), powerpc (32 and
64 bit), sparc64, arm, and alpha.

Signed-off-by: Mikael Pettersson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mikael Pettersson
2011-01-14 00:03:12 +0800

01 Dec, 2010

2 commits

114279be2 exec: copy-and-paste the fixes into compat_do_execve() paths ... Browse Code »

Note: this patch targets 2.6.37 and tries to be as simple as possible.
That is why it adds more copy-and-paste horror into fs/compat.c and
uglifies fs/exec.c, this will be cleanuped later.

compat_copy_strings() plays with bprm->vma/mm directly and thus has
two problems: it lacks the RLIMIT_STACK check and argv/envp memory
is not visible to oom killer.

Export acct_arg_size() and get_arg_page(), change compat_copy_strings()
to use get_arg_page(), change compat_do_execve() to do acct_arg_size(0)
as do_execve() does.

Add the fatal_signal_pending/cond_resched checks into compat_count() and
compat_copy_strings(), this matches the code in fs/exec.c and certainly
makes sense.

Signed-off-by: Oleg Nesterov
Cc: KOSAKI Motohiro
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-12-01 09:56:38 +0800
3c77f8457 exec: make argv/envp memory visible to oom-killer ... Browse Code »

Brad Spengler published a local memory-allocation DoS that
evades the OOM-killer (though not the virtual memory RLIMIT):
http://www.grsecurity.net/~spender/64bit_dos.c

execve()->copy_strings() can allocate a lot of memory, but
this is not visible to oom-killer, nobody can see the nascent
bprm->mm and take it into account.

With this patch get_arg_page() increments current's MM_ANONPAGES
counter every time we allocate the new page for argv/envp. When
do_execve() succeds or fails, we change this counter back.

Technically this is not 100% correct, we can't know if the new
page is swapped out and turn MM_ANONPAGES into MM_SWAPENTS, but
I don't think this really matters and everything becomes correct
once exec changes ->mm or fails.

Reported-by: Brad Spengler
Reviewed-and-discussed-by: KOSAKI Motohiro
Signed-off-by: Oleg Nesterov
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-12-01 09:56:37 +0800

18 Aug, 2010

1 commit

d7627467b Make do_execve() take a const filename pointer ... Browse Code »

Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells
Tested-by: Ralf Baechle
Acked-by: Russell King
Signed-off-by: Linus Torvalds

David Howells
2010-08-18 09:07:43 +0800

07 Mar, 2010

1 commit

30736a4d4 coredump: pass mm->flags as a coredump parameter for consistency ... Browse Code »

Pass mm->flags as a coredump parameter for consistency.

---
1787 if (mm->core_state || !get_dumpable(mm)) { mmap_sem);
1789 put_cred(cred);
1790 goto fail;
1791 }
1792
[...]
1798 if (get_dumpable(mm) == 2) { /* Setuid core dump mode */ fsuid = 0; /* Dump root private */
1801 }
---

Since dumpable bits are not protected by lock, there is a chance to change
these bits between (1) and (2).

To solve this issue, this patch copies mm->flags to
coredump_params.mm_flags at the beginning of do_coredump() and uses it
instead of get_dumpable() while dumping core.

This copy is also passed to binfmt->core_dump, since elf*_core_dump() uses
dump_filter bits in mm->flags.

[akpm@linux-foundation.org: fix merge]
Signed-off-by: Masami Hiramatsu
Acked-by: Roland McGrath
Cc: Hidehiro Kawai
Cc: Oleg Nesterov
Cc: Ingo Molnar
Reviewed-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masami Hiramatsu
2010-03-07 03:26:46 +0800

30 Jan, 2010

1 commit

221af7f87 Split 'flush_old_exec' into two functions ... Browse Code »

'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.

Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.

As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.

This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.

Signed-off-by: H. Peter Anvin
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2010-01-30 00:22:01 +0800

18 Dec, 2009

1 commit

f6151dfea mm: introduce coredump parameter structure ... Browse Code »

Introduce coredump parameter data structure (struct coredump_params) to
simplify binfmt->core_dump() arguments.

Signed-off-by: Masami Hiramatsu
Suggested-by: Ingo Molnar
Cc: Hidehiro Kawai
Cc: Oleg Nesterov
Cc: Roland McGrath
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masami Hiramatsu
2009-12-18 07:45:31 +0800

24 Sep, 2009

1 commit

964ee7df9 exec: fix set_binfmt() vs sys_delete_module() race ... Browse Code »

sys_delete_module() can set MODULE_STATE_GOING after
search_binary_handler() does try_module_get(). In this case
set_binfmt()->try_module_get() fails but since none of the callers
check the returned error, the task will run with the wrong old
->binfmt.

The proper fix should change all ->load_binary() methods, but we can
rely on fact that the caller must hold a reference to binfmt->module
and use __module_get() which never fails.

Signed-off-by: Oleg Nesterov
Acked-by: Rusty Russell
Cc: Hiroshi Shimamoto
Cc: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-09-24 22:21:01 +0800

06 Sep, 2009

1 commit

a2a8474c3 exec: do not sleep in TASK_TRACED under ->cred_guard_mutex ... Browse Code »

Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.

The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or the
tracee resumes.

With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:

- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.

- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().

or, if exec fails,

free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.

Reported-by: Tom Horsley
Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Cc: Roland McGrath
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2009-09-06 02:30:42 +0800

03 May, 2009

1 commit

74641f584 alpha: binfmt_aout fix ... Browse Code »

This fixes the problem introduced by commit 3bfacef412 (get rid of
special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults
when binfmt_aout built as module. That happens because aout binary
handler gets on the top of the binfmt list due to late registration, and
kernel attempts to execute the binary without preparatory work that must
be done by binfmt_loader.

Fixed by changing the registration order of the default binfmt handlers
using list_add_tail() and introducing insert_binfmt() function which
places new handler on the top of the binfmt list. This might be generally
useful for installing arch-specific frontends for default handlers or just
for overriding them.

Signed-off-by: Ivan Kokshaysky
Cc: Al Viro
Cc: Richard Henderson
Signed-off-by: Linus Torvalds

Ivan Kokshaysky
2009-05-03 06:36:10 +0800

03 Apr, 2009

1 commit

a50b0aa4b struct linux_binprm: drop unused fields ... Browse Code »

Signed-off-by: Kirill A. Shutemov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2009-04-03 10:05:01 +0800

07 Jan, 2009

2 commits

8cd3ac3ac fs/exec.c: make do_coredump() void ... Browse Code »

No one cares do_coredump()'s return value, and also it seems that it
is also not necessary. So make it void.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: WANG Cong
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

WANG Cong
2009-01-07 07:59:29 +0800
5cf0cc4e6 binfmts.h: include list.h ... Browse Code »

linux_binfmt uses list_head, so list.h is needed.

[akpm@linux-foundation.org: fix `make headerscheck']
Signed-off-by: Hiroshi Shimamoto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hiroshi Shimamoto
2009-01-07 07:59:19 +0800

14 Nov, 2008

1 commit

a6f76f23d CRED: Make execve() take advantage of copy-on-write credentials ... Browse Code »

Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

(1) execve().

The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.

I would like to replace bprm->cap_effective with:

cap_isclear(bprm->cap_effective)

but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).

The following sequence of events now happens:

(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.

(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.

This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.

(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.

(c) prepare_binprm() is called, possibly multiple times.

(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.

(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.

This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.

(iii) bprm->cred_prepared is set to 1.

bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.

(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:

(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().

(ii) Clear any bits in current->personality that were deferred from
(c.i).

(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:

(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.

This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).

(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.

(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.

(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.

(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.

(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.

(2) LSM interface.

A number of functions have been changed, added or removed:

(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()

Removed in favour of preparing new credentials and modifying those.

(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().

(*) security_bprm_set(), ->bprm_set_security()

Removed; folded into security_bprm_set_creds().

(*) security_bprm_set_creds(), ->bprm_set_creds()

New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.

(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()

New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.

The former may access bprm->cred, the latter may not.

(3) SELinux.

SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:

(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.

(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:24 +0800

17 Oct, 2008

2 commits

bf2a9a396 Allow recursion in binfmt_script and binfmt_misc ... Browse Code »

binfmt_script and binfmt_misc disallow recursion to avoid stack overflow
using sh_bang and misc_bang. It causes problem in some cases:

$ echo '#!/bin/ls' > /tmp/t0
$ echo '#!/tmp/t0' > /tmp/t1
$ echo '#!/tmp/t1' > /tmp/t2
$ chmod +x /tmp/t*
$ /tmp/t2
zsh: exec format error: /tmp/t2

Similar problem with binfmt_misc.

This patch introduces field 'recursion_depth' into struct linux_binprm to
track recursion level in binfmt_misc and binfmt_script. If recursion
level more then BINPRM_MAX_RECURSION it generates -ENOEXEC.

[akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint]
Signed-off-by: Kirill A. Shutemov
Cc: Pavel Emelyanov
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2008-10-17 02:21:38 +0800
53112488b alpha: introduce field 'taso' into struct linux_binprm ... Browse Code »

This change is Alpha-specific. It adds field 'taso' into struct
linux_binprm to remember if the application is TASO. Previously, field
sh_bang was used for this purpose.

Signed-off-by: Kirill A. Shutemov
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Pavel Emelyanov
Cc: Alexander Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2008-10-17 02:21:38 +0800

25 Jul, 2008

1 commit

5459c164f security: protect legacy applications from executing with insufficient privilege ... Browse Code »

When cap_bset suppresses some of the forced (fP) capabilities of a file,
it is generally only safe to execute the program if it understands how to
recognize it doesn't have enough privilege to work correctly. For legacy
applications (fE!=0), which have no non-destructive way to determine that
they are missing privilege, we fail to execute (EPERM) any executable that
requires fP capabilities, but would otherwise get pP' < fP. This is a
fail-safe permission check.

For some discussion of why it is problematic for (legacy) privileged
applications to run with less than the set of capabilities requested for
them, see:

http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html

With this iteration of this support, we do not include setuid-0 based
privilege protection from the bounding set. That is, the admin can still
(ab)use the bounding set to suppress the privileges of a setuid-0 program.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Andrew G. Morgan
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew G. Morgan
2008-07-25 01:47:22 +0800

17 May, 2008

1 commit

08a6fac1c [PATCH] get rid of leak in compat_execve() ... Browse Code »

Even though copy_compat_strings() doesn't cache the pages,
copy_strings_kernel() and stuff indirectly called by e.g.
->load_binary() is doing that, so we need to drop the
cache contents in the end.

[found by WANG Cong ]

Signed-off-by: Al Viro

Al Viro
2008-05-17 05:23:05 +0800

29 Apr, 2008

2 commits

3a2e7f47d binfmt_misc.c: avoid potential kernel stack overflow ... Browse Code »

This can be triggered with root help only, but...

Register the ":text:E::txt::/root/cat.txt:' rule in binfmt_misc (by root) and
try launching the cat.txt file (by anyone) :) The result is - the endless
recursion in the load_misc_binary -> open_exec -> load_misc_binary chain and
stack overflow.

There's a similar problem with binfmt_script, and there's a sh_bang memner on
linux_binprm structure to handle this, but simply raising this in binfmt_misc
may break some setups when the interpreter of some misc binaries is a script.

So the proposal is to turn sh_bang into a bit, add a new one (the misc_bang)
and raise it in load_misc_binary. After this, even if we set up the misc ->
script -> misc loop for binfmts one of them will step on its own bang and
exit.

Signed-off-by: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-04-29 23:06:04 +0800
175a06ae3 exec: remove argv_len from struct linux_binprm ... Browse Code »

I noticed that 2.6.24.2 calculates bprm->argv_len at do_execve(). But it
doesn't update bprm->argv_len after "remove_arg_zero() +
copy_strings_kernel()" at load_script() etc.

audit_bprm() is called from search_binary_handler() and
search_binary_handler() is called from load_script() etc. Thus, I think the
condition check

if (bprm->argv_len > (audit_argv_kb << 10))
return -E2BIG;

in audit_bprm() might return wrong result when strlen(removed_arg) !=
strlen(spliced_args). Why not update bprm->argv_len at load_script() etc. ?

By the way, 2.6.25-rc3 seems to not doing the condition check. Is the field
bprm->argv_len no longer needed?

Signed-off-by: Tetsuo Handa
Cc: Ollie Wild
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2008-04-29 23:06:03 +0800

17 Oct, 2007

4 commits

b53767719 Implement file posix capabilities ... Browse Code »

Implement file posix capabilities. This allows programs to be given a
subset of root's powers regardless of who runs them, without having to use
setuid and giving the binary all of root's powers.

This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/index.php. For more information on how to use this
patch, Chris Friedhoff has posted a nice page at
http://www.friedhoff.org/fscaps.html.

Changelog:
Nov 27:
Incorporate fixes from Andrew Morton
(security-introduce-file-caps-tweaks and
security-introduce-file-caps-warning-fix)
Fix Kconfig dependency.
Fix change signaling behavior when file caps are not compiled in.

Nov 13:
Integrate comments from Alexey: Remove CONFIG_ ifdef from
capability.h, and use %zd for printing a size_t.

Nov 13:
Fix endianness warnings by sparse as suggested by Alexey
Dobriyan.

Nov 09:
Address warnings of unused variables at cap_bprm_set_security
when file capabilities are disabled, and simultaneously clean
up the code a little, by pulling the new code into a helper
function.

Nov 08:
For pointers to required userspace tools and how to use
them, see http://www.friedhoff.org/fscaps.html.

Nov 07:
Fix the calculation of the highest bit checked in
check_cap_sanity().

Nov 07:
Allow file caps to be enabled without CONFIG_SECURITY, since
capabilities are the default.
Hook cap_task_setscheduler when !CONFIG_SECURITY.
Move capable(TASK_KILL) to end of cap_task_kill to reduce
audit messages.

Nov 05:
Add secondary calls in selinux/hooks.c to task_setioprio and
task_setscheduler so that selinux and capabilities with file
cap support can be stacked.

Sep 05:
As Seth Arnold points out, uid checks are out of place
for capability code.

Sep 01:
Define task_setscheduler, task_setioprio, cap_task_kill, and
task_setnice to make sure a user cannot affect a process in which
they called a program with some fscaps.

One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?

It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.

task_setscheduler
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit

Aug 21:
Updated secureexec implementation to reflect the fact that
euid and uid might be the same and nonzero, but the process
might still have elevated caps.

Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.

Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.

[morgan@kernel.org: file-caps clean up for linux/capability.h]
[bunk@kernel.org: unexport cap_inode_killpriv]
Signed-off-by: Serge E. Hallyn
Cc: Stephen Smalley
Cc: James Morris
Cc: Chris Wright
Cc: Andrew Morgan
Signed-off-by: Andrew Morgan
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2007-10-17 23:43:07 +0800
7dc0b22e3 core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe ... Browse Code »

For some time /proc/sys/kernel/core_pattern has been able to set its output
destination as a pipe, allowing a user space helper to receive and
intellegently process a core. This infrastructure however has some
shortcommings which can be enhanced. Specifically:

1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
when core_pattern is a pipe, since file system resources are not being
consumed in this case, unless the user application wishes to save the core,
at which point the app is restricted by usual file system limits and
restrictions.

2) The core_pattern code should be able to parse and pass options to the
user space helper as an argv array. The real core limit of the uid of the
crashing proces should also be passable to the user space helper (since it
is overridden to zero when called).

3) Some miscellaneous bugs need to be cleaned up (specifically the
recognition of a recursive core dump, should the user mode helper itself
crash. Also, the core dump code in the kernel should not wait for the user
mode helper to exit, since the same context is responsible for writing to
the pipe, and a read of the pipe by the user mode helper will result in a
deadlock.

This patch:

Remove the check of RLIMIT_CORE if core_pattern is a pipe. In the event that
core_pattern is a pipe, the entire core will be fed to the user mode helper.

Signed-off-by: Neil Horman
Cc:
Cc:
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Horman
2007-10-17 23:42:50 +0800
f6b450d48 Make unregister_binfmt() return void ... Browse Code »

list_del() hardly can fail, so checking for return value is pointless
(and current code always return 0).

Nobody really cared that return value anyway.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:42:46 +0800
e4dc1b14d Use list_head in binfmt handling ... Browse Code »

Switch single-linked binfmt formats list to usual list_head's. This leads
to one-liners in register_binfmt() and unregister_binfmt(). The downside
is one pointer more in struct linux_binfmt. This is not a problem, since
the set of registered binfmts on typical box is very small -- (ELF +
something distro enabled for you).

Test-booted, played with executable .txt files, modprobe/rmmod binfmt_misc.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:42:46 +0800

20 Jul, 2007

2 commits

b6a2fea39 mm: variable length argument support ... Browse Code »

Remove the arg+env limit of MAX_ARG_PAGES by copying the strings directly from
the old mm into the new mm.

We create the new mm before the binfmt code runs, and place the new stack at
the very top of the address space. Once the binfmt code runs and figures out
where the stack should be, we move it downwards.

It is a bit peculiar in that we have one task with two mm's, one of which is
inactive.

[a.p.zijlstra@chello.nl: limit stack size]
Signed-off-by: Ollie Wild
Signed-off-by: Peter Zijlstra
Cc:
Cc: Hugh Dickins
[bunk@stusta.de: unexport bprm_mm_init]
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ollie Wild
2007-07-20 01:04:45 +0800
bdf4c48af audit: rework execve audit ... Browse Code »

The purpose of audit_bprm() is to log the argv array to a userspace daemon at
the end of the execve system call. Since user-space hasn't had time to run,
this array is still in pristine state on the process' stack; so no need to
copy it, we can just grab it from there.

In order to minimize the damage to audit_log_*() copy each string into a
temporary kernel buffer first.

Currently the audit code requires that the full argument vector fits in a
single packet. So currently it does clip the argv size to a (sysctl) limit,
but only when execve auditing is enabled.

If the audit protocol gets extended to allow for multiple packets this check
can be removed.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ollie Wild
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:45 +0800

17 May, 2007

1 commit

71ce92f3f make sysctl/kernel/core_pattern and fs/exec.c agree on maximum core filename size ... Browse Code »

Make sysctl/kernel/core_pattern and fs/exec.c agree on maximum core
filename size and change it to 128, so that extensive patterns such as
'/local/cores/%e-%h-%s-%t-%p.core' won't result in truncated filename
generation.

Signed-off-by: Dan Aloni
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Aloni
2007-05-17 20:23:05 +0800

13 Feb, 2007

1 commit

9fbbd4dd1 [PATCH] x86: Don't require the vDSO for handling a.out signals ... Browse Code »

and in other strange binfmts. vDSO is not necessarily mapped there.

Signed-off-by: Andi Kleen

Andi Kleen
2007-02-13 20:26:26 +0800

24 Jun, 2005

1 commit

d6e711448 [PATCH] setuid core dump ... Browse Code »

Add a new `suid_dumpable' sysctl:

This value can be used to query and set the core dump mode for setuid
or otherwise protected/tainted binaries. The modes are

0 - (default) - traditional behaviour. Any process which has changed
privilege levels or is execute only will not be dumped

1 - (debug) - all processes dump core when possible. The core dump is
owned by the current user and no security is applied. This is intended
for system debugging situations only. Ptrace is unchecked.

2 - (suidsafe) - any binary which normally would not be dumped is dumped
readable by root only. This allows the end user to remove such a dump but
not access it directly. For security reasons core dumps in this mode will
not overwrite one another or other files. This mode is appropriate when
adminstrators are attempting to debug problems in a normal environment.

(akpm:

> > +EXPORT_SYMBOL(suid_dumpable);
>
> EXPORT_SYMBOL_GPL?

No problem to me.

> > if (current->euid == current->uid && current->egid == current->gid)
> > current->mm->dumpable = 1;
>
> Should this be SUID_DUMP_USER?

Actually the feedback I had from last time was that the SUID_ defines
should go because its clearer to follow the numbers. They can go
everywhere (and there are lots of places where dumpable is tested/used
as a bool in untouched code)

> Maybe this should be renamed to `dump_policy' or something. Doing that
> would help us catch any code which isn't using the #defines, too.

Fair comment. The patch was designed to be easy to maintain for Red Hat
rather than for merging. Changing that field would create a gigantic
diff because it is used all over the place.

)

Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alan Cox
2005-06-24 00:45:26 +0800

06 May, 2005

1 commit

75c96f858 [PATCH] make some things static ... Browse Code »

This patch makes some needlessly global identifiers static.

Signed-off-by: Adrian Bunk
Acked-by: Arjan van de Ven
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2005-05-06 07:36:47 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800