Eric Lee / smarc-fsl-linux-kernel

01 Nov, 2011

1 commit

fcf634098 Cross Memory Attach ... Browse Code »

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call. There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
using it:
- Does not allow for specifying iovecs for both src and dest, assuming
preadv or pwritev was implemented either the area read from or
written to would need to be contiguous.
- Currently mem_read allows only processes who are currently
ptrace'ing the target and are still able to ptrace the target to read
from the target. This check could possibly be moved to the open call,
but its not clear exactly what race this restriction is stopping
(reason appears to have been lost)
- Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
domain socket is a bit ugly from a userspace point of view,
especially when you may have hundreds if not (eventually) thousands
of processes that all need to do this with each other
- Doesn't allow for some future use of the interface we would like to
consider adding in the future (see below)
- Interestingly reading from /proc/pid/mem currently actually
involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems. Since you need the reader and writer working co-operatively if
the pipe is not drained then you block. Which requires some wrapping to
do non blocking on the send side or polling on the receive. In all to all
communication it requires ordering otherwise you can deadlock. And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could. For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy. We don't need to keep a copy of the data
from the source. I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI). This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication. And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Arnd Bergmann
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: James Morris
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2011-11-01 08:30:44 +0800

17 Mar, 2011

1 commit

4aab1e896 KEYS: Make request_key() and co. return an error for a negative key ... Browse Code »

Make request_key() and co. return an error for a negative or rejected key. If
the key was simply negated, then return ENOKEY, otherwise return the error
with which it was rejected.

Without this patch, the following command returns a key number (with the latest
keyutils):

[root@andromeda ~]# keyctl request2 user debug:foo rejected @s
586569904

Trying to print the key merely gets you a permission denied error:

[root@andromeda ~]# keyctl print 586569904
keyctl_read_alloc: Permission denied

Doing another request_key() call does get you the error, as long as it hasn't
expired yet:

[root@andromeda ~]# keyctl request user debug:foo
request_key: Key was rejected by service

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-17 08:59:49 +0800

08 Mar, 2011

2 commits

ee009e4a0 KEYS: Add an iovec version of KEYCTL_INSTANTIATE ... Browse Code »

Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but
takes an iovec array and concatenates the data in-kernel into one buffer.
Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a
problem.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-08 08:17:22 +0800
fdd1b9458 KEYS: Add a new keyctl op to reject a key with a specified error code ... Browse Code »

Add a new keyctl op to reject a key with a specified error code. This works
much the same as negating a key, and so keyctl_negate_key() is made a special
case of keyctl_reject_key(). The difference is that keyctl_negate_key()
selects ENOKEY as the error to be reported.

Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or
EKEYREJECTED, but this is not mandatory.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-08 08:17:18 +0800

22 Jan, 2011

2 commits

973c9f4f4 KEYS: Fix up comments in key management code ... Browse Code »

Fix up comments in the key management code. No functional changes.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2011-01-22 06:59:30 +0800
a8b17ed01 KEYS: Do some style cleanup in the key management code. ... Browse Code »

Do a bit of a style clean up in the key management code. No functional
changes.

Done using:

perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c

To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2011-01-22 06:59:29 +0800

10 Sep, 2010

2 commits

3d96406c7 KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring ... Browse Code »

Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership
of the parent process's session keyring whether or not the parent has a session
keyring [CVE-2010-2960].

This results in the following oops:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [] keyctl_session_to_parent+0x251/0x443
...
Call Trace:
[] ? keyctl_session_to_parent+0x67/0x443
[] ? __do_fault+0x24b/0x3d0
[] sys_keyctl+0xb4/0xb8
[] system_call_fastpath+0x16/0x1b

if the parent process has no session keyring.

If the system is using pam_keyinit then it mostly protected against this as all
processes derived from a login will have inherited the session keyring created
by pam_keyinit during the log in procedure.

To test this, pam_keyinit calls need to be commented out in /etc/pam.d/.

Reported-by: Tavis Ormandy
Signed-off-by: David Howells
Acked-by: Tavis Ormandy
Signed-off-by: Linus Torvalds

David Howells
2010-09-10 22:30:00 +0800
9d1ac65a9 KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() ... Browse Code »

There's an protected access to the parent process's credentials in the middle
of keyctl_session_to_parent(). This results in the following RCU warning:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by keyctl-session-/2137:
#0: (tasklist_lock){.+.+..}, at: [] keyctl_session_to_parent+0x60/0x236

stack backtrace:
Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1
Call Trace:
[] lockdep_rcu_dereference+0xaa/0xb3
[] keyctl_session_to_parent+0xed/0x236
[] sys_keyctl+0xb4/0xb6
[] system_call_fastpath+0x16/0x1b

The code should take the RCU read lock to make sure the parents credentials
don't go away, even though it's holding a spinlock and has IRQ disabled.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2010-09-10 22:30:00 +0800

02 Aug, 2010

2 commits

94fd8405e KEYS: Use the variable 'key' in keyctl_describe_key() ... Browse Code »

keyctl_describe_key() turns the key reference it gets into a usable key pointer
and assigns that to a variable called 'key', which it then ignores in favour of
recomputing the key pointer each time it needs it. Make it use the precomputed
pointer instead.

Without this patch, gcc 4.6 reports that the variable key is set but not used:

building with gcc 4.6 I'm getting a warning message:
CC security/keys/keyctl.o
security/keys/keyctl.c: In function 'keyctl_describe_key':
security/keys/keyctl.c:472:14: warning: variable 'key' set but not used

Reported-by: Justin P. Mattock
Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2010-08-02 13:34:56 +0800
9156235b3 KEYS: Authorise keyctl_set_timeout() on a key if we have its authorisation key ... Browse Code »

Authorise a process to perform keyctl_set_timeout() on an uninstantiated key if
that process has the authorisation key for it.

This allows the instantiator to set the timeout on a key it is instantiating -
provided it does it before instantiating the key.

For instance, the test upcall script provided with the keyutils package could
be modified to set the expiry to an hour hence before instantiating the key:

[/usr/share/keyutils/request-key-debug.sh]
if [ "$3" != "neg" ]
then
+ keyctl timeout $1 3600
keyctl instantiate $1 "Debug $3" $4 || exit 1
else

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2010-08-02 13:34:27 +0800

27 Jun, 2010

1 commit

4303ef19c KEYS: Propagate error code instead of returning -EINVAL ... Browse Code »

This is from a Smatch check I'm writing.

strncpy_from_user() returns -EFAULT on error so the first change just
silences a warning but doesn't change how the code works.

The other change is a bug fix because install_thread_keyring_to_cred()
can return a variety of errors such as -EINVAL, -EEXIST, -ENOMEM or
-EKEYREVOKED.

Signed-off-by: Dan Carpenter
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

Dan Carpenter
2010-06-27 22:02:34 +0800

28 May, 2010

1 commit

dd98acf74 keyctl_session_to_parent(): use thread_group_empty() to check singlethreadness ... Browse Code »

No functional changes.

keyctl_session_to_parent() is the only user of signal->count which needs
the correct value. Change it to use thread_group_empty() instead, this
must be strictly equivalent under tasklist, and imho looks better.

Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Cc: Peter Zijlstra
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-05-28 00:12:47 +0800

23 Apr, 2010

1 commit

c5b60b5e6 security: whitespace coding style fixes ... Browse Code »

Whitespace coding style fixes.

Signed-off-by: Justin P. Mattock
Signed-off-by: James Morris

Justin P. Mattock
2010-04-23 08:10:23 +0800

12 Apr, 2010

1 commit

3011a344c security: remove dead hook key_session_to_parent ... Browse Code »

Unused hook. Remove.

Signed-off-by: Eric Paris
Signed-off-by: James Morris

Eric Paris
2010-04-12 10:19:18 +0800

17 Dec, 2009

2 commits

a00ae4d21 Keys: KEYCTL_SESSION_TO_PARENT needs TIF_NOTIFY_RESUME architecture support ... Browse Code »

As of commit ee18d64c1f632043a02e6f5ba5e045bb26a5465f ("KEYS: Add a keyctl to
install a process's session keyring on its parent [try #6]"), CONFIG_KEYS=y
fails to build on architectures that haven't implemented TIF_NOTIFY_RESUME yet:

security/keys/keyctl.c: In function 'keyctl_session_to_parent':
security/keys/keyctl.c:1312: error: 'TIF_NOTIFY_RESUME' undeclared (first use in this function)
security/keys/keyctl.c:1312: error: (Each undeclared identifier is reported only once
security/keys/keyctl.c:1312: error: for each function it appears in.)

Make KEYCTL_SESSION_TO_PARENT depend on TIF_NOTIFY_RESUME until
m68k, and xtensa have implemented it.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: James Morris
Acked-by: Mike Frysinger

Geert Uytterhoeven
2009-12-17 06:27:59 +0800
fa1cc7b5a keys: PTR_ERR return of wrong pointer in keyctl_get_security() ... Browse Code »

Return the PTR_ERR of the correct pointer.

Signed-off-by: Roel Kluin
Signed-off-by: Andrew Morton
Acked-by: David Howells
Signed-off-by: James Morris

Roel Kluin
2009-12-17 06:23:48 +0800

16 Oct, 2009

1 commit

21279cfa1 KEYS: get_instantiation_keyring() should inc the keyring refcount in all cases ... Browse Code »

The destination keyring specified to request_key() and co. is made available to
the process that instantiates the key (the slave process started by
/sbin/request-key typically). This is passed in the request_key_auth struct as
the dest_keyring member.

keyctl_instantiate_key and keyctl_negate_key() call get_instantiation_keyring()
to get the keyring to attach the newly constructed key to at the end of
instantiation. This may be given a specific keyring into which a link will be
made later, or it may be asked to find the keyring passed to request_key(). In
the former case, it returns a keyring with the refcount incremented by
lookup_user_key(); in the latter case, it returns the keyring from the
request_key_auth struct - and does _not_ increment the refcount.

The latter case will eventually result in an oops when the keyring prematurely
runs out of references and gets destroyed. The effect may take some time to
show up as the key is destroyed lazily.

To fix this, the keyring returned by get_instantiation_keyring() must always
have its refcount incremented, no matter where it comes from.

This can be tested by setting /etc/request-key.conf to:

#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
#====== ======= =============== =============== ===============================
create * test:* * |/bin/false %u %g %d %{user:_display}
negate * * * /bin/keyctl negate %k 10 @u

and then doing:

keyctl add user _display aaaaaaaa @u
while keyctl request2 user test:x test:x @u &&
keyctl list @u;
do
keyctl request2 user test:x test:x @u;
sleep 31;
keyctl list @u;
done

which will oops eventually. Changing the negate line to have @u rather than
%S at the end is important as that forces the latter case by passing a special
keyring ID rather than an actual keyring ID.

Reported-by: Alexander Zangerl
Signed-off-by: David Howells
Tested-by: Alexander Zangerl
Signed-off-by: Linus Torvalds

David Howells
2009-10-16 06:19:58 +0800

15 Sep, 2009

2 commits

c08ef808e KEYS: Fix garbage collector ... Browse Code »

Fix a number of problems with the new key garbage collector:

(1) A rogue semicolon in keyring_gc() was causing the initial count of dead
keys to be miscalculated.

(2) A missing return in keyring_gc() meant that under certain circumstances,
the keyring semaphore would be unlocked twice.

(3) The key serial tree iterator (key_garbage_collector()) part of the garbage
collector has been modified to:

(a) Complete each scan of the keyrings before setting the new timer.

(b) Only set the new timer for keys that have yet to expire. This means
that the new timer is now calculated correctly, and the gc doesn't
get into a loop continually scanning for keys that have expired, and
preventing other things from happening, like RCU cleaning up the old
keyring contents.

(c) Perform an extra scan if any keys were garbage collected in this one
as a key might become garbage during a scan, and (b) could mean we
don't set the timer again.

(4) Made key_schedule_gc() take the time at which to do a collection run,
rather than the time at which the key expires. This means the collection
of dead keys (key type unregistered) can happen immediately.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2009-09-15 07:11:02 +0800
5c84342a3 KEYS: Unlock tasklist when exiting early from keyctl_session_to_parent ... Browse Code »

When we exit early from keyctl_session_to_parent because of permissions or
because the session keyring is the same as the parent, we need to unlock the
tasklist.

The missing unlock causes the system to hang completely when using
keyctl(KEYCTL_SESSION_TO_PARENT) with a keyring shared with the parent.

Signed-off-by: Marc Dionne
Signed-off-by: David Howells
Signed-off-by: James Morris

Marc Dionne
2009-09-15 07:10:59 +0800

02 Sep, 2009

4 commits

ee18d64c1 KEYS: Add a keyctl to install a process's session keyring on its parent [try #6] ... Browse Code »

Add a keyctl to install a process's session keyring onto its parent. This
replaces the parent's session keyring. Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again. Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME. This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership. However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

#include
#include
#include

#define KEYCTL_SESSION_TO_PARENT 18

#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

int main(int argc, char **argv)
{
key_serial_t keyring, key;
long ret;

keyring = keyctl_join_session_keyring(argv[1]);
OSERROR(keyring, "keyctl_join_session_keyring");

key = add_key("user", "a", "b", 1, keyring);
OSERROR(key, "add_key");

ret = keyctl(KEYCTL_SESSION_TO_PARENT);
OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

return 0;
}

Compiled and linked with -lkeyutils, you should see something like:

[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
355907932 --alswrv 4043 -1 \_ keyring: _uid.4043
[dhowells@andromeda ~]$ /tmp/newpag
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: _ses
1055658746 --alswrv 4043 4043 \_ user: a
[dhowells@andromeda ~]$ /tmp/newpag hello
[dhowells@andromeda ~]$ keyctl show
Session Keyring
-3 --alswrv 4043 4043 keyring: hello
340417692 --alswrv 4043 4043 \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2009-09-02 19:29:22 +0800
5d135440f KEYS: Add garbage collection for dead, revoked and expired keys. [try #6] ... Browse Code »

Add garbage collection for dead, revoked and expired keys. This involved
erasing all links to such keys from keyrings that point to them. At that
point, the key will be deleted in the normal manner.

Keyrings from which garbage collection occurs are shrunk and their quota
consumption reduced as appropriate.

Dead keys (for which the key type has been removed) will be garbage collected
immediately.

Revoked and expired keys will hang around for a number of seconds, as set in
/proc/sys/kernel/keys/gc_delay before being automatically removed. The default
is 5 minutes.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2009-09-02 19:29:11 +0800
0c2c9a3fc KEYS: Allow keyctl_revoke() on keys that have SETATTR but not WRITE perm [try #6] ... Browse Code »

Allow keyctl_revoke() to operate on keys that have SETATTR but not WRITE
permission, rather than only on keys that have WRITE permission.

Signed-off-by: David Howells
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2009-09-02 19:29:06 +0800
5593122ee KEYS: Deal with dead-type keys appropriately [try #6] ... Browse Code »

Allow keys for which the key type has been removed to be unlinked. Currently
dead-type keys can only be disposed of by completely clearing the keyrings
that point to them.

Signed-off-by: David Howells
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2009-09-02 19:29:04 +0800

27 Feb, 2009

1 commit

1d1e97562 keys: distinguish per-uid keys in different namespaces ... Browse Code »

per-uid keys were looked by uid only. Use the user namespace
to distinguish the same uid in different namespaces.

This does not address key_permission. So a task can for instance
try to join a keyring owned by the same uid in another namespace.
That will be handled by a separate patch.

Signed-off-by: Serge E. Hallyn
Acked-by: David Howells
Signed-off-by: James Morris

Serge E. Hallyn
2009-02-27 09:35:06 +0800

18 Jan, 2009

1 commit

0d54ee1c7 security: introduce missing kfree ... Browse Code »

Plug this leak.

Acked-by: David Howells
Cc: James Morris
Cc:
Signed-off-by: Vegard Nossum
Signed-off-by: Linus Torvalds

Vegard Nossum
2009-01-18 06:24:46 +0800

14 Jan, 2009

2 commits

938bb9f5e [CVE-2009-0029] System call wrappers part 28 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:30 +0800
1e7bfb213 [CVE-2009-0029] System call wrappers part 27 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:29 +0800

01 Jan, 2009

1 commit

90bd49ab6 keys: fix sparse warning by adding __user annotation to cast ... Browse Code »

Fix the following sparse warning:

CC security/keys/key.o
security/keys/keyctl.c:1297:10: warning: incorrect type in argument 2 (different address spaces)
security/keys/keyctl.c:1297:10: expected char [noderef] *buffer
security/keys/keyctl.c:1297:10: got char *

which appears to be caused by lack of __user annotation to the cast of
a syscall argument.

Signed-off-by: James Morris
Acked-by: David Howells

James Morris
2009-01-01 07:32:44 +0800

29 Dec, 2008

1 commit

eca1bf5b4 KEYS: Fix variable uninitialisation warnings ... Browse Code »

Fix variable uninitialisation warnings introduced in:

commit 8bbf4976b59fc9fc2861e79cab7beb3f6d647640
Author: David Howells
Date: Fri Nov 14 10:39:14 2008 +1100

KEYS: Alter use of key instantiation link-to-keyring argument

As:

security/keys/keyctl.c: In function 'keyctl_negate_key':
security/keys/keyctl.c:976: warning: 'dest_keyring' may be used uninitialized in this function
security/keys/keyctl.c: In function 'keyctl_instantiate_key':
security/keys/keyctl.c:898: warning: 'dest_keyring' may be used uninitialized in this function

Some versions of gcc notice that get_instantiation_key() doesn't always set
*_dest_keyring, but fail to observe that if this happens then *_dest_keyring
will not be read by the caller.

Reported-by: Linus Torvalds
Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2008-12-29 11:24:43 +0800

14 Nov, 2008

4 commits

d84f4f992 CRED: Inaugurate COW credentials ... Browse Code »

Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

(1) Its reference count may incremented and decremented.

(2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

(1) execve().

This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.

(2) Temporary credential overrides.

do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.

This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.

(3) LSM interface.

A number of functions have been changed, added or removed:

(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()

Removed in favour of security_capset().

(*) security_capset(), ->capset()

New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.

(*) security_bprm_apply_creds(), ->bprm_apply_creds()

Changed; now returns a value, which will cause the process to be
killed if it's an error.

(*) security_task_alloc(), ->task_alloc_security()

Removed in favour of security_prepare_creds().

(*) security_cred_free(), ->cred_free()

New. Free security data attached to cred->security.

(*) security_prepare_creds(), ->cred_prepare()

New. Duplicate any security data attached to cred->security.

(*) security_commit_creds(), ->cred_commit()

New. Apply any security effects for the upcoming installation of new
security by commit_creds().

(*) security_task_post_setuid(), ->task_post_setuid()

Removed in favour of security_task_fix_setuid().

(*) security_task_fix_setuid(), ->task_fix_setuid()

Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().

(*) security_task_reparent_to_init(), ->task_reparent_to_init()

Removed. Instead the task being reparented to init is referred
directly to init's credentials.

NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.

(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()

Changed. These now take cred pointers rather than task pointers to
refer to the security context.

(4) sys_capset().

This has been simplified and uses less locking. The LSM functions it
calls have been merged.

(5) reparent_to_kthreadd().

This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.

(6) __sigqueue_alloc() and switch_uid()

__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.

switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().

(7) [sg]et[ug]id() and co and [sg]et_current_groups.

The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.

security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.

The calling of set_dumpable() has been moved into commit_creds().

Much of the functionality of set_user() has been moved into
commit_creds().

The get functions all simply access the data directly.

(8) security_task_prctl() and cap_task_prctl().

security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.

Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.

(9) Keyrings.

A number of changes have been made to the keyrings code:

(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.

(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.

(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.

(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.

(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).

(10) Usermode helper.

The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.

call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.

call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.

(11) SELinux.

SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:

(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.

(12) is_single_threaded().

This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.

The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.

Signed-off-by: David Howells
Acked-by: James Morris
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:23 +0800
b6dff3ec5 CRED: Separate task security context from task_struct ... Browse Code »

Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:16 +0800
8bbf4976b KEYS: Alter use of key instantiation link-to-keyring argument ... Browse Code »

Alter the use of the key instantiation and negation functions' link-to-keyring
arguments. Currently this specifies a keyring in the target process to link
the key into, creating the keyring if it doesn't exist. This, however, can be
a problem for copy-on-write credentials as it means that the instantiating
process can alter the credentials of the requesting process.

This patch alters the behaviour such that:

(1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific
keyring by ID (ringid >= 0), then that keyring will be used.

(2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the
special constants that refer to the requesting process's keyrings
(KEY_SPEC_*_KEYRING, all | Instantiator |------->| Instantiator |
| | | | | |
+-----------+ +--------------+ +--------------+
request_key() request_key()

This might be useful, for example, in Kerberos, where the requestor requests a
ticket, and then the ticket instantiator requests the TGT, which someone else
then has to go and fetch. The TGT, however, should be retained in the
keyrings of the requestor, not the first instantiator. To make this explict
an extra special keyring constant is also added.

Signed-off-by: David Howells
Reviewed-by: James Morris
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:14 +0800
47d804bfa CRED: Wrap task credential accesses in the key management code ... Browse Code »

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells
Reviewed-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:11 +0800

29 Apr, 2008

5 commits

0b77f5bfb keys: make the keyring quotas controllable through /proc/sys ... Browse Code »

Make the keyring quotas controllable through /proc/sys files:

(*) /proc/sys/kernel/keys/root_maxkeys
/proc/sys/kernel/keys/root_maxbytes

Maximum number of keys that root may have and the maximum total number of
bytes of data that root may have stored in those keys.

(*) /proc/sys/kernel/keys/maxkeys
/proc/sys/kernel/keys/maxbytes

Maximum number of keys that each non-root user may have and the maximum
total number of bytes of data that each of those users may have stored in
their keys.

Also increase the quotas as a number of people have been complaining that it's
not big enough. I'm not sure that it's big enough now either, but on the
other hand, it can now be set in /etc/sysctl.conf.

Signed-off-by: David Howells
Cc:
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-04-29 23:06:17 +0800
6b79ccb51 keys: allow clients to set key perms in key_create_or_update() ... Browse Code »

The key_create_or_update() function provided by the keyring code has a default
set of permissions that are always applied to the key when created. This
might not be desirable to all clients.

Here's a patch that adds a "perm" parameter to the function to address this,
which can be set to KEY_PERM_UNDEF to revert to the current behaviour.

Signed-off-by: Arun Raghavan
Signed-off-by: David Howells
Cc: Satyam Sharma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun Raghavan
2008-04-29 23:06:16 +0800
70a5bb72b keys: add keyctl function to get a security label ... Browse Code »

Add a keyctl() function to get the security label of a key.

The following is added to Documentation/keys.txt:

(*) Get the LSM security context attached to a key.

long keyctl(KEYCTL_GET_SECURITY, key_serial_t key, char *buffer,
size_t buflen)

This function returns a string that represents the LSM security context
attached to a key in the buffer provided.

Unless there's an error, it always returns the amount of data it could
produce, even if that's too big for the buffer, but it won't copy more
than requested to userspace. If the buffer pointer is NULL then no copy
will take place.

A NUL character is included at the end of the string if the buffer is
sufficiently big. This is included in the returned count. If no LSM is
in force then an empty string will be returned.

A process must have view permission on the key for this function to be
successful.

[akpm@linux-foundation.org: declare keyctl_get_security()]
Signed-off-by: David Howells
Acked-by: Stephen Smalley
Cc: Paul Moore
Cc: Chris Wright
Cc: James Morris
Cc: Kevin Coffman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-04-29 23:06:16 +0800
4a38e122e keys: allow the callout data to be passed as a blob rather than a string ... Browse Code »

Allow the callout data to be passed as a blob rather than a string for
internal kernel services that call any request_key_*() interface other than
request_key(). request_key() itself still takes a NUL-terminated string.

The functions that change are:

request_key_with_auxdata()
request_key_async()
request_key_async_with_auxdata()

Signed-off-by: David Howells
Cc: Paul Moore
Cc: Chris Wright
Cc: Stephen Smalley
Cc: James Morris
Cc: Kevin Coffman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-04-29 23:06:16 +0800
38bbca6b6 keys: increase the payload size when instantiating a key ... Browse Code »

Increase the size of a payload that can be used to instantiate a key in
add_key() and keyctl_instantiate_key(). This permits huge CIFS SPNEGO blobs
to be passed around. The limit is raised to 1MB. If kmalloc() can't allocate
a buffer of sufficient size, vmalloc() will be tried instead.

Signed-off-by: David Howells
Cc: Paul Moore
Cc: Chris Wright
Cc: Stephen Smalley
Cc: James Morris
Cc: Kevin Coffman
Cc: Steven French
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2008-04-29 23:06:16 +0800

30 Jun, 2006

1 commit

4e54f0854 [PATCH] Keys: Allow in-kernel key requestor to pass auxiliary data to upcaller ... Browse Code »

The proposed NFS key type uses its own method of passing key requests to
userspace (upcalling) rather than invoking /sbin/request-key. This is
because the responsible userspace daemon should already be running and will
be contacted through rpc_pipefs.

This patch permits the NFS filesystem to pass auxiliary data to the upcall
operation (struct key_type::request_key) so that the upcaller can use a
pre-existing communications channel more easily.

Signed-off-by: David Howells
Acked-By: Kevin Coffman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-06-30 01:26:20 +0800

27 Jun, 2006

1 commit

5801649d8 [PATCH] keys: let keyctl_chown() change a key's owner ... Browse Code »

Let keyctl_chown() change a key's owner, including attempting to transfer the
quota burden to the new user.

Signed-off-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fredrik Tolf
2006-06-27 00:58:18 +0800