Eric Lee / smarc-fsl-linux-kernel

15 Jan, 2012

1 commit

c49c41a41 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security ... Browse Code »

* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

- the interface was removed in commit fd7784615248 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")

- a new user of it appeared in commit a38f7907b926 ("crypto: Add
userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.

Linus Torvalds
2012-01-15 10:36:33 +0800

06 Jan, 2012

7 commits

f1c84dae0 capabilities: remove task_ns_* functions ... Browse Code »

task_ in the front of a function, in the security subsystem anyway, means
to me at least, that we are operating with that task as the subject of the
security decision. In this case what it means is that we are using current as
the subject but we use the task to get the right namespace. Who in the world
would ever realize that's what task_ns_capability means just by the name? This
patch eliminates the task_ns functions entirely and uses the has_ns_capability
function instead. This means we explicitly open code the ns in question in
the caller. I think it makes the caller a LOT more clear what is going on.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:59 +0800
d2a7009f0 capabitlies: ns_capable can use the cap helpers rather than lsm call ... Browse Code »
43

Just to reduce the number of places to change if we every change the LSM
hook, use the capability helpers internally when possible.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:58 +0800
105ddf49c capabilities: style only - move capable below ns_capable ... Browse Code »

Although the current code is fine for consistency this moves the capable
code below the function it calls in the c file. It doesn't actually change
code.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:57 +0800
7b61d6484 capabilites: introduce new has_ns_capabilities_noaudit ... Browse Code »

For consistency in interfaces, introduce a new interface called
has_ns_capabilities_noaudit. It checks if the given task has the given
capability in the given namespace. Use this new function by
has_capabilities_noaudit.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:57 +0800
25e757034 capabilities: call has_ns_capability from has_capability ... Browse Code »

Declare the more specific has_ns_capability first in the code and then call it
from has_capability. The declaration reversal isn't stricty necessary since
they are both declared in header files, but it just makes sense to put more
specific functions first in the code.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:56 +0800
2920a8409 capabilities: remove all _real_ interfaces ... Browse Code »

The name security_real_capable and security_real_capable_noaudit just don't
make much sense to me. Convert them to use security_capable and
security_capable_noaudit.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:55 +0800
b7e724d30 capabilities: reverse arguments to security_capable ... Browse Code »

security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!

Signed-off-by: Eric Paris

Eric Paris
2012-01-06 07:52:53 +0800

31 Oct, 2011

1 commit

9984de1a5 kernel: Map most files to use export.h instead of module.h ... Browse Code »

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else. Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

-#include
+#include

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

19 May, 2011

1 commit

12a5a2621 Merge branch 'master' into next ... Browse Code »

Conflicts:
include/linux/capability.h

Manually resolve merge conflict w/ thanks to Stephen Rothwell.

Signed-off-by: James Morris

James Morris
2011-05-19 16:51:57 +0800

14 May, 2011

1 commit

47a150edc Cache user_ns in struct cred ... Browse Code »

If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).

Get rid of _current_user_ns. This requires nsown_capable() to be
defined in capability.c rather than as static inline in capability.h,
so do that.

Request_key needs init_user_ns defined at current_user_ns if
!CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
at current_user_ns() define.

Compile-tested with and without CONFIG_USERNS.

Signed-off-by: Serge E. Hallyn
[ This makes a huge performance difference for acl_permission_check(),
up to 30%. And that is one of the hottest kernel functions for loads
that are pathname-lookup heavy. ]
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-05-14 02:45:33 +0800

04 Apr, 2011

2 commits

5163b583a capabilities: delete unused cap_set_full ... Browse Code »

unused code. Clean it up.

Signed-off-by: Eric Paris
Acked-by: David Howells
Acked-by: Andrew G. Morgan
Signed-off-by: James Morris

Eric Paris
2011-04-04 08:31:12 +0800
ffa8e59df capabilities: do not drop CAP_SETPCAP from the initial task ... Browse Code »

In olden' days of yore CAP_SETPCAP had special meaning for the init task.
We actually have code to make sure that CAP_SETPCAP wasn't in pE of things
using the init_cred. But CAP_SETPCAP isn't so special any more and we
don't have a reason to special case dropping it for init or kthreads....

Signed-off-by: Eric Paris
Acked-by: Andrew G. Morgan
Signed-off-by: James Morris

Eric Paris
2011-04-04 08:31:09 +0800

24 Mar, 2011

2 commits

3263245de userns: make has_capability* into real functions ... Browse Code »

So we can let type safety keep things sane, and as a bonus we can remove
the declaration of init_user_ns in capability.h.

Signed-off-by: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Daniel Lezcano
Cc: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-03-24 10:47:06 +0800
3486740a4 userns: security: make capabilities relative to the user namespace ... Browse Code »

- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.

The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.

I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.

Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

(Original written and signed off by Eric; latest, modified version
acked by him)

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman
Signed-off-by: Serge E. Hallyn
Acked-by: "Eric W. Biederman"
Acked-by: Daniel Lezcano
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Serge E. Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-03-24 10:47:02 +0800

11 Feb, 2011

1 commit

6037b715d security: add cred argument to security_capable() ... Browse Code »

Expand security_capable() to include cred, so that it can be usable in a
wider range of call sites.

Signed-off-by: Chris Wright
Acked-by: Serge Hallyn
Signed-off-by: James Morris

Chris Wright
2011-02-11 14:41:58 +0800

03 Apr, 2010

1 commit

32bd7eb5a sched: Remove remaining USER_SCHED code ... Browse Code »

This is left over from commit 7c9414385e ("sched: Remove USER_SCHED"")

Signed-off-by: Li Zefan
Acked-by: Dhaval Giani
Signed-off-by: Peter Zijlstra
Cc: David Howells
LKML-Reference:
Signed-off-by: Ingo Molnar

Li Zefan
2010-04-03 02:12:00 +0800

10 Dec, 2009

1 commit

86fc80f16 capabilities: Use RCU to protect task lookup in sys_capget ... Browse Code »

cap_get_target_pid() protects the task lookup with tasklist_lock.
security_capget() is called under tasklist_lock as well but
tasklist_lock does not protect anything there. The capabilities are
protected by RCU already.

So tasklist_lock only protects the lookup and prevents the task going
away, which can be done with rcu_read_lock() as well.

Signed-off-by: Thomas Gleixner
Signed-off-by: James Morris

Thomas Gleixner
2009-12-10 06:42:48 +0800

24 Nov, 2009

2 commits

b3a222e52 remove CONFIG_SECURITY_FILE_CAPABILITIES compile option ... Browse Code »

As far as I know, all distros currently ship kernels with default
CONFIG_SECURITY_FILE_CAPABILITIES=y. Since having the option on
leaves a 'no_file_caps' option to boot without file capabilities,
the main reason to keep the option is that turning it off saves
you (on my s390x partition) 5k. In particular, vmlinux sizes
came to:

without patch fscaps=n: 53598392
without patch fscaps=y: 53603406
with this patch applied: 53603342

with the security-next tree.

Against this we must weigh the fact that there is no simple way for
userspace to figure out whether file capabilities are supported,
while things like per-process securebits, capability bounding
sets, and adding bits to pI if CAP_SETPCAP is in pE are not supported
with SECURITY_FILE_CAPABILITIES=n, leaving a bit of a problem for
applications wanting to know whether they can use them and/or why
something failed.

It also adds another subtly different set of semantics which we must
maintain at the risk of severe security regressions.

So this patch removes the SECURITY_FILE_CAPABILITIES compile
option. It drops the kernel size by about 50k over the stock
SECURITY_FILE_CAPABILITIES=y kernel, by removing the
cap_limit_ptraced_target() function.

Changelog:
Nov 20: remove cap_limit_ptraced_target() as it's logic
was ifndef'ed.

Signed-off-by: Serge E. Hallyn
Acked-by: Andrew G. Morgan"
Signed-off-by: James Morris

Serge E. Hallyn
2009-11-24 12:06:47 +0800
c4a5af54c Silence the existing API for capability version compatibility check. ... Browse Code »

When libcap, or other libraries attempt to confirm/determine the supported
capability version magic, they generally supply a NULL dataptr to capget().

In this case, while returning the supported/preferred magic (via a
modified header content), the return code of this system call may be 0,
-EINVAL, or -EFAULT.

No libcap code depends on the previous -EINVAL etc. return code, and
all of the above three return codes can accompany a valid (successful)
attempt to determine the requested magic value.

This patch cleans up the system call to return 0, if the call is
successfully being used to determine the supported/preferred capability
magic value.

Signed-off-by: Andrew G. Morgan
Acked-by: Steve Grubb
Acked-by: Serge Hallyn
Signed-off-by: James Morris

Andrew G. Morgan
2009-11-24 05:53:29 +0800

14 Oct, 2009

1 commit

825332e4f capabilities: simplify bound checks for copy_from_user() ... Browse Code »

The capabilities syscall has a copy_from_user() call where gcc currently
cannot prove to itself that the copy is always within bounds.

This patch adds a very explicity bound check to prove to gcc that this
copy_from_user cannot overflow its destination buffer.

Signed-off-by: Arjan van de Ven
Acked-by: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: James Morris

Arjan van de Ven
2009-10-14 05:17:36 +0800

14 Jan, 2009

1 commit

b290ebe2c [CVE-2009-0029] System call wrappers part 04 ... Browse Code »

Signed-off-by: Heiko Carstens

Heiko Carstens
2009-01-14 21:15:19 +0800

07 Jan, 2009

3 commits

ac8cc0fa5 Merge branch 'next' into for-linus Browse Code »

James Morris
2009-01-07 06:58:22 +0800
3699c53c4 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3] ... Browse Code »

Fix a regression in cap_capable() due to:

commit 3b11a1decef07c19443d24ae926982bc8ec9f4c0
Author: David Howells
Date: Fri Nov 14 10:39:26 2008 +1100

CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds. However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

if (!(mask & MAY_EXEC) || execute_ok(inode))
if (capable(CAP_DAC_OVERRIDE))
return 0;

This change passes the set of credentials to be tested down into the commoncap
and SELinux code. The security functions called by capable() and
has_capability() select the appropriate set of credentials from the process
being checked.

This can be tested by compiling the following program from the XFS testsuite:

/*
* t_access_root.c - trivial test program to show permission bug.
*
* Written by Michael Kerrisk - copyright ownership not pursued.
* Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
*/
#include
#include
#include
#include
#include
#include

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
perror(msg);
exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
int fd, perm, uid, gid;
char *testpath;
char cmd[PATH_MAX + 20];

testpath = (argc > 1) ? argv[1] : TESTPATH;
perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
uid = (argc > 3) ? atoi(argv[3]) : UID;
gid = (argc > 4) ? atoi(argv[4]) : GID;

unlink(testpath);

fd = open(testpath, O_RDWR | O_CREAT, 0);
if (fd == -1) errExit("open");

if (fchown(fd, uid, gid) == -1) errExit("fchown");
if (fchmod(fd, perm) == -1) errExit("fchmod");
close(fd);

snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
system(cmd);

if (seteuid(uid) == -1) errExit("seteuid");

accessTest(testpath, 0, "0");
accessTest(testpath, R_OK, "R_OK");
accessTest(testpath, W_OK, "W_OK");
accessTest(testpath, X_OK, "X_OK");
accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem. If successful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns 0
access(/tmp/xxx, W_OK) returns 0
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns 0
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns -1
access(/tmp/xxx, W_OK) returns -1
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns -1
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells
Tested-by: J. Bruce Fields
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2009-01-07 06:38:48 +0800
29881c450 Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]" ... Browse Code »

This reverts commit 14eaddc967b16017d4a1a24d2be6c28ecbe06ed8.

David has a better version to come.

James Morris
2009-01-07 06:21:54 +0800

05 Jan, 2009

2 commits

14eaddc96 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2] ... Browse Code »

Fix a regression in cap_capable() due to:

commit 5ff7711e635b32f0a1e558227d030c7e45b4a465
Author: David Howells
Date: Wed Dec 31 02:52:28 2008 +0000

CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds. However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

if (!(mask & MAY_EXEC) || execute_ok(inode))
if (capable(CAP_DAC_OVERRIDE))
return 0;

This change splits capable() from has_capability() down into the commoncap and
SELinux code. The capable() security op now only deals with the current
process, and uses the current process's subjective creds. A new security op -
task_capable() - is introduced that can check any task's objective creds.

strictly the capable() security op is superfluous with the presence of the
task_capable() op, however it should be faster to call the capable() op since
two fewer arguments need be passed down through the various layers.

This can be tested by compiling the following program from the XFS testsuite:

/*
* t_access_root.c - trivial test program to show permission bug.
*
* Written by Michael Kerrisk - copyright ownership not pursued.
* Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
*/
#include
#include
#include
#include
#include
#include

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
perror(msg);
exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
int fd, perm, uid, gid;
char *testpath;
char cmd[PATH_MAX + 20];

testpath = (argc > 1) ? argv[1] : TESTPATH;
perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
uid = (argc > 3) ? atoi(argv[3]) : UID;
gid = (argc > 4) ? atoi(argv[4]) : GID;

unlink(testpath);

fd = open(testpath, O_RDWR | O_CREAT, 0);
if (fd == -1) errExit("open");

if (fchown(fd, uid, gid) == -1) errExit("fchown");
if (fchmod(fd, perm) == -1) errExit("fchmod");
close(fd);

snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
system(cmd);

if (seteuid(uid) == -1) errExit("seteuid");

accessTest(testpath, 0, "0");
accessTest(testpath, R_OK, "R_OK");
accessTest(testpath, W_OK, "W_OK");
accessTest(testpath, X_OK, "X_OK");
accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem. If successful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns 0
access(/tmp/xxx, W_OK) returns 0
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns 0
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
access(/tmp/xxx, 0) returns 0
access(/tmp/xxx, R_OK) returns -1
access(/tmp/xxx, W_OK) returns -1
access(/tmp/xxx, X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK) returns -1
access(/tmp/xxx, R_OK | X_OK) returns -1
access(/tmp/xxx, W_OK | X_OK) returns -1
access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2009-01-05 08:17:04 +0800
57f71a0af sanitize audit_log_capset() ... Browse Code »

* no allocations
* return void
* don't duplicate checked for dummy context

Signed-off-by: Al Viro

Al Viro
2009-01-05 04:14:41 +0800

14 Nov, 2008

3 commits

d84f4f992 CRED: Inaugurate COW credentials ... Browse Code »

Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

(1) Its reference count may incremented and decremented.

(2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

(1) execve().

This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.

(2) Temporary credential overrides.

do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.

This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.

(3) LSM interface.

A number of functions have been changed, added or removed:

(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()

Removed in favour of security_capset().

(*) security_capset(), ->capset()

New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.

(*) security_bprm_apply_creds(), ->bprm_apply_creds()

Changed; now returns a value, which will cause the process to be
killed if it's an error.

(*) security_task_alloc(), ->task_alloc_security()

Removed in favour of security_prepare_creds().

(*) security_cred_free(), ->cred_free()

New. Free security data attached to cred->security.

(*) security_prepare_creds(), ->cred_prepare()

New. Duplicate any security data attached to cred->security.

(*) security_commit_creds(), ->cred_commit()

New. Apply any security effects for the upcoming installation of new
security by commit_creds().

(*) security_task_post_setuid(), ->task_post_setuid()

Removed in favour of security_task_fix_setuid().

(*) security_task_fix_setuid(), ->task_fix_setuid()

Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().

(*) security_task_reparent_to_init(), ->task_reparent_to_init()

Removed. Instead the task being reparented to init is referred
directly to init's credentials.

NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.

(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()

Changed. These now take cred pointers rather than task pointers to
refer to the security context.

(4) sys_capset().

This has been simplified and uses less locking. The LSM functions it
calls have been merged.

(5) reparent_to_kthreadd().

This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.

(6) __sigqueue_alloc() and switch_uid()

__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.

switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().

(7) [sg]et[ug]id() and co and [sg]et_current_groups.

The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.

security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.

The calling of set_dumpable() has been moved into commit_creds().

Much of the functionality of set_user() has been moved into
commit_creds().

The get functions all simply access the data directly.

(8) security_task_prctl() and cap_task_prctl().

security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.

Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.

(9) Keyrings.

A number of changes have been made to the keyrings code:

(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.

(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.

(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.

(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.

(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).

(10) Usermode helper.

The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.

call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.

call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.

(11) SELinux.

SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:

(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.

(12) is_single_threaded().

This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.

The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.

Signed-off-by: David Howells
Acked-by: James Morris
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:23 +0800
b6dff3ec5 CRED: Separate task security context from task_struct ... Browse Code »

Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:16 +0800
1cdcbec1a CRED: Neuter sys_capset() ... Browse Code »

Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

(1) Without LSM enabled, sys_capset() is disallowed.

(2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells
Acked-by: Serge Hallyn
Acked-by: Andrew G. Morgan
Acked-by: James Morris
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:14 +0800

11 Nov, 2008

2 commits

637d32dc7 Capabilities: BUG when an invalid capability is requested ... Browse Code »

If an invalid (large) capability is requested the capabilities system
may panic as it is dereferencing an array of fixed (short) length. Its
possible (and actually often happens) that the capability system
accidentally stumbled into a valid memory region but it also regularly
happens that it hits invalid memory and BUGs. If such an operation does
get past cap_capable then the selinux system is sure to have problems as
it already does a (simple) validity check and BUG. This is known to
happen by the broken and buggy firegl driver.

This patch cleanly checks all capable calls and BUG if a call is for an
invalid capability. This will likely break the firegl driver for some
situations, but it is the right thing to do. Garbage into a security
system gets you killed/bugged

Signed-off-by: Eric Paris
Acked-by: Arjan van de Ven
Acked-by: Serge Hallyn
Acked-by: Andrew G. Morgan
Signed-off-by: James Morris

Eric Paris
2008-11-11 19:01:24 +0800
e68b75a02 When the capset syscall is used it is not possible for audit to record the ... Browse Code »

actual capbilities being added/removed. This patch adds a new record type
which emits the target pid and the eff, inh, and perm cap sets.

example output if you audit capset syscalls would be:

type=SYSCALL msg=audit(1225743140.465:76): arch=c000003e syscall=126 success=yes exit=0 a0=17f2014 a1=17f201c a2=80000000 a3=7fff2ab7f060 items=0 ppid=2160 pid=2223 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="setcap" exe="/usr/sbin/setcap" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=UNKNOWN[1322] msg=audit(1225743140.465:76): pid=0 cap_pi=ffffffffffffffff cap_pp=ffffffffffffffff cap_pe=ffffffffffffffff

Signed-off-by: Eric Paris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

Eric Paris
2008-11-11 18:48:22 +0800

06 Nov, 2008

1 commit

1f29fae29 file capabilities: add no_file_caps switch (v4) ... Browse Code »

Add a no_file_caps boot option when file capabilities are
compiled into the kernel (CONFIG_SECURITY_FILE_CAPABILITIES=y).

This allows distributions to ship a kernel with file capabilities
compiled in, without forcing users to use (and understand and
trust) them.

When no_file_caps is specified at boot, then when a process executes
a file, any file capabilities stored with that file will not be
used in the calculation of the process' new capability sets.

This means that booting with the no_file_caps boot option will
not be the same as booting a kernel with file capabilities
compiled out - in particular a task with CAP_SETPCAP will not
have any chance of passing capabilities to another task (which
isn't "really" possible anyway, and which may soon by killed
altogether by David Howells in any case), and it will instead
be able to put new capabilities in its pI. However since fI
will always be empty and pI is masked with fI, it gains the
task nothing.

We also support the extra prctl options, setting securebits and
dropping capabilities from the per-process bounding set.

The other remaining difference is that killpriv, task_setscheduler,
setioprio, and setnice will continue to be hooked. That will
be noticable in the case where a root task changed its uid
while keeping some caps, and another task owned by the new uid
tries to change settings for the more privileged task.

Changelog:
Nov 05 2008: (v4) trivial port on top of always-start-\
with-clear-caps patch
Sep 23 2008: nixed file_caps_enabled when file caps are
not compiled in as it isn't used.
Document no_file_caps in kernel-parameters.txt.

Signed-off-by: Serge Hallyn
Acked-by: Andrew G. Morgan
Signed-off-by: James Morris

Serge E. Hallyn
2008-11-06 07:14:51 +0800

14 Aug, 2008

1 commit

5cd9c58fb security: Fix setting of PF_SUPERPRIV by __capable() ... Browse Code »

Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags. This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

(1) security_ptrace_may_access(). This passes judgement on whether one
process may access another only (PTRACE_MODE_ATTACH for ptrace() and
PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
current is the parent.

(2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
and takes only a pointer to the parent process. current is the child.

In Smack and commoncap, this uses has_capability() to determine whether
the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
This does not set PF_SUPERPRIV.

Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

(1) The OOM killer calls __capable() thrice when weighing the killability of a
process. All of these now use has_capability().

(2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
whether the parent was allowed to trace any process. As mentioned above,
these have been split. For PTRACE_ATTACH and /proc, capable() is now
used, and for PTRACE_TRACEME, has_capability() is used.

(3) cap_safe_nice() only ever saw current, so now uses capable().

(4) smack_setprocattr() rejected accesses to tasks other than current just
after calling __capable(), so the order of these two tests have been
switched and capable() is used instead.

(5) In smack_file_send_sigiotask(), we need to allow privileged processes to
receive SIGIO on files they're manipulating.

(6) In smack_task_wait(), we let a process wait for a privileged process,
whether or not the process doing the waiting is privileged.

I've tested this with the LTP SELinux and syscalls testscripts.

Signed-off-by: David Howells
Acked-by: Serge Hallyn
Acked-by: Casey Schaufler
Acked-by: Andrew G. Morgan
Acked-by: Al Viro
Signed-off-by: James Morris

David Howells
2008-08-14 20:59:43 +0800

25 Jul, 2008

1 commit

ab763c711 security: filesystem capabilities refactor kernel code ... Browse Code »

To date, we've tried hard to confine filesystem support for capabilities
to the security modules. This has left a lot of the code in
kernel/capability.c in a state where it looks like it supports something
that filesystem support for capabilities actually suppresses when the LSM
security/commmoncap.c code runs. What is left is a lot of code that uses
sub-optimal locking in the main kernel

With this change we refactor the main kernel code and make it explicit
which locks are needed and that the only remaining kernel races in this
area are associated with non-filesystem capability code.

Signed-off-by: Andrew G. Morgan
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew G. Morgan
2008-07-25 01:47:22 +0800

05 Jul, 2008

1 commit

086f7316f security: filesystem capabilities: fix fragile setuid fixup code ... Browse Code »

This commit includes a bugfix for the fragile setuid fixup code in the
case that filesystem capabilities are supported (in access()). The effect
of this fix is gated on filesystem capability support because changing
securebits is only supported when filesystem capabilities support is
configured.)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andrew G. Morgan
Acked-by: Serge Hallyn
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew G. Morgan
2008-07-05 01:40:08 +0800

01 Jun, 2008

1 commit

ca05a99a5 capabilities: remain source compatible with 32-bit raw legacy capability support. ... Browse Code »

Source code out there hard-codes a notion of what the
_LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the
raw capability system calls capget() and capset(). Its unfortunate, but
true.

Since the confusing header file has been in a released kernel, there is
software that is erroneously using 64-bit capabilities with the semantics
of 32-bit compatibilities. These recently compiled programs may suffer
corruption of their memory when sys_getcap() overwrites more memory than
they are coded to expect, and the raising of added capabilities when using
sys_capset().

As such, this patch does a number of things to clean up the situation
for all. It

1. forces the _LINUX_CAPABILITY_VERSION define to always retain its
legacy value.

2. adopts a new #define strategy for the kernel's internal
implementation of the preferred magic.

3. deprecates v2 capability magic in favor of a new (v3) magic
number. The functionality of v3 is entirely equivalent to v2,
the only difference being that the v2 magic causes the kernel
to log a "deprecated" warning so the admin can find applications
that may be using v2 inappropriately.

[User space code continues to be encouraged to use the libcap API which
protects the application from details like this. libcap-2.10 is the first
to support v3 capabilities.]

Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518.
Thanks to Bojan Smojver for the report.

[akpm@linux-foundation.org: s/depreciate/deprecate/g]
[akpm@linux-foundation.org: be robust about put_user size]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andrew G. Morgan
Cc: Serge E. Hallyn
Cc: Bojan Smojver
Cc: stable@kernel.org
Signed-off-by: Andrew Morton
Signed-off-by: Chris Wright

Andrew G. Morgan
2008-06-01 07:36:16 +0800

06 Feb, 2008

1 commit

e338d263a Add 64-bit capability support to the kernel ... Browse Code »

The patch supports legacy (32-bit) capability userspace, and where possible
translates 32-bit capabilities to/from userspace and the VFS to 64-bit
kernel space capabilities. If a capability set cannot be compressed into
32-bits for consumption by user space, the system call fails, with -ERANGE.

FWIW libcap-2.00 supports this change (and earlier capability formats)

http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/

[akpm@linux-foundation.org: coding-syle fixes]
[akpm@linux-foundation.org: use get_task_comm()]
[ezk@cs.sunysb.edu: build fix]
[akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
[akpm@linux-foundation.org: unused var]
[serue@us.ibm.com: export __cap_ symbols]
Signed-off-by: Andrew G. Morgan
Cc: Stephen Smalley
Acked-by: Serge Hallyn
Cc: Chris Wright
Cc: James Morris
Cc: Casey Schaufler
Signed-off-by: Erez Zadok
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morgan
2008-02-06 01:44:20 +0800

20 Oct, 2007

2 commits

8990571eb Uninline find_pid etc set of functions ... Browse Code »

The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its
id, depending on whic id - global or virtual - is used.

The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the
stack to call another function - find_pid_ns(). It turned out, that this
dereference together with the push itself cause the kernel text size to
grow too much.

Move all these out-of-line. Together with the previous patch this saves a
bit less that 400 bytes from .text section.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:41 +0800
228ebcbe6 Uninline find_task_by_xxx set of functions ... Browse Code »

The find_task_by_something is a set of macros are used to find task by pid
depending on what kind of pid is proposed - global or virtual one. All of
them are wrappers above the most generic one - find_task_by_pid_type_ns() -
and just substitute some args for it.

It turned out, that dereferencing the current->nsproxy->pid_ns construction
and pushing one more argument on the stack inline cause kernel text size to
grow.

This patch moves all this stuff out-of-line into kernel/pid.c. Together
with the next patch it saves a bit less than 400 bytes from the .text
section.

Signed-off-by: Pavel Emelyanov
Cc: Sukadev Bhattiprolu
Cc: Oleg Nesterov
Cc: Paul Menage
Cc: "Eric W. Biederman"
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:40 +0800