Eric Lee / smarc-fsl-linux-kernel

21 May, 2020

1 commit

87b047d2b exec: Teach prepare_exec_creds how exec treats uids & gids ... Browse Code »

It is almost possible to use the result of prepare_exec_creds with no
modifications during exec. Update prepare_exec_creds to initialize
the suid and the fsuid to the euid, and the sgid and the fsgid to the
egid. This is all that is needed to handle the common case of exec
when nothing special like a setuid exec is happening.

That this preserves the existing behavior of exec can be verified
by examing bprm_fill_uid and cap_bprm_set_creds.

This change makes it clear that the later parts of exec that
update bprm->cred are just need to handle special cases such
as setuid exec and change of domains.

Link: https://lkml.kernel.org/r/871rng22dm.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds
Reviewed-by: Kees Cook
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2020-05-21 03:44:21 +0800

25 Mar, 2020

1 commit

aa884c113 kernel: doc: remove outdated comment cred.c ... Browse Code »

This removes an outdated comment in prepare_kernel_cred.

There is no "cred_replace_mutex" any more, so the comment must
go away.

Signed-off-by: Bernd Edlinger
Reviewed-by: Kees Cook
Signed-off-by: Eric W. Biederman

Bernd Edlinger
2020-03-25 23:04:01 +0800

15 Jan, 2020

2 commits

e033e7d4a Merge branch 'dhowells' (patches from DavidH) ... Browse Code »

Merge misc fixes from David Howells.

Two afs fixes and a key refcounting fix.

* dhowells:
afs: Fix afs_lookup() to not clobber the version on a new dentry
afs: Fix use-after-loss-of-ref
keys: Fix request_key() cache

Linus Torvalds
2020-01-15 01:56:31 +0800
8379bb84b keys: Fix request_key() cache ... Browse Code »

When the key cached by request_key() and co. is cleaned up on exit(),
the code looks in the wrong task_struct, and so clears the wrong cache.
This leads to anomalies in key refcounting when doing, say, a kernel
build on an afs volume, that then trigger kasan to report a
use-after-free when the key is viewed in /proc/keys.

Fix this by making exit_creds() look in the passed-in task_struct rather
than in current (the task_struct cleanup code is deferred by RCU and
potentially run in another task).

Fixes: 7743c48e54ee ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2020-01-15 01:40:06 +0800

05 Jan, 2020

1 commit

84029fd04 memcg: account security cred as well to kmemcg ... Browse Code »

The cred_jar kmem_cache is already memcg accounted in the current kernel
but cred->security is not. Account cred->security to kmemcg.

Recently we saw high root slab usage on our production and on further
inspection, we found a buggy application leaking processes. Though that
buggy application was contained within its memcg but we observe much
more system memory overhead, couple of GiBs, during that period. This
overhead can adversely impact the isolation on the system.

One source of high overhead we found was cred->security objects, which
have a lifetime of at least the life of the process which allocated
them.

Link: http://lkml.kernel.org/r/20191205223721.40034-1-shakeelb@google.com
Signed-off-by: Shakeel Butt
Acked-by: Chris Down
Reviewed-by: Roman Gushchin
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shakeel Butt
2020-01-05 05:55:09 +0800

25 Jul, 2019

2 commits

a29a0a467 Merge branch 'access-creds' ... Browse Code »

The access() (and faccessat()) credentials change can cause an
unnecessary load on the RCU machinery because every access() call ends
up freeing the temporary access credential using RCU.

This isn't really noticeable on small machines, but if you have hundreds
of cores you can cause huge slowdowns due to RCU storms.

It's easy to avoid: the temporary access crededntials aren't actually
normally accessed using RCU at all, so we can avoid the whole issue by
just marking them as such.

* access-creds:
access: avoid the RCU grace period for the temporary subjective credentials

Linus Torvalds
2019-07-25 23:36:29 +0800
d7852fbd0 access: avoid the RCU grace period for the temporary subjective credentials ... Browse Code »

It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
work because it installs a temporary credential that gets allocated and
freed for each system call.

The allocation and freeing overhead is mostly benign, but because
credentials can be accessed under the RCU read lock, the freeing
involves a RCU grace period.

Which is not a huge deal normally, but if you have a lot of access()
calls, this causes a fair amount of seconday damage: instead of having a
nice alloc/free patterns that hits in hot per-CPU slab caches, you have
all those delayed free's, and on big machines with hundreds of cores,
the RCU overhead can end up being enormous.

But it turns out that all of this is entirely unnecessary. Exactly
because access() only installs the credential as the thread-local
subjective credential, the temporary cred pointer doesn't actually need
to be RCU free'd at all. Once we're done using it, we can just free it
synchronously and avoid all the RCU overhead.

So add a 'non_rcu' flag to 'struct cred', which can be set by users that
know they only use it in non-RCU context (there are other potential
users for this). We can make it a union with the rcu freeing list head
that we need for the RCU case, so this doesn't need any extra storage.

Note that this also makes 'get_current_cred()' clear the new non_rcu
flag, in case we have filesystems that take a long-term reference to the
cred and then expect the RCU delayed freeing afterwards. It's not
entirely clear that this is required, but it makes for clear semantics:
the subjective cred remains non-RCU as long as you only access it
synchronously using the thread-local accessors, but you _can_ use it as
a generic cred if you want to.

It is possible that we should just remove the whole RCU markings for
->cred entirely. Only ->real_cred is really supposed to be accessed
through RCU, and the long-term cred copies that nfs uses might want to
explicitly re-enable RCU freeing if required, rather than have
get_current_cred() do it implicitly.

But this is a "minimal semantic changes" change for the immediate
problem.

Acked-by: Peter Zijlstra (Intel)
Acked-by: Eric Dumazet
Acked-by: Paul E. McKenney
Cc: Oleg Nesterov
Cc: Jan Glauber
Cc: Jiri Kosina
Cc: Jayachandran Chandrasekharan Nair
Cc: Greg KH
Cc: Kees Cook
Cc: David Howells
Cc: Miklos Szeredi
Cc: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2019-07-25 01:12:09 +0800

09 Jul, 2019

2 commits

c236b6dd4 Merge tag 'keys-request-20190626' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

Pull request_key improvements from David Howells:
"These are all request_key()-related, including a fix and some improvements:

- Fix the lack of a Link permission check on a key found by
request_key(), thereby enabling request_key() to link keys that
don't grant this permission to the target keyring (which must still
grant Write permission).

Note that the key must be in the caller's keyrings already to be
found.

- Invalidate used request_key authentication keys rather than
revoking them, so that they get cleaned up immediately rather than
hanging around till the expiry time is passed.

- Move the RCU locks outwards from the keyring search functions so
that a request_key_rcu() can be provided. This can be called in RCU
mode, so it can't sleep and can't upcall - but it can be called
from LOOKUP_RCU pathwalk mode.

- Cache the latest positive result of request_key*() temporarily in
task_struct so that filesystems that make a lot of request_key()
calls during pathwalk can take advantage of it to avoid having to
redo the searching. This requires CONFIG_KEYS_REQUEST_CACHE=y.

It is assumed that the key just found is likely to be used multiple
times in each step in an RCU pathwalk, and is likely to be reused
for the next step too.

Note that the cleanup of the cache is done on TIF_NOTIFY_RESUME,
just before userspace resumes, and on exit"

* tag 'keys-request-20190626' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Kill off request_key_async{,_with_auxdata}
keys: Cache result of request_key*() temporarily in task_struct
keys: Provide request_key_rcu()
keys: Move the RCU locks outwards from the keyring search functions
keys: Invalidate used request_key authentication keys
keys: Fix request_key() lack of Link perm check on found key

Linus Torvalds
2019-07-09 10:19:37 +0800
d44a62742 Merge tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

Pull misc keyring updates from David Howells:
"These are some miscellaneous keyrings fixes and improvements:

- Fix a bunch of warnings from sparse, including missing RCU bits and
kdoc-function argument mismatches

- Implement a keyctl to allow a key to be moved from one keyring to
another, with the option of prohibiting key replacement in the
destination keyring.

- Grant Link permission to possessors of request_key_auth tokens so
that upcall servicing daemons can more easily arrange things such
that only the necessary auth key is passed to the actual service
program, and not all the auth keys a daemon might possesss.

- Improvement in lookup_user_key().

- Implement a keyctl to allow keyrings subsystem capabilities to be
queried.

The keyutils next branch has commits to make available, document and
test the move-key and capabilities code:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log

They're currently on the 'next' branch"

* tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Add capability-checking keyctl function
keys: Reuse keyring_index_key::desc_len in lookup_user_key()
keys: Grant Link permission to possessers of request_key auth keys
keys: Add a keyctl to move a key between keyrings
keys: Hoist locking out of __key_link_begin()
keys: Break bits out of key_unlink()
keys: Change keyring_serialise_link_sem to a mutex
keys: sparse: Fix kdoc mismatches
keys: sparse: Fix incorrect RCU accesses
keys: sparse: Fix key_fs[ug]id_changed()

Linus Torvalds
2019-07-09 10:02:11 +0800

19 Jun, 2019

1 commit

7743c48e5 keys: Cache result of request_key*() temporarily in task_struct ... Browse Code »

If a filesystem uses keys to hold authentication tokens, then it needs a
token for each VFS operation that might perform an authentication check -
either by passing it to the server, or using to perform a check based on
authentication data cached locally.

For open files this isn't a problem, since the key should be cached in the
file struct since it represents the subject performing operations on that
file descriptor.

During pathwalk, however, there isn't anywhere to cache the key, except
perhaps in the nameidata struct - but that isn't exposed to the
filesystems. Further, a pathwalk can incur a lot of operations, calling
one or more of the following, for instance:

->lookup()
->permission()
->d_revalidate()
->d_automount()
->get_acl()
->getxattr()

on each dentry/inode it encounters - and each one may need to call
request_key(). And then, at the end of pathwalk, it will call the actual
operation:

->mkdir()
->mknod()
->getattr()
->open()
...

which may need to go and get the token again.

However, it is very likely that all of the operations on a single
dentry/inode - and quite possibly a sequence of them - will all want to use
the same authentication token, which suggests that caching it would be a
good idea.

To this end:

(1) Make it so that a positive result of request_key() and co. that didn't
require upcalling to userspace is cached temporarily in task_struct.

(2) The cache is 1 deep, so a new result displaces the old one.

(3) The key is released by exit and by notify-resume.

(4) The cache is cleared in a newly forked process.

Signed-off-by: David Howells

David Howells
2019-06-19 23:10:15 +0800

12 Jun, 2019

2 commits

aa7235483 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull ptrace fixes from Eric Biederman:
"This is just two very minor fixes:

- prevent ptrace from reading unitialized kernel memory found twice
by syzkaller

- restore a missing smp_rmb in ptrace_may_access and add comment tp
it so it is not removed by accident again.

Apologies for being a little slow about getting this to you, I am
still figuring out how to develop with a little baby in the house"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: restore smp_rmb() in __ptrace_may_access()
signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO

Linus Torvalds
2019-06-12 09:44:45 +0800
f6581f5b5 ptrace: restore smp_rmb() in __ptrace_may_access() ... Browse Code »

Restore the read memory barrier in __ptrace_may_access() that was deleted
a couple years ago. Also add comments on this barrier and the one it pairs
with to explain why they're there (as far as I understand).

Fixes: bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks")
Cc: stable@vger.kernel.org
Acked-by: Kees Cook
Acked-by: Oleg Nesterov
Signed-off-by: Jann Horn
Signed-off-by: Eric W. Biederman

Jann Horn
2019-06-12 04:08:28 +0800

24 May, 2019

1 commit

b4d0d230c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public licence as published by
the free software foundation either version 2 of the licence or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 114 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Kate Stewart
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-24 23:27:11 +0800

22 May, 2019

1 commit

2e21865fa keys: sparse: Fix key_fs[ug]id_changed() ... Browse Code »

Sparse warnings are incurred by key_fs[ug]id_changed() due to unprotected
accesses of tsk->cred, which is marked __rcu.

Fix this by passing the new cred struct to these functions from
commit_creds() rather than the task pointer.

Signed-off-by: David Howells
Reviewed-by: James Morris

David Howells
2019-05-22 21:06:51 +0800

09 Jan, 2019

1 commit

98c886513 SELinux: Remove cred security blob poisoning ... Browse Code »

The SELinux specific credential poisioning only makes sense
if SELinux is managing the credentials. As the intent of this
patch set is to move the blob management out of the modules
and into the infrastructure, the SELinux specific code has
to go. The poisioning could be introduced into the infrastructure
at some later date.

Signed-off-by: Casey Schaufler
Reviewed-by: Kees Cook
Signed-off-by: Kees Cook

Casey Schaufler
2019-01-09 05:18:44 +0800

20 Dec, 2018

3 commits

a6d8e7637 cred: export get_task_cred(). ... Browse Code »

There is no reason that modules should not be able
to use this, and NFS will need it when converted to
use 'struct cred'.

Signed-off-by: NeilBrown
Signed-off-by: Anna Schumaker

NeilBrown
2018-12-20 02:52:44 +0800
97d0fb239 cred: add get_cred_rcu() ... Browse Code »

Sometimes we want to opportunistically get a
ref to a cred in an rcu_read_lock protected section.
get_task_cred() does this, and NFS does as similar thing
with its own credential structures.
To prepare for NFS converting to use 'struct cred' more
uniformly, define get_cred_rcu(), and use it in
get_task_cred().

Signed-off-by: NeilBrown
Signed-off-by: Anna Schumaker

NeilBrown
2018-12-20 02:52:44 +0800
d89b22d46 cred: add cred_fscmp() for comparing creds. ... Browse Code »

NFS needs to compare to credentials, to see if they can
be treated the same w.r.t. filesystem access. Sometimes
an ordering is needed when credentials are used as a key
to an rbtree.
NFS currently has its own private credential management from
before 'struct cred' existed. To move it over to more consistent
use of 'struct cred' we need a comparison function.
This patch adds that function.

Signed-off-by: NeilBrown
Signed-off-by: Anna Schumaker

NeilBrown
2018-12-20 02:52:44 +0800

19 May, 2017

1 commit

af777cd1b doc: ReSTify credentials.txt ... Browse Code »

This updates the credentials API documentation to ReST markup and moves
it under the security subsection of kernel API documentation.

Cc: David Howells
Signed-off-by: Kees Cook
Signed-off-by: Jonathan Corbet

Kees Cook
2017-05-19 00:30:19 +0800

02 Mar, 2017

1 commit

f7ccbae45 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/coredump.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:28 +0800

01 Jul, 2016

1 commit

5f65e5ca2 cred: Reject inodes with invalid ids in set_create_file_as() ... Browse Code »

Using INVALID_[UG]ID for the LSM file creation context doesn't
make sense, so return an error if the inode passed to
set_create_file_as() has an invalid id.

Signed-off-by: Seth Forshee
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Seth Forshee
2016-07-01 07:05:09 +0800

15 Jan, 2016

1 commit

5d097056c kmemcg: account certain kmem allocations to memcg ... Browse Code »

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:

- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: Greg Thelen
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2016-01-15 08:00:49 +0800

11 Sep, 2015

1 commit

52aa8536f kernel/cred.c: remove unnecessary kdebug atomic reads ... Browse Code »

Commit e0e817392b9a ("CRED: Add some configurable debugging [try #6]")
added the kdebug mechanism to this file back in 2009.

The kdebug macro calls no_printk which always evaluates arguments.

Most of the kdebug uses have an unnecessary call of
atomic_read(&cred->usage)

Make the kdebug macro do nothing by defining it with
do { if (0) no_printk(...); } while (0)
when not enabled.

$ size kernel/cred.o* (defconfig x86-64)
text data bss dec hex filename
2748 336 8 3092 c14 kernel/cred.o.new
2788 336 8 3132 c3c kernel/cred.o.old

Miscellanea:
o Neaten the #define kdebug macros while there

Signed-off-by: Joe Perches
Cc: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-09-11 04:29:01 +0800

16 Apr, 2015

1 commit

2813893f8 kernel: conditionally support non-root users, groups and capabilities ... Browse Code »

There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root. For these systems,
supporting multiple users is not necessary.

This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional. It is enabled
under CONFIG_EXPERT menu.

When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.

The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.

Also, groups.c is compiled out completely.

In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.

This change saves about 25 KB on a defconfig build. The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB. (The 25k goes down a bit with allnoconfig, but not that much.

The kernel was booted in Qemu. All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.

Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Iulia Manda
Reviewed-by: Josh Triplett
Acked-by: Geert Uytterhoeven
Tested-by: Paul E. McKenney
Reviewed-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Iulia Manda
2015-04-16 07:35:22 +0800

19 Dec, 2012

1 commit

a2faf2fc5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull (again) user namespace infrastructure changes from Eric Biederman:
"Those bugs, those darn embarrasing bugs just want don't want to get
fixed.

Linus I just updated my mirror of your kernel.org tree and it appears
you successfully pulled everything except the last 4 commits that fix
those embarrasing bugs.

When you get a chance can you please repull my branch"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Fix typo in description of the limitation of userns_install
userns: Add a more complete capability subset test to commit_creds
userns: Require CAP_SYS_ADMIN for most uses of setns.
Fix cap_capable to only allow owners in the parent user namespace to have caps.

Linus Torvalds
2012-12-19 02:55:28 +0800

17 Dec, 2012

1 commit

2a74dbb9a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security subsystem updates from James Morris:
"A quiet cycle for the security subsystem with just a few maintenance
updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: create a sysfs mount point for smackfs
Smack: use select not depends in Kconfig
Yama: remove locking from delete path
Yama: add RCU to drop read locking
drivers/char/tpm: remove tasklet and cleanup
KEYS: Use keyring_alloc() to create special keyrings
KEYS: Reduce initial permissions on keys
KEYS: Make the session and process keyrings per-thread
seccomp: Make syscall skipping and nr changes more consistent
key: Fix resource leak
keys: Fix unreachable code
KEYS: Add payload preparsing opportunity prior to key instantiate or update

Linus Torvalds
2012-12-17 07:40:50 +0800

15 Dec, 2012

1 commit

aa6d054e5 userns: Add a more complete capability subset test to commit_creds ... Browse Code »

When unsharing a user namespace we reduce our credentials to just what
can be done in that user namespace. This is a subset of the credentials
we previously had. Teach commit_creds to recognize this is a subset
of the credentials we have had before and don't clear the dumpability flag.

This allows an unprivileged program to do:
unshare(CLONE_NEWUSER);
fd = open("/proc/self/uid_map", O_RDWR);

Where previously opening the uid_map writable would fail because
the the task had been made non-dumpable.

Acked-by: Serge Hallyn
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2012-12-15 10:36:26 +0800

03 Oct, 2012

1 commit

3a50597de KEYS: Make the session and process keyrings per-thread ... Browse Code »

Make the session keyring per-thread rather than per-process, but still
inherited from the parent thread to solve a problem with PAM and gdm.

The problem is that join_session_keyring() will reject attempts to change the
session keyring of a multithreaded program but gdm is now multithreaded before
it gets to the point of starting PAM and running pam_keyinit to create the
session keyring. See:

https://bugs.freedesktop.org/show_bug.cgi?id=49211

The reason that join_session_keyring() will only change the session keyring
under a single-threaded environment is that it's hard to alter the other
thread's credentials to effect the change in a multi-threaded program. The
problems are such as:

(1) How to prevent two threads both running join_session_keyring() from
racing.

(2) Another thread's credentials may not be modified directly by this process.

(3) The number of threads is uncertain whilst we're not holding the
appropriate spinlock, making preallocation slightly tricky.

(4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
another thread to replace its keyring, but that means preallocating for
each thread.

A reasonable way around this is to make the session keyring per-thread rather
than per-process and just document that if you want a common session keyring,
you must get it before you spawn any threads - which is the current situation
anyway.

Whilst we're at it, we can the process keyring behave in the same way. This
means we can clean up some of the ickyness in the creds code.

Basically, after this patch, the session, process and thread keyrings are about
inheritance rules only and not about sharing changes of keyring.

Reported-by: Mantas M.
Signed-off-by: David Howells
Tested-by: Ray Strode

David Howells
2012-10-03 02:24:29 +0800

24 Aug, 2012

1 commit

c9235f487 userns: Make credential debugging user namespace safe. ... Browse Code »

Cc: David Howells
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-08-24 13:54:18 +0800

24 May, 2012

2 commits

f23ca3354 keys: kill task_struct->replacement_session_keyring ... Browse Code »

Kill the no longer used task_struct->replacement_session_keyring, update
copy_creds() and exit_creds().

Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Cc: Thomas Gleixner
Cc: Richard Kuo
Cc: Linus Torvalds
Cc: Alexander Gordeev
Cc: Chris Zankel
Cc: David Smith
Cc: "Frank Ch. Eigler"
Cc: Geert Uytterhoeven
Cc: Larry Woodman
Cc: Peter Zijlstra
Cc: Tejun Heo
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Oleg Nesterov
2012-05-24 10:11:41 +0800
644473e9c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace enhancements from Eric Biederman:
"This is a course correction for the user namespace, so that we can
reach an inexpensive, maintainable, and reasonably complete
implementation.

Highlights:
- Config guards make it impossible to enable the user namespace and
code that has not been converted to be user namespace safe.

- Use of the new kuid_t type ensures the if you somehow get past the
config guards the kernel will encounter type errors if you enable
user namespaces and attempt to compile in code whose permission
checks have not been updated to be user namespace safe.

- All uids from child user namespaces are mapped into the initial
user namespace before they are processed. Removing the need to add
an additional check to see if the user namespace of the compared
uids remains the same.

- With the user namespaces compiled out the performance is as good or
better than it is today.

- For most operations absolutely nothing changes performance or
operationally with the user namespace enabled.

- The worst case performance I could come up with was timing 1
billion cache cold stat operations with the user namespace code
enabled. This went from 156s to 164s on my laptop (or 156ns to
164ns per stat operation).

- (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
Most uid/gid setting system calls treat these value specially
anyway so attempting to use -1 as a uid would likely cause
entertaining failures in userspace.

- If setuid is called with a uid that can not be mapped setuid fails.
I have looked at sendmail, login, ssh and every other program I
could think of that would call setuid and they all check for and
handle the case where setuid fails.

- If stat or a similar system call is called from a context in which
we can not map a uid we lie and return overflowuid. The LFS
experience suggests not lying and returning an error code might be
better, but the historical precedent with uids is different and I
can not think of anything that would break by lying about a uid we
can't map.

- Capabilities are localized to the current user namespace making it
safe to give the initial user in a user namespace all capabilities.

My git tree covers all of the modifications needed to convert the core
kernel and enough changes to make a system bootable to runlevel 1."

Fix up trivial conflicts due to nearby independent changes in fs/stat.c

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
userns: Silence silly gcc warning.
cred: use correct cred accessor with regards to rcu read lock
userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
userns: Convert cgroup permission checks to use uid_eq
userns: Convert tmpfs to use kuid and kgid where appropriate
userns: Convert sysfs to use kgid/kuid where appropriate
userns: Convert sysctl permission checks to use kuid and kgids.
userns: Convert proc to use kuid/kgid where appropriate
userns: Convert ext4 to user kuid/kgid where appropriate
userns: Convert ext3 to use kuid/kgid where appropriate
userns: Convert ext2 to use kuid/kgid where appropriate.
userns: Convert devpts to use kuid/kgid where appropriate
userns: Convert binary formats to use kuid/kgid where appropriate
userns: Add negative depends on entries to avoid building code that is userns unsafe
userns: signal remove unnecessary map_cred_ns
userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
userns: Convert stat to return values mapped from kuids and kgids
userns: Convert user specfied uids and gids in chown into kuids and kgid
userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
...

Linus Torvalds
2012-05-24 08:42:39 +0800

03 May, 2012

1 commit

078de5f70 userns: Store uid and gid values in struct cred with kuid_t and kgid_t types ... Browse Code »

cred.h and a few trivial users of struct cred are changed. The rest of the users
of struct cred are left for other patches as there are too many changes to make
in one go and leave the change reviewable. If the user namespace is disabled and
CONFIG_UIDGID_STRICT_TYPE_CHECKS are disabled the code will contiue to compile
and behave correctly.

Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-05-03 18:28:38 +0800

11 Apr, 2012

1 commit

79549c6df cred: copy_process() should clear child->replacement_session_keyring ... Browse Code »
1

keyctl_session_to_parent(task) sets ->replacement_session_keyring,
it should be processed and cleared by key_replace_session_keyring().

However, this task can fork before it notices TIF_NOTIFY_RESUME and
the new child gets the bogus ->replacement_session_keyring copied by
dup_task_struct(). This is obviously wrong and, if nothing else, this
leads to put_cred(already_freed_cred).

change copy_creds() to clear this member. If copy_process() fails
before this point the wrong ->replacement_session_keyring doesn't
matter, exit_creds() won't be called.

Cc:
Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-04-11 23:20:11 +0800

08 Apr, 2012

1 commit

0093ccb68 cred: Refcount the user_ns pointed to by the cred. ... Browse Code »

struct user_struct will shortly loose it's user_ns reference
so make the cred user_ns reference a proper reference complete
with reference counting.

Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-04-08 07:55:52 +0800

14 Feb, 2012

1 commit

404015308 security: trim security.h ... Browse Code »

Trim security.h

Signed-off-by: Al Viro
Signed-off-by: James Morris

Al Viro
2012-02-14 07:45:42 +0800

31 Oct, 2011

1 commit

9984de1a5 kernel: Map most files to use export.h instead of module.h ... Browse Code »

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else. Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

-#include
+#include

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

25 Oct, 2011

1 commit

36b8d186e Merge branch 'next' of git://selinuxproject.org/~jmorris/linux-security ... Browse Code »

* 'next' of git://selinuxproject.org/~jmorris/linux-security: (95 commits)
TOMOYO: Fix incomplete read after seek.
Smack: allow to access /smack/access as normal user
TOMOYO: Fix unused kernel config option.
Smack: fix: invalid length set for the result of /smack/access
Smack: compilation fix
Smack: fix for /smack/access output, use string instead of byte
Smack: domain transition protections (v3)
Smack: Provide information for UDS getsockopt(SO_PEERCRED)
Smack: Clean up comments
Smack: Repair processing of fcntl
Smack: Rule list lookup performance
Smack: check permissions from user space (v2)
TOMOYO: Fix quota and garbage collector.
TOMOYO: Remove redundant tasklist_lock.
TOMOYO: Fix domain transition failure warning.
TOMOYO: Remove tomoyo_policy_memory_lock spinlock.
TOMOYO: Simplify garbage collector.
TOMOYO: Fix make namespacecheck warnings.
target: check hex2bin result
encrypted-keys: check hex2bin result
...

Linus Torvalds
2011-10-25 15:45:31 +0800

23 Aug, 2011

2 commits

8ad346c62 CRED: fix build error due to 'tgcred' undeclared ... Browse Code »

This patch adds CONFIG_KEYS guard for tgcred to fix below build error
if CONFIG_KEYS is not configured.

CC kernel/cred.o
kernel/cred.c: In function 'prepare_kernel_cred':
kernel/cred.c:657: error: 'tgcred' undeclared (first use in this function)
kernel/cred.c:657: error: (Each undeclared identifier is reported only once
kernel/cred.c:657: error: for each function it appears in.)
make[1]: *** [kernel/cred.o] Error 1
make: *** [kernel] Error 2

Signed-off-by: Axel Lin
Acked-by: David Howells
Signed-off-by: James Morris

Axel Lin
2011-08-23 16:22:28 +0800
012146d07 CRED: Fix prepare_kernel_cred() to provide a new thread_group_cred struct ... Browse Code »

Fix prepare_kernel_cred() to provide a new, separate thread_group_cred struct
otherwise when using request_key() ____call_usermodehelper() calls
umh_keys_init() with the new creds pointing to init_tgcred, which
umh_keys_init() then blithely alters.

The problem can be demonstrated by:

# keyctl request2 user a debug:a @s
249681132
# grep req /proc/keys
079906a5 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.249681132: 1/4
38ef1626 IR---- 1 expd 0b010000 0 0 .request_ key:ee1d4ec pid:4371 ci:1

The keyring _req.XXXX should have gone away, but something (init_tgcred) is
pinning it.

That key actually requested can then be removed and a new one created:

# keyctl unlink 249681132
1 links removed
[root@andromeda ~]# grep req /proc/keys
116cecac IR---- 1 expd 0b010000 0 0 .request_ key:eeb4911 pid:4379 ci:1
36d1cbf8 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.250300689: 1/4

which causes the old _req keyring to go away and a new one to take its place.

This is a consequence of the changes in:

commit 879669961b11e7f40b518784863a259f735a72bf
Author: David Howells
Date: Fri Jun 17 11:25:59 2011 +0100
KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring

and:

commit 17f60a7da150fdd0cfb9756f86a262daa72c835f
Author: Eric Paris
Date: Fri Apr 1 17:07:50 2011 -0400
capabilites: allow the application of capability limits to usermode helpers

After this patch is applied, the _req keyring and the .request_key key are
cleaned up.

Signed-off-by: David Howells
cc: Eric Paris
Signed-off-by: James Morris

David Howells
2011-08-23 07:57:35 +0800

12 Aug, 2011

1 commit

72fa59970 move RLIMIT_NPROC check from set_user() to do_execve_common() ... Browse Code »

The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
check in set_user() to check for NPROC exceeding via setuid() and
similar functions.

Before the check there was a possibility to greatly exceed the allowed
number of processes by an unprivileged user if the program relied on
rlimit only. But the check created new security threat: many poorly
written programs simply don't check setuid() return code and believe it
cannot fail if executed with root privileges. So, the check is removed
in this patch because of too often privilege escalations related to
buggy programs.

The NPROC can still be enforced in the common code flow of daemons
spawning user processes. Most of daemons do fork()+setuid()+execve().
The check introduced in execve() (1) enforces the same limit as in
setuid() and (2) doesn't create similar security issues.

Neil Brown suggested to track what specific process has exceeded the
limit by setting PF_NPROC_EXCEEDED process flag. With the change only
this process would fail on execve(), and other processes' execve()
behaviour is not changed.

Solar Designer suggested to re-check whether NPROC limit is still
exceeded at the moment of execve(). If the process was sleeping for
days between set*uid() and execve(), and the NPROC counter step down
under the limit, the defered execve() failure because NPROC limit was
exceeded days ago would be unexpected. If the limit is not exceeded
anymore, we clear the flag on successful calls to execve() and fork().

The flag is also cleared on successful calls to set_user() as the limit
was exceeded for the previous user, not the current one.

Similar check was introduced in -ow patches (without the process flag).

v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().

Reviewed-by: James Morris
Signed-off-by: Vasiliy Kulikov
Acked-by: NeilBrown
Signed-off-by: Linus Torvalds

Vasiliy Kulikov
2011-08-12 02:24:42 +0800