Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

24 Sep, 2013

1 commit

f36f8c75a KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches ... Browse Code »

Add support for per-user_namespace registers of persistent per-UID kerberos
caches held within the kernel.

This allows the kerberos cache to be retained beyond the life of all a user's
processes so that the user's cron jobs can work.

The kerberos cache is envisioned as a keyring/key tree looking something like:

struct user_namespace
\___ .krb_cache keyring - The register
\___ _krb.0 keyring - Root's Kerberos cache
\___ _krb.5000 keyring - User 5000's Kerberos cache
\___ _krb.5001 keyring - User 5001's Kerberos cache
\___ tkt785 big_key - A ccache blob
\___ tkt12345 big_key - Another ccache blob

Or possibly:

struct user_namespace
\___ .krb_cache keyring - The register
\___ _krb.0 keyring - Root's Kerberos cache
\___ _krb.5000 keyring - User 5000's Kerberos cache
\___ _krb.5001 keyring - User 5001's Kerberos cache
\___ tkt785 keyring - A ccache
\___ krbtgt/REDHAT.COM@REDHAT.COM big_key
\___ http/REDHAT.COM@REDHAT.COM user
\___ afs/REDHAT.COM@REDHAT.COM user
\___ nfs/REDHAT.COM@REDHAT.COM user
\___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key
\___ http/KERNEL.ORG@KERNEL.ORG big_key

What goes into a particular Kerberos cache is entirely up to userspace. Kernel
support is limited to giving you the Kerberos cache keyring that you want.

The user asks for their Kerberos cache by:

krb_cache = keyctl_get_krbcache(uid, dest_keyring);

The uid is -1 or the user's own UID for the user's own cache or the uid of some
other user's cache (requires CAP_SETUID). This permits rpc.gssd or whatever to
mess with the cache.

The cache returned is a keyring named "_krb." that the possessor can read,
search, clear, invalidate, unlink from and add links to. Active LSMs get a
chance to rule on whether the caller is permitted to make a link.

Each uid's cache keyring is created when it first accessed and is given a
timeout that is extended each time this function is called so that the keyring
goes away after a while. The timeout is configurable by sysctl but defaults to
three days.

Each user_namespace struct gets a lazily-created keyring that serves as the
register. The cache keyrings are added to it. This means that standard key
search and garbage collection facilities are available.

The user_namespace struct's register goes away when it does and anything left
in it is then automatically gc'd.

Signed-off-by: David Howells
Tested-by: Simo Sorce
cc: Serge E. Hallyn
cc: Eric W. Biederman

David Howells
2013-09-24 17:35:19 +0800

08 May, 2013

1 commit

a27bb332c aio: don't include aio.h in sched.h ... Browse Code »

Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet
Cc: Zach Brown
Cc: Felipe Balbi
Cc: Greg Kroah-Hartman
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Rusty Russell
Cc: Jens Axboe
Cc: Asai Thambi S P
Cc: Selvan Mani
Cc: Sam Bradshaw
Cc: Jeff Moyer
Cc: Al Viro
Cc: Benjamin LaHaise
Reviewed-by: "Theodore Ts'o"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kent Overstreet
2013-05-08 11:16:25 +0800

17 Dec, 2012

1 commit

2a74dbb9a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security subsystem updates from James Morris:
"A quiet cycle for the security subsystem with just a few maintenance
updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: create a sysfs mount point for smackfs
Smack: use select not depends in Kconfig
Yama: remove locking from delete path
Yama: add RCU to drop read locking
drivers/char/tpm: remove tasklet and cleanup
KEYS: Use keyring_alloc() to create special keyrings
KEYS: Reduce initial permissions on keys
KEYS: Make the session and process keyrings per-thread
seccomp: Make syscall skipping and nr changes more consistent
key: Fix resource leak
keys: Fix unreachable code
KEYS: Add payload preparsing opportunity prior to key instantiate or update

Linus Torvalds
2012-12-17 07:40:50 +0800

15 Oct, 2012

1 commit

d25282d1c Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module signing support from Rusty Russell:
"module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
X.509: Fix indefinite length element skip error handling
X.509: Convert some printk calls to pr_devel
asymmetric keys: fix printk format warning
MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
MODSIGN: Make mrproper should remove generated files.
MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
MODSIGN: Use the same digest for the autogen key sig as for the module sig
MODSIGN: Sign modules during the build process
MODSIGN: Provide a script for generating a key ID from an X.509 cert
MODSIGN: Implement module signature checking
MODSIGN: Provide module signing public keys to the kernel
MODSIGN: Automatically generate module signing keys if missing
MODSIGN: Provide Kconfig options
MODSIGN: Provide gitignore and make clean rules for extra files
MODSIGN: Add FIPS policy
module: signature checking hook
X.509: Add a crypto key parser for binary (DER) X.509 certificates
MPILIB: Provide a function to read raw data into an MPI
X.509: Add an ASN.1 decoder
X.509: Add simple ASN.1 grammar compiler
...

Linus Torvalds
2012-10-15 04:39:34 +0800

08 Oct, 2012

1 commit

cf7f601c0 KEYS: Add payload preparsing opportunity prior to key instantiate or update ... Browse Code »

Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called. This is done with the
provision of two new key type operations:

int (*preparse)(struct key_preparsed_payload *prep);
void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases). The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

struct key_preparsed_payload {
char *description;
void *type_data[2];
void *payload;
const void *data;
size_t datalen;
size_t quotalen;
};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field. This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function. This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells
Signed-off-by: Rusty Russell

David Howells
2012-10-08 11:19:48 +0800

03 Oct, 2012

3 commits

4442d7704 Merge branch 'modsign-keys-devel' into security-next-keys ... Browse Code »

Signed-off-by: David Howells

David Howells
2012-10-03 02:30:19 +0800
3a50597de KEYS: Make the session and process keyrings per-thread ... Browse Code »

Make the session keyring per-thread rather than per-process, but still
inherited from the parent thread to solve a problem with PAM and gdm.

The problem is that join_session_keyring() will reject attempts to change the
session keyring of a multithreaded program but gdm is now multithreaded before
it gets to the point of starting PAM and running pam_keyinit to create the
session keyring. See:

https://bugs.freedesktop.org/show_bug.cgi?id=49211

The reason that join_session_keyring() will only change the session keyring
under a single-threaded environment is that it's hard to alter the other
thread's credentials to effect the change in a multi-threaded program. The
problems are such as:

(1) How to prevent two threads both running join_session_keyring() from
racing.

(2) Another thread's credentials may not be modified directly by this process.

(3) The number of threads is uncertain whilst we're not holding the
appropriate spinlock, making preallocation slightly tricky.

(4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
another thread to replace its keyring, but that means preallocating for
each thread.

A reasonable way around this is to make the session keyring per-thread rather
than per-process and just document that if you want a common session keyring,
you must get it before you spawn any threads - which is the current situation
anyway.

Whilst we're at it, we can the process keyring behave in the same way. This
means we can clean up some of the ickyness in the creds code.

Basically, after this patch, the session, process and thread keyrings are about
inheritance rules only and not about sharing changes of keyring.

Reported-by: Mantas M.
Signed-off-by: David Howells
Tested-by: Ray Strode

David Howells
2012-10-03 02:24:29 +0800
437589a74 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace changes from Eric Biederman:
"This is a mostly modest set of changes to enable basic user namespace
support. This allows the code to code to compile with user namespaces
enabled and removes the assumption there is only the initial user
namespace. Everything is converted except for the most complex of the
filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
nfs, ocfs2 and xfs as those patches need a bit more review.

The strategy is to push kuid_t and kgid_t values are far down into
subsystems and filesystems as reasonable. Leaving the make_kuid and
from_kuid operations to happen at the edge of userspace, as the values
come off the disk, and as the values come in from the network.
Letting compile type incompatible compile errors (present when user
namespaces are enabled) guide me to find the issues.

The most tricky areas have been the places where we had an implicit
union of uid and gid values and were storing them in an unsigned int.
Those places were converted into explicit unions. I made certain to
handle those places with simple trivial patches.

Out of that work I discovered we have generic interfaces for storing
quota by projid. I had never heard of the project identifiers before.
Adding full user namespace support for project identifiers accounts
for most of the code size growth in my git tree.

Ultimately there will be work to relax privlige checks from
"capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
root in a user names to do those things that today we only forbid to
non-root users because it will confuse suid root applications.

While I was pushing kuid_t and kgid_t changes deep into the audit code
I made a few other cleanups. I capitalized on the fact we process
netlink messages in the context of the message sender. I removed
usage of NETLINK_CRED, and started directly using current->tty.

Some of these patches have also made it into maintainer trees, with no
problems from identical code from different trees showing up in
linux-next.

After reading through all of this code I feel like I might be able to
win a game of kernel trivial pursuit."

Fix up some fairly trivial conflicts in netfilter uid/git logging code.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
userns: Convert the ufs filesystem to use kuid/kgid where appropriate
userns: Convert the udf filesystem to use kuid/kgid where appropriate
userns: Convert ubifs to use kuid/kgid
userns: Convert squashfs to use kuid/kgid where appropriate
userns: Convert reiserfs to use kuid and kgid where appropriate
userns: Convert jfs to use kuid/kgid where appropriate
userns: Convert jffs2 to use kuid and kgid where appropriate
userns: Convert hpfs to use kuid and kgid where appropriate
userns: Convert btrfs to use kuid/kgid where appropriate
userns: Convert bfs to use kuid/kgid where appropriate
userns: Convert affs to use kuid/kgid wherwe appropriate
userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
userns: On ia64 deal with current_uid and current_gid being kuid and kgid
userns: On ppc convert current_uid from a kuid before printing.
userns: Convert s390 getting uid and gid system calls to use kuid and kgid
userns: Convert s390 hypfs to use kuid and kgid where appropriate
userns: Convert binder ipc to use kuids
userns: Teach security_path_chown to take kuids and kgids
userns: Add user namespace support to IMA
userns: Convert EVM to deal with kuids and kgids in it's hmac computation
...

Linus Torvalds
2012-10-03 02:11:09 +0800

28 Sep, 2012

1 commit

a84a92197 key: Fix resource leak ... Browse Code »

On an error iov may still have been reallocated and need freeing

Signed-off-by: Alan Cox
Signed-off-by: David Howells

Alan Cox
2012-09-28 19:20:02 +0800

14 Sep, 2012

1 commit

9a56c2db4 userns: Convert security/keys to the new userns infrastructure ... Browse Code »

- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
keys in the user namespace of the opener of key status proc files.

Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-09-14 09:28:02 +0800

13 Sep, 2012

2 commits

b3f68f16d task_work: Revert "hold task_lock around checks in keyctl" ... Browse Code »

This reverts commit d35abdb28824cf74f0a106a0f9c6f3ff700a35bf.

task_lock() was added to ensure exit_mm() and thus exit_task_work() is
not possible before task_work_add().

This is wrong, task_lock() must not be nested with write_lock(tasklist).
And this is no longer needed, task_work_add() now fails if it is called
after exit_task_work().

Reported-by: Dave Jones
Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
Cc: Al Viro
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20120826191214.GA4231@redhat.com
Signed-off-by: Ingo Molnar

Oleg Nesterov
2012-09-13 22:47:36 +0800
d4f65b5d2 KEYS: Add payload preparsing opportunity prior to key instantiate or update ... Browse Code »

Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called. This is done with the
provision of two new key type operations:

int (*preparse)(struct key_preparsed_payload *prep);
void (*free_preparse)(struct key_preparsed_payload *prep);

If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases). The second operation is called to clean up if the first
was called.

preparse() is given the opportunity to fill in the following structure:

struct key_preparsed_payload {
char *description;
void *type_data[2];
void *payload;
const void *data;
size_t datalen;
size_t quotalen;
};

Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.

The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.

The preparser may also propose a description for the key by attaching it as a
string to the description field. This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function. This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.

This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.

The instantiate() and update() operations are then modified to look like this:

int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
int (*update)(struct key *key, struct key_preparsed_payload *prep);

and the new payload data is passed in *prep, whether or not it was preparsed.

Signed-off-by: David Howells

David Howells
2012-09-13 20:06:29 +0800

24 Jul, 2012

1 commit

e05644e17 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security subsystem updates from James Morris:
"Nothing groundbreaking for this kernel, just cleanups and fixes, and a
couple of Smack enhancements."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
Smack: Maintainer Record
Smack: don't show empty rules when /smack/load or /smack/load2 is read
Smack: user access check bounds
Smack: onlycap limits on CAP_MAC_ADMIN
Smack: fix smack_new_inode bogosities
ima: audit is compiled only when enabled
ima: ima_initialized is set only if successful
ima: add policy for pseudo fs
ima: remove unused cleanup functions
ima: free securityfs violations file
ima: use full pathnames in measurement list
security: Fix nommu build.
samples: seccomp: add .gitignore for untracked executables
tpm: check the chip reference before using it
TPM: fix memleak when register hardware fails
TPM: chip disabled state erronously being reported as error
MAINTAINERS: TPM maintainers' contacts update
Merge branches 'next-queue' and 'next' into next
Remove unused code from MPI library
Revert "crypto: GnuPG based MPI lib - additional sources (part 4)"
...

Linus Torvalds
2012-07-24 09:49:06 +0800

23 Jul, 2012

3 commits

d35abdb28 hold task_lock around checks in keyctl ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-23 03:58:01 +0800
67d121455 merge task_work and rcu_head, get rid of separate allocation for keyring case ... Browse Code »

task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.

Signed-off-by: Al Viro

Al Viro
2012-07-23 03:57:56 +0800
41f9d29f0 trimming task_work: kill ->data ... Browse Code »

get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.

Signed-off-by: Al Viro

Al Viro
2012-07-23 03:57:54 +0800

10 Jun, 2012

1 commit

66dd07b88 Merge commit 'v3.5-rc2' into next Browse Code »

James Morris
2012-06-10 20:52:10 +0800

01 Jun, 2012

3 commits

fb21affa4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal ... Browse Code »

Pull second pile of signal handling patches from Al Viro:
"This one is just task_work_add() series + remaining prereqs for it.

There probably will be another pull request from that tree this
cycle - at least for helpers, to get them out of the way for per-arch
fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8e0 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
keys: kill task_struct->replacement_session_keyring
keys: kill the dummy key_replace_session_keyring()
keys: change keyctl_session_to_parent() to use task_work_add()
genirq: reimplement exit_irq_thread() hook via task_work_add()
task_work_add: generic process-context callbacks
avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
parisc: need to check NOTIFY_RESUME when exiting from syscall
move key_repace_session_keyring() into tracehook_notify_resume()
TIF_NOTIFY_RESUME is defined on all targets now

Linus Torvalds
2012-06-01 09:47:30 +0800
ac34ebb3a aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() ... Browse Code »

A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after
changes made to support CMA in an earlier patch.

Rather than having an additional check_access parameter to these
functions, the first paramater type is overloaded to allow the caller to
specify CHECK_IOVEC_ONLY which means check that the contents of the iovec
are valid, but do not check the memory that they point to. This is used
by process_vm_readv/writev where we need to validate that a iovec passed
to the syscall is valid but do not want to check the memory that it points
to at this point because it refers to an address space in another process.

Signed-off-by: Chris Yeoh
Reviewed-by: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2012-06-01 08:49:32 +0800
4f1c28d24 security/keys/keyctl.c: suppress memory allocation failure warning ... Browse Code »

This allocation may be large. The code is probing to see if it will
succeed and if not, it falls back to vmalloc(). We should suppress any
page-allocation failure messages when the fallback happens.

Reported-by: Dave Jones
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2012-06-01 08:49:26 +0800

25 May, 2012

1 commit

423b97880 KEYS: Fix some sparse warnings ... Browse Code »

Fix some sparse warnings in the keyrings code:

(1) compat_keyctl_instantiate_key_iov() should be static.

(2) There were a couple of places where a pointer was being compared against
integer 0 rather than NULL.

(3) keyctl_instantiate_key_common() should not take a __user-labelled iovec
pointer as the caller must have copied the iovec to kernel space.

(4) __key_link_begin() takes and __key_link_end() releases
keyring_serialise_link_sem under some circumstances and so this should be
declared.

Note that adding __acquires() and __releases() for this doesn't help cure
the warnings messages - something only commenting out both helps.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2012-05-25 18:51:42 +0800

24 May, 2012

2 commits

413cd3d9a keys: change keyctl_session_to_parent() to use task_work_add() ... Browse Code »

Change keyctl_session_to_parent() to use task_work_add() and move
key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to ensure that
only one work can be pending at any time. This is important, we must not
allow user-space to abuse the parent's ->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING. I guess this
is not really needed but looks better.

As a side effect, this fixes the (unlikely) race. The callers of
key_replace_session_keyring() and keyctl_session_to_parent() lack the
necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and related
code.

Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Cc: Thomas Gleixner
Cc: Richard Kuo
Cc: Linus Torvalds
Cc: Alexander Gordeev
Cc: Chris Zankel
Cc: David Smith
Cc: "Frank Ch. Eigler"
Cc: Geert Uytterhoeven
Cc: Larry Woodman
Cc: Peter Zijlstra
Cc: Tejun Heo
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Oleg Nesterov
2012-05-24 10:11:23 +0800
1227dd773 TIF_NOTIFY_RESUME is defined on all targets now ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-05-24 10:09:19 +0800

11 May, 2012

1 commit

fd75815f7 KEYS: Add invalidation support ... Browse Code »

Add support for invalidating a key - which renders it immediately invisible to
further searches and causes the garbage collector to immediately wake up,
remove it from keyrings and then destroy it when it's no longer referenced.

It's better not to do this with keyctl_revoke() as that marks the key to start
returning -EKEYREVOKED to searches when what is actually desired is to have the
key refetched.

To invalidate a key the caller must be granted SEARCH permission by the key.
This may be too strict. It may be better to also permit invalidation if the
caller has any of READ, WRITE or SETATTR permission.

The primary use for this is to evict keys that are cached in special keyrings,
such as the DNS resolver or an ID mapper.

Signed-off-by: David Howells

David Howells
2012-05-11 17:56:56 +0800

23 Mar, 2012

1 commit

f63d395d4 Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client updates for Linux 3.4 from Trond Myklebust:
"New features include:
- Add NFS client support for containers.

This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from which
the mount system call was issued.

- NFSv4 idmapper scalability improvements

Base the idmapper cache on the keyring interface to allow
concurrent access to idmapper entries. Start the process of
migrating users from the single-threaded daemon-based approach to
the multi-threaded request-key based approach.

- NFSv4.1 implementation id.

Allows the NFSv4.1 client and server to mutually identify each
other for logging and debugging purposes.

- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.

- SUNRPC tracepoints.

Start the process of adding tracepoints in order to improve
debugging of the RPC layer.

- pNFS object layout support for autologin.

Important bugfixes include:

- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
fail to wake up all tasks when applied to priority waitqueues.

- Ensure that we handle read delegations correctly, when we try to
truncate a file.

- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery)."

* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
NFS: fix sb->s_id in nfs debug prints
xprtrdma: Remove assumption that each segment is ls_state in release_lockowner
NFS: ncommit count is being double decremented
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
Try using machine credentials for RENEW calls
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
...

Linus Torvalds
2012-03-23 23:53:47 +0800

02 Mar, 2012

1 commit

59e6b9c11 Created a function for setting timeouts on keys ... Browse Code »

The keyctl_set_timeout function isn't exported to other parts of the
kernel, but I want to use it for the NFS idmapper. I already have the
key, but I wanted a generic way to set the timeout.

Signed-off-by: Bryan Schumaker
Acked-by: David Howells
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-03-02 05:50:31 +0800

19 Jan, 2012

1 commit

700920eb5 KEYS: Allow special keyrings to be cleared ... Browse Code »

The kernel contains some special internal keyrings, for instance the DNS
resolver keyring :

2a93faf1 I----- 1 perm 1f030000 0 0 keyring .dns_resolver: empty

It would occasionally be useful to allow the contents of such keyrings to be
flushed by root (cache invalidation).

Allow a flag to be set on a keyring to mark that someone possessing the
sysadmin capability can clear the keyring, even without normal write access to
the keyring.

Set this flag on the special keyrings created by the DNS resolver, the NFS
identity mapper and the CIFS identity mapper.

Signed-off-by: David Howells
Acked-by: Jeff Layton
Acked-by: Steve Dickson
Signed-off-by: James Morris

David Howells
2012-01-19 11:38:51 +0800

01 Nov, 2011

1 commit

fcf634098 Cross Memory Attach ... Browse Code »

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call. There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
using it:
- Does not allow for specifying iovecs for both src and dest, assuming
preadv or pwritev was implemented either the area read from or
written to would need to be contiguous.
- Currently mem_read allows only processes who are currently
ptrace'ing the target and are still able to ptrace the target to read
from the target. This check could possibly be moved to the open call,
but its not clear exactly what race this restriction is stopping
(reason appears to have been lost)
- Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
domain socket is a bit ugly from a userspace point of view,
especially when you may have hundreds if not (eventually) thousands
of processes that all need to do this with each other
- Doesn't allow for some future use of the interface we would like to
consider adding in the future (see below)
- Interestingly reading from /proc/pid/mem currently actually
involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems. Since you need the reader and writer working co-operatively if
the pipe is not drained then you block. Which requires some wrapping to
do non blocking on the send side or polling on the receive. In all to all
communication it requires ordering otherwise you can deadlock. And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could. For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy. We don't need to keep a copy of the data
from the source. I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI). This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication. And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Arnd Bergmann
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: James Morris
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christopher Yeoh
2011-11-01 08:30:44 +0800

17 Mar, 2011

1 commit

4aab1e896 KEYS: Make request_key() and co. return an error for a negative key ... Browse Code »

Make request_key() and co. return an error for a negative or rejected key. If
the key was simply negated, then return ENOKEY, otherwise return the error
with which it was rejected.

Without this patch, the following command returns a key number (with the latest
keyutils):

[root@andromeda ~]# keyctl request2 user debug:foo rejected @s
586569904

Trying to print the key merely gets you a permission denied error:

[root@andromeda ~]# keyctl print 586569904
keyctl_read_alloc: Permission denied

Doing another request_key() call does get you the error, as long as it hasn't
expired yet:

[root@andromeda ~]# keyctl request user debug:foo
request_key: Key was rejected by service

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-17 08:59:49 +0800

08 Mar, 2011

2 commits

ee009e4a0 KEYS: Add an iovec version of KEYCTL_INSTANTIATE ... Browse Code »

Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but
takes an iovec array and concatenates the data in-kernel into one buffer.
Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a
problem.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-08 08:17:22 +0800
fdd1b9458 KEYS: Add a new keyctl op to reject a key with a specified error code ... Browse Code »

Add a new keyctl op to reject a key with a specified error code. This works
much the same as negating a key, and so keyctl_negate_key() is made a special
case of keyctl_reject_key(). The difference is that keyctl_negate_key()
selects ENOKEY as the error to be reported.

Typically the key would be rejected with EKEYEXPIRED, EKEYREVOKED or
EKEYREJECTED, but this is not mandatory.

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2011-03-08 08:17:18 +0800

22 Jan, 2011

2 commits

973c9f4f4 KEYS: Fix up comments in key management code ... Browse Code »

Fix up comments in the key management code. No functional changes.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2011-01-22 06:59:30 +0800
a8b17ed01 KEYS: Do some style cleanup in the key management code. ... Browse Code »

Do a bit of a style clean up in the key management code. No functional
changes.

Done using:

perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c

To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2011-01-22 06:59:29 +0800

10 Sep, 2010

2 commits

3d96406c7 KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring ... Browse Code »

Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership
of the parent process's session keyring whether or not the parent has a session
keyring [CVE-2010-2960].

This results in the following oops:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [] keyctl_session_to_parent+0x251/0x443
...
Call Trace:
[] ? keyctl_session_to_parent+0x67/0x443
[] ? __do_fault+0x24b/0x3d0
[] sys_keyctl+0xb4/0xb8
[] system_call_fastpath+0x16/0x1b

if the parent process has no session keyring.

If the system is using pam_keyinit then it mostly protected against this as all
processes derived from a login will have inherited the session keyring created
by pam_keyinit during the log in procedure.

To test this, pam_keyinit calls need to be commented out in /etc/pam.d/.

Reported-by: Tavis Ormandy
Signed-off-by: David Howells
Acked-by: Tavis Ormandy
Signed-off-by: Linus Torvalds

David Howells
2010-09-10 22:30:00 +0800
9d1ac65a9 KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() ... Browse Code »

There's an protected access to the parent process's credentials in the middle
of keyctl_session_to_parent(). This results in the following RCU warning:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by keyctl-session-/2137:
#0: (tasklist_lock){.+.+..}, at: [] keyctl_session_to_parent+0x60/0x236

stack backtrace:
Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1
Call Trace:
[] lockdep_rcu_dereference+0xaa/0xb3
[] keyctl_session_to_parent+0xed/0x236
[] sys_keyctl+0xb4/0xb6
[] system_call_fastpath+0x16/0x1b

The code should take the RCU read lock to make sure the parents credentials
don't go away, even though it's holding a spinlock and has IRQ disabled.

Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

David Howells
2010-09-10 22:30:00 +0800

02 Aug, 2010

2 commits

94fd8405e KEYS: Use the variable 'key' in keyctl_describe_key() ... Browse Code »

keyctl_describe_key() turns the key reference it gets into a usable key pointer
and assigns that to a variable called 'key', which it then ignores in favour of
recomputing the key pointer each time it needs it. Make it use the precomputed
pointer instead.

Without this patch, gcc 4.6 reports that the variable key is set but not used:

building with gcc 4.6 I'm getting a warning message:
CC security/keys/keyctl.o
security/keys/keyctl.c: In function 'keyctl_describe_key':
security/keys/keyctl.c:472:14: warning: variable 'key' set but not used

Reported-by: Justin P. Mattock
Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2010-08-02 13:34:56 +0800
9156235b3 KEYS: Authorise keyctl_set_timeout() on a key if we have its authorisation key ... Browse Code »

Authorise a process to perform keyctl_set_timeout() on an uninstantiated key if
that process has the authorisation key for it.

This allows the instantiator to set the timeout on a key it is instantiating -
provided it does it before instantiating the key.

For instance, the test upcall script provided with the keyutils package could
be modified to set the expiry to an hour hence before instantiating the key:

[/usr/share/keyutils/request-key-debug.sh]
if [ "$3" != "neg" ]
then
+ keyctl timeout $1 3600
keyctl instantiate $1 "Debug $3" $4 || exit 1
else

Signed-off-by: David Howells
Signed-off-by: James Morris

David Howells
2010-08-02 13:34:27 +0800

27 Jun, 2010

1 commit

4303ef19c KEYS: Propagate error code instead of returning -EINVAL ... Browse Code »

This is from a Smatch check I'm writing.

strncpy_from_user() returns -EFAULT on error so the first change just
silences a warning but doesn't change how the code works.

The other change is a bug fix because install_thread_keyring_to_cred()
can return a variety of errors such as -EINVAL, -EEXIST, -ENOMEM or
-EKEYREVOKED.

Signed-off-by: Dan Carpenter
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

Dan Carpenter
2010-06-27 22:02:34 +0800

28 May, 2010

1 commit

dd98acf74 keyctl_session_to_parent(): use thread_group_empty() to check singlethreadness ... Browse Code »

No functional changes.

keyctl_session_to_parent() is the only user of signal->count which needs
the correct value. Change it to use thread_group_empty() instead, this
must be strictly equivalent under tasklist, and imho looks better.

Signed-off-by: Oleg Nesterov
Acked-by: David Howells
Cc: Peter Zijlstra
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-05-28 00:12:47 +0800

23 Apr, 2010

1 commit

c5b60b5e6 security: whitespace coding style fixes ... Browse Code »

Whitespace coding style fixes.

Signed-off-by: Justin P. Mattock
Signed-off-by: James Morris

Justin P. Mattock
2010-04-23 08:10:23 +0800