21 May, 2020

1 commit

  • It is almost possible to use the result of prepare_exec_creds with no
    modifications during exec. Update prepare_exec_creds to initialize
    the suid and the fsuid to the euid, and the sgid and the fsgid to the
    egid. This is all that is needed to handle the common case of exec
    when nothing special like a setuid exec is happening.

    That this preserves the existing behavior of exec can be verified
    by examing bprm_fill_uid and cap_bprm_set_creds.

    This change makes it clear that the later parts of exec that
    update bprm->cred are just need to handle special cases such
    as setuid exec and change of domains.

    Link: https://lkml.kernel.org/r/871rng22dm.fsf_-_@x220.int.ebiederm.org
    Acked-by: Linus Torvalds
    Reviewed-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

25 Mar, 2020

1 commit


15 Jan, 2020

2 commits

  • Merge misc fixes from David Howells.

    Two afs fixes and a key refcounting fix.

    * dhowells:
    afs: Fix afs_lookup() to not clobber the version on a new dentry
    afs: Fix use-after-loss-of-ref
    keys: Fix request_key() cache

    Linus Torvalds
     
  • When the key cached by request_key() and co. is cleaned up on exit(),
    the code looks in the wrong task_struct, and so clears the wrong cache.
    This leads to anomalies in key refcounting when doing, say, a kernel
    build on an afs volume, that then trigger kasan to report a
    use-after-free when the key is viewed in /proc/keys.

    Fix this by making exit_creds() look in the passed-in task_struct rather
    than in current (the task_struct cleanup code is deferred by RCU and
    potentially run in another task).

    Fixes: 7743c48e54ee ("keys: Cache result of request_key*() temporarily in task_struct")
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

05 Jan, 2020

1 commit

  • The cred_jar kmem_cache is already memcg accounted in the current kernel
    but cred->security is not. Account cred->security to kmemcg.

    Recently we saw high root slab usage on our production and on further
    inspection, we found a buggy application leaking processes. Though that
    buggy application was contained within its memcg but we observe much
    more system memory overhead, couple of GiBs, during that period. This
    overhead can adversely impact the isolation on the system.

    One source of high overhead we found was cred->security objects, which
    have a lifetime of at least the life of the process which allocated
    them.

    Link: http://lkml.kernel.org/r/20191205223721.40034-1-shakeelb@google.com
    Signed-off-by: Shakeel Butt
    Acked-by: Chris Down
    Reviewed-by: Roman Gushchin
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shakeel Butt
     

25 Jul, 2019

2 commits

  • The access() (and faccessat()) credentials change can cause an
    unnecessary load on the RCU machinery because every access() call ends
    up freeing the temporary access credential using RCU.

    This isn't really noticeable on small machines, but if you have hundreds
    of cores you can cause huge slowdowns due to RCU storms.

    It's easy to avoid: the temporary access crededntials aren't actually
    normally accessed using RCU at all, so we can avoid the whole issue by
    just marking them as such.

    * access-creds:
    access: avoid the RCU grace period for the temporary subjective credentials

    Linus Torvalds
     
  • It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
    work because it installs a temporary credential that gets allocated and
    freed for each system call.

    The allocation and freeing overhead is mostly benign, but because
    credentials can be accessed under the RCU read lock, the freeing
    involves a RCU grace period.

    Which is not a huge deal normally, but if you have a lot of access()
    calls, this causes a fair amount of seconday damage: instead of having a
    nice alloc/free patterns that hits in hot per-CPU slab caches, you have
    all those delayed free's, and on big machines with hundreds of cores,
    the RCU overhead can end up being enormous.

    But it turns out that all of this is entirely unnecessary. Exactly
    because access() only installs the credential as the thread-local
    subjective credential, the temporary cred pointer doesn't actually need
    to be RCU free'd at all. Once we're done using it, we can just free it
    synchronously and avoid all the RCU overhead.

    So add a 'non_rcu' flag to 'struct cred', which can be set by users that
    know they only use it in non-RCU context (there are other potential
    users for this). We can make it a union with the rcu freeing list head
    that we need for the RCU case, so this doesn't need any extra storage.

    Note that this also makes 'get_current_cred()' clear the new non_rcu
    flag, in case we have filesystems that take a long-term reference to the
    cred and then expect the RCU delayed freeing afterwards. It's not
    entirely clear that this is required, but it makes for clear semantics:
    the subjective cred remains non-RCU as long as you only access it
    synchronously using the thread-local accessors, but you _can_ use it as
    a generic cred if you want to.

    It is possible that we should just remove the whole RCU markings for
    ->cred entirely. Only ->real_cred is really supposed to be accessed
    through RCU, and the long-term cred copies that nfs uses might want to
    explicitly re-enable RCU freeing if required, rather than have
    get_current_cred() do it implicitly.

    But this is a "minimal semantic changes" change for the immediate
    problem.

    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Eric Dumazet
    Acked-by: Paul E. McKenney
    Cc: Oleg Nesterov
    Cc: Jan Glauber
    Cc: Jiri Kosina
    Cc: Jayachandran Chandrasekharan Nair
    Cc: Greg KH
    Cc: Kees Cook
    Cc: David Howells
    Cc: Miklos Szeredi
    Cc: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

09 Jul, 2019

2 commits

  • Pull request_key improvements from David Howells:
    "These are all request_key()-related, including a fix and some improvements:

    - Fix the lack of a Link permission check on a key found by
    request_key(), thereby enabling request_key() to link keys that
    don't grant this permission to the target keyring (which must still
    grant Write permission).

    Note that the key must be in the caller's keyrings already to be
    found.

    - Invalidate used request_key authentication keys rather than
    revoking them, so that they get cleaned up immediately rather than
    hanging around till the expiry time is passed.

    - Move the RCU locks outwards from the keyring search functions so
    that a request_key_rcu() can be provided. This can be called in RCU
    mode, so it can't sleep and can't upcall - but it can be called
    from LOOKUP_RCU pathwalk mode.

    - Cache the latest positive result of request_key*() temporarily in
    task_struct so that filesystems that make a lot of request_key()
    calls during pathwalk can take advantage of it to avoid having to
    redo the searching. This requires CONFIG_KEYS_REQUEST_CACHE=y.

    It is assumed that the key just found is likely to be used multiple
    times in each step in an RCU pathwalk, and is likely to be reused
    for the next step too.

    Note that the cleanup of the cache is done on TIF_NOTIFY_RESUME,
    just before userspace resumes, and on exit"

    * tag 'keys-request-20190626' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Kill off request_key_async{,_with_auxdata}
    keys: Cache result of request_key*() temporarily in task_struct
    keys: Provide request_key_rcu()
    keys: Move the RCU locks outwards from the keyring search functions
    keys: Invalidate used request_key authentication keys
    keys: Fix request_key() lack of Link perm check on found key

    Linus Torvalds
     
  • Pull misc keyring updates from David Howells:
    "These are some miscellaneous keyrings fixes and improvements:

    - Fix a bunch of warnings from sparse, including missing RCU bits and
    kdoc-function argument mismatches

    - Implement a keyctl to allow a key to be moved from one keyring to
    another, with the option of prohibiting key replacement in the
    destination keyring.

    - Grant Link permission to possessors of request_key_auth tokens so
    that upcall servicing daemons can more easily arrange things such
    that only the necessary auth key is passed to the actual service
    program, and not all the auth keys a daemon might possesss.

    - Improvement in lookup_user_key().

    - Implement a keyctl to allow keyrings subsystem capabilities to be
    queried.

    The keyutils next branch has commits to make available, document and
    test the move-key and capabilities code:

    https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log

    They're currently on the 'next' branch"

    * tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Add capability-checking keyctl function
    keys: Reuse keyring_index_key::desc_len in lookup_user_key()
    keys: Grant Link permission to possessers of request_key auth keys
    keys: Add a keyctl to move a key between keyrings
    keys: Hoist locking out of __key_link_begin()
    keys: Break bits out of key_unlink()
    keys: Change keyring_serialise_link_sem to a mutex
    keys: sparse: Fix kdoc mismatches
    keys: sparse: Fix incorrect RCU accesses
    keys: sparse: Fix key_fs[ug]id_changed()

    Linus Torvalds
     

19 Jun, 2019

1 commit

  • If a filesystem uses keys to hold authentication tokens, then it needs a
    token for each VFS operation that might perform an authentication check -
    either by passing it to the server, or using to perform a check based on
    authentication data cached locally.

    For open files this isn't a problem, since the key should be cached in the
    file struct since it represents the subject performing operations on that
    file descriptor.

    During pathwalk, however, there isn't anywhere to cache the key, except
    perhaps in the nameidata struct - but that isn't exposed to the
    filesystems. Further, a pathwalk can incur a lot of operations, calling
    one or more of the following, for instance:

    ->lookup()
    ->permission()
    ->d_revalidate()
    ->d_automount()
    ->get_acl()
    ->getxattr()

    on each dentry/inode it encounters - and each one may need to call
    request_key(). And then, at the end of pathwalk, it will call the actual
    operation:

    ->mkdir()
    ->mknod()
    ->getattr()
    ->open()
    ...

    which may need to go and get the token again.

    However, it is very likely that all of the operations on a single
    dentry/inode - and quite possibly a sequence of them - will all want to use
    the same authentication token, which suggests that caching it would be a
    good idea.

    To this end:

    (1) Make it so that a positive result of request_key() and co. that didn't
    require upcalling to userspace is cached temporarily in task_struct.

    (2) The cache is 1 deep, so a new result displaces the old one.

    (3) The key is released by exit and by notify-resume.

    (4) The cache is cleared in a newly forked process.

    Signed-off-by: David Howells

    David Howells
     

12 Jun, 2019

2 commits

  • Pull ptrace fixes from Eric Biederman:
    "This is just two very minor fixes:

    - prevent ptrace from reading unitialized kernel memory found twice
    by syzkaller

    - restore a missing smp_rmb in ptrace_may_access and add comment tp
    it so it is not removed by accident again.

    Apologies for being a little slow about getting this to you, I am
    still figuring out how to develop with a little baby in the house"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    ptrace: restore smp_rmb() in __ptrace_may_access()
    signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO

    Linus Torvalds
     
  • Restore the read memory barrier in __ptrace_may_access() that was deleted
    a couple years ago. Also add comments on this barrier and the one it pairs
    with to explain why they're there (as far as I understand).

    Fixes: bfedb589252c ("mm: Add a user_ns owner to mm_struct and fix ptrace permission checks")
    Cc: stable@vger.kernel.org
    Acked-by: Kees Cook
    Acked-by: Oleg Nesterov
    Signed-off-by: Jann Horn
    Signed-off-by: Eric W. Biederman

    Jann Horn
     

24 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public licence as published by
    the free software foundation either version 2 of the licence or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 114 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

22 May, 2019

1 commit

  • Sparse warnings are incurred by key_fs[ug]id_changed() due to unprotected
    accesses of tsk->cred, which is marked __rcu.

    Fix this by passing the new cred struct to these functions from
    commit_creds() rather than the task pointer.

    Signed-off-by: David Howells
    Reviewed-by: James Morris

    David Howells
     

09 Jan, 2019

1 commit

  • The SELinux specific credential poisioning only makes sense
    if SELinux is managing the credentials. As the intent of this
    patch set is to move the blob management out of the modules
    and into the infrastructure, the SELinux specific code has
    to go. The poisioning could be introduced into the infrastructure
    at some later date.

    Signed-off-by: Casey Schaufler
    Reviewed-by: Kees Cook
    Signed-off-by: Kees Cook

    Casey Schaufler
     

20 Dec, 2018

3 commits

  • There is no reason that modules should not be able
    to use this, and NFS will need it when converted to
    use 'struct cred'.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • Sometimes we want to opportunistically get a
    ref to a cred in an rcu_read_lock protected section.
    get_task_cred() does this, and NFS does as similar thing
    with its own credential structures.
    To prepare for NFS converting to use 'struct cred' more
    uniformly, define get_cred_rcu(), and use it in
    get_task_cred().

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • NFS needs to compare to credentials, to see if they can
    be treated the same w.r.t. filesystem access. Sometimes
    an ordering is needed when credentials are used as a key
    to an rbtree.
    NFS currently has its own private credential management from
    before 'struct cred' existed. To move it over to more consistent
    use of 'struct cred' we need a comparison function.
    This patch adds that function.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     

19 May, 2017

1 commit

  • This updates the credentials API documentation to ReST markup and moves
    it under the security subsection of kernel API documentation.

    Cc: David Howells
    Signed-off-by: Kees Cook
    Signed-off-by: Jonathan Corbet

    Kees Cook
     

02 Mar, 2017

1 commit


01 Jul, 2016

1 commit


15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

11 Sep, 2015

1 commit

  • Commit e0e817392b9a ("CRED: Add some configurable debugging [try #6]")
    added the kdebug mechanism to this file back in 2009.

    The kdebug macro calls no_printk which always evaluates arguments.

    Most of the kdebug uses have an unnecessary call of
    atomic_read(&cred->usage)

    Make the kdebug macro do nothing by defining it with
    do { if (0) no_printk(...); } while (0)
    when not enabled.

    $ size kernel/cred.o* (defconfig x86-64)
    text data bss dec hex filename
    2748 336 8 3092 c14 kernel/cred.o.new
    2788 336 8 3132 c3c kernel/cred.o.old

    Miscellanea:
    o Neaten the #define kdebug macros while there

    Signed-off-by: Joe Perches
    Cc: David Howells
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

16 Apr, 2015

1 commit

  • There are a lot of embedded systems that run most or all of their
    functionality in init, running as root:root. For these systems,
    supporting multiple users is not necessary.

    This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
    non-root users, non-root groups, and capabilities optional. It is enabled
    under CONFIG_EXPERT menu.

    When this symbol is not defined, UID and GID are zero in any possible case
    and processes always have all capabilities.

    The following syscalls are compiled out: setuid, setregid, setgid,
    setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
    getgroups, setfsuid, setfsgid, capget, capset.

    Also, groups.c is compiled out completely.

    In kernel/capability.c, capable function was moved in order to avoid
    adding two ifdef blocks.

    This change saves about 25 KB on a defconfig build. The most minimal
    kernels have total text sizes in the high hundreds of kB rather than
    low MB. (The 25k goes down a bit with allnoconfig, but not that much.

    The kernel was booted in Qemu. All the common functionalities work.
    Adding users/groups is not possible, failing with -ENOSYS.

    Bloat-o-meter output:
    add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Iulia Manda
    Reviewed-by: Josh Triplett
    Acked-by: Geert Uytterhoeven
    Tested-by: Paul E. McKenney
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Iulia Manda
     

19 Dec, 2012

1 commit

  • Pull (again) user namespace infrastructure changes from Eric Biederman:
    "Those bugs, those darn embarrasing bugs just want don't want to get
    fixed.

    Linus I just updated my mirror of your kernel.org tree and it appears
    you successfully pulled everything except the last 4 commits that fix
    those embarrasing bugs.

    When you get a chance can you please repull my branch"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Fix typo in description of the limitation of userns_install
    userns: Add a more complete capability subset test to commit_creds
    userns: Require CAP_SYS_ADMIN for most uses of setns.
    Fix cap_capable to only allow owners in the parent user namespace to have caps.

    Linus Torvalds
     

17 Dec, 2012

1 commit

  • Pull security subsystem updates from James Morris:
    "A quiet cycle for the security subsystem with just a few maintenance
    updates."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    Smack: create a sysfs mount point for smackfs
    Smack: use select not depends in Kconfig
    Yama: remove locking from delete path
    Yama: add RCU to drop read locking
    drivers/char/tpm: remove tasklet and cleanup
    KEYS: Use keyring_alloc() to create special keyrings
    KEYS: Reduce initial permissions on keys
    KEYS: Make the session and process keyrings per-thread
    seccomp: Make syscall skipping and nr changes more consistent
    key: Fix resource leak
    keys: Fix unreachable code
    KEYS: Add payload preparsing opportunity prior to key instantiate or update

    Linus Torvalds
     

15 Dec, 2012

1 commit

  • When unsharing a user namespace we reduce our credentials to just what
    can be done in that user namespace. This is a subset of the credentials
    we previously had. Teach commit_creds to recognize this is a subset
    of the credentials we have had before and don't clear the dumpability flag.

    This allows an unprivileged program to do:
    unshare(CLONE_NEWUSER);
    fd = open("/proc/self/uid_map", O_RDWR);

    Where previously opening the uid_map writable would fail because
    the the task had been made non-dumpable.

    Acked-by: Serge Hallyn
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

03 Oct, 2012

1 commit

  • Make the session keyring per-thread rather than per-process, but still
    inherited from the parent thread to solve a problem with PAM and gdm.

    The problem is that join_session_keyring() will reject attempts to change the
    session keyring of a multithreaded program but gdm is now multithreaded before
    it gets to the point of starting PAM and running pam_keyinit to create the
    session keyring. See:

    https://bugs.freedesktop.org/show_bug.cgi?id=49211

    The reason that join_session_keyring() will only change the session keyring
    under a single-threaded environment is that it's hard to alter the other
    thread's credentials to effect the change in a multi-threaded program. The
    problems are such as:

    (1) How to prevent two threads both running join_session_keyring() from
    racing.

    (2) Another thread's credentials may not be modified directly by this process.

    (3) The number of threads is uncertain whilst we're not holding the
    appropriate spinlock, making preallocation slightly tricky.

    (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
    another thread to replace its keyring, but that means preallocating for
    each thread.

    A reasonable way around this is to make the session keyring per-thread rather
    than per-process and just document that if you want a common session keyring,
    you must get it before you spawn any threads - which is the current situation
    anyway.

    Whilst we're at it, we can the process keyring behave in the same way. This
    means we can clean up some of the ickyness in the creds code.

    Basically, after this patch, the session, process and thread keyrings are about
    inheritance rules only and not about sharing changes of keyring.

    Reported-by: Mantas M.
    Signed-off-by: David Howells
    Tested-by: Ray Strode

    David Howells
     

24 Aug, 2012

1 commit


24 May, 2012

2 commits

  • Kill the no longer used task_struct->replacement_session_keyring, update
    copy_creds() and exit_creds().

    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Cc: Thomas Gleixner
    Cc: Richard Kuo
    Cc: Linus Torvalds
    Cc: Alexander Gordeev
    Cc: Chris Zankel
    Cc: David Smith
    Cc: "Frank Ch. Eigler"
    Cc: Geert Uytterhoeven
    Cc: Larry Woodman
    Cc: Peter Zijlstra
    Cc: Tejun Heo
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Oleg Nesterov
     
  • Pull user namespace enhancements from Eric Biederman:
    "This is a course correction for the user namespace, so that we can
    reach an inexpensive, maintainable, and reasonably complete
    implementation.

    Highlights:
    - Config guards make it impossible to enable the user namespace and
    code that has not been converted to be user namespace safe.

    - Use of the new kuid_t type ensures the if you somehow get past the
    config guards the kernel will encounter type errors if you enable
    user namespaces and attempt to compile in code whose permission
    checks have not been updated to be user namespace safe.

    - All uids from child user namespaces are mapped into the initial
    user namespace before they are processed. Removing the need to add
    an additional check to see if the user namespace of the compared
    uids remains the same.

    - With the user namespaces compiled out the performance is as good or
    better than it is today.

    - For most operations absolutely nothing changes performance or
    operationally with the user namespace enabled.

    - The worst case performance I could come up with was timing 1
    billion cache cold stat operations with the user namespace code
    enabled. This went from 156s to 164s on my laptop (or 156ns to
    164ns per stat operation).

    - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
    Most uid/gid setting system calls treat these value specially
    anyway so attempting to use -1 as a uid would likely cause
    entertaining failures in userspace.

    - If setuid is called with a uid that can not be mapped setuid fails.
    I have looked at sendmail, login, ssh and every other program I
    could think of that would call setuid and they all check for and
    handle the case where setuid fails.

    - If stat or a similar system call is called from a context in which
    we can not map a uid we lie and return overflowuid. The LFS
    experience suggests not lying and returning an error code might be
    better, but the historical precedent with uids is different and I
    can not think of anything that would break by lying about a uid we
    can't map.

    - Capabilities are localized to the current user namespace making it
    safe to give the initial user in a user namespace all capabilities.

    My git tree covers all of the modifications needed to convert the core
    kernel and enough changes to make a system bootable to runlevel 1."

    Fix up trivial conflicts due to nearby independent changes in fs/stat.c

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
    userns: Silence silly gcc warning.
    cred: use correct cred accessor with regards to rcu read lock
    userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
    userns: Convert cgroup permission checks to use uid_eq
    userns: Convert tmpfs to use kuid and kgid where appropriate
    userns: Convert sysfs to use kgid/kuid where appropriate
    userns: Convert sysctl permission checks to use kuid and kgids.
    userns: Convert proc to use kuid/kgid where appropriate
    userns: Convert ext4 to user kuid/kgid where appropriate
    userns: Convert ext3 to use kuid/kgid where appropriate
    userns: Convert ext2 to use kuid/kgid where appropriate.
    userns: Convert devpts to use kuid/kgid where appropriate
    userns: Convert binary formats to use kuid/kgid where appropriate
    userns: Add negative depends on entries to avoid building code that is userns unsafe
    userns: signal remove unnecessary map_cred_ns
    userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
    userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
    userns: Convert stat to return values mapped from kuids and kgids
    userns: Convert user specfied uids and gids in chown into kuids and kgid
    userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
    ...

    Linus Torvalds
     

03 May, 2012

1 commit


11 Apr, 2012

1 commit

  • keyctl_session_to_parent(task) sets ->replacement_session_keyring,
    it should be processed and cleared by key_replace_session_keyring().

    However, this task can fork before it notices TIF_NOTIFY_RESUME and
    the new child gets the bogus ->replacement_session_keyring copied by
    dup_task_struct(). This is obviously wrong and, if nothing else, this
    leads to put_cred(already_freed_cred).

    change copy_creds() to clear this member. If copy_process() fails
    before this point the wrong ->replacement_session_keyring doesn't
    matter, exit_creds() won't be called.

    Cc:
    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

08 Apr, 2012

1 commit


14 Feb, 2012

1 commit


31 Oct, 2011

1 commit

  • The changed files were only including linux/module.h for the
    EXPORT_SYMBOL infrastructure, and nothing else. Revector them
    onto the isolated export header for faster compile times.

    Nothing to see here but a whole lot of instances of:

    -#include
    +#include

    This commit is only changing the kernel dir; next targets
    will probably be mm, fs, the arch dirs, etc.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

25 Oct, 2011

1 commit

  • * 'next' of git://selinuxproject.org/~jmorris/linux-security: (95 commits)
    TOMOYO: Fix incomplete read after seek.
    Smack: allow to access /smack/access as normal user
    TOMOYO: Fix unused kernel config option.
    Smack: fix: invalid length set for the result of /smack/access
    Smack: compilation fix
    Smack: fix for /smack/access output, use string instead of byte
    Smack: domain transition protections (v3)
    Smack: Provide information for UDS getsockopt(SO_PEERCRED)
    Smack: Clean up comments
    Smack: Repair processing of fcntl
    Smack: Rule list lookup performance
    Smack: check permissions from user space (v2)
    TOMOYO: Fix quota and garbage collector.
    TOMOYO: Remove redundant tasklist_lock.
    TOMOYO: Fix domain transition failure warning.
    TOMOYO: Remove tomoyo_policy_memory_lock spinlock.
    TOMOYO: Simplify garbage collector.
    TOMOYO: Fix make namespacecheck warnings.
    target: check hex2bin result
    encrypted-keys: check hex2bin result
    ...

    Linus Torvalds
     

23 Aug, 2011

2 commits

  • This patch adds CONFIG_KEYS guard for tgcred to fix below build error
    if CONFIG_KEYS is not configured.

    CC kernel/cred.o
    kernel/cred.c: In function 'prepare_kernel_cred':
    kernel/cred.c:657: error: 'tgcred' undeclared (first use in this function)
    kernel/cred.c:657: error: (Each undeclared identifier is reported only once
    kernel/cred.c:657: error: for each function it appears in.)
    make[1]: *** [kernel/cred.o] Error 1
    make: *** [kernel] Error 2

    Signed-off-by: Axel Lin
    Acked-by: David Howells
    Signed-off-by: James Morris

    Axel Lin
     
  • Fix prepare_kernel_cred() to provide a new, separate thread_group_cred struct
    otherwise when using request_key() ____call_usermodehelper() calls
    umh_keys_init() with the new creds pointing to init_tgcred, which
    umh_keys_init() then blithely alters.

    The problem can be demonstrated by:

    # keyctl request2 user a debug:a @s
    249681132
    # grep req /proc/keys
    079906a5 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.249681132: 1/4
    38ef1626 IR---- 1 expd 0b010000 0 0 .request_ key:ee1d4ec pid:4371 ci:1

    The keyring _req.XXXX should have gone away, but something (init_tgcred) is
    pinning it.

    That key actually requested can then be removed and a new one created:

    # keyctl unlink 249681132
    1 links removed
    [root@andromeda ~]# grep req /proc/keys
    116cecac IR---- 1 expd 0b010000 0 0 .request_ key:eeb4911 pid:4379 ci:1
    36d1cbf8 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.250300689: 1/4

    which causes the old _req keyring to go away and a new one to take its place.

    This is a consequence of the changes in:

    commit 879669961b11e7f40b518784863a259f735a72bf
    Author: David Howells
    Date: Fri Jun 17 11:25:59 2011 +0100
    KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring

    and:

    commit 17f60a7da150fdd0cfb9756f86a262daa72c835f
    Author: Eric Paris
    Date: Fri Apr 1 17:07:50 2011 -0400
    capabilites: allow the application of capability limits to usermode helpers

    After this patch is applied, the _req keyring and the .request_key key are
    cleaned up.

    Signed-off-by: David Howells
    cc: Eric Paris
    Signed-off-by: James Morris

    David Howells
     

12 Aug, 2011

1 commit

  • The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
    check in set_user() to check for NPROC exceeding via setuid() and
    similar functions.

    Before the check there was a possibility to greatly exceed the allowed
    number of processes by an unprivileged user if the program relied on
    rlimit only. But the check created new security threat: many poorly
    written programs simply don't check setuid() return code and believe it
    cannot fail if executed with root privileges. So, the check is removed
    in this patch because of too often privilege escalations related to
    buggy programs.

    The NPROC can still be enforced in the common code flow of daemons
    spawning user processes. Most of daemons do fork()+setuid()+execve().
    The check introduced in execve() (1) enforces the same limit as in
    setuid() and (2) doesn't create similar security issues.

    Neil Brown suggested to track what specific process has exceeded the
    limit by setting PF_NPROC_EXCEEDED process flag. With the change only
    this process would fail on execve(), and other processes' execve()
    behaviour is not changed.

    Solar Designer suggested to re-check whether NPROC limit is still
    exceeded at the moment of execve(). If the process was sleeping for
    days between set*uid() and execve(), and the NPROC counter step down
    under the limit, the defered execve() failure because NPROC limit was
    exceeded days ago would be unexpected. If the limit is not exceeded
    anymore, we clear the flag on successful calls to execve() and fork().

    The flag is also cleared on successful calls to set_user() as the limit
    was exceeded for the previous user, not the current one.

    Similar check was introduced in -ow patches (without the process flag).

    v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().

    Reviewed-by: James Morris
    Signed-off-by: Vasiliy Kulikov
    Acked-by: NeilBrown
    Signed-off-by: Linus Torvalds

    Vasiliy Kulikov