01 Jul, 2016

1 commit


15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

11 Sep, 2015

1 commit

  • Commit e0e817392b9a ("CRED: Add some configurable debugging [try #6]")
    added the kdebug mechanism to this file back in 2009.

    The kdebug macro calls no_printk which always evaluates arguments.

    Most of the kdebug uses have an unnecessary call of
    atomic_read(&cred->usage)

    Make the kdebug macro do nothing by defining it with
    do { if (0) no_printk(...); } while (0)
    when not enabled.

    $ size kernel/cred.o* (defconfig x86-64)
    text data bss dec hex filename
    2748 336 8 3092 c14 kernel/cred.o.new
    2788 336 8 3132 c3c kernel/cred.o.old

    Miscellanea:
    o Neaten the #define kdebug macros while there

    Signed-off-by: Joe Perches
    Cc: David Howells
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

16 Apr, 2015

1 commit

  • There are a lot of embedded systems that run most or all of their
    functionality in init, running as root:root. For these systems,
    supporting multiple users is not necessary.

    This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
    non-root users, non-root groups, and capabilities optional. It is enabled
    under CONFIG_EXPERT menu.

    When this symbol is not defined, UID and GID are zero in any possible case
    and processes always have all capabilities.

    The following syscalls are compiled out: setuid, setregid, setgid,
    setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
    getgroups, setfsuid, setfsgid, capget, capset.

    Also, groups.c is compiled out completely.

    In kernel/capability.c, capable function was moved in order to avoid
    adding two ifdef blocks.

    This change saves about 25 KB on a defconfig build. The most minimal
    kernels have total text sizes in the high hundreds of kB rather than
    low MB. (The 25k goes down a bit with allnoconfig, but not that much.

    The kernel was booted in Qemu. All the common functionalities work.
    Adding users/groups is not possible, failing with -ENOSYS.

    Bloat-o-meter output:
    add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Iulia Manda
    Reviewed-by: Josh Triplett
    Acked-by: Geert Uytterhoeven
    Tested-by: Paul E. McKenney
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Iulia Manda
     

19 Dec, 2012

1 commit

  • Pull (again) user namespace infrastructure changes from Eric Biederman:
    "Those bugs, those darn embarrasing bugs just want don't want to get
    fixed.

    Linus I just updated my mirror of your kernel.org tree and it appears
    you successfully pulled everything except the last 4 commits that fix
    those embarrasing bugs.

    When you get a chance can you please repull my branch"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Fix typo in description of the limitation of userns_install
    userns: Add a more complete capability subset test to commit_creds
    userns: Require CAP_SYS_ADMIN for most uses of setns.
    Fix cap_capable to only allow owners in the parent user namespace to have caps.

    Linus Torvalds
     

17 Dec, 2012

1 commit

  • Pull security subsystem updates from James Morris:
    "A quiet cycle for the security subsystem with just a few maintenance
    updates."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    Smack: create a sysfs mount point for smackfs
    Smack: use select not depends in Kconfig
    Yama: remove locking from delete path
    Yama: add RCU to drop read locking
    drivers/char/tpm: remove tasklet and cleanup
    KEYS: Use keyring_alloc() to create special keyrings
    KEYS: Reduce initial permissions on keys
    KEYS: Make the session and process keyrings per-thread
    seccomp: Make syscall skipping and nr changes more consistent
    key: Fix resource leak
    keys: Fix unreachable code
    KEYS: Add payload preparsing opportunity prior to key instantiate or update

    Linus Torvalds
     

15 Dec, 2012

1 commit

  • When unsharing a user namespace we reduce our credentials to just what
    can be done in that user namespace. This is a subset of the credentials
    we previously had. Teach commit_creds to recognize this is a subset
    of the credentials we have had before and don't clear the dumpability flag.

    This allows an unprivileged program to do:
    unshare(CLONE_NEWUSER);
    fd = open("/proc/self/uid_map", O_RDWR);

    Where previously opening the uid_map writable would fail because
    the the task had been made non-dumpable.

    Acked-by: Serge Hallyn
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

03 Oct, 2012

1 commit

  • Make the session keyring per-thread rather than per-process, but still
    inherited from the parent thread to solve a problem with PAM and gdm.

    The problem is that join_session_keyring() will reject attempts to change the
    session keyring of a multithreaded program but gdm is now multithreaded before
    it gets to the point of starting PAM and running pam_keyinit to create the
    session keyring. See:

    https://bugs.freedesktop.org/show_bug.cgi?id=49211

    The reason that join_session_keyring() will only change the session keyring
    under a single-threaded environment is that it's hard to alter the other
    thread's credentials to effect the change in a multi-threaded program. The
    problems are such as:

    (1) How to prevent two threads both running join_session_keyring() from
    racing.

    (2) Another thread's credentials may not be modified directly by this process.

    (3) The number of threads is uncertain whilst we're not holding the
    appropriate spinlock, making preallocation slightly tricky.

    (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
    another thread to replace its keyring, but that means preallocating for
    each thread.

    A reasonable way around this is to make the session keyring per-thread rather
    than per-process and just document that if you want a common session keyring,
    you must get it before you spawn any threads - which is the current situation
    anyway.

    Whilst we're at it, we can the process keyring behave in the same way. This
    means we can clean up some of the ickyness in the creds code.

    Basically, after this patch, the session, process and thread keyrings are about
    inheritance rules only and not about sharing changes of keyring.

    Reported-by: Mantas M.
    Signed-off-by: David Howells
    Tested-by: Ray Strode

    David Howells
     

24 Aug, 2012

1 commit


24 May, 2012

2 commits

  • Kill the no longer used task_struct->replacement_session_keyring, update
    copy_creds() and exit_creds().

    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Cc: Thomas Gleixner
    Cc: Richard Kuo
    Cc: Linus Torvalds
    Cc: Alexander Gordeev
    Cc: Chris Zankel
    Cc: David Smith
    Cc: "Frank Ch. Eigler"
    Cc: Geert Uytterhoeven
    Cc: Larry Woodman
    Cc: Peter Zijlstra
    Cc: Tejun Heo
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Oleg Nesterov
     
  • Pull user namespace enhancements from Eric Biederman:
    "This is a course correction for the user namespace, so that we can
    reach an inexpensive, maintainable, and reasonably complete
    implementation.

    Highlights:
    - Config guards make it impossible to enable the user namespace and
    code that has not been converted to be user namespace safe.

    - Use of the new kuid_t type ensures the if you somehow get past the
    config guards the kernel will encounter type errors if you enable
    user namespaces and attempt to compile in code whose permission
    checks have not been updated to be user namespace safe.

    - All uids from child user namespaces are mapped into the initial
    user namespace before they are processed. Removing the need to add
    an additional check to see if the user namespace of the compared
    uids remains the same.

    - With the user namespaces compiled out the performance is as good or
    better than it is today.

    - For most operations absolutely nothing changes performance or
    operationally with the user namespace enabled.

    - The worst case performance I could come up with was timing 1
    billion cache cold stat operations with the user namespace code
    enabled. This went from 156s to 164s on my laptop (or 156ns to
    164ns per stat operation).

    - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
    Most uid/gid setting system calls treat these value specially
    anyway so attempting to use -1 as a uid would likely cause
    entertaining failures in userspace.

    - If setuid is called with a uid that can not be mapped setuid fails.
    I have looked at sendmail, login, ssh and every other program I
    could think of that would call setuid and they all check for and
    handle the case where setuid fails.

    - If stat or a similar system call is called from a context in which
    we can not map a uid we lie and return overflowuid. The LFS
    experience suggests not lying and returning an error code might be
    better, but the historical precedent with uids is different and I
    can not think of anything that would break by lying about a uid we
    can't map.

    - Capabilities are localized to the current user namespace making it
    safe to give the initial user in a user namespace all capabilities.

    My git tree covers all of the modifications needed to convert the core
    kernel and enough changes to make a system bootable to runlevel 1."

    Fix up trivial conflicts due to nearby independent changes in fs/stat.c

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
    userns: Silence silly gcc warning.
    cred: use correct cred accessor with regards to rcu read lock
    userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
    userns: Convert cgroup permission checks to use uid_eq
    userns: Convert tmpfs to use kuid and kgid where appropriate
    userns: Convert sysfs to use kgid/kuid where appropriate
    userns: Convert sysctl permission checks to use kuid and kgids.
    userns: Convert proc to use kuid/kgid where appropriate
    userns: Convert ext4 to user kuid/kgid where appropriate
    userns: Convert ext3 to use kuid/kgid where appropriate
    userns: Convert ext2 to use kuid/kgid where appropriate.
    userns: Convert devpts to use kuid/kgid where appropriate
    userns: Convert binary formats to use kuid/kgid where appropriate
    userns: Add negative depends on entries to avoid building code that is userns unsafe
    userns: signal remove unnecessary map_cred_ns
    userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
    userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
    userns: Convert stat to return values mapped from kuids and kgids
    userns: Convert user specfied uids and gids in chown into kuids and kgid
    userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
    ...

    Linus Torvalds
     

03 May, 2012

1 commit


11 Apr, 2012

1 commit

  • keyctl_session_to_parent(task) sets ->replacement_session_keyring,
    it should be processed and cleared by key_replace_session_keyring().

    However, this task can fork before it notices TIF_NOTIFY_RESUME and
    the new child gets the bogus ->replacement_session_keyring copied by
    dup_task_struct(). This is obviously wrong and, if nothing else, this
    leads to put_cred(already_freed_cred).

    change copy_creds() to clear this member. If copy_process() fails
    before this point the wrong ->replacement_session_keyring doesn't
    matter, exit_creds() won't be called.

    Cc:
    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

08 Apr, 2012

1 commit


14 Feb, 2012

1 commit


31 Oct, 2011

1 commit

  • The changed files were only including linux/module.h for the
    EXPORT_SYMBOL infrastructure, and nothing else. Revector them
    onto the isolated export header for faster compile times.

    Nothing to see here but a whole lot of instances of:

    -#include
    +#include

    This commit is only changing the kernel dir; next targets
    will probably be mm, fs, the arch dirs, etc.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

25 Oct, 2011

1 commit

  • * 'next' of git://selinuxproject.org/~jmorris/linux-security: (95 commits)
    TOMOYO: Fix incomplete read after seek.
    Smack: allow to access /smack/access as normal user
    TOMOYO: Fix unused kernel config option.
    Smack: fix: invalid length set for the result of /smack/access
    Smack: compilation fix
    Smack: fix for /smack/access output, use string instead of byte
    Smack: domain transition protections (v3)
    Smack: Provide information for UDS getsockopt(SO_PEERCRED)
    Smack: Clean up comments
    Smack: Repair processing of fcntl
    Smack: Rule list lookup performance
    Smack: check permissions from user space (v2)
    TOMOYO: Fix quota and garbage collector.
    TOMOYO: Remove redundant tasklist_lock.
    TOMOYO: Fix domain transition failure warning.
    TOMOYO: Remove tomoyo_policy_memory_lock spinlock.
    TOMOYO: Simplify garbage collector.
    TOMOYO: Fix make namespacecheck warnings.
    target: check hex2bin result
    encrypted-keys: check hex2bin result
    ...

    Linus Torvalds
     

23 Aug, 2011

2 commits

  • This patch adds CONFIG_KEYS guard for tgcred to fix below build error
    if CONFIG_KEYS is not configured.

    CC kernel/cred.o
    kernel/cred.c: In function 'prepare_kernel_cred':
    kernel/cred.c:657: error: 'tgcred' undeclared (first use in this function)
    kernel/cred.c:657: error: (Each undeclared identifier is reported only once
    kernel/cred.c:657: error: for each function it appears in.)
    make[1]: *** [kernel/cred.o] Error 1
    make: *** [kernel] Error 2

    Signed-off-by: Axel Lin
    Acked-by: David Howells
    Signed-off-by: James Morris

    Axel Lin
     
  • Fix prepare_kernel_cred() to provide a new, separate thread_group_cred struct
    otherwise when using request_key() ____call_usermodehelper() calls
    umh_keys_init() with the new creds pointing to init_tgcred, which
    umh_keys_init() then blithely alters.

    The problem can be demonstrated by:

    # keyctl request2 user a debug:a @s
    249681132
    # grep req /proc/keys
    079906a5 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.249681132: 1/4
    38ef1626 IR---- 1 expd 0b010000 0 0 .request_ key:ee1d4ec pid:4371 ci:1

    The keyring _req.XXXX should have gone away, but something (init_tgcred) is
    pinning it.

    That key actually requested can then be removed and a new one created:

    # keyctl unlink 249681132
    1 links removed
    [root@andromeda ~]# grep req /proc/keys
    116cecac IR---- 1 expd 0b010000 0 0 .request_ key:eeb4911 pid:4379 ci:1
    36d1cbf8 I--Q-- 1 perm 1f3f0000 0 0 keyring _req.250300689: 1/4

    which causes the old _req keyring to go away and a new one to take its place.

    This is a consequence of the changes in:

    commit 879669961b11e7f40b518784863a259f735a72bf
    Author: David Howells
    Date: Fri Jun 17 11:25:59 2011 +0100
    KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring

    and:

    commit 17f60a7da150fdd0cfb9756f86a262daa72c835f
    Author: Eric Paris
    Date: Fri Apr 1 17:07:50 2011 -0400
    capabilites: allow the application of capability limits to usermode helpers

    After this patch is applied, the _req keyring and the .request_key key are
    cleaned up.

    Signed-off-by: David Howells
    cc: Eric Paris
    Signed-off-by: James Morris

    David Howells
     

12 Aug, 2011

1 commit

  • The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
    check in set_user() to check for NPROC exceeding via setuid() and
    similar functions.

    Before the check there was a possibility to greatly exceed the allowed
    number of processes by an unprivileged user if the program relied on
    rlimit only. But the check created new security threat: many poorly
    written programs simply don't check setuid() return code and believe it
    cannot fail if executed with root privileges. So, the check is removed
    in this patch because of too often privilege escalations related to
    buggy programs.

    The NPROC can still be enforced in the common code flow of daemons
    spawning user processes. Most of daemons do fork()+setuid()+execve().
    The check introduced in execve() (1) enforces the same limit as in
    setuid() and (2) doesn't create similar security issues.

    Neil Brown suggested to track what specific process has exceeded the
    limit by setting PF_NPROC_EXCEEDED process flag. With the change only
    this process would fail on execve(), and other processes' execve()
    behaviour is not changed.

    Solar Designer suggested to re-check whether NPROC limit is still
    exceeded at the moment of execve(). If the process was sleeping for
    days between set*uid() and execve(), and the NPROC counter step down
    under the limit, the defered execve() failure because NPROC limit was
    exceeded days ago would be unexpected. If the limit is not exceeded
    anymore, we clear the flag on successful calls to execve() and fork().

    The flag is also cleared on successful calls to set_user() as the limit
    was exceeded for the previous user, not the current one.

    Similar check was introduced in -ow patches (without the process flag).

    v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().

    Reviewed-by: James Morris
    Signed-off-by: Vasiliy Kulikov
    Acked-by: NeilBrown
    Signed-off-by: Linus Torvalds

    Vasiliy Kulikov
     

28 May, 2011

1 commit


20 May, 2011

1 commit


19 May, 2011

1 commit


14 May, 2011

1 commit

  • If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).

    Get rid of _current_user_ns. This requires nsown_capable() to be
    defined in capability.c rather than as static inline in capability.h,
    so do that.

    Request_key needs init_user_ns defined at current_user_ns if
    !CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
    at current_user_ns() define.

    Compile-tested with and without CONFIG_USERNS.

    Signed-off-by: Serge E. Hallyn
    [ This makes a huge performance difference for acl_permission_check(),
    up to 30%. And that is one of the hottest kernel functions for loads
    that are pathname-lookup heavy. ]
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

04 Apr, 2011

1 commit


24 Mar, 2011

1 commit

  • - Introduce ns_capable to test for a capability in a non-default
    user namespace.
    - Teach cap_capable to handle capabilities in a non-default
    user namespace.

    The motivation is to get to the unprivileged creation of new
    namespaces. It looks like this gets us 90% of the way there, with
    only potential uid confusion issues left.

    I still need to handle getting all caps after creation but otherwise I
    think I have a good starter patch that achieves all of your goals.

    Changelog:
    11/05/2010: [serge] add apparmor
    12/14/2010: [serge] fix capabilities to created user namespaces
    Without this, if user serge creates a user_ns, he won't have
    capabilities to the user_ns he created. THis is because we
    were first checking whether his effective caps had the caps
    he needed and returning -EPERM if not, and THEN checking whether
    he was the creator. Reverse those checks.
    12/16/2010: [serge] security_real_capable needs ns argument in !security case
    01/11/2011: [serge] add task_ns_capable helper
    01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
    02/16/2011: [serge] fix a logic bug: the root user is always creator of
    init_user_ns, but should not always have capabilities to
    it! Fix the check in cap_capable().
    02/21/2011: Add the required user_ns parameter to security_capable,
    fixing a compile failure.
    02/23/2011: Convert some macros to functions as per akpm comments. Some
    couldn't be converted because we can't easily forward-declare
    them (they are inline if !SECURITY, extern if SECURITY). Add
    a current_user_ns function so we can use it in capability.h
    without #including cred.h. Move all forward declarations
    together to the top of the #ifdef __KERNEL__ section, and use
    kernel-doc format.
    02/23/2011: Per dhowells, clean up comment in cap_capable().
    02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

    (Original written and signed off by Eric; latest, modified version
    acked by him)

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
    [serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Serge E. Hallyn
    Acked-by: "Eric W. Biederman"
    Acked-by: Daniel Lezcano
    Acked-by: David Howells
    Cc: James Morris
    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

16 Feb, 2011

1 commit


08 Feb, 2011

2 commits

  • In prepare_kernel_cred() since 2.6.29, put_cred(new) is called without
    assigning new->usage when security_prepare_creds() returned an error. As a
    result, memory for new and refcount for new->{user,group_info,tgcred} are
    leaked because put_cred(new) won't call __put_cred() unless old->usage == 1.

    Fix these leaks by assigning new->usage (and new->subscribers which was added
    in 2.6.32) before calling security_prepare_creds().

    Signed-off-by: Tetsuo Handa
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
    new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
    returns an error. As a result, BUG() will be triggered if SELinux is enabled
    or CONFIG_DEBUG_CREDENTIALS=y.

    If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
    cred->magic == 0. Failing that, BUG() is called from selinux_cred_free()
    because selinux_cred_free() is not expecting cred->security == NULL. This does
    not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().

    Fix these bugs by

    (1) Set new->magic before calling security_cred_alloc_blank().

    (2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().

    Signed-off-by: Tetsuo Handa
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

27 Jan, 2011

1 commit


28 Oct, 2010

1 commit

  • Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
    itself and we can reuse ->cred_guard_mutex for it. Yes, concurrent
    execve() has no worth.

    Let's move ->cred_guard_mutex from task_struct to signal_struct. It
    naturally prevent multiple-threads-inside-exec.

    Signed-off-by: KOSAKI Motohiro
    Reviewed-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

13 Aug, 2010

1 commit


30 Jul, 2010

1 commit

  • It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
    credentials by incrementing their usage count after their replacement by the
    task being accessed.

    What happens is that get_task_cred() can race with commit_creds():

    TASK_1 TASK_2 RCU_CLEANER
    -->get_task_cred(TASK_2)
    rcu_read_lock()
    __cred = __task_cred(TASK_2)
    -->commit_creds()
    old_cred = TASK_2->real_cred
    TASK_2->real_cred = ...
    put_cred(old_cred)
    call_rcu(old_cred)
    [__cred->usage == 0]
    get_cred(__cred)
    [__cred->usage == 1]
    rcu_read_unlock()
    -->put_cred_rcu()
    [__cred->usage == 1]
    panic()

    However, since a tasks credentials are generally not changed very often, we can
    reasonably make use of a loop involving reading the creds pointer and using
    atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.

    If successful, we can safely return the credentials in the knowledge that, even
    if the task we're accessing has released them, they haven't gone to the RCU
    cleanup code.

    We then change task_state() in procfs to use get_task_cred() rather than
    calling get_cred() on the result of __task_cred(), as that suffers from the
    same problem.

    Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
    tripped when it is noticed that the usage count is not zero as it ought to be,
    for example:

    kernel BUG at kernel/cred.c:168!
    invalid opcode: 0000 [#1] SMP
    last sysfs file: /sys/kernel/mm/ksm/run
    CPU 0
    Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
    745
    RIP: 0010:[] [] __put_cred+0xc/0x45
    RSP: 0018:ffff88019e7e9eb8 EFLAGS: 00010202
    RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
    RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
    RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
    R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
    FS: 00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
    Stack:
    ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
    ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
    ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
    Call Trace:
    [] put_cred+0x13/0x15
    [] commit_creds+0x16b/0x175
    [] set_current_groups+0x47/0x4e
    [] sys_setgroups+0xf6/0x105
    [] system_call_fastpath+0x16/0x1b
    Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
    48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 0b eb fe 65 48 8b
    04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
    RIP [] __put_cred+0xc/0x45
    RSP
    ---[ end trace df391256a100ebdd ]---

    Signed-off-by: David Howells
    Acked-by: Jiri Olsa
    Signed-off-by: Linus Torvalds

    David Howells
     

28 May, 2010

1 commit

  • Now that nobody ever changes subprocess_info->cred we can kill this member
    and related code. ____call_usermodehelper() always runs in the context of
    freshly forked kernel thread, it has the proper ->cred copied from its
    parent kthread, keventd.

    Signed-off-by: Oleg Nesterov
    Acked-by: Neil Horman
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

20 May, 2010

1 commit

  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (61 commits)
    KEYS: Return more accurate error codes
    LSM: Add __init to fixup function.
    TOMOYO: Add pathname grouping support.
    ima: remove ACPI dependency
    TPM: ACPI/PNP dependency removal
    security/selinux/ss: Use kstrdup
    TOMOYO: Use stack memory for pending entry.
    Revert "ima: remove ACPI dependency"
    Revert "TPM: ACPI/PNP dependency removal"
    KEYS: Do preallocation for __key_link()
    TOMOYO: Use mutex_lock_interruptible.
    KEYS: Better handling of errors from construct_alloc_key()
    KEYS: keyring_serialise_link_sem is only needed for keyring->keyring links
    TOMOYO: Use GFP_NOFS rather than GFP_KERNEL.
    ima: remove ACPI dependency
    TPM: ACPI/PNP dependency removal
    selinux: generalize disabling of execmem for plt-in-heap archs
    LSM Audit: rename LSM_AUDIT_NO_AUDIT to LSM_AUDIT_DATA_NONE
    CRED: Holding a spinlock does not imply the holding of RCU read lock
    SMACK: Don't #include Ext2 headers
    ...

    Linus Torvalds
     

07 May, 2010

1 commit


06 May, 2010

1 commit


22 Apr, 2010

1 commit

  • creds_are_invalid() reads both cred->usage and cred->subscribers and then
    compares them to make sure the number of processes subscribed to a cred struct
    never exceeds the refcount of that cred struct.

    The problem is that this can cause a race with both copy_creds() and
    exit_creds() as the two counters, whilst they are of atomic_t type, are only
    atomic with respect to themselves, and not atomic with respect to each other.

    This means that if creds_are_invalid() can read the values on one CPU whilst
    they're being modified on another CPU, and so can observe an evolving state in
    which the subscribers count now is greater than the usage count a moment
    before.

    Switching the order in which the counts are read cannot help, so the thing to
    do is to remove that particular check.

    I had considered rechecking the values to see if they're in flux if the test
    fails, but I can't guarantee they won't appear the same, even if they've
    changed several times in the meantime.

    Note that this can only happen if CONFIG_DEBUG_CREDENTIALS is enabled.

    The problem is only likely to occur with multithreaded programs, and can be
    tested by the tst-eintr1 program from glibc's "make check". The symptoms look
    like:

    CRED: Invalid credentials
    CRED: At include/linux/cred.h:240
    CRED: Specified credentials: ffff88003dda5878 [real][eff]
    CRED: ->magic=43736564, put_addr=(null)
    CRED: ->usage=766, subscr=766
    CRED: ->*uid = { 0,0,0,0 }
    CRED: ->*gid = { 0,0,0,0 }
    CRED: ->security is ffff88003d72f538
    CRED: ->security {359, 359}
    ------------[ cut here ]------------
    kernel BUG at kernel/cred.c:850!
    ...
    RIP: 0010:[] [] __invalid_creds+0x4e/0x52
    ...
    Call Trace:
    [] copy_creds+0x6b/0x23f

    Note the ->usage=766 and subscr=766. The values appear the same because
    they've been re-read since the check was made.

    Reported-by: Roland McGrath
    Signed-off-by: David Howells
    Signed-off-by: James Morris

    David Howells
     

21 Apr, 2010

1 commit

  • Patch 570b8fb505896e007fd3bb07573ba6640e51851d:

    Author: Mathieu Desnoyers
    Date: Tue Mar 30 00:04:00 2010 +0100
    Subject: CRED: Fix memory leak in error handling

    attempts to fix a memory leak in the error handling by making the offending
    return statement into a jump down to the bottom of the function where a
    kfree(tgcred) is inserted.

    This is, however, incorrect, as it does a kfree() after doing put_cred() if
    security_prepare_creds() fails. That will result in a double free if 'error'
    is jumped to as put_cred() will also attempt to free the new tgcred record by
    virtue of it being pointed to by the new cred record.

    Signed-off-by: David Howells
    Signed-off-by: James Morris

    David Howells
     

15 Apr, 2010

1 commit