10 Jul, 2019

1 commit


07 Jul, 2019

1 commit


12 Jun, 2019

1 commit


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

08 Mar, 2019

1 commit

  • Pull audit updates from Paul Moore:
    "A lucky 13 audit patches for v5.1.

    Despite the rather large diffstat, most of the changes are from two
    bug fix patches that move code from one Kconfig option to another.

    Beyond that bit of churn, the remaining changes are largely cleanups
    and bug-fixes as we slowly march towards container auditing. It isn't
    all boring though, we do have a couple of new things: file
    capabilities v3 support, and expanded support for filtering on
    filesystems to solve problems with remote filesystems.

    All changes pass the audit-testsuite. Please merge for v5.1"

    * tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
    audit: mark expected switch fall-through
    audit: hide auditsc_get_stamp and audit_serial prototypes
    audit: join tty records to their syscall
    audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL
    audit: remove unused actx param from audit_rule_match
    audit: ignore fcaps on umount
    audit: clean up AUDITSYSCALL prototypes and stubs
    audit: more filter PATH records keyed on filesystem magic
    audit: add support for fcaps v3
    audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT
    audit: add syscall information to CONFIG_CHANGE records
    audit: hand taken context to audit_kill_trees for syscall logging
    audit: give a clue what CONFIG_CHANGE op was involved

    Linus Torvalds
     

26 Feb, 2019

1 commit


26 Jan, 2019

1 commit

  • V3 namespaced file capabilities were introduced in
    commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")

    Add support for these by adding the "frootid" field to the existing
    fcaps fields in the NAME and BPRM_FCAPS records.

    Please see github issue
    https://github.com/linux-audit/audit-kernel/issues/103

    Signed-off-by: Richard Guy Briggs
    Acked-by: Serge Hallyn
    [PM: comment tweak to fit an 80 char line width]
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

11 Jan, 2019

1 commit

  • This patch provides a general mechanism for passing flags to the
    security_capable LSM hook. It replaces the specific 'audit' flag that is
    used to tell security_capable whether it should log an audit message for
    the given capability check. The reason for generalizing this flag
    passing is so we can add an additional flag that signifies whether
    security_capable is being called by a setid syscall (which is needed by
    the proposed SafeSetID LSM).

    Signed-off-by: Micah Morton
    Reviewed-by: Kees Cook
    Signed-off-by: James Morris

    Micah Morton
     

09 Jan, 2019

1 commit


13 Dec, 2018

1 commit

  • Historically a lot of these existed because we did not have
    a distinction between what was modular code and what was providing
    support to modules via EXPORT_SYMBOL and friends. That changed
    when we forked out support for the latter into the export.h file.
    This means we should be able to reduce the usage of module.h
    in code that is obj-y Makefile or bool Kconfig.

    The advantage in removing such instances is that module.h itself
    sources about 15 other headers; adding significantly to what we feed
    cpp, and it can obscure what headers we are effectively using.

    Since module.h might have been the implicit source for init.h
    (for __init) and for export.h (for EXPORT_SYMBOL) we consider each
    instance for the presence of either and replace as needed.

    Cc: James Morris
    Cc: "Serge E. Hallyn"
    Cc: John Johansen
    Cc: Mimi Zohar
    Cc: Dmitry Kasatkin
    Cc: David Howells
    Cc: linux-security-module@vger.kernel.org
    Cc: linux-integrity@vger.kernel.org
    Cc: keyrings@vger.kernel.org
    Signed-off-by: Paul Gortmaker
    Signed-off-by: James Morris

    Paul Gortmaker
     

05 Sep, 2018

1 commit


30 Aug, 2018

1 commit


11 Aug, 2018

1 commit

  • The code in cap_inode_getsecurity(), introduced by commit 8db6c34f1dbc
    ("Introduce v3 namespaced file capabilities"), should use
    d_find_any_alias() instead of d_find_alias() do handle unhashed dentry
    correctly. This is needed, for example, if execveat() is called with an
    open but unlinked overlayfs file, because overlayfs unhashes dentry on
    unlink.
    This is a regression of real life application, first reported at
    https://www.spinics.net/lists/linux-unionfs/msg05363.html

    Below reproducer and setup can reproduce the case.
    const char* exec="echo";
    const char *newargv[] = { "echo", "hello", NULL};
    const char *newenviron[] = { NULL };
    int fd, err;

    fd = open(exec, O_PATH);
    unlink(exec);
    err = syscall(322/*SYS_execveat*/, fd, "", newargv, newenviron,
    AT_EMPTY_PATH);
    if(err
    Acked-by: Amir Goldstein
    Acked-by: Serge E. Hallyn
    Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
    Cc: # v4.14
    Signed-off-by: Eddie Horng
    Signed-off-by: Eric W. Biederman

    Eddie.Horng
     

25 May, 2018

1 commit

  • A privileged user in s_user_ns will generally have the ability to
    manipulate the backing store and insert security.* xattrs into
    the filesystem directly. Therefore the kernel must be prepared to
    handle these xattrs from unprivileged mounts, and it makes little
    sense for commoncap to prevent writing these xattrs to the
    filesystem. The capability and LSM code have already been updated
    to appropriately handle xattrs from unprivileged mounts, so it
    is safe to loosen this restriction on setting xattrs.

    The exception to this logic is that writing xattrs to a mounted
    filesystem may also cause the LSM inode_post_setxattr or
    inode_setsecurity callbacks to be invoked. SELinux will deny the
    xattr update by virtue of applying mountpoint labeling to
    unprivileged userns mounts, and Smack will deny the writes for
    any user without global CAP_MAC_ADMIN, so loosening the
    capability check in commoncap is safe in this respect as well.

    Signed-off-by: Seth Forshee
    Acked-by: Serge Hallyn
    Acked-by: Christian Brauner
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

11 Apr, 2018

1 commit

  • syzbot is reporting NULL pointer dereference at xattr_getsecurity() [1],
    for cap_inode_getsecurity() is returning sizeof(struct vfs_cap_data) when
    memory allocation failed. Return -ENOMEM if memory allocation failed.

    [1] https://syzkaller.appspot.com/bug?id=a55ba438506fe68649a5f50d2d82d56b365e0107

    Signed-off-by: Tetsuo Handa
    Fixes: 8db6c34f1dbc8e06 ("Introduce v3 namespaced file capabilities")
    Reported-by: syzbot
    Cc: stable # 4.14+
    Acked-by: Serge E. Hallyn
    Acked-by: James Morris
    Signed-off-by: Eric W. Biederman

    Tetsuo Handa
     

02 Jan, 2018

1 commit

  • If userspace attempted to set a "security.capability" xattr shorter than
    4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
    cap_convert_nscap() read past the end of the buffer containing the xattr
    value because it accessed the ->magic_etc field without verifying that
    the xattr value is long enough to contain that field.

    Fix it by validating the xattr value size first.

    This bug was found using syzkaller with KASAN. The KASAN report was as
    follows (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
    Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852

    CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0xe3/0x195 lib/dump_stack.c:53
    print_address_description+0x73/0x260 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x235/0x350 mm/kasan/report.c:409
    cap_convert_nscap+0x514/0x630 security/commoncap.c:498
    setxattr+0x2bd/0x350 fs/xattr.c:446
    path_setxattr+0x168/0x1b0 fs/xattr.c:472
    SYSC_setxattr fs/xattr.c:487 [inline]
    SyS_setxattr+0x36/0x50 fs/xattr.c:483
    entry_SYSCALL_64_fastpath+0x18/0x85

    Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
    Cc: # v4.14+
    Signed-off-by: Eric Biggers
    Reviewed-by: Serge Hallyn
    Signed-off-by: James Morris

    Eric Biggers
     

14 Nov, 2017

1 commit

  • Pull general security subsystem updates from James Morris:
    "TPM (from Jarkko):
    - essential clean up for tpm_crb so that ARM64 and x86 versions do
    not distract each other as much as before

    - /dev/tpm0 rejects now too short writes (shorter buffer than
    specified in the command header

    - use DMA-safe buffer in tpm_tis_spi

    - otherwise mostly minor fixes.

    Smack:
    - base support for overlafs

    Capabilities:
    - BPRM_FCAPS fixes, from Richard Guy Briggs:

    The audit subsystem is adding a BPRM_FCAPS record when auditing
    setuid application execution (SYSCALL execve). This is not expected
    as it was supposed to be limited to when the file system actually
    had capabilities in an extended attribute. It lists all
    capabilities making the event really ugly to parse what is
    happening. The PATH record correctly records the setuid bit and
    owner. Suppress the BPRM_FCAPS record on set*id.

    TOMOYO:
    - Y2038 timestamping fixes"

    * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
    MAINTAINERS: update the IMA, EVM, trusted-keys, encrypted-keys entries
    Smack: Base support for overlayfs
    MAINTAINERS: remove David Safford as maintainer for encrypted+trusted keys
    tomoyo: fix timestamping for y2038
    capabilities: audit log other surprising conditions
    capabilities: fix logic for effective root or real root
    capabilities: invert logic for clarity
    capabilities: remove a layer of conditional logic
    capabilities: move audit log decision to function
    capabilities: use intuitive names for id changes
    capabilities: use root_priveleged inline to clarify logic
    capabilities: rename has_cap to has_fcap
    capabilities: intuitive names for cap gain status
    capabilities: factor out cap_bprm_set_creds privileged root
    tpm, tpm_tis: use ARRAY_SIZE() to define TPM_HID_USR_IDX
    tpm: fix duplicate inline declaration specifier
    tpm: fix type of a local variables in tpm_tis_spi.c
    tpm: fix type of a local variable in tpm2_map_command()
    tpm: fix type of a local variable in tpm2_get_cc_attrs_tbl()
    tpm-dev-common: Reject too short writes
    ...

    Linus Torvalds
     

20 Oct, 2017

10 commits

  • The existing condition tested for process effective capabilities set by
    file attributes but intended to ignore the change if the result was
    unsurprisingly an effective full set in the case root is special with a
    setuid root executable file and we are root.

    Stated again:
    - When you execute a setuid root application, it is no surprise and
    expected that it got all capabilities, so we do not want capabilities
    recorded.
    if (pE_grew && !(pE_fullset && (eff_root || real_root) && root_priveleged) )

    Now make sure we cover other cases:
    - If something prevented a setuid root app getting all capabilities and
    it wound up with one capability only, then it is a surprise and should
    be logged. When it is a setuid root file, we only want capabilities
    when the process does not get full capabilities..
    root_priveleged && setuid_root && !pE_fullset

    - Similarly if a non-setuid program does pick up capabilities due to
    file system based capabilities, then we want to know what capabilities
    were picked up. When it has file system based capabilities we want
    the capabilities.
    !is_setuid && (has_fcap && pP_gained)

    - If it is a non-setuid file and it gets ambient capabilities, we want
    the capabilities.
    !is_setuid && pA_gained

    - These last two are combined into one due to the common first parameter.

    Related: https://github.com/linux-audit/audit-kernel/issues/16

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Acked-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Now that the logic is inverted, it is much easier to see that both real
    root and effective root conditions had to be met to avoid printing the
    BPRM_FCAPS record with audit syscalls. This meant that any setuid root
    applications would print a full BPRM_FCAPS record when it wasn't
    necessary, cluttering the event output, since the SYSCALL and PATH
    records indicated the presence of the setuid bit and effective root user
    id.

    Require only one of effective root or real root to avoid printing the
    unnecessary record.

    Ref: commit 3fc689e96c0c ("Add audit_log_bprm_fcaps/AUDIT_BPRM_FCAPS")
    See: https://github.com/linux-audit/audit-kernel/issues/16

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Acked-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • The way the logic was presented, it was awkward to read and verify.
    Invert the logic using DeMorgan's Law to be more easily able to read and
    understand.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Remove a layer of conditional logic to make the use of conditions
    easier to read and analyse.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Move the audit log decision logic to its own function to isolate the
    complexity in one place.

    Suggested-by: Serge Hallyn
    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Introduce a number of inlines to make the use of the negation of
    uid_eq() easier to read and analyse.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Introduce inline root_privileged() to make use of SECURE_NONROOT
    easier to read.

    Suggested-by: Serge Hallyn
    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Rename has_cap to has_fcap to clarify it applies to file capabilities
    since the entire source file is about capabilities.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Introduce macros cap_gained, cap_grew, cap_full to make the use of the
    negation of is_subset() easier to read and analyse.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     
  • Factor out the case of privileged root from the function
    cap_bprm_set_creds() to make the latter easier to read and analyse.

    Suggested-by: Serge Hallyn
    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Serge Hallyn
    Acked-by: James Morris
    Acked-by: Kees Cook
    Okay-ished-by: Paul Moore
    Signed-off-by: James Morris

    Richard Guy Briggs
     

19 Oct, 2017

1 commit

  • The pointer fs_ns is assigned from inode->i_ib->s_user_ns before
    a null pointer check on inode, hence if inode is actually null we
    will get a null pointer dereference on this assignment. Fix this
    by only dereferencing inode after the null pointer check on
    inode.

    Detected by CoverityScan CID#1455328 ("Dereference before null check")

    Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
    Signed-off-by: Colin Ian King
    Cc: stable@vger.kernel.org
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    Colin Ian King
     

25 Sep, 2017

1 commit

  • Pull misc security layer update from James Morris:
    "This is the remaining 'general' change in the security tree for v4.14,
    following the direct merging of SELinux (+ TOMOYO), AppArmor, and
    seccomp.

    That's everything now for the security tree except IMA, which will
    follow shortly (I've been traveling for the past week with patchy
    internet)"

    * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    security: fix description of values returned by cap_inode_need_killpriv

    Linus Torvalds
     

24 Sep, 2017

1 commit


12 Sep, 2017

1 commit

  • Pull namespace updates from Eric Biederman:
    "Life has been busy and I have not gotten half as much done this round
    as I would have liked. I delayed it so that a minor conflict
    resolution with the mips tree could spend a little time in linux-next
    before I sent this pull request.

    This includes two long delayed user namespace changes from Kirill
    Tkhai. It also includes a very useful change from Serge Hallyn that
    allows the security capability attribute to be used inside of user
    namespaces. The practical effect of this is people can now untar
    tarballs and install rpms in user namespaces. It had been suggested to
    generalize this and encode some of the namespace information
    information in the xattr name. Upon close inspection that makes the
    things that should be hard easy and the things that should be easy
    more expensive.

    Then there is my bugfix/cleanup for signal injection that removes the
    magic encoding of the siginfo union member from the kernel internal
    si_code. The mips folks reported the case where I had used FPE_FIXME
    me is impossible so I have remove FPE_FIXME from mips, while at the
    same time including a return statement in that case to keep gcc from
    complaining about unitialized variables.

    I almost finished the work to get make copy_siginfo_to_user a trivial
    copy to user. The code is available at:

    git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3

    But I did not have time/energy to get the code posted and reviewed
    before the merge window opened.

    I was able to see that the security excuse for just copying fields
    that we know are initialized doesn't work in practice there are buggy
    initializations that don't initialize the proper fields in siginfo. So
    we still sometimes copy unitialized data to userspace"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    Introduce v3 namespaced file capabilities
    mips/signal: In force_fcr31_sig return in the impossible case
    signal: Remove kernel interal si_code magic
    fcntl: Don't use ambiguous SIG_POLL si_codes
    prctl: Allow local CAP_SYS_ADMIN changing exe_file
    security: Use user_namespace::level to avoid redundant iterations in cap_capable()
    userns,pidns: Verify the userns for new pid namespaces
    signal/testing: Don't look for __SI_FAULT in userspace
    signal/mips: Document a conflict with SI_USER with SIGFPE
    signal/sparc: Document a conflict with SI_USER with SIGFPE
    signal/ia64: Document a conflict with SI_USER with SIGFPE
    signal/alpha: Document a conflict with SI_USER for SIGTRAP

    Linus Torvalds
     

02 Sep, 2017

1 commit

  • Root in a non-initial user ns cannot be trusted to write a traditional
    security.capability xattr. If it were allowed to do so, then any
    unprivileged user on the host could map his own uid to root in a private
    namespace, write the xattr, and execute the file with privilege on the
    host.

    However supporting file capabilities in a user namespace is very
    desirable. Not doing so means that any programs designed to run with
    limited privilege must continue to support other methods of gaining and
    dropping privilege. For instance a program installer must detect
    whether file capabilities can be assigned, and assign them if so but set
    setuid-root otherwise. The program in turn must know how to drop
    partial capabilities, and do so only if setuid-root.

    This patch introduces v3 of the security.capability xattr. It builds a
    vfs_ns_cap_data struct by appending a uid_t rootid to struct
    vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user
    namespace which mounted the filesystem, usually init_user_ns) of the
    root id in whose namespaces the file capabilities may take effect.

    When a task asks to write a v2 security.capability xattr, if it is
    privileged with respect to the userns which mounted the filesystem, then
    nothing should change. Otherwise, the kernel will transparently rewrite
    the xattr as a v3 with the appropriate rootid. This is done during the
    execution of setxattr() to catch user-space-initiated capability writes.
    Subsequently, any task executing the file which has the noted kuid as
    its root uid, or which is in a descendent user_ns of such a user_ns,
    will run the file with capabilities.

    Similarly when asking to read file capabilities, a v3 capability will
    be presented as v2 if it applies to the caller's namespace.

    If a task writes a v3 security.capability, then it can provide a uid for
    the xattr so long as the uid is valid in its own user namespace, and it
    is privileged with CAP_SETFCAP over its namespace. The kernel will
    translate that rootid to an absolute uid, and write that to disk. After
    this, a task in the writer's namespace will not be able to use those
    capabilities (unless rootid was 0), but a task in a namespace where the
    given uid is root will.

    Only a single security.capability xattr may exist at a time for a given
    file. A task may overwrite an existing xattr so long as it is
    privileged over the inode. Note this is a departure from previous
    semantics, which required privilege to remove a security.capability
    xattr. This check can be re-added if deemed useful.

    This allows a simple setxattr to work, allows tar/untar to work, and
    allows us to tar in one namespace and untar in another while preserving
    the capability, without risking leaking privilege into a parent
    namespace.

    Example using tar:

    $ cp /bin/sleep sleepx
    $ mkdir b1 b2
    $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1
    $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2
    $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx
    $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar
    $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx
    b2/sleepx = cap_sys_admin+ep
    # /opt/ltp/testcases/bin/getv3xattr b2/sleepx
    v3 xattr, rootid is 100001

    A patch to linux-test-project adding a new set of tests for this
    functionality is in the nsfscaps branch at github.com/hallyn/ltp

    Changelog:
    Nov 02 2016: fix invalid check at refuse_fcap_overwrite()
    Nov 07 2016: convert rootid from and to fs user_ns
    (From ebiederm: mar 28 2017)
    commoncap.c: fix typos - s/v4/v3
    get_vfs_caps_from_disk: clarify the fs_ns root access check
    nsfscaps: change the code split for cap_inode_setxattr()
    Apr 09 2017:
    don't return v3 cap for caps owned by current root.
    return a v2 cap for a true v2 cap in non-init ns
    Apr 18 2017:
    . Change the flow of fscap writing to support s_user_ns writing.
    . Remove refuse_fcap_overwrite(). The value of the previous
    xattr doesn't matter.
    Apr 24 2017:
    . incorporate Eric's incremental diff
    . move cap_convert_nscap to setxattr and simplify its usage
    May 8, 2017:
    . fix leaking dentry refcount in cap_inode_getsecurity

    Signed-off-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Serge E. Hallyn
     

02 Aug, 2017

2 commits

  • Instead of a separate function, open-code the cap_elevated test, which
    lets us entirely remove bprm->cap_effective (to use the local "effective"
    variable instead), and more accurately examine euid/egid changes via the
    existing local "is_setid".

    The following LTP tests were run to validate the changes:

    # ./runltp -f syscalls -s cap
    # ./runltp -f securebits
    # ./runltp -f cap_bounds
    # ./runltp -f filecaps

    All kernel selftests for capabilities and exec continue to pass as well.

    Signed-off-by: Kees Cook
    Reviewed-by: James Morris
    Acked-by: Serge Hallyn
    Reviewed-by: Andy Lutomirski

    Kees Cook
     
  • The commoncap implementation of the bprm_secureexec hook is the only LSM
    that depends on the final call to its bprm_set_creds hook (since it may
    be called for multiple files, it ignores bprm->called_set_creds). As a
    result, it cannot safely _clear_ bprm->secureexec since other LSMs may
    have set it. Instead, remove the bprm_secureexec hook by introducing a
    new flag to bprm specific to commoncap: cap_elevated. This is similar to
    cap_effective, but that is used for a specific subset of elevated
    privileges, and exists solely to track state from bprm_set_creds to
    bprm_secureexec. As such, it will be removed in the next patch.

    Here, set the new bprm->cap_elevated flag when setuid/setgid has happened
    from bprm_fill_uid() or fscapabilities have been prepared. This temporarily
    moves the bprm_secureexec hook to a static inline. The helper will be
    removed in the next patch; this makes the step easier to review and bisect,
    since this does not introduce any changes to inputs nor outputs to the
    "elevated privileges" calculation.

    The new flag is merged with the bprm->secureexec flag in setup_new_exec()
    since this marks the end of any further prepare_binprm() calls.

    Cc: Andy Lutomirski
    Signed-off-by: Kees Cook
    Reviewed-by: Andy Lutomirski
    Acked-by: James Morris
    Acked-by: Serge Hallyn

    Kees Cook
     

20 Jul, 2017

1 commit


06 Mar, 2017

1 commit


24 Feb, 2017

1 commit

  • Pull namespace updates from Eric Biederman:
    "There is a lot here. A lot of these changes result in subtle user
    visible differences in kernel behavior. I don't expect anything will
    care but I will revert/fix things immediately if any regressions show
    up.

    From Seth Forshee there is a continuation of the work to make the vfs
    ready for unpriviled mounts. We had thought the previous changes
    prevented the creation of files outside of s_user_ns of a filesystem,
    but it turns we missed the O_CREAT path. Ooops.

    Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
    standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
    children that are forked after the prctl are considered and not
    children forked before the prctl. The only known user of this prctl
    systemd forks all children after the prctl. So no userspace
    regressions will occur. Holding earlier forked children to the same
    rules as later forked children creates a semantic that is sane enough
    to allow checkpoing of processes that use this feature.

    There is a long delayed change by Nikolay Borisov to limit inotify
    instances inside a user namespace.

    Michael Kerrisk extends the API for files used to maniuplate
    namespaces with two new trivial ioctls to allow discovery of the
    hierachy and properties of namespaces.

    Konstantin Khlebnikov with the help of Al Viro adds code that when a
    network namespace exits purges it's sysctl entries from the dcache. As
    in some circumstances this could use a lot of memory.

    Vivek Goyal fixed a bug with stacked filesystems where the permissions
    on the wrong inode were being checked.

    I continue previous work on ptracing across exec. Allowing a file to
    be setuid across exec while being ptraced if the tracer has enough
    credentials in the user namespace, and if the process has CAP_SETUID
    in it's own namespace. Proc files for setuid or otherwise undumpable
    executables are now owned by the root in the user namespace of their
    mm. Allowing debugging of setuid applications in containers to work
    better.

    A bug I introduced with permission checking and automount is now
    fixed. The big change is to mark the mounts that the kernel initiates
    as a result of an automount. This allows the permission checks in sget
    to be safely suppressed for this kind of mount. As the permission
    check happened when the original filesystem was mounted.

    Finally a special case in the mount namespace is removed preventing
    unbounded chains in the mount hash table, and making the semantics
    simpler which benefits CRIU.

    The vfs fix along with related work in ima and evm I believe makes us
    ready to finish developing and merge fully unprivileged mounts of the
    fuse filesystem. The cleanups of the mount namespace makes discussing
    how to fix the worst case complexity of umount. The stacked filesystem
    fixes pave the way for adding multiple mappings for the filesystem
    uids so that efficient and safer containers can be implemented"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc/sysctl: Don't grab i_lock under sysctl_lock.
    vfs: Use upper filesystem inode in bprm_fill_uid()
    proc/sysctl: prune stale dentries during unregistering
    mnt: Tuck mounts under others instead of creating shadow/side mounts.
    prctl: propagate has_child_subreaper flag to every descendant
    introduce the walk_process_tree() helper
    nsfs: Add an ioctl() to return owner UID of a userns
    fs: Better permission checking for submounts
    exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
    vfs: open() with O_CREAT should not create inodes with unknown ids
    nsfs: Add an ioctl() to return the namespace type
    proc: Better ownership of files for non-dumpable tasks in user namespaces
    exec: Remove LSM_UNSAFE_PTRACE_CAP
    exec: Test the ptracer's saved cred to see if the tracee can gain caps
    exec: Don't reset euid and egid when the tracee has CAP_SETUID
    inotify: Convert to using per-namespace limits

    Linus Torvalds
     

24 Jan, 2017

3 commits