Eric Lee / smarc-fsl-linux-kernel

10 Jul, 2019

1 commit

9d22167f3 Merge branch 'next-lsm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull capabilities update from James Morris:
"Minor fixes for capabilities:

- Update the commoncap.c code to utilize XATTR_SECURITY_PREFIX_LEN,
from Carmeli tamir.

- Make the capability hooks static, from Yue Haibing"

* 'next-lsm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security/commoncap: Use xattr security prefix len
security: Make capability_hooks static

Linus Torvalds
2019-07-10 03:24:21 +0800

07 Jul, 2019

1 commit

c5eaab1d1 security/commoncap: Use xattr security prefix len ... Browse Code »

Using the existing defined XATTR_SECURITY_PREFIX_LEN instead of
sizeof(XATTR_SECURITY_PREFIX) - 1. Pretty simple cleanup.

Signed-off-by: Carmeli Tamir
Signed-off-by: James Morris

Carmeli Tamir
2019-07-07 10:55:54 +0800

12 Jun, 2019

1 commit

d1c5947ec security: Make capability_hooks static ... Browse Code »

Fix sparse warning:

security/commoncap.c:1347:27: warning:
symbol 'capability_hooks' was not declared. Should it be static?

Reported-by: Hulk Robot
Signed-off-by: YueHaibing
Signed-off-by: James Morris

YueHaibing
2019-06-12 05:05:16 +0800

31 May, 2019

1 commit

2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

08 Mar, 2019

1 commit

be37f21a0 Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit ... Browse Code »

Pull audit updates from Paul Moore:
"A lucky 13 audit patches for v5.1.

Despite the rather large diffstat, most of the changes are from two
bug fix patches that move code from one Kconfig option to another.

Beyond that bit of churn, the remaining changes are largely cleanups
and bug-fixes as we slowly march towards container auditing. It isn't
all boring though, we do have a couple of new things: file
capabilities v3 support, and expanded support for filtering on
filesystems to solve problems with remote filesystems.

All changes pass the audit-testsuite. Please merge for v5.1"

* tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: mark expected switch fall-through
audit: hide auditsc_get_stamp and audit_serial prototypes
audit: join tty records to their syscall
audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL
audit: remove unused actx param from audit_rule_match
audit: ignore fcaps on umount
audit: clean up AUDITSYSCALL prototypes and stubs
audit: more filter PATH records keyed on filesystem magic
audit: add support for fcaps v3
audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT
audit: add syscall information to CONFIG_CHANGE records
audit: hand taken context to audit_kill_trees for syscall logging
audit: give a clue what CONFIG_CHANGE op was involved

Linus Torvalds
2019-03-08 04:20:11 +0800

26 Feb, 2019

1 commit

e88ed488a LSM: Update function documentation for cap_capable ... Browse Code »

This should have gone in with commit
c1a85a00ea66cb6f0bd0f14e47c28c2b0999799f.

Signed-off-by: Micah Morton
Signed-off-by: James Morris

Micah Morton
2019-02-26 07:16:25 +0800

26 Jan, 2019

1 commit

2fec30e24 audit: add support for fcaps v3 ... Browse Code »

V3 namespaced file capabilities were introduced in
commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")

Add support for these by adding the "frootid" field to the existing
fcaps fields in the NAME and BPRM_FCAPS records.

Please see github issue
https://github.com/linux-audit/audit-kernel/issues/103

Signed-off-by: Richard Guy Briggs
Acked-by: Serge Hallyn
[PM: comment tweak to fit an 80 char line width]
Signed-off-by: Paul Moore

Richard Guy Briggs
2019-01-26 02:31:23 +0800

11 Jan, 2019

1 commit

c1a85a00e LSM: generalize flag passing to security_capable ... Browse Code »

This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM).

Signed-off-by: Micah Morton
Reviewed-by: Kees Cook
Signed-off-by: James Morris

Micah Morton
2019-01-11 06:16:06 +0800

09 Jan, 2019

1 commit

d117a154e capability: Initialize as LSM_ORDER_FIRST ... Browse Code »

This converts capabilities to use the new LSM_ORDER_FIRST position.

Signed-off-by: Kees Cook
Reviewed-by: Casey Schaufler

Kees Cook
2019-01-09 05:18:44 +0800

13 Dec, 2018

1 commit

876979c93 security: audit and remove any unnecessary uses of module.h ... Browse Code »

Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends. That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.

The advantage in removing such instances is that module.h itself
sources about 15 other headers; adding significantly to what we feed
cpp, and it can obscure what headers we are effectively using.

Since module.h might have been the implicit source for init.h
(for __init) and for export.h (for EXPORT_SYMBOL) we consider each
instance for the presence of either and replace as needed.

Cc: James Morris
Cc: "Serge E. Hallyn"
Cc: John Johansen
Cc: Mimi Zohar
Cc: Dmitry Kasatkin
Cc: David Howells
Cc: linux-security-module@vger.kernel.org
Cc: linux-integrity@vger.kernel.org
Cc: keyrings@vger.kernel.org
Signed-off-by: Paul Gortmaker
Signed-off-by: James Morris

Paul Gortmaker
2018-12-13 06:58:51 +0800

05 Sep, 2018

1 commit

e42f6f9be Merge tag 'v4.19-rc2' into next-general ... Browse Code »

Sync to Linux 4.19-rc2 for downstream developers.

James Morris
2018-09-05 02:35:54 +0800

30 Aug, 2018

1 commit

4408e300a security/capabilities: remove check for -EINVAL ... Browse Code »

bprm_caps_from_vfs_caps() never returned -EINVAL so remove the
rc == -EINVAL check.

Signed-off-by: Christian Brauner
Reviewed-by: Serge Hallyn
Signed-off-by: James Morris

Christian Brauner
2018-08-30 00:05:28 +0800

11 Aug, 2018

1 commit

355139a8d cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias() ... Browse Code »

The code in cap_inode_getsecurity(), introduced by commit 8db6c34f1dbc
("Introduce v3 namespaced file capabilities"), should use
d_find_any_alias() instead of d_find_alias() do handle unhashed dentry
correctly. This is needed, for example, if execveat() is called with an
open but unlinked overlayfs file, because overlayfs unhashes dentry on
unlink.
This is a regression of real life application, first reported at
https://www.spinics.net/lists/linux-unionfs/msg05363.html

Below reproducer and setup can reproduce the case.
const char* exec="echo";
const char *newargv[] = { "echo", "hello", NULL};
const char *newenviron[] = { NULL };
int fd, err;

fd = open(exec, O_PATH);
unlink(exec);
err = syscall(322/*SYS_execveat*/, fd, "", newargv, newenviron,
AT_EMPTY_PATH);
if(err
Acked-by: Amir Goldstein
Acked-by: Serge E. Hallyn
Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
Cc: # v4.14
Signed-off-by: Eddie Horng
Signed-off-by: Eric W. Biederman

Eddie.Horng
2018-08-11 15:05:53 +0800

25 May, 2018

1 commit

b1d749c5c capabilities: Allow privileged user in s_user_ns to set security.* xattrs ... Browse Code »

A privileged user in s_user_ns will generally have the ability to
manipulate the backing store and insert security.* xattrs into
the filesystem directly. Therefore the kernel must be prepared to
handle these xattrs from unprivileged mounts, and it makes little
sense for commoncap to prevent writing these xattrs to the
filesystem. The capability and LSM code have already been updated
to appropriately handle xattrs from unprivileged mounts, so it
is safe to loosen this restriction on setting xattrs.

The exception to this logic is that writing xattrs to a mounted
filesystem may also cause the LSM inode_post_setxattr or
inode_setsecurity callbacks to be invoked. SELinux will deny the
xattr update by virtue of applying mountpoint labeling to
unprivileged userns mounts, and Smack will deny the writes for
any user without global CAP_MAC_ADMIN, so loosening the
capability check in commoncap is safe in this respect as well.

Signed-off-by: Seth Forshee
Acked-by: Serge Hallyn
Acked-by: Christian Brauner
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2018-05-25 01:03:31 +0800

11 Apr, 2018

1 commit

1f5781725 commoncap: Handle memory allocation failure. ... Browse Code »

syzbot is reporting NULL pointer dereference at xattr_getsecurity() [1],
for cap_inode_getsecurity() is returning sizeof(struct vfs_cap_data) when
memory allocation failed. Return -ENOMEM if memory allocation failed.

[1] https://syzkaller.appspot.com/bug?id=a55ba438506fe68649a5f50d2d82d56b365e0107

Signed-off-by: Tetsuo Handa
Fixes: 8db6c34f1dbc8e06 ("Introduce v3 namespaced file capabilities")
Reported-by: syzbot
Cc: stable # 4.14+
Acked-by: Serge E. Hallyn
Acked-by: James Morris
Signed-off-by: Eric W. Biederman

Tetsuo Handa
2018-04-11 08:17:41 +0800

02 Jan, 2018

1 commit

dc32b5c3e capabilities: fix buffer overread on very short xattr ... Browse Code »

If userspace attempted to set a "security.capability" xattr shorter than
4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
cap_convert_nscap() read past the end of the buffer containing the xattr
value because it accessed the ->magic_etc field without verifying that
the xattr value is long enough to contain that field.

Fix it by validating the xattr value size first.

This bug was found using syzkaller with KASAN. The KASAN report was as
follows (cleaned up slightly):

BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852

CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xe3/0x195 lib/dump_stack.c:53
print_address_description+0x73/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x235/0x350 mm/kasan/report.c:409
cap_convert_nscap+0x514/0x630 security/commoncap.c:498
setxattr+0x2bd/0x350 fs/xattr.c:446
path_setxattr+0x168/0x1b0 fs/xattr.c:472
SYSC_setxattr fs/xattr.c:487 [inline]
SyS_setxattr+0x36/0x50 fs/xattr.c:483
entry_SYSCALL_64_fastpath+0x18/0x85

Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
Cc: # v4.14+
Signed-off-by: Eric Biggers
Reviewed-by: Serge Hallyn
Signed-off-by: James Morris

Eric Biggers
2018-01-02 17:49:13 +0800

14 Nov, 2017

1 commit

55b3a0cb5 Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull general security subsystem updates from James Morris:
"TPM (from Jarkko):
- essential clean up for tpm_crb so that ARM64 and x86 versions do
not distract each other as much as before

- /dev/tpm0 rejects now too short writes (shorter buffer than
specified in the command header

- use DMA-safe buffer in tpm_tis_spi

- otherwise mostly minor fixes.

Smack:
- base support for overlafs

Capabilities:
- BPRM_FCAPS fixes, from Richard Guy Briggs:

The audit subsystem is adding a BPRM_FCAPS record when auditing
setuid application execution (SYSCALL execve). This is not expected
as it was supposed to be limited to when the file system actually
had capabilities in an extended attribute. It lists all
capabilities making the event really ugly to parse what is
happening. The PATH record correctly records the setuid bit and
owner. Suppress the BPRM_FCAPS record on set*id.

TOMOYO:
- Y2038 timestamping fixes"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
MAINTAINERS: update the IMA, EVM, trusted-keys, encrypted-keys entries
Smack: Base support for overlayfs
MAINTAINERS: remove David Safford as maintainer for encrypted+trusted keys
tomoyo: fix timestamping for y2038
capabilities: audit log other surprising conditions
capabilities: fix logic for effective root or real root
capabilities: invert logic for clarity
capabilities: remove a layer of conditional logic
capabilities: move audit log decision to function
capabilities: use intuitive names for id changes
capabilities: use root_priveleged inline to clarify logic
capabilities: rename has_cap to has_fcap
capabilities: intuitive names for cap gain status
capabilities: factor out cap_bprm_set_creds privileged root
tpm, tpm_tis: use ARRAY_SIZE() to define TPM_HID_USR_IDX
tpm: fix duplicate inline declaration specifier
tpm: fix type of a local variables in tpm_tis_spi.c
tpm: fix type of a local variable in tpm2_map_command()
tpm: fix type of a local variable in tpm2_get_cc_attrs_tbl()
tpm-dev-common: Reject too short writes
...

Linus Torvalds
2017-11-14 02:30:44 +0800

20 Oct, 2017

10 commits

dbbbe1105 capabilities: audit log other surprising conditions ... Browse Code »

The existing condition tested for process effective capabilities set by
file attributes but intended to ignore the change if the result was
unsurprisingly an effective full set in the case root is special with a
setuid root executable file and we are root.

Stated again:
- When you execute a setuid root application, it is no surprise and
expected that it got all capabilities, so we do not want capabilities
recorded.
if (pE_grew && !(pE_fullset && (eff_root || real_root) && root_priveleged) )

Now make sure we cover other cases:
- If something prevented a setuid root app getting all capabilities and
it wound up with one capability only, then it is a surprise and should
be logged. When it is a setuid root file, we only want capabilities
when the process does not get full capabilities..
root_priveleged && setuid_root && !pE_fullset

- Similarly if a non-setuid program does pick up capabilities due to
file system based capabilities, then we want to know what capabilities
were picked up. When it has file system based capabilities we want
the capabilities.
!is_setuid && (has_fcap && pP_gained)

- If it is a non-setuid file and it gets ambient capabilities, we want
the capabilities.
!is_setuid && pA_gained

- These last two are combined into one due to the common first parameter.

Related: https://github.com/linux-audit/audit-kernel/issues/16

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Acked-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:46 +0800
588fb2c7e capabilities: fix logic for effective root or real root ... Browse Code »

Now that the logic is inverted, it is much easier to see that both real
root and effective root conditions had to be met to avoid printing the
BPRM_FCAPS record with audit syscalls. This meant that any setuid root
applications would print a full BPRM_FCAPS record when it wasn't
necessary, cluttering the event output, since the SYSCALL and PATH
records indicated the presence of the setuid bit and effective root user
id.

Require only one of effective root or real root to avoid printing the
unnecessary record.

Ref: commit 3fc689e96c0c ("Add audit_log_bprm_fcaps/AUDIT_BPRM_FCAPS")
See: https://github.com/linux-audit/audit-kernel/issues/16

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Acked-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:45 +0800
c0d1adefe capabilities: invert logic for clarity ... Browse Code »

The way the logic was presented, it was awkward to read and verify.
Invert the logic using DeMorgan's Law to be more easily able to read and
understand.

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:45 +0800
02ebbaf48 capabilities: remove a layer of conditional logic ... Browse Code »

Remove a layer of conditional logic to make the use of conditions
easier to read and analyse.

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:45 +0800
9fbc2c796 capabilities: move audit log decision to function ... Browse Code »

Move the audit log decision logic to its own function to isolate the
complexity in one place.

Suggested-by: Serge Hallyn
Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:44 +0800
81a6a0129 capabilities: use intuitive names for id changes ... Browse Code »

Introduce a number of inlines to make the use of the negation of
uid_eq() easier to read and analyse.

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:44 +0800
9304b46c9 capabilities: use root_priveleged inline to clarify logic ... Browse Code »

Introduce inline root_privileged() to make use of SECURE_NONROOT
easier to read.

Suggested-by: Serge Hallyn
Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:44 +0800
fc7eadf76 capabilities: rename has_cap to has_fcap ... Browse Code »

Rename has_cap to has_fcap to clarify it applies to file capabilities
since the entire source file is about capabilities.

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:44 +0800
4c7e715fc capabilities: intuitive names for cap gain status ... Browse Code »

Introduce macros cap_gained, cap_grew, cap_full to make the use of the
negation of is_subset() easier to read and analyse.

Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:43 +0800
db1a8922c capabilities: factor out cap_bprm_set_creds privileged root ... Browse Code »

Factor out the case of privileged root from the function
cap_bprm_set_creds() to make the latter easier to read and analyse.

Suggested-by: Serge Hallyn
Signed-off-by: Richard Guy Briggs
Reviewed-by: Serge Hallyn
Acked-by: James Morris
Acked-by: Kees Cook
Okay-ished-by: Paul Moore
Signed-off-by: James Morris

Richard Guy Briggs
2017-10-20 12:22:43 +0800

19 Oct, 2017

1 commit

76ba89c76 commoncap: move assignment of fs_ns to avoid null pointer dereference ... Browse Code »

The pointer fs_ns is assigned from inode->i_ib->s_user_ns before
a null pointer check on inode, hence if inode is actually null we
will get a null pointer dereference on this assignment. Fix this
by only dereferencing inode after the null pointer check on
inode.

Detected by CoverityScan CID#1455328 ("Dereference before null check")

Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities")
Signed-off-by: Colin Ian King
Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn
Signed-off-by: James Morris

Colin Ian King
2017-10-19 10:09:33 +0800

25 Sep, 2017

1 commit

a30282478 Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull misc security layer update from James Morris:
"This is the remaining 'general' change in the security tree for v4.14,
following the direct merging of SELinux (+ TOMOYO), AppArmor, and
seccomp.

That's everything now for the security tree except IMA, which will
follow shortly (I've been traveling for the past week with patchy
internet)"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: fix description of values returned by cap_inode_need_killpriv

Linus Torvalds
2017-09-25 02:40:41 +0800

24 Sep, 2017

1 commit

ab5348c9c security: fix description of values returned by cap_inode_need_killpriv ... Browse Code »

cap_inode_need_killpriv returns 1 if security.capability exists and
has a value and inode_killpriv() is required, 0 otherwise. Fix the
description of the return value to reflect this.

Signed-off-by: Stefan Berger
Reviewed-by: Serge Hallyn
Signed-off-by: James Morris

Stefan Berger
2017-09-24 12:15:41 +0800

12 Sep, 2017

1 commit

dd198ce71 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull namespace updates from Eric Biederman:
"Life has been busy and I have not gotten half as much done this round
as I would have liked. I delayed it so that a minor conflict
resolution with the mips tree could spend a little time in linux-next
before I sent this pull request.

This includes two long delayed user namespace changes from Kirill
Tkhai. It also includes a very useful change from Serge Hallyn that
allows the security capability attribute to be used inside of user
namespaces. The practical effect of this is people can now untar
tarballs and install rpms in user namespaces. It had been suggested to
generalize this and encode some of the namespace information
information in the xattr name. Upon close inspection that makes the
things that should be hard easy and the things that should be easy
more expensive.

Then there is my bugfix/cleanup for signal injection that removes the
magic encoding of the siginfo union member from the kernel internal
si_code. The mips folks reported the case where I had used FPE_FIXME
me is impossible so I have remove FPE_FIXME from mips, while at the
same time including a return statement in that case to keep gcc from
complaining about unitialized variables.

I almost finished the work to get make copy_siginfo_to_user a trivial
copy to user. The code is available at:

git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3

But I did not have time/energy to get the code posted and reviewed
before the merge window opened.

I was able to see that the security excuse for just copying fields
that we know are initialized doesn't work in practice there are buggy
initializations that don't initialize the proper fields in siginfo. So
we still sometimes copy unitialized data to userspace"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
Introduce v3 namespaced file capabilities
mips/signal: In force_fcr31_sig return in the impossible case
signal: Remove kernel interal si_code magic
fcntl: Don't use ambiguous SIG_POLL si_codes
prctl: Allow local CAP_SYS_ADMIN changing exe_file
security: Use user_namespace::level to avoid redundant iterations in cap_capable()
userns,pidns: Verify the userns for new pid namespaces
signal/testing: Don't look for __SI_FAULT in userspace
signal/mips: Document a conflict with SI_USER with SIGFPE
signal/sparc: Document a conflict with SI_USER with SIGFPE
signal/ia64: Document a conflict with SI_USER with SIGFPE
signal/alpha: Document a conflict with SI_USER for SIGTRAP

Linus Torvalds
2017-09-12 09:34:47 +0800

02 Sep, 2017

1 commit

8db6c34f1 Introduce v3 namespaced file capabilities ... Browse Code »

Root in a non-initial user ns cannot be trusted to write a traditional
security.capability xattr. If it were allowed to do so, then any
unprivileged user on the host could map his own uid to root in a private
namespace, write the xattr, and execute the file with privilege on the
host.

However supporting file capabilities in a user namespace is very
desirable. Not doing so means that any programs designed to run with
limited privilege must continue to support other methods of gaining and
dropping privilege. For instance a program installer must detect
whether file capabilities can be assigned, and assign them if so but set
setuid-root otherwise. The program in turn must know how to drop
partial capabilities, and do so only if setuid-root.

This patch introduces v3 of the security.capability xattr. It builds a
vfs_ns_cap_data struct by appending a uid_t rootid to struct
vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user
namespace which mounted the filesystem, usually init_user_ns) of the
root id in whose namespaces the file capabilities may take effect.

When a task asks to write a v2 security.capability xattr, if it is
privileged with respect to the userns which mounted the filesystem, then
nothing should change. Otherwise, the kernel will transparently rewrite
the xattr as a v3 with the appropriate rootid. This is done during the
execution of setxattr() to catch user-space-initiated capability writes.
Subsequently, any task executing the file which has the noted kuid as
its root uid, or which is in a descendent user_ns of such a user_ns,
will run the file with capabilities.

Similarly when asking to read file capabilities, a v3 capability will
be presented as v2 if it applies to the caller's namespace.

If a task writes a v3 security.capability, then it can provide a uid for
the xattr so long as the uid is valid in its own user namespace, and it
is privileged with CAP_SETFCAP over its namespace. The kernel will
translate that rootid to an absolute uid, and write that to disk. After
this, a task in the writer's namespace will not be able to use those
capabilities (unless rootid was 0), but a task in a namespace where the
given uid is root will.

Only a single security.capability xattr may exist at a time for a given
file. A task may overwrite an existing xattr so long as it is
privileged over the inode. Note this is a departure from previous
semantics, which required privilege to remove a security.capability
xattr. This check can be re-added if deemed useful.

This allows a simple setxattr to work, allows tar/untar to work, and
allows us to tar in one namespace and untar in another while preserving
the capability, without risking leaking privilege into a parent
namespace.

Example using tar:

$ cp /bin/sleep sleepx
$ mkdir b1 b2
$ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1
$ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2
$ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx
$ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar
$ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx
b2/sleepx = cap_sys_admin+ep
# /opt/ltp/testcases/bin/getv3xattr b2/sleepx
v3 xattr, rootid is 100001

A patch to linux-test-project adding a new set of tests for this
functionality is in the nsfscaps branch at github.com/hallyn/ltp

Changelog:
Nov 02 2016: fix invalid check at refuse_fcap_overwrite()
Nov 07 2016: convert rootid from and to fs user_ns
(From ebiederm: mar 28 2017)
commoncap.c: fix typos - s/v4/v3
get_vfs_caps_from_disk: clarify the fs_ns root access check
nsfscaps: change the code split for cap_inode_setxattr()
Apr 09 2017:
don't return v3 cap for caps owned by current root.
return a v2 cap for a true v2 cap in non-init ns
Apr 18 2017:
. Change the flow of fscap writing to support s_user_ns writing.
. Remove refuse_fcap_overwrite(). The value of the previous
xattr doesn't matter.
Apr 24 2017:
. incorporate Eric's incremental diff
. move cap_convert_nscap to setxattr and simplify its usage
May 8, 2017:
. fix leaking dentry refcount in cap_inode_getsecurity

Signed-off-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Serge E. Hallyn
2017-09-02 03:57:15 +0800

02 Aug, 2017

2 commits

ee67ae7ef commoncap: Move cap_elevated calculation into bprm_set_creds ... Browse Code »

Instead of a separate function, open-code the cap_elevated test, which
lets us entirely remove bprm->cap_effective (to use the local "effective"
variable instead), and more accurately examine euid/egid changes via the
existing local "is_setid".

The following LTP tests were run to validate the changes:

# ./runltp -f syscalls -s cap
# ./runltp -f securebits
# ./runltp -f cap_bounds
# ./runltp -f filecaps

All kernel selftests for capabilities and exec continue to pass as well.

Signed-off-by: Kees Cook
Reviewed-by: James Morris
Acked-by: Serge Hallyn
Reviewed-by: Andy Lutomirski

Kees Cook
2017-08-02 03:03:09 +0800
46d98eb4e commoncap: Refactor to remove bprm_secureexec hook ... Browse Code »

The commoncap implementation of the bprm_secureexec hook is the only LSM
that depends on the final call to its bprm_set_creds hook (since it may
be called for multiple files, it ignores bprm->called_set_creds). As a
result, it cannot safely _clear_ bprm->secureexec since other LSMs may
have set it. Instead, remove the bprm_secureexec hook by introducing a
new flag to bprm specific to commoncap: cap_elevated. This is similar to
cap_effective, but that is used for a specific subset of elevated
privileges, and exists solely to track state from bprm_set_creds to
bprm_secureexec. As such, it will be removed in the next patch.

Here, set the new bprm->cap_elevated flag when setuid/setgid has happened
from bprm_fill_uid() or fscapabilities have been prepared. This temporarily
moves the bprm_secureexec hook to a static inline. The helper will be
removed in the next patch; this makes the step easier to review and bisect,
since this does not introduce any changes to inputs nor outputs to the
"elevated privileges" calculation.

The new flag is merged with the bprm->secureexec flag in setup_new_exec()
since this marks the end of any further prepare_binprm() calls.

Cc: Andy Lutomirski
Signed-off-by: Kees Cook
Reviewed-by: Andy Lutomirski
Acked-by: James Morris
Acked-by: Serge Hallyn

Kees Cook
2017-08-02 03:03:08 +0800

20 Jul, 2017

1 commit

64db4c7f4 security: Use user_namespace::level to avoid redundant iterations in cap_capable() ... Browse Code »

When ns->level is not larger then cred->user_ns->level,
then ns can't be cred->user_ns's descendant, and
there is no a sense to search in parents.

So, break the cycle earlier and skip needless iterations.

v2: Change comment on suggested by Andy Lutomirski.

Signed-off-by: Kirill Tkhai
Signed-off-by: Eric W. Biederman

Kirill Tkhai
2017-07-20 20:46:06 +0800

06 Mar, 2017

1 commit

ca97d939d security: mark LSM hooks as __ro_after_init ... Browse Code »

Mark all of the registration hooks as __ro_after_init (via the
__lsm_ro_after_init macro).

Signed-off-by: James Morris
Acked-by: Stephen Smalley
Acked-by: Kees Cook

James Morris
2017-03-06 08:00:15 +0800

24 Feb, 2017

1 commit

f1ef09fde Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull namespace updates from Eric Biederman:
"There is a lot here. A lot of these changes result in subtle user
visible differences in kernel behavior. I don't expect anything will
care but I will revert/fix things immediately if any regressions show
up.

From Seth Forshee there is a continuation of the work to make the vfs
ready for unpriviled mounts. We had thought the previous changes
prevented the creation of files outside of s_user_ns of a filesystem,
but it turns we missed the O_CREAT path. Ooops.

Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
children that are forked after the prctl are considered and not
children forked before the prctl. The only known user of this prctl
systemd forks all children after the prctl. So no userspace
regressions will occur. Holding earlier forked children to the same
rules as later forked children creates a semantic that is sane enough
to allow checkpoing of processes that use this feature.

There is a long delayed change by Nikolay Borisov to limit inotify
instances inside a user namespace.

Michael Kerrisk extends the API for files used to maniuplate
namespaces with two new trivial ioctls to allow discovery of the
hierachy and properties of namespaces.

Konstantin Khlebnikov with the help of Al Viro adds code that when a
network namespace exits purges it's sysctl entries from the dcache. As
in some circumstances this could use a lot of memory.

Vivek Goyal fixed a bug with stacked filesystems where the permissions
on the wrong inode were being checked.

I continue previous work on ptracing across exec. Allowing a file to
be setuid across exec while being ptraced if the tracer has enough
credentials in the user namespace, and if the process has CAP_SETUID
in it's own namespace. Proc files for setuid or otherwise undumpable
executables are now owned by the root in the user namespace of their
mm. Allowing debugging of setuid applications in containers to work
better.

A bug I introduced with permission checking and automount is now
fixed. The big change is to mark the mounts that the kernel initiates
as a result of an automount. This allows the permission checks in sget
to be safely suppressed for this kind of mount. As the permission
check happened when the original filesystem was mounted.

Finally a special case in the mount namespace is removed preventing
unbounded chains in the mount hash table, and making the semantics
simpler which benefits CRIU.

The vfs fix along with related work in ima and evm I believe makes us
ready to finish developing and merge fully unprivileged mounts of the
fuse filesystem. The cleanups of the mount namespace makes discussing
how to fix the worst case complexity of umount. The stacked filesystem
fixes pave the way for adding multiple mappings for the filesystem
uids so that efficient and safer containers can be implemented"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc/sysctl: Don't grab i_lock under sysctl_lock.
vfs: Use upper filesystem inode in bprm_fill_uid()
proc/sysctl: prune stale dentries during unregistering
mnt: Tuck mounts under others instead of creating shadow/side mounts.
prctl: propagate has_child_subreaper flag to every descendant
introduce the walk_process_tree() helper
nsfs: Add an ioctl() to return owner UID of a userns
fs: Better permission checking for submounts
exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
vfs: open() with O_CREAT should not create inodes with unknown ids
nsfs: Add an ioctl() to return the namespace type
proc: Better ownership of files for non-dumpable tasks in user namespaces
exec: Remove LSM_UNSAFE_PTRACE_CAP
exec: Test the ptracer's saved cred to see if the tracee can gain caps
exec: Don't reset euid and egid when the tracee has CAP_SETUID
inotify: Convert to using per-namespace limits

Linus Torvalds
2017-02-24 12:33:51 +0800

24 Jan, 2017

3 commits

9227dd2a8 exec: Remove LSM_UNSAFE_PTRACE_CAP ... Browse Code »

With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.

Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2017-01-24 07:03:08 +0800
20523132e exec: Test the ptracer's saved cred to see if the tracee can gain caps ... Browse Code »

Now that we have user namespaces and non-global capabilities verify
the tracer has capabilities in the relevant user namespace instead
of in the current_user_ns().

As the test for setting LSM_UNSAFE_PTRACE_CAP is currently
ptracer_capable(p, current_user_ns()) and the new task credentials are
in current_user_ns() this change does not have any user visible change
and simply moves the test to where it is used, making the code easier
to read.

Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2017-01-24 07:03:08 +0800
70169420f exec: Don't reset euid and egid when the tracee has CAP_SETUID ... Browse Code »

Don't reset euid and egid when the tracee has CAP_SETUID in
it's user namespace. I punted on relaxing this permission check
long ago but now that I have read this code closely it is clear
it is safe to test against CAP_SETUID in the user namespace.

Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2017-01-24 07:03:07 +0800