15 Feb, 2017

1 commit

  • commit 0c461cb727d146c9ef2d3e86214f498b78b7d125 upstream.

    SELinux tries to support setting/clearing of /proc/pid/attr attributes
    from the shell by ignoring terminating newlines and treating an
    attribute value that begins with a NUL or newline as an attempt to
    clear the attribute. However, the test for clearing attributes has
    always been wrong; it has an off-by-one error, and this could further
    lead to reading past the end of the allocated buffer since commit
    bb646cdb12e75d82258c2f2e7746d5952d3e321a ("proc_pid_attr_write():
    switch to memdup_user()"). Fix the off-by-one error.

    Even with this fix, setting and clearing /proc/pid/attr attributes
    from the shell is not straightforward since the interface does not
    support multiple write() calls (so shells that write the value and
    newline separately will set and then immediately clear the attribute,
    requiring use of echo -n to set the attribute), whereas trying to use
    echo -n "" to clear the attribute causes the shell to skip the
    write() call altogether since POSIX says that a zero-length write
    causes no side effects. Thus, one must use echo -n to set and echo
    without -n to clear, as in the following example:
    $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
    $ cat /proc/$$/attr/fscreate
    unconfined_u:object_r:user_home_t:s0
    $ echo "" > /proc/$$/attr/fscreate
    $ cat /proc/$$/attr/fscreate

    Note the use of /proc/$$ rather than /proc/self, as otherwise
    the cat command will read its own attribute value, not that of the shell.

    There are no users of this facility to my knowledge; possibly we
    should just get rid of it.

    UPDATE: Upon further investigation it appears that a local process
    with the process:setfscreate permission can cause a kernel panic as a
    result of this bug. This patch fixes CVE-2017-2618.

    Signed-off-by: Stephen Smalley
    [PM: added the update about CVE-2017-2618 to the commit description]
    Signed-off-by: Paul Moore
    Signed-off-by: Greg Kroah-Hartman

    Signed-off-by: James Morris

    Stephen Smalley
     

20 Oct, 2016

1 commit

  • Asking for a non-current task's stack can't be done without races
    unless the task is frozen in kernel mode. As far as I know,
    vm_is_stack_for_task() never had a safe non-current use case.

    The __unused annotation is because some KSTK_ESP implementations
    ignore their parameter, which IMO is further justification for this
    patch.

    Signed-off-by: Andy Lutomirski
    Acked-by: Thomas Gleixner
    Cc: Al Viro
    Cc: Andrew Morton
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Jann Horn
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Linux API
    Cc: Peter Zijlstra
    Cc: Tycho Andersen
    Link: http://lkml.kernel.org/r/4c3f68f426e6c061ca98b4fc7ef85ffbb0a25b0c.1475257877.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     

11 Oct, 2016

3 commits

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Pull vfs xattr updates from Al Viro:
    "xattr stuff from Andreas

    This completes the switch to xattr_handler ->get()/->set() from
    ->getxattr/->setxattr/->removexattr"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Remove {get,set,remove}xattr inode operations
    xattr: Stop calling {get,set,remove}xattr inode operations
    vfs: Check for the IOP_XATTR flag in listxattr
    xattr: Add __vfs_{get,set,remove}xattr helpers
    libfs: Use IOP_XATTR flag for empty directory handling
    vfs: Use IOP_XATTR flag for bad-inode handling
    vfs: Add IOP_XATTR inode operations flag
    vfs: Move xattr_resolve_name to the front of fs/xattr.c
    ecryptfs: Switch to generic xattr handlers
    sockfs: Get rid of getxattr iop
    sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    kernfs: Switch to generic xattr handlers
    hfs: Switch to generic xattr handlers
    jffs2: Remove jffs2_{get,set,remove}xattr macros
    xattr: Remove unnecessary NULL attribute name check

    Linus Torvalds
     
  • Merge my system logging cleanups, triggered by the broken '\n' patches.

    The line continuation handling has been broken basically forever, and
    the code to handle the system log records was both confusing and
    dubious. And it would do entirely the wrong thing unless you always had
    a terminating newline, partly because it couldn't actually see whether a
    message was marked KERN_CONT or not (but partly because the LOG_CONT
    handling in the recording code was rather confusing too).

    This re-introduces a real semantically meaningful KERN_CONT, and fixes
    the few places I noticed where it was missing. There are probably more
    missing cases, since KERN_CONT hasn't actually had any semantic meaning
    for at least four years (other than the checkpatch meaning of "no log
    level necessary, this is a continuation line").

    This also allows the combination of KERN_CONT and a log level. In that
    case the log level will be ignored if the merging with a previous line
    is successful, but if a new record is needed, that new record will now
    get the right log level.

    That also means that you can at least in theory combine KERN_CONT with
    the "pr_info()" style helpers, although any use of pr_fmt() prefixing
    would make that just result in a mess, of course (the prefix would end
    up in the middle of a continuing line).

    * printk-cleanups:
    printk: make reading the kernel log flush pending lines
    printk: re-organize log_output() to be more legible
    printk: split out core logging code into helper function
    printk: reinstate KERN_CONT for printing continuation lines

    Linus Torvalds
     

10 Oct, 2016

1 commit

  • Long long ago the kernel log buffer was a buffered stream of bytes, very
    much like stdio in user space. It supported log levels by scanning the
    stream and noticing the log level markers at the beginning of each line,
    but if you wanted to print a partial line in multiple chunks, you just
    did multiple printk() calls, and it just automatically worked.

    Except when it didn't, and you had very confusing output when different
    lines got all mixed up with each other. Then you got fragment lines
    mixing with each other, or with non-fragment lines, because it was
    traditionally impossible to tell whether a printk() call was a
    continuation or not.

    To at least help clarify the issue of continuation lines, we added a
    KERN_CONT marker back in 2007 to mark continuation lines:

    474925277671 ("printk: add KERN_CONT annotation").

    That continuation marker was initially an empty string, and didn't
    actuall make any semantic difference. But it at least made it possible
    to annotate the source code, and have check-patch notice that a printk()
    didn't need or want a log level marker, because it was a continuation of
    a previous line.

    To avoid the ambiguity between a continuation line that had that
    KERN_CONT marker, and a printk with no level information at all, we then
    in 2009 made KERN_CONT be a real log level marker which meant that we
    could now reliably tell the difference between the two cases.

    5fd29d6ccbc9 ("printk: clean up handling of log-levels and newlines")

    and we could take advantage of that to make sure we didn't mix up
    continuation lines with lines that just didn't have any loglevel at all.

    Then, in 2012, the kernel log buffer was changed to be a "record" based
    log, where each line was a record that has a loglevel and a timestamp.

    You can see the beginning of that conversion in commits

    e11fea92e13f ("kmsg: export printk records to the /dev/kmsg interface")
    7ff9554bb578 ("printk: convert byte-buffer to variable-length record buffer")

    with a number of follow-up commits to fix some painful fallout from that
    conversion. Over all, it took a couple of months to sort out most of
    it. But the upside was that you could have concurrent readers (and
    writers) of the kernel log and not have lines with mixed output in them.

    And one particular pain-point for the record-based kernel logging was
    exactly the fragmentary lines that are generated in smaller chunks. In
    order to still log them as one recrod, the continuation lines need to be
    attached to the previous record properly.

    However the explicit continuation record marker that is actually useful
    for this exact case was actually removed in aroundm the same time by commit

    61e99ab8e35a ("printk: remove the now unnecessary "C" annotation for KERN_CONT")

    due to the incorrect belief that KERN_CONT wasn't meaningful. The
    ambiguity between "is this a continuation line" or "is this a plain
    printk with no log level information" was reintroduced, and in fact
    became an even bigger pain point because there was now the whole
    record-level merging of kernel messages going on.

    This patch reinstates the KERN_CONT as a real non-empty string marker,
    so that the ambiguity is fixed once again.

    But it's not a plain revert of that original removal: in the four years
    since we made KERN_CONT an empty string again, not only has the format
    of the log level markers changed, we've also had some usage changes in
    this area.

    For example, some ACPI code seems to use KERN_CONT _together_ with a log
    level, and now uses both the KERN_CONT marker and (for example) a
    KERN_INFO marker to show that it's an informational continuation of a
    line.

    Which is actually not a bad idea - if the continuation line cannot be
    attached to its predecessor, without the log level information we don't
    know what log level to assign to it (and we traditionally just assigned
    it the default loglevel). So having both a log level and the KERN_CONT
    marker is not necessarily a bad idea, but it does mean that we need to
    actually iterate over potentially multiple markers, rather than just a
    single one.

    Also, since KERN_CONT was still conceptually needed, and encouraged, but
    didn't actually _do_ anything, we've also had the reverse problem:
    rather than having too many annotations it has too few, and there is bit
    rot with code that no longer marks the continuation lines with the
    KERN_CONT marker.

    So this patch not only re-instates the non-empty KERN_CONT marker, it
    also fixes up the cases of bit-rot I noticed in my own logs.

    There are probably other cases where KERN_CONT will be needed to be
    added, either because it is new code that never dealt with the need for
    KERN_CONT, or old code that has bitrotted without anybody noticing.

    That said, we should strive to avoid the need for KERN_CONT. It does
    result in real problems for logging, and should generally not be seen as
    a good feature. If we some day can get rid of the feature entirely,
    because nobody does any fragmented printk calls, that would be lovely.

    But until that point, let's at mark the code that relies on the hacky
    multi-fragment kernel printk's. Not only does it avoid the ambiguity,
    it also annotates code as "maybe this would be good to fix some day".

    (That said, particularly during single-threaded bootup, the downsides of
    KERN_CONT are very limited. Things get much hairier when you have
    multiple threads going on and user level reading and writing logs too).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

08 Oct, 2016

1 commit

  • Right now, various places in the kernel check for the existence of
    getxattr, setxattr, and removexattr inode operations and directly call
    those operations. Switch to helper functions and test for the IOP_XATTR
    flag instead.

    Signed-off-by: Andreas Gruenbacher
    Acked-by: James Morris
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

28 Sep, 2016

1 commit

  • CURRENT_TIME macro is not appropriate for filesystems as it
    doesn't use the right granularity for filesystem timestamps.
    Use current_time() instead.

    CURRENT_TIME is also not y2038 safe.

    This is also in preparation for the patch that transitions
    vfs timestamps to use 64 bit time and hence make them
    y2038 safe. As part of the effort current_time() will be
    extended to do range checks. Hence, it is necessary for all
    file system timestamps to use current_time(). Also,
    current_time() will be transitioned along with vfs to be
    y2038 safe.

    Note that whenever a single call to current_time() is used
    to change timestamps in different inodes, it is because they
    share the same time granularity.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Felipe Balbi
    Acked-by: Steven Whitehouse
    Acked-by: Ryusuke Konishi
    Acked-by: David Sterba
    Signed-off-by: Al Viro

    Deepa Dinamani
     

20 Sep, 2016

1 commit

  • Right now LSM_AUDIT_DATA_PATH type contains "struct path" in union "u"
    of common_audit_data. This information is used to print path of file
    at the same time it is also used to get to dentry and inode. And this
    inode information is used to get to superblock and device and print
    device information.

    This does not work well for layered filesystems like overlay where dentry
    contained in path is overlay dentry and not the real dentry of underlying
    file system. That means inode retrieved from dentry is also overlay
    inode and not the real inode.

    SELinux helpers like file_path_has_perm() are doing checks on inode
    retrieved from file_inode(). This returns the real inode and not the
    overlay inode. That means we are doing check on real inode but for audit
    purposes we are printing details of overlay inode and that can be
    confusing while debugging.

    Hence, introduce a new type LSM_AUDIT_DATA_FILE which carries file
    information and inode retrieved is real inode using file_inode(). That
    way right avc denied information is given to user.

    For example, following is one example avc before the patch.

    type=AVC msg=audit(1473360868.399:214): avc: denied { read open } for
    pid=1765 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="overlay" ino=21443
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

    It looks as follows after the patch.

    type=AVC msg=audit(1473360017.388:282): avc: denied { read open } for
    pid=2530 comm="cat"
    path="/root/.../overlay/container1/merged/readfile"
    dev="dm-0" ino=2377915
    scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20
    tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0
    tclass=file permissive=0

    Notice that now dev information points to "dm-0" device instead of
    "overlay" device. This makes it clear that check failed on underlying
    inode and not on the overlay inode.

    Signed-off-by: Vivek Goyal
    [PM: slight tweaks to the description to make checkpatch.pl happy]
    Signed-off-by: Paul Moore

    Vivek Goyal
     

14 Sep, 2016

1 commit

  • Fix to return error code -EINVAL from the error handling case instead
    of 0 (rc is overwrite to 0 when policyvers >=
    POLICYDB_VERSION_ROLETRANS), as done elsewhere in this function.

    Signed-off-by: Wei Yongjun
    [PM: normalize "selinux" in patch subject, description line wrap]
    Signed-off-by: Paul Moore

    Wei Yongjun
     

31 Aug, 2016

1 commit


30 Aug, 2016

2 commits

  • libsepol pointed out an issue where its possible to have
    an unitialized jmp and invalid dereference, fix this.
    While we're here, zero allocate all the *_val_to_struct
    structures.

    Signed-off-by: William Roberts
    Signed-off-by: Paul Moore

    William Roberts
     
  • When count is 0 and the highbit is not zero, the ebitmap is not
    valid and the internal node is not allocated. This causes issues
    when routines, like mls_context_isvalid() attempt to use the
    ebitmap_for_each_bit() and ebitmap_node_get_bit() as they assume
    a highbit > 0 will have a node allocated.

    Signed-off-by: William Roberts
    Signed-off-by: Paul Moore

    William Roberts
     

19 Aug, 2016

1 commit

  • Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option

    Per: https://github.com/SELinuxProject/selinux/wiki/Kernel-Todo

    This was only needed on Fedora 3 and 4 and just causes issues now,
    so drop it.

    The MAX and MIN should just be whatever the kernel can support.

    Signed-off-by: William Roberts
    Signed-off-by: Paul Moore

    William Roberts
     

10 Aug, 2016

1 commit

  • Calculate what would be the label of newly created file and set that
    secid in the passed creds.

    Context of the task which is actually creating file is retrieved from
    set of creds passed in. (old->security).

    Signed-off-by: Vivek Goyal
    Acked-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Vivek Goyal
     

09 Aug, 2016

4 commits

  • Right now selinux_determine_inode_label() works on security pointer of
    current task. Soon I need this to work on a security pointer retrieved
    from a set of creds. So start passing in a pointer and caller can
    decide where to fetch security pointer from.

    Signed-off-by: Vivek Goyal
    Acked-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Vivek Goyal
     
  • When a file is copied up in overlay, we have already created file on
    upper/ with right label and there is no need to copy up selinux
    label/xattr from lower file to upper file. In fact in case of context
    mount, we don't want to copy up label as newly created file got its label
    from context= option.

    Signed-off-by: Vivek Goyal
    Acked-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Vivek Goyal
     
  • A file is being copied up for overlay file system. Prepare a new set of
    creds and set create_sid appropriately so that new file is created with
    appropriate label.

    Overlay inode has right label for both context and non-context mount
    cases. In case of non-context mount, overlay inode will have the label
    of lower file and in case of context mount, overlay inode will have
    the label from context= mount option.

    Signed-off-by: Vivek Goyal
    Acked-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Vivek Goyal
     
  • The IS_ENABLED() macro checks if a Kconfig symbol has been enabled
    either built-in or as a module, use that macro instead of open coding
    the same.

    Signed-off-by: Javier Martinez Canillas
    Acked-by: Casey Schaufler
    Signed-off-by: Paul Moore

    Javier Martinez Canillas
     

06 Aug, 2016

1 commit

  • Pull qstr constification updates from Al Viro:
    "Fairly self-contained bunch - surprising lot of places passes struct
    qstr * as an argument when const struct qstr * would suffice; it
    complicates analysis for no good reason.

    I'd prefer to feed that separately from the assorted fixes (those are
    in #for-linus and with somewhat trickier topology)"

    * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    qstr: constify instances in adfs
    qstr: constify instances in lustre
    qstr: constify instances in f2fs
    qstr: constify instances in ext2
    qstr: constify instances in vfat
    qstr: constify instances in procfs
    qstr: constify instances in fuse
    qstr constify instances in fs/dcache.c
    qstr: constify instances in nfs
    qstr: constify instances in ocfs2
    qstr: constify instances in autofs4
    qstr: constify instances in hfs
    qstr: constify instances in hfsplus
    qstr: constify instances in logfs
    qstr: constify dentry_init_security

    Linus Torvalds
     

30 Jul, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - TPM core and driver updates/fixes
    - IPv6 security labeling (CALIPSO)
    - Lots of Apparmor fixes
    - Seccomp: remove 2-phase API, close hole where ptrace can change
    syscall #"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
    apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
    tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
    tpm: Factor out common startup code
    tpm: use devm_add_action_or_reset
    tpm2_i2c_nuvoton: add irq validity check
    tpm: read burstcount from TPM_STS in one 32-bit transaction
    tpm: fix byte-order for the value read by tpm2_get_tpm_pt
    tpm_tis_core: convert max timeouts from msec to jiffies
    apparmor: fix arg_size computation for when setprocattr is null terminated
    apparmor: fix oops, validate buffer size in apparmor_setprocattr()
    apparmor: do not expose kernel stack
    apparmor: fix module parameters can be changed after policy is locked
    apparmor: fix oops in profile_unpack() when policy_db is not present
    apparmor: don't check for vmalloc_addr if kvzalloc() failed
    apparmor: add missing id bounds check on dfa verification
    apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
    apparmor: use list_next_entry instead of list_entry_next
    apparmor: fix refcount race when finding a child profile
    apparmor: fix ref count leak when profile sha1 hash is read
    apparmor: check that xindex is in trans_table bounds
    ...

    Linus Torvalds
     

21 Jul, 2016

1 commit


07 Jul, 2016

1 commit


28 Jun, 2016

6 commits


25 Jun, 2016

1 commit

  • Security labels from unprivileged mounts in user namespaces must
    be ignored. Force superblocks from user namespaces whose labeling
    behavior is to use xattrs to use mountpoint labeling instead.
    For the mountpoint label, default to converting the current task
    context into a form suitable for file objects, but also allow the
    policy writer to specify a different label through policy
    transition rules.

    Pieced together from code snippets provided by Stephen Smalley.

    Signed-off-by: Seth Forshee
    Acked-by: Stephen Smalley
    Acked-by: James Morris
    Signed-off-by: Eric W. Biederman

    Seth Forshee
     

24 Jun, 2016

1 commit

  • If a process gets access to a mount from a different user
    namespace, that process should not be able to take advantage of
    setuid files or selinux entrypoints from that filesystem. Prevent
    this by treating mounts from other mount namespaces and those not
    owned by current_user_ns() or an ancestor as nosuid.

    This will make it safer to allow more complex filesystems to be
    mounted in non-root user namespaces.

    This does not remove the need for MNT_LOCK_NOSUID. The setuid,
    setgid, and file capability bits can no longer be abused if code in
    a user namespace were to clear nosuid on an untrusted filesystem,
    but this patch, by itself, is insufficient to protect the system
    from abuse of files that, when execed, would increase MAC privilege.

    As a more concrete explanation, any task that can manipulate a
    vfsmount associated with a given user namespace already has
    capabilities in that namespace and all of its descendents. If they
    can cause a malicious setuid, setgid, or file-caps executable to
    appear in that mount, then that executable will only allow them to
    elevate privileges in exactly the set of namespaces in which they
    are already privileges.

    On the other hand, if they can cause a malicious executable to
    appear with a dangerous MAC label, running it could change the
    caller's security context in a way that should not have been
    possible, even inside the namespace in which the task is confined.

    As a hardening measure, this would have made CVE-2014-5207 much
    more difficult to exploit.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Seth Forshee
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Andy Lutomirski
     

16 Jun, 2016

1 commit

  • avc_cache_threshold is of type unsigned int. Do not use a signed
    new_value in sscanf(page, "%u", &new_value).

    Signed-off-by: Heinrich Schuchardt
    [PM: subject prefix fix, description cleanup]
    Signed-off-by: Paul Moore

    Heinrich Schuchardt
     

09 Jun, 2016

1 commit


01 Jun, 2016

1 commit

  • The current bounds checking of both source and target types
    requires allowing any domain that has access to the child
    domain to also have the same permissions to the parent, which
    is undesirable. Drop the target bounds checking.

    KaiGai Kohei originally removed all use of target bounds in
    commit 7d52a155e38d ("selinux: remove dead code in
    type_attribute_bounds_av()") but this was reverted in
    commit 2ae3ba39389b ("selinux: libsepol: remove dead code in
    check_avtab_hierarchy_callback()") because it would have
    required explicitly allowing the parent any permissions
    to the child that the child is allowed to itself.

    This change in contrast retains the logic for the case where both
    source and target types are bounded, thereby allowing access
    if the parent of the source is allowed the corresponding
    permissions to the parent of the target. Further, this change
    reworks the logic such that we only perform a single computation
    for each case and there is no ambiguity as to how to resolve
    a bounds violation.

    Under the new logic, if the source type and target types are both
    bounded, then the parent of the source type must be allowed the same
    permissions to the parent of the target type. If only the source
    type is bounded, then the parent of the source type must be allowed
    the same permissions to the target type.

    Examples of the new logic and comparisons with the old logic:
    1. If we have:
    typebounds A B;
    then:
    allow B self:process ;
    will satisfy the bounds constraint iff:
    allow A self:process ;
    is also allowed in policy.

    Under the old logic, the allow rule on B satisfies the
    bounds constraint if any of the following three are allowed:
    allow A B:process ; or
    allow B A:process ; or
    allow A self:process ;
    However, either of the first two ultimately require the third to
    satisfy the bounds constraint under the old logic, and therefore
    this degenerates to the same result (but is more efficient - we only
    need to perform one compute_av call).

    2. If we have:
    typebounds A B;
    typebounds A_exec B_exec;
    then:
    allow B B_exec:file ;
    will satisfy the bounds constraint iff:
    allow A A_exec:file ;
    is also allowed in policy.

    This is essentially the same as #1; it is merely included as
    an example of dealing with object types related to a bounded domain
    in a manner that satisfies the bounds relationship. Note that
    this approach is preferable to leaving B_exec unbounded and having:
    allow A B_exec:file ;
    in policy because that would allow B's entrypoints to be used to
    enter A. Similarly for _tmp or other related types.

    3. If we have:
    typebounds A B;
    and an unbounded type T, then:
    allow B T:file ;
    will satisfy the bounds constraint iff:
    allow A T:file ;
    is allowed in policy.

    The old logic would have been identical for this example.

    4. If we have:
    typebounds A B;
    and an unbounded domain D, then:
    allow D B:unix_stream_socket ;
    is not subject to any bounds constraints under the new logic
    because D is not bounded. This is desirable so that we can
    allow a domain to e.g. connectto a child domain without having
    to allow it to do the same to its parent.

    The old logic would have required:
    allow D A:unix_stream_socket ;
    to also be allowed in policy.

    Signed-off-by: Stephen Smalley
    [PM: re-wrapped description to appease checkpatch.pl]
    Signed-off-by: Paul Moore

    Stephen Smalley
     

20 May, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - A new LSM, "LoadPin", from Kees Cook is added, which allows forcing
    of modules and firmware to be loaded from a specific device (this
    is from ChromeOS, where the device as a whole is verified
    cryptographically via dm-verity).

    This is disabled by default but can be configured to be enabled by
    default (don't do this if you don't know what you're doing).

    - Keys: allow authentication data to be stored in an asymmetric key.
    Lots of general fixes and updates.

    - SELinux: add restrictions for loading of kernel modules via
    finit_module(). Distinguish non-init user namespace capability
    checks. Apply execstack check on thread stacks"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (48 commits)
    LSM: LoadPin: provide enablement CONFIG
    Yama: use atomic allocations when reporting
    seccomp: Fix comment typo
    ima: add support for creating files using the mknodat syscall
    ima: fix ima_inode_post_setattr
    vfs: forbid write access when reading a file into memory
    fs: fix over-zealous use of "const"
    selinux: apply execstack check on thread stacks
    selinux: distinguish non-init user namespace capability checks
    LSM: LoadPin for kernel file loading restrictions
    fs: define a string representation of the kernel_read_file_id enumeration
    Yama: consolidate error reporting
    string_helpers: add kstrdup_quotable_file
    string_helpers: add kstrdup_quotable_cmdline
    string_helpers: add kstrdup_quotable
    selinux: check ss_initialized before revalidating an inode label
    selinux: delay inode label lookup as long as possible
    selinux: don't revalidate an inode's label when explicitly setting it
    selinux: Change bool variable name to index.
    KEYS: Add KEYCTL_DH_COMPUTE command
    ...

    Linus Torvalds
     

18 May, 2016

2 commits

  • Pull networking updates from David Miller:
    "Highlights:

    1) Support SPI based w5100 devices, from Akinobu Mita.

    2) Partial Segmentation Offload, from Alexander Duyck.

    3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

    4) Allow cls_flower stats offload, from Amir Vadai.

    5) Implement bpf blinding, from Daniel Borkmann.

    6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
    actually using FASYNC these atomics are superfluous. From Eric
    Dumazet.

    7) Run TCP more preemptibly, also from Eric Dumazet.

    8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
    driver, from Gal Pressman.

    9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

    10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

    11) Support tunneling offloads in qed, from Manish Chopra.

    12) aRFS offloading in mlx5e, from Maor Gottlieb.

    13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
    Leitner.

    14) Add MSG_EOR support to TCP, this allows controlling packet
    coalescing on application record boundaries for more accurate
    socket timestamp sampling. From Martin KaFai Lau.

    15) Fix alignment of 64-bit netlink attributes across the board, from
    Nicolas Dichtel.

    16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

    17) Several conversions of drivers to ethtool ksettings, from Philippe
    Reynes.

    18) Checksum neutral ILA in ipv6, from Tom Herbert.

    19) Factorize all of the various marvell dsa drivers into one, from
    Vivien Didelot

    20) Add VF support to qed driver, from Yuval Mintz"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
    Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
    Revert "phy dp83867: Make rgmii parameters optional"
    r8169: default to 64-bit DMA on recent PCIe chips
    phy dp83867: Make rgmii parameters optional
    phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
    bpf: arm64: remove callee-save registers use for tmp registers
    asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
    switchdev: pass pointer to fib_info instead of copy
    net_sched: close another race condition in tcf_mirred_release()
    tipc: fix nametable publication field in nl compat
    drivers: net: Don't print unpopulated net_device name
    qed: add support for dcbx.
    ravb: Add missing free_irq() calls to ravb_close()
    qed: Remove a stray tab
    net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
    net: ethernet: fec-mpc52xx: use phydev from struct net_device
    bpf, doc: fix typo on bpf_asm descriptions
    stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
    net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
    net: ethernet: fs-enet: use phydev from struct net_device
    ...

    Linus Torvalds
     
  • Pull 'struct path' constification update from Al Viro:
    "'struct path' is passed by reference to a bunch of Linux security
    methods; in theory, there's nothing to stop them from modifying the
    damn thing and LSM community being what it is, sooner or later some
    enterprising soul is going to decide that it's a good idea.

    Let's remove the temptation and constify all of those..."

    * 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    constify ima_d_path()
    constify security_sb_pivotroot()
    constify security_path_chroot()
    constify security_path_{link,rename}
    apparmor: remove useless checks for NULL ->mnt
    constify security_path_{mkdir,mknod,symlink}
    constify security_path_{unlink,rmdir}
    apparmor: constify common_perm_...()
    apparmor: constify aa_path_link()
    apparmor: new helper - common_path_perm()
    constify chmod_common/security_path_chmod
    constify security_sb_mount()
    constify chown_common/security_path_chown
    tomoyo: constify assorted struct path *
    apparmor_path_truncate(): path->mnt is never NULL
    constify vfs_truncate()
    constify security_path_truncate()
    [apparmor] constify struct path * in a bunch of helpers

    Linus Torvalds
     

27 Apr, 2016

2 commits

  • The execstack check was only being applied on the main
    process stack. Thread stacks allocated via mmap were
    only subject to the execmem permission check. Augment
    the check to apply to the current thread stack as well.
    Note that this does NOT prevent making a different thread's
    stack executable.

    Suggested-by: Nick Kralevich
    Acked-by: Nick Kralevich
    Signed-off-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Stephen Smalley
     
  • Distinguish capability checks against a target associated
    with the init user namespace versus capability checks against
    a target associated with a non-init user namespace by defining
    and using separate security classes for the latter.

    This is needed to support e.g. Chrome usage of user namespaces
    for the Chrome sandbox without needing to allow Chrome to also
    exercise capabilities on targets in the init user namespace.

    Suggested-by: Dan Walsh
    Signed-off-by: Stephen Smalley
    Signed-off-by: Paul Moore

    Stephen Smalley
     

21 Apr, 2016

1 commit

  • This patch adds a new RTM_GETSTATS message to query link stats via netlink
    from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
    returns a lot more than just stats and is expensive in some cases when
    frequent polling for stats from userspace is a common operation.

    RTM_GETSTATS is an attempt to provide a light weight netlink message
    to explicity query only link stats from the kernel on an interface.
    The idea is to also keep it extensible so that new kinds of stats can be
    added to it in the future.

    This patch adds the following attribute for NETDEV stats:
    struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
    [IFLA_STATS_LINK_64] = { .len = sizeof(struct rtnl_link_stats64) },
    };

    Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
    a single interface or all interfaces with NLM_F_DUMP.

    Future possible new types of stat attributes:
    link af stats:
    - IFLA_STATS_LINK_IPV6 (nested. for ipv6 stats)
    - IFLA_STATS_LINK_MPLS (nested. for mpls/mdev stats)
    extended stats:
    - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
    vlan, vxlan etc)
    - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
    available via ethtool today)

    This patch also declares a filter mask for all stat attributes.
    User has to provide a mask of stats attributes to query. filter mask
    can be specified in the new hdr 'struct if_stats_msg' for stats messages.
    Other important field in the header is the ifindex.

    This api can also include attributes for global stats (eg tcp) in the future.
    When global stats are included in a stats msg, the ifindex in the header
    must be zero. A single stats message cannot contain both global and
    netdev specific stats. To easily distinguish them, netdev specific stat
    attributes name are prefixed with IFLA_STATS_LINK_

    Without any attributes in the filter_mask, no stats will be returned.

    This patch has been tested with mofified iproute2 ifstat.

    Suggested-by: Jamal Hadi Salim
    Signed-off-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Roopa Prabhu