09 Jul, 2019

9 commits

  • …iederm/user-namespace

    Pull force_sig() argument change from Eric Biederman:
    "A source of error over the years has been that force_sig has taken a
    task parameter when it is only safe to use force_sig with the current
    task.

    The force_sig function is built for delivering synchronous signals
    such as SIGSEGV where the userspace application caused a synchronous
    fault (such as a page fault) and the kernel responded with a signal.

    Because the name force_sig does not make this clear, and because the
    force_sig takes a task parameter the function force_sig has been
    abused for sending other kinds of signals over the years. Slowly those
    have been fixed when the oopses have been tracked down.

    This set of changes fixes the remaining abusers of force_sig and
    carefully rips out the task parameter from force_sig and friends
    making this kind of error almost impossible in the future"

    * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
    signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
    signal: Remove the signal number and task parameters from force_sig_info
    signal: Factor force_sig_info_to_task out of force_sig_info
    signal: Generate the siginfo in force_sig
    signal: Move the computation of force into send_signal and correct it.
    signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
    signal: Remove the task parameter from force_sig_fault
    signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
    signal: Explicitly call force_sig_fault on current
    signal/unicore32: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from __do_user_fault
    signal/arm: Remove tsk parameter from ptrace_break
    signal/nds32: Remove tsk parameter from send_sigtrap
    signal/riscv: Remove tsk parameter from do_trap
    signal/sh: Remove tsk parameter from force_sig_info_fault
    signal/um: Remove task parameter from send_sigtrap
    signal/x86: Remove task parameter from send_sigtrap
    signal: Remove task parameter from force_sig_mceerr
    signal: Remove task parameter from force_sig
    signal: Remove task parameter from force_sigsegv
    ...

    Linus Torvalds
     
  • Pull cgroup updates from Tejun Heo:
    "Documentation updates and the addition of cgroup_parse_float() which
    will be used by new controllers including blk-iocost"

    * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    docs: cgroup-v1: convert docs to ReST and rename to *.rst
    cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS
    cgroup: add cgroup_parse_float()

    Linus Torvalds
     
  • Pull integrity updates from Mimi Zohar:
    "Bug fixes, code clean up, and new features:

    - IMA policy rules can be defined in terms of LSM labels, making the
    IMA policy dependent on LSM policy label changes, in particular LSM
    label deletions. The new environment, in which IMA-appraisal is
    being used, frequently updates the LSM policy and permits LSM label
    deletions.

    - Prevent an mmap'ed shared file opened for write from also being
    mmap'ed execute. In the long term, making this and other similar
    changes at the VFS layer would be preferable.

    - The IMA per policy rule template format support is needed for a
    couple of new/proposed features (eg. kexec boot command line
    measurement, appended signatures, and VFS provided file hashes).

    - Other than the "boot-aggregate" record in the IMA measuremeent
    list, all other measurements are of file data. Measuring and
    storing the kexec boot command line in the IMA measurement list is
    the first buffer based measurement included in the measurement
    list"

    * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
    integrity: Introduce struct evm_xattr
    ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition
    KEXEC: Call ima_kexec_cmdline to measure the boot command line args
    IMA: Define a new template field buf
    IMA: Define a new hook to measure the kexec boot command line arguments
    IMA: support for per policy rule template formats
    integrity: Fix __integrity_init_keyring() section mismatch
    ima: Use designated initializers for struct ima_event_data
    ima: use the lsm policy update notifier
    LSM: switch to blocking policy update notifiers
    x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY
    ima: Make arch_policy_entry static
    ima: prevent a file already mmap'ed write to be mmap'ed execute
    x86/ima: check EFI SetupMode too

    Linus Torvalds
     
  • Pull keyring ACL support from David Howells:
    "This changes the permissions model used by keys and keyrings to be
    based on an internal ACL by the following means:

    - Replace the permissions mask internally with an ACL that contains a
    list of ACEs, each with a specific subject with a permissions mask.
    Potted default ACLs are available for new keys and keyrings.

    ACE subjects can be macroised to indicate the UID and GID specified
    on the key (which remain). Future commits will be able to add
    additional subject types, such as specific UIDs or domain
    tags/namespaces.

    Also split a number of permissions to give finer control. Examples
    include splitting the revocation permit from the change-attributes
    permit, thereby allowing someone to be granted permission to revoke
    a key without allowing them to change the owner; also the ability
    to join a keyring is split from the ability to link to it, thereby
    stopping a process accessing a keyring by joining it and thus
    acquiring use of possessor permits.

    - Provide a keyctl to allow the granting or denial of one or more
    permits to a specific subject. Direct access to the ACL is not
    granted, and the ACL cannot be viewed"

    * tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Provide KEYCTL_GRANT_PERMISSION
    keys: Replace uid/gid/perm permissions checking with an ACL

    Linus Torvalds
     
  • …/git/dhowells/linux-fs

    Pull keyring namespacing from David Howells:
    "These patches help make keys and keyrings more namespace aware.

    Firstly some miscellaneous patches to make the process easier:

    - Simplify key index_key handling so that the word-sized chunks
    assoc_array requires don't have to be shifted about, making it
    easier to add more bits into the key.

    - Cache the hash value in the key so that we don't have to calculate
    on every key we examine during a search (it involves a bunch of
    multiplications).

    - Allow keying_search() to search non-recursively.

    Then the main patches:

    - Make it so that keyring names are per-user_namespace from the point
    of view of KEYCTL_JOIN_SESSION_KEYRING so that they're not
    accessible cross-user_namespace.

    keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEYRING_NAME for this.

    - Move the user and user-session keyrings to the user_namespace
    rather than the user_struct. This prevents them propagating
    directly across user_namespaces boundaries (ie. the KEY_SPEC_*
    flags will only pick from the current user_namespace).

    - Make it possible to include the target namespace in which the key
    shall operate in the index_key. This will allow the possibility of
    multiple keys with the same description, but different target
    domains to be held in the same keyring.

    keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEY_TAG for this.

    - Make it so that keys are implicitly invalidated by removal of a
    domain tag, causing them to be garbage collected.

    - Institute a network namespace domain tag that allows keys to be
    differentiated by the network namespace in which they operate. New
    keys that are of a type marked 'KEY_TYPE_NET_DOMAIN' are assigned
    the network domain in force when they are created.

    - Make it so that the desired network namespace can be handed down
    into the request_key() mechanism. This allows AFS, NFS, etc. to
    request keys specific to the network namespace of the superblock.

    This also means that the keys in the DNS record cache are
    thenceforth namespaced, provided network filesystems pass the
    appropriate network namespace down into dns_query().

    For DNS, AFS and NFS are good, whilst CIFS and Ceph are not. Other
    cache keyrings, such as idmapper keyrings, also need to set the
    domain tag - for which they need access to the network namespace of
    the superblock"

    * tag 'keys-namespace-20190627' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Pass the network namespace into request_key mechanism
    keys: Network namespace domain tag
    keys: Garbage collect keys for which the domain has been removed
    keys: Include target namespace in match criteria
    keys: Move the user and user-session keyrings to the user_namespace
    keys: Namespace keyring names
    keys: Add a 'recurse' flag for keyring searches
    keys: Cache the hash value to avoid lots of recalculation
    keys: Simplify key description management

    Linus Torvalds
     
  • Pull request_key improvements from David Howells:
    "These are all request_key()-related, including a fix and some improvements:

    - Fix the lack of a Link permission check on a key found by
    request_key(), thereby enabling request_key() to link keys that
    don't grant this permission to the target keyring (which must still
    grant Write permission).

    Note that the key must be in the caller's keyrings already to be
    found.

    - Invalidate used request_key authentication keys rather than
    revoking them, so that they get cleaned up immediately rather than
    hanging around till the expiry time is passed.

    - Move the RCU locks outwards from the keyring search functions so
    that a request_key_rcu() can be provided. This can be called in RCU
    mode, so it can't sleep and can't upcall - but it can be called
    from LOOKUP_RCU pathwalk mode.

    - Cache the latest positive result of request_key*() temporarily in
    task_struct so that filesystems that make a lot of request_key()
    calls during pathwalk can take advantage of it to avoid having to
    redo the searching. This requires CONFIG_KEYS_REQUEST_CACHE=y.

    It is assumed that the key just found is likely to be used multiple
    times in each step in an RCU pathwalk, and is likely to be reused
    for the next step too.

    Note that the cleanup of the cache is done on TIF_NOTIFY_RESUME,
    just before userspace resumes, and on exit"

    * tag 'keys-request-20190626' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Kill off request_key_async{,_with_auxdata}
    keys: Cache result of request_key*() temporarily in task_struct
    keys: Provide request_key_rcu()
    keys: Move the RCU locks outwards from the keyring search functions
    keys: Invalidate used request_key authentication keys
    keys: Fix request_key() lack of Link perm check on found key

    Linus Torvalds
     
  • Pull misc keyring updates from David Howells:
    "These are some miscellaneous keyrings fixes and improvements:

    - Fix a bunch of warnings from sparse, including missing RCU bits and
    kdoc-function argument mismatches

    - Implement a keyctl to allow a key to be moved from one keyring to
    another, with the option of prohibiting key replacement in the
    destination keyring.

    - Grant Link permission to possessors of request_key_auth tokens so
    that upcall servicing daemons can more easily arrange things such
    that only the necessary auth key is passed to the actual service
    program, and not all the auth keys a daemon might possesss.

    - Improvement in lookup_user_key().

    - Implement a keyctl to allow keyrings subsystem capabilities to be
    queried.

    The keyutils next branch has commits to make available, document and
    test the move-key and capabilities code:

    https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log

    They're currently on the 'next' branch"

    * tag 'keys-misc-20190619' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    keys: Add capability-checking keyctl function
    keys: Reuse keyring_index_key::desc_len in lookup_user_key()
    keys: Grant Link permission to possessers of request_key auth keys
    keys: Add a keyctl to move a key between keyrings
    keys: Hoist locking out of __key_link_begin()
    keys: Break bits out of key_unlink()
    keys: Change keyring_serialise_link_sem to a mutex
    keys: sparse: Fix kdoc mismatches
    keys: sparse: Fix incorrect RCU accesses
    keys: sparse: Fix key_fs[ug]id_changed()

    Linus Torvalds
     
  • Pull selinux updates from Paul Moore:
    "Like the audit pull request this is a little early due to some
    upcoming vacation plans and uncertain network access while I'm away.
    Also like the audit PR, the list of patches here is pretty minor, the
    highlights include:

    - Explicitly use __le variables to make sure "sparse" can verify
    proper byte endian handling.

    - Remove some BUG_ON()s that are no longer needed.

    - Allow zero-byte writes to the "keycreate" procfs attribute without
    requiring key:create to make it easier for userspace to reset the
    keycreate label.

    - Consistently log the "invalid_context" field as an untrusted string
    in the AUDIT_SELINUX_ERR audit records"

    * tag 'selinux-pr-20190702' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: format all invalid context as untrusted
    selinux: fix empty write to keycreate file
    selinux: remove some no-op BUG_ONs
    selinux: provide __le variables explicitly

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "The main changes in this cycle are:

    - rwsem scalability improvements, phase #2, by Waiman Long, which are
    rather impressive:

    "On a 2-socket 40-core 80-thread Skylake system with 40 reader
    and writer locking threads, the min/mean/max locking operations
    done in a 5-second testing window before the patchset were:

    40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
    40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255

    After the patchset, they became:

    40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
    40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"

    There's a lot of changes to the locking implementation that makes
    it similar to qrwlock, including owner handoff for more fair
    locking.

    Another microbenchmark shows how across the spectrum the
    improvements are:

    "With a locking microbenchmark running on 5.1 based kernel, the
    total locking rates (in kops/s) on a 2-socket Skylake system
    with equal numbers of readers and writers (mixed) before and
    after this patchset were:

    # of Threads Before Patch After Patch
    ------------ ------------ -----------
    2 2,618 4,193
    4 1,202 3,726
    8 802 3,622
    16 729 3,359
    32 319 2,826
    64 102 2,744"

    The changes are extensive and the patch-set has been through
    several iterations addressing various locking workloads. There
    might be more regressions, but unless they are pathological I
    believe we want to use this new implementation as the baseline
    going forward.

    - jump-label optimizations by Daniel Bristot de Oliveira: the primary
    motivation was to remove IPI disturbance of isolated RT-workload
    CPUs, which resulted in the implementation of batched jump-label
    updates. Beyond the improvement of the real-time characteristics
    kernel, in one test this patchset improved static key update
    overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
    as well.

    - atomic64_t cross-arch type cleanups by Mark Rutland: over the last
    ~10 years of atomic64_t existence the various types used by the
    APIs only had to be self-consistent within each architecture -
    which means they became wildly inconsistent across architectures.
    Mark puts and end to this by reworking all the atomic64
    implementations to use 's64' as the base type for atomic64_t, and
    to ensure that this type is consistently used for parameters and
    return values in the API, avoiding further problems in this area.

    - A large set of small improvements to lockdep by Yuyang Du: type
    cleanups, output cleanups, function return type and othr cleanups
    all around the place.

    - A set of percpu ops cleanups and fixes by Peter Zijlstra.

    - Misc other changes - please see the Git log for more details"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
    locking/lockdep: increase size of counters for lockdep statistics
    locking/atomics: Use sed(1) instead of non-standard head(1) option
    locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
    x86/jump_label: Make tp_vec_nr static
    x86/percpu: Optimize raw_cpu_xchg()
    x86/percpu, sched/fair: Avoid local_clock()
    x86/percpu, x86/irq: Relax {set,get}_irq_regs()
    x86/percpu: Relax smp_processor_id()
    x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
    locking/rwsem: Guard against making count negative
    locking/rwsem: Adaptive disabling of reader optimistic spinning
    locking/rwsem: Enable time-based spinning on reader-owned rwsem
    locking/rwsem: Make rwsem->owner an atomic_long_t
    locking/rwsem: Enable readers spinning on writer
    locking/rwsem: Clarify usage of owner's nonspinaable bit
    locking/rwsem: Wake up almost all readers in wait queue
    locking/rwsem: More optimal RT task handling of null owner
    locking/rwsem: Always release wait_lock before waking up tasks
    locking/rwsem: Implement lock handoff to prevent lock starvation
    locking/rwsem: Make rwsem_spin_on_owner() return owner state
    ...

    Linus Torvalds
     

03 Jul, 2019

1 commit

  • Provide a keyctl() operation to grant/remove permissions. The grant
    operation, wrapped by libkeyutils, looks like:

    int ret = keyctl_grant_permission(key_serial_t key,
    enum key_ace_subject_type type,
    unsigned int subject,
    unsigned int perm);

    Where key is the key to be modified, type and subject represent the subject
    to which permission is to be granted (or removed) and perm is the set of
    permissions to be granted. 0 is returned on success. SET_SECURITY
    permission is required for this.

    The subject type currently must be KEY_ACE_SUBJ_STANDARD for the moment
    (other subject types will come along later).

    For subject type KEY_ACE_SUBJ_STANDARD, the following subject values are
    available:

    KEY_ACE_POSSESSOR The possessor of the key
    KEY_ACE_OWNER The owner of the key
    KEY_ACE_GROUP The key's group
    KEY_ACE_EVERYONE Everyone

    perm lists the permissions to be granted:

    KEY_ACE_VIEW Can view the key metadata
    KEY_ACE_READ Can read the key content
    KEY_ACE_WRITE Can update/modify the key content
    KEY_ACE_SEARCH Can find the key by searching/requesting
    KEY_ACE_LINK Can make a link to the key
    KEY_ACE_SET_SECURITY Can set security
    KEY_ACE_INVAL Can invalidate
    KEY_ACE_REVOKE Can revoke
    KEY_ACE_JOIN Can join this keyring
    KEY_ACE_CLEAR Can clear this keyring

    If an ACE already exists for the subject, then the permissions mask will be
    overwritten; if perm is 0, it will be deleted.

    Currently, the internal ACL is limited to a maximum of 16 entries.

    For example:

    int ret = keyctl_grant_permission(key,
    KEY_ACE_SUBJ_STANDARD,
    KEY_ACE_OWNER,
    KEY_ACE_VIEW | KEY_ACE_READ);

    Signed-off-by: David Howells

    David Howells
     

02 Jul, 2019

1 commit

  • The userspace tools expect all fields of the same name to be logged
    consistently with the same encoding. Since the invalid_context fields
    contain untrusted strings in selinux_inode_setxattr()
    and selinux_setprocattr(), encode all instances of this field the same
    way as though they were untrusted even though
    compute_sid_handle_invalid_context() and security_sid_mls_copy() are
    trusted.

    Please see github issue
    https://github.com/linux-audit/audit-kernel/issues/57

    Signed-off-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

01 Jul, 2019

3 commits

  • Even though struct evm_ima_xattr_data includes a fixed-size array to hold a
    SHA1 digest, most of the code ignores the array and uses the struct to mean
    "type indicator followed by data of unspecified size" and tracks the real
    size of what the struct represents in a separate length variable.

    The only exception to that is the EVM code, which correctly uses the
    definition of struct evm_ima_xattr_data.

    So make this explicit in the code by removing the length specification from
    the array in struct evm_ima_xattr_data. Also, change the name of the
    element from digest to data since in most places the array doesn't hold a
    digest.

    A separate struct evm_xattr is introduced, with the original definition of
    evm_ima_xattr_data to be used in the places that actually expect that
    definition, specifically the EVM HMAC code.

    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Mimi Zohar

    Thiago Jung Bauermann
     
  • MAX_TEMPLATE_NAME_LEN is used when restoring measurements carried over from
    a kexec. It should be set to the length of a template containing all fields
    except for 'd' and 'n', which don't need to be accounted for since they
    shouldn't be defined in the same template description as 'd-ng' and 'n-ng'.

    That length is greater than the current 15, so update using a sizeof() to
    show where the number comes from and also can be visually shown to be
    correct. The sizeof() is calculated at compile time.

    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Mimi Zohar

    Thiago Jung Bauermann
     
  • A buffer(kexec boot command line arguments) measured into IMA
    measuremnt list cannot be appraised, without already being
    aware of the buffer contents. Since hashes are non-reversible,
    raw buffer is needed for validation or regenerating hash for
    appraisal/attestation.

    Add support to store/read the buffer contents in HEX.
    The kexec cmdline hash is stored in the "d-ng" field of the
    template data. It can be verified using
    sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements |
    grep kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum

    - Add two new fields to ima_event_data to hold the buf and
    buf_len
    - Add a new template field 'buf' to be used to store/read
    the buffer data.
    - Updated process_buffer_meaurement to add the buffer to
    ima_event_data. process_buffer_measurement added in
    "Define a new IMA hook to measure the boot command line
    arguments"
    - Add a new template policy name ima-buf to represent
    'd-ng|n-ng|buf'

    Signed-off-by: Prakhar Srivastava
    Reviewed-by: Roberto Sassu
    Reviewed-by: James Morris
    Signed-off-by: Mimi Zohar

    Prakhar Srivastava
     

28 Jun, 2019

2 commits

  • Replace the uid/gid/perm permissions checking on a key with an ACL to allow
    the SETATTR and SEARCH permissions to be split. This will also allow a
    greater range of subjects to represented.

    ============
    WHY DO THIS?
    ============

    The problem is that SETATTR and SEARCH cover a slew of actions, not all of
    which should be grouped together.

    For SETATTR, this includes actions that are about controlling access to a
    key:

    (1) Changing a key's ownership.

    (2) Changing a key's security information.

    (3) Setting a keyring's restriction.

    And actions that are about managing a key's lifetime:

    (4) Setting an expiry time.

    (5) Revoking a key.

    and (proposed) managing a key as part of a cache:

    (6) Invalidating a key.

    Managing a key's lifetime doesn't really have anything to do with
    controlling access to that key.

    Expiry time is awkward since it's more about the lifetime of the content
    and so, in some ways goes better with WRITE permission. It can, however,
    be set unconditionally by a process with an appropriate authorisation token
    for instantiating a key, and can also be set by the key type driver when a
    key is instantiated, so lumping it with the access-controlling actions is
    probably okay.

    As for SEARCH permission, that currently covers:

    (1) Finding keys in a keyring tree during a search.

    (2) Permitting keyrings to be joined.

    (3) Invalidation.

    But these don't really belong together either, since these actions really
    need to be controlled separately.

    Finally, there are number of special cases to do with granting the
    administrator special rights to invalidate or clear keys that I would like
    to handle with the ACL rather than key flags and special checks.

    ===============
    WHAT IS CHANGED
    ===============

    The SETATTR permission is split to create two new permissions:

    (1) SET_SECURITY - which allows the key's owner, group and ACL to be
    changed and a restriction to be placed on a keyring.

    (2) REVOKE - which allows a key to be revoked.

    The SEARCH permission is split to create:

    (1) SEARCH - which allows a keyring to be search and a key to be found.

    (2) JOIN - which allows a keyring to be joined as a session keyring.

    (3) INVAL - which allows a key to be invalidated.

    The WRITE permission is also split to create:

    (1) WRITE - which allows a key's content to be altered and links to be
    added, removed and replaced in a keyring.

    (2) CLEAR - which allows a keyring to be cleared completely. This is
    split out to make it possible to give just this to an administrator.

    (3) REVOKE - see above.

    Keys acquire ACLs which consist of a series of ACEs, and all that apply are
    unioned together. An ACE specifies a subject, such as:

    (*) Possessor - permitted to anyone who 'possesses' a key
    (*) Owner - permitted to the key owner
    (*) Group - permitted to the key group
    (*) Everyone - permitted to everyone

    Note that 'Other' has been replaced with 'Everyone' on the assumption that
    you wouldn't grant a permit to 'Other' that you wouldn't also grant to
    everyone else.

    Further subjects may be made available by later patches.

    The ACE also specifies a permissions mask. The set of permissions is now:

    VIEW Can view the key metadata
    READ Can read the key content
    WRITE Can update/modify the key content
    SEARCH Can find the key by searching/requesting
    LINK Can make a link to the key
    SET_SECURITY Can change owner, ACL, expiry
    INVAL Can invalidate
    REVOKE Can revoke
    JOIN Can join this keyring
    CLEAR Can clear this keyring

    The KEYCTL_SETPERM function is then deprecated.

    The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set,
    or if the caller has a valid instantiation auth token.

    The KEYCTL_INVALIDATE function then requires INVAL.

    The KEYCTL_REVOKE function then requires REVOKE.

    The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an
    existing keyring.

    The JOIN permission is enabled by default for session keyrings and manually
    created keyrings only.

    ======================
    BACKWARD COMPATIBILITY
    ======================

    To maintain backward compatibility, KEYCTL_SETPERM will translate the
    permissions mask it is given into a new ACL for a key - unless
    KEYCTL_SET_ACL has been called on that key, in which case an error will be
    returned.

    It will convert possessor, owner, group and other permissions into separate
    ACEs, if each portion of the mask is non-zero.

    SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE
    permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned
    on if a keyring is being altered.

    The KEYCTL_DESCRIBE function translates the ACL back into a permissions
    mask to return depending on possessor, owner, group and everyone ACEs.

    It will make the following mappings:

    (1) INVAL, JOIN -> SEARCH

    (2) SET_SECURITY -> SETATTR

    (3) REVOKE -> WRITE if SETATTR isn't already set

    (4) CLEAR -> WRITE

    Note that the value subsequently returned by KEYCTL_DESCRIBE may not match
    the value set with KEYCTL_SETATTR.

    =======
    TESTING
    =======

    This passes the keyutils testsuite for all but a couple of tests:

    (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now
    returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed
    if the type doesn't have ->read(). You still can't actually read the
    key.

    (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't
    work as Other has been replaced with Everyone in the ACL.

    Signed-off-by: David Howells

    David Howells
     
  • Create a request_key_net() function and use it to pass the network
    namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys
    for different domains can coexist in the same keyring.

    Signed-off-by: David Howells
    cc: netdev@vger.kernel.org
    cc: linux-nfs@vger.kernel.org
    cc: linux-cifs@vger.kernel.org
    cc: linux-afs@lists.infradead.org

    David Howells
     

27 Jun, 2019

9 commits

  • Create key domain tags for network namespaces and make it possible to
    automatically tag keys that are used by networked services (e.g. AF_RXRPC,
    AFS, DNS) with the default network namespace if not set by the caller.

    This allows keys with the same description but in different namespaces to
    coexist within a keyring.

    Signed-off-by: David Howells
    cc: netdev@vger.kernel.org
    cc: linux-nfs@vger.kernel.org
    cc: linux-cifs@vger.kernel.org
    cc: linux-afs@lists.infradead.org

    David Howells
     
  • If a key operation domain (such as a network namespace) has been removed
    then attempt to garbage collect all the keys that use it.

    Signed-off-by: David Howells

    David Howells
     
  • Currently a key has a standard matching criteria of { type, description }
    and this is used to only allow keys with unique criteria in a keyring.
    This means, however, that you cannot have keys with the same type and
    description but a different target namespace in the same keyring.

    This is a potential problem for a containerised environment where, say, a
    container is made up of some parts of its mount space involving netfs
    superblocks from two different network namespaces.

    This is also a problem for shared system management keyrings such as the
    DNS records keyring or the NFS idmapper keyring that might contain keys
    from different network namespaces.

    Fix this by including a namespace component in a key's matching criteria.
    Keyring types are marked to indicate which, if any, namespace is relevant
    to keys of that type, and that namespace is set when the key is created
    from the current task's namespace set.

    The capability bit KEYCTL_CAPS1_NS_KEY_TAG is set if the kernel is
    employing this feature.

    Signed-off-by: David Howells

    David Howells
     
  • Move the user and user-session keyrings to the user_namespace struct rather
    than pinning them from the user_struct struct. This prevents these
    keyrings from propagating across user-namespaces boundaries with regard to
    the KEY_SPEC_* flags, thereby making them more useful in a containerised
    environment.

    The issue is that a single user_struct may be represent UIDs in several
    different namespaces.

    The way the patch does this is by attaching a 'register keyring' in each
    user_namespace and then sticking the user and user-session keyrings into
    that. It can then be searched to retrieve them.

    Signed-off-by: David Howells
    cc: Jann Horn

    David Howells
     
  • Keyring names are held in a single global list that any process can pick
    from by means of keyctl_join_session_keyring (provided the keyring grants
    Search permission). This isn't very container friendly, however.

    Make the following changes:

    (1) Make default session, process and thread keyring names begin with a
    '.' instead of '_'.

    (2) Keyrings whose names begin with a '.' aren't added to the list. Such
    keyrings are system specials.

    (3) Replace the global list with per-user_namespace lists. A keyring adds
    its name to the list for the user_namespace that it is currently in.

    (4) When a user_namespace is deleted, it just removes itself from the
    keyring name list.

    The global keyring_name_lock is retained for accessing the name lists.
    This allows (4) to work.

    This can be tested by:

    # keyctl newring foo @s
    995906392
    # unshare -U
    $ keyctl show
    ...
    995906392 --alswrv 65534 65534 \_ keyring: foo
    ...
    $ keyctl session foo
    Joined session keyring: 935622349

    As can be seen, a new session keyring was created.

    The capability bit KEYCTL_CAPS1_NS_KEYRING_NAME is set if the kernel is
    employing this feature.

    Signed-off-by: David Howells
    cc: Eric W. Biederman

    David Howells
     
  • Add a 'recurse' flag for keyring searches so that the flag can be omitted
    and recursion disabled, thereby allowing just the nominated keyring to be
    searched and none of the children.

    Signed-off-by: David Howells

    David Howells
     
  • Cache the hash of the key's type and description in the index key so that
    we're not recalculating it every time we look at a key during a search.
    The hash function does a bunch of multiplications, so evading those is
    probably worthwhile - especially as this is done for every key examined
    during a search.

    This also allows the methods used by assoc_array to get chunks of index-key
    to be simplified.

    Signed-off-by: David Howells

    David Howells
     
  • Simplify key description management by cramming the word containing the
    length with the first few chars of the description also. This simplifies
    the code that generates the index-key used by assoc_array. It should speed
    up key searching a bit too.

    Signed-off-by: David Howells

    David Howells
     
  • Kill off request_key_async{,_with_auxdata}() as they're not currently used.

    Signed-off-by: David Howells

    David Howells
     

24 Jun, 2019

1 commit

  • Currently during soft reboot(kexec_file_load) boot command line
    arguments are not measured. Define hooks needed to measure kexec
    command line arguments during soft reboot(kexec_file_load).

    - A new ima hook ima_kexec_cmdline is defined to be called by the
    kexec code.
    - A new function process_buffer_measurement is defined to measure
    the buffer hash into the IMA measurement list.
    - A new func policy KEXEC_CMDLINE is defined to control the
    measurement.

    Signed-off-by: Prakhar Srivastava
    Signed-off-by: Mimi Zohar

    Prakhar Srivastava
     

22 Jun, 2019

1 commit

  • Pull still more SPDX updates from Greg KH:
    "Another round of SPDX updates for 5.2-rc6

    Here is what I am guessing is going to be the last "big" SPDX update
    for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates
    that were "easy" to determine by pattern matching. The ones after this
    are going to be a bit more difficult and the people on the spdx list
    will be discussing them on a case-by-case basis now.

    Another 5000+ files are fixed up, so our overall totals are:
    Files checked: 64545
    Files with SPDX: 45529

    Compared to the 5.1 kernel which was:
    Files checked: 63848
    Files with SPDX: 22576

    This is a huge improvement.

    Also, we deleted another 20000 lines of boilerplate license crud,
    always nice to see in a diffstat"

    * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits)
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485
    ...

    Linus Torvalds
     

20 Jun, 2019

1 commit


19 Jun, 2019

10 commits

  • If a filesystem uses keys to hold authentication tokens, then it needs a
    token for each VFS operation that might perform an authentication check -
    either by passing it to the server, or using to perform a check based on
    authentication data cached locally.

    For open files this isn't a problem, since the key should be cached in the
    file struct since it represents the subject performing operations on that
    file descriptor.

    During pathwalk, however, there isn't anywhere to cache the key, except
    perhaps in the nameidata struct - but that isn't exposed to the
    filesystems. Further, a pathwalk can incur a lot of operations, calling
    one or more of the following, for instance:

    ->lookup()
    ->permission()
    ->d_revalidate()
    ->d_automount()
    ->get_acl()
    ->getxattr()

    on each dentry/inode it encounters - and each one may need to call
    request_key(). And then, at the end of pathwalk, it will call the actual
    operation:

    ->mkdir()
    ->mknod()
    ->getattr()
    ->open()
    ...

    which may need to go and get the token again.

    However, it is very likely that all of the operations on a single
    dentry/inode - and quite possibly a sequence of them - will all want to use
    the same authentication token, which suggests that caching it would be a
    good idea.

    To this end:

    (1) Make it so that a positive result of request_key() and co. that didn't
    require upcalling to userspace is cached temporarily in task_struct.

    (2) The cache is 1 deep, so a new result displaces the old one.

    (3) The key is released by exit and by notify-resume.

    (4) The cache is cleared in a newly forked process.

    Signed-off-by: David Howells

    David Howells
     
  • Provide a request_key_rcu() function that can be used to request a key
    under RCU conditions. It can only search and check permissions; it cannot
    allocate a new key, upcall or wait for an upcall to complete. It may
    return a partially constructed key.

    Signed-off-by: David Howells

    David Howells
     
  • Move the RCU locks outwards from the keyring search functions so that it
    will become possible to provide an RCU-capable partial request_key()
    function in a later commit.

    Signed-off-by: David Howells

    David Howells
     
  • Invalidate used request_key authentication keys rather than revoking them
    so that they get cleaned up immediately rather than potentially hanging
    around. There doesn't seem any need to keep the revoked keys around.

    Signed-off-by: David Howells

    David Howells
     
  • The request_key() syscall allows a process to gain access to the 'possessor'
    permits of any key that grants it Search permission by virtue of request_key()
    not checking whether a key it finds grants Link permission to the caller.

    Signed-off-by: David Howells

    David Howells
     
  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Add a keyctl function that requests a set of capability bits to find out
    what features are supported.

    Signed-off-by: David Howells

    David Howells
     
  • Each function that manipulates the aa_ext struct should reset it's "pos"
    member on failure. This ensures that, on failure, no changes are made to
    the state of the aa_ext struct.

    There are paths were elements are optional and the error path is
    used to indicate the optional element is not present. This means
    instead of just aborting on error the unpack stream can become
    unsynchronized on optional elements, if using one of the affected
    functions.

    Cc: stable@vger.kernel.org
    Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy")
    Signed-off-by: Mike Salvatore
    Signed-off-by: John Johansen

    Mike Salvatore
     
  • A packed AppArmor policy contains null-terminated tag strings that are read
    by unpack_nameX(). However, unpack_nameX() uses string functions on them
    without ensuring that they are actually null-terminated, potentially
    leading to out-of-bounds accesses.

    Make sure that the tag string is null-terminated before passing it to
    strcmp().

    Cc: stable@vger.kernel.org
    Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy")
    Signed-off-by: Jann Horn
    Signed-off-by: John Johansen

    Jann Horn
     
  • While commit 11c236b89d7c2 ("apparmor: add a default null dfa") ensure
    every profile has a policy.dfa it does not resize the policy.start[]
    to have entries for every possible start value. Which means
    PROFILE_MEDIATES is not safe to use on untrusted input. Unforunately
    commit b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE") did not
    take into account the start value usage.

    The input string in profile_query_cb() is user controlled and is not
    properly checked to be within the limited start[] entries, even worse
    it can't be as userspace policy is allowed to make us of entries types
    the kernel does not know about. This mean usespace can currently cause
    the kernel to access memory up to 240 entries beyond the start array
    bounds.

    Cc: stable@vger.kernel.org
    Fixes: b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE")
    Signed-off-by: John Johansen

    John Johansen
     

18 Jun, 2019

1 commit

  • With gcc-4.6.3:

    WARNING: vmlinux.o(.text.unlikely+0x24c64): Section mismatch in reference from the function __integrity_init_keyring() to the function .init.text:set_platform_trusted_keys()
    The function __integrity_init_keyring() references
    the function __init set_platform_trusted_keys().
    This is often because __integrity_init_keyring lacks a __init
    annotation or the annotation of set_platform_trusted_keys is wrong.

    Indeed, if the compiler decides not to inline __integrity_init_keyring(),
    a warning is issued.

    Fix this by adding the missing __init annotation.

    Fixes: 9dc92c45177ab70e ("integrity: Define a trusted platform keyring")
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Nayna Jain
    Reviewed-by: James Morris
    Signed-off-by: Mimi Zohar

    Geert Uytterhoeven
     

17 Jun, 2019

1 commit

  • All callers of lockdep_assert_held_exclusive() use it to verify the
    correct locking state of either a semaphore (ldisc_sem in tty,
    mmap_sem for perf events, i_rwsem of inode for dax) or rwlock by
    apparmor. Thus it makes sense to rename _exclusive to _write since
    that's the semantics callers care. Additionally there is already
    lockdep_assert_held_read(), which this new naming is more consistent with.

    No functional changes.

    Signed-off-by: Nikolay Borisov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190531100651.3969-1-nborisov@suse.com
    Signed-off-by: Ingo Molnar

    Nikolay Borisov