06 Dec, 2018

1 commit

  • commit 9137bb27e60e554dab694eafa4cca241fa3a694f upstream

    Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
    PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
    indirect branch speculation via STIBP and IBPB.

    Invocations:
    Check indirect branch speculation status with
    - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);

    Enable indirect branch speculation with
    - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);

    Disable indirect branch speculation with
    - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);

    Force disable indirect branch speculation with
    - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);

    See Documentation/userspace-api/spec_ctrl.rst.

    Signed-off-by: Tim Chen
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Jiri Kosina
    Cc: Tom Lendacky
    Cc: Josh Poimboeuf
    Cc: Andrea Arcangeli
    Cc: David Woodhouse
    Cc: Andi Kleen
    Cc: Dave Hansen
    Cc: Casey Schaufler
    Cc: Asit Mallick
    Cc: Arjan van de Ven
    Cc: Jon Masters
    Cc: Waiman Long
    Cc: Greg KH
    Cc: Dave Stewart
    Cc: Kees Cook
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

23 May, 2018

3 commits

  • commit dd0792699c4058e63c0715d9a7c2d40226fcdddc upstream

    Fix some typos, improve formulations, end sentences with a fullstop.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 356e4bfff2c5489e016fdb925adbf12a1e3950ee upstream

    For certain use cases it is desired to enforce mitigations so they cannot
    be undone afterwards. That's important for loader stubs which want to
    prevent a child from disabling the mitigation again. Will also be used for
    seccomp(). The extra state preserving of the prctl state for SSB is a
    preparatory step for EBPF dymanic speculation control.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit b617cfc858161140d69cc0b5cc211996b557a1c7 upstream

    Add two new prctls to control aspects of speculation related vulnerabilites
    and their mitigations to provide finer grained control over performance
    impacting mitigations.

    PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
    which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
    the following meaning:

    Bit Define Description
    0 PR_SPEC_PRCTL Mitigation can be controlled per task by
    PR_SET_SPECULATION_CTRL
    1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is
    disabled
    2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is
    enabled

    If all bits are 0 the CPU is not affected by the speculation misfeature.

    If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
    available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
    misfeature will fail.

    PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
    is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
    control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.

    The common return values are:

    EINVAL prctl is not implemented by the architecture or the unused prctl()
    arguments are not 0
    ENODEV arg2 is selecting a not supported speculation misfeature

    PR_SET_SPECULATION_CTRL has these additional return values:

    ERANGE arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
    ENXIO prctl control of the selected speculation misfeature is disabled

    The first supported controlable speculation misfeature is
    PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
    architectures.

    Based on an initial patch from Tim Chen and mostly rewritten.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar
    Reviewed-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

15 Aug, 2017

5 commits

  • Right now, SECCOMP_RET_KILL_THREAD (neé SECCOMP_RET_KILL) kills the
    current thread. There have been a few requests for this to kill the entire
    process (the thread group). This cannot be just changed (discovered when
    adding coredump support since coredumping kills the entire process)
    because there are userspace programs depending on the thread-kill
    behavior.

    Instead, implement SECCOMP_RET_KILL_PROCESS, which is 0x80000000, and can
    be processed as "-1" by the kernel, below the existing RET_KILL that is
    ABI-set to "0". For userspace, SECCOMP_RET_ACTION_FULL is added to expand
    the mask to the signed bit. Old userspace using the SECCOMP_RET_ACTION
    mask will see SECCOMP_RET_KILL_PROCESS as 0 still, but this would only
    be visible when examining the siginfo in a core dump from a RET_KILL_*,
    where it will think it was thread-killed instead of process-killed.

    Attempts to introduce this behavior via other ways (filter flags,
    seccomp struct flags, masked RET_DATA bits) all come with weird
    side-effects and baggage. This change preserves the central behavioral
    expectations of the seccomp filter engine without putting too great
    a burden on changes needed in userspace to use the new action.

    The new action is discoverable by userspace through either the new
    actions_avail sysctl or through the SECCOMP_GET_ACTION_AVAIL seccomp
    operation. If used without checking for availability, old kernels
    will treat RET_KILL_PROCESS as RET_KILL_THREAD (since the old mask
    will produce RET_KILL_THREAD).

    Cc: Paul Moore
    Cc: Fabricio Voznika
    Signed-off-by: Kees Cook

    Kees Cook
     
  • In preparation for adding SECCOMP_RET_KILL_PROCESS, rename SECCOMP_RET_KILL
    to the more accurate SECCOMP_RET_KILL_THREAD.

    The existing selftest values are intentionally left as SECCOMP_RET_KILL
    just to be sure we're exercising the alias.

    Signed-off-by: Kees Cook

    Kees Cook
     
  • Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
    the syscall. At the implementation level, this action is identical to
    the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
    initially developing a seccomp filter for an application. The developer
    can set the default action to be SECCOMP_RET_LOG, maybe mark any
    obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
    application through its paces. A list of syscalls that triggered the
    default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
    that list can be used to build the syscall whitelist. Finally, the
    developer can change the default action to the desired value.

    This provides a more friendly experience than seeing the application get
    killed, then updating the filter and rebuilding the app, seeing the
    application get killed due to a different syscall, then updating the
    filter and rebuilding the app, etc.

    The functionality is similar to what's supported by the various LSMs.
    SELinux has permissive mode, AppArmor has complain mode, SMACK has
    bring-up mode, etc.

    SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW as allow
    while logging is slightly more restrictive than quietly allowing.

    Unfortunately, the tests added for SECCOMP_RET_LOG are not capable of
    inspecting the audit log to verify that the syscall was logged.

    With this patch, the logic for deciding if an action will be logged is:

    if action == RET_ALLOW:
    do not log
    else if action == RET_KILL && RET_KILL in actions_logged:
    log
    else if action == RET_LOG && RET_LOG in actions_logged:
    log
    else if filter-requests-logging && action in actions_logged:
    log
    else if audit_enabled && process-is-being-audited:
    log
    else:
    do not log

    Signed-off-by: Tyler Hicks
    Signed-off-by: Kees Cook

    Tyler Hicks
     
  • Adminstrators can write to this sysctl to set the seccomp actions that
    are allowed to be logged. Any actions not found in this sysctl will not
    be logged.

    For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
    SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
    written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
    since its string representation ("trace") wasn't present in the sysctl
    value.

    The path to the sysctl is:

    /proc/sys/kernel/seccomp/actions_logged

    The actions_avail sysctl can be read to discover the valid action names
    that can be written to the actions_logged sysctl with the exception of
    "allow". SECCOMP_RET_ALLOW actions cannot be configured for logging.

    The default setting for the sysctl is to allow all actions to be logged
    except SECCOMP_RET_ALLOW. While only SECCOMP_RET_KILL actions are
    currently logged, an upcoming patch will allow applications to request
    additional actions to be logged.

    There's one important exception to this sysctl. If a task is
    specifically being audited, meaning that an audit context has been
    allocated for the task, seccomp will log all actions other than
    SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
    preserves the existing auditing behavior of tasks with an allocated
    audit context.

    With this patch, the logic for deciding if an action will be logged is:

    if action == RET_ALLOW:
    do not log
    else if action == RET_KILL && RET_KILL in actions_logged:
    log
    else if audit_enabled && task-is-being-audited:
    log
    else:
    do not log

    Signed-off-by: Tyler Hicks
    Signed-off-by: Kees Cook

    Tyler Hicks
     
  • This patch creates a read-only sysctl containing an ordered list of
    seccomp actions that the kernel supports. The ordering, from left to
    right, is the lowest action value (kill) to the highest action value
    (allow). Currently, a read of the sysctl file would return "kill trap
    errno trace allow". The contents of this sysctl file can be useful for
    userspace code as well as the system administrator.

    The path to the sysctl is:

    /proc/sys/kernel/seccomp/actions_avail

    libseccomp and other userspace code can easily determine which actions
    the current kernel supports. The set of actions supported by the current
    kernel may be different than the set of action macros found in kernel
    headers that were installed where the userspace code was built.

    In addition, this sysctl will allow system administrators to know which
    actions are supported by the kernel and make it easier to configure
    exactly what seccomp logs through the audit subsystem. Support for this
    level of logging configuration will come in a future patch.

    Signed-off-by: Tyler Hicks
    Signed-off-by: Kees Cook

    Tyler Hicks
     

19 May, 2017

3 commits

  • This updates no_new_privs documentation to ReST markup and adds it to
    the user-space API documentation.

    Signed-off-by: Kees Cook
    Signed-off-by: Jonathan Corbet

    Kees Cook
     
  • This updates seccomp_filter.txt for ReST markup, and moves it under the
    user-space API index, since it describes how application author can use
    seccomp.

    Signed-off-by: Kees Cook
    Signed-off-by: Jonathan Corbet

    Kees Cook
     
  • The asterisk of the pointer is interpreted as a start tag for inline
    emphasis. Asterisks which are not Sphinx markup need to be quoted in
    rst-files. This fixes the Sphinx warning:

    Documentation/userspace-api/unshare.rst:108: WARNING: Inline emphasis start-string without end-string.

    Signed-off-by: Markus Heiser
    Signed-off-by: Jonathan Corbet

    Markus Heiser
     

03 Apr, 2017

2 commits