09 Jun, 2022

9 commits

  • commit 8ba0005ff418ec356e176b26eaa04a6ac755d05b upstream.

    The original behavior was to check if the full set of requested accesses
    was allowed by at least a rule of every relevant layer. This didn't
    take into account requests for multiple accesses and same-layer rules
    allowing the union of these accesses in a complementary way. As a
    result, multiple accesses requested on a file hierarchy matching rules
    that, together, allowed these accesses, but without a unique rule
    allowing all of them, was illegitimately denied. This case should be
    rare in practice and it can only be triggered by the path_rename or
    file_open hook implementations.

    For instance, if, for the same layer, a rule allows execution
    beneath /a/b and another rule allows read beneath /a, requesting access
    to read and execute at the same time for /a/b should be allowed for this
    layer.

    This was an inconsistency because the union of same-layer rule accesses
    was already allowed if requested once at a time anyway.

    This fix changes the way allowed accesses are gathered over a path walk.
    To take into account all these rule accesses, we store in a matrix all
    layer granting the set of requested accesses, according to the handled
    accesses. To avoid heap allocation, we use an array on the stack which
    is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
    access right will increase this size to reach 112 bytes (2*14*4) in case
    of link or rename actions.

    Add a new layout1.layer_rule_unions test to check that accesses from
    different rules pertaining to the same layer are ORed in a file
    hierarchy. Also test that it is not the case for rules from different
    layers.

    Reviewed-by: Paul Moore
    Link: https://lore.kernel.org/r/20220506161102.525323-5-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 2cd7cd6eed88b8383cfddce589afe9c0ae1d19b4 upstream.

    This refactoring will be useful in a following commit.

    Reviewed-by: Paul Moore
    Link: https://lore.kernel.org/r/20220506161102.525323-4-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 75c542d6c6cc48720376862d5496d51509160dfd upstream.

    The maximum number of nested Landlock domains is currently 64. Because
    of the following fix and to help reduce the stack size, let's reduce it
    to 16. This seems large enough for a lot of use cases (e.g. sandboxed
    init service, spawning a sandboxed SSH service, in nested sandboxed
    containers). Reducing the number of nested domains may also help to
    discover misuse of Landlock (e.g. creating a domain per rule).

    Add and use a dedicated layer_mask_t typedef to fit with the number of
    layers. This might be useful when changing it and to keep it consistent
    with the maximum number of layers.

    Reviewed-by: Paul Moore
    Link: https://lore.kernel.org/r/20220506161102.525323-3-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 5f2ff33e10843ef51275c8611bdb7b49537aba5d upstream.

    Create and use the access_mask_t typedef to enforce a consistent access
    mask size and uniformly use a 16-bits type. This will helps transition
    to a 32-bits value one day.

    Add a build check to make sure all (filesystem) access rights fit in.
    This will be extended with a following commit.

    Reviewed-by: Paul Moore
    Link: https://lore.kernel.org/r/20220506161102.525323-2-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit eba39ca4b155c54adf471a69e91799cc1727873f upstream.

    According to the Landlock goal to be a security feature available to
    unprivileges processes, it makes more sense to first check for
    no_new_privs before checking anything else (i.e. syscall arguments).

    Merge inval_fd_enforce and unpriv_enforce_without_no_new_privs tests
    into the new restrict_self_checks_ordering. This is similar to the
    previous commit checking other syscalls.

    Link: https://lore.kernel.org/r/20220506160820.524344-10-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 589172e5636c4d16c40b90e87543d43defe2d968 upstream.

    This makes more sense to first check the ruleset FD and then the rule
    attribute. It will be useful to factor out code for other rule types.

    Add inval_add_rule_arguments tests, extension of empty_path_beneath_attr
    tests, to also check error ordering for landlock_add_rule(2).

    Link: https://lore.kernel.org/r/20220506160820.524344-9-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit a13e248ff90e81e9322406c0e618cf2168702f4e upstream.

    It is not mandatory to pass a file descriptor obtained with the O_PATH
    flag. Also, replace rule's accesses with ruleset's accesses.

    Link: https://lore.kernel.org/r/20220506160820.524344-2-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 06a1c40a09a8dded4bf0e7e3ccbda6bddcccd7c8 upstream.

    Let's follow a consistent and documented coding style. Everything may
    not be to our liking but it is better than tacit knowledge. Moreover,
    this will help maintain style consistency between different developers.

    This contains only whitespace changes.

    Automatically formatted with:
    clang-format-14 -i security/landlock/*.[ch] include/uapi/linux/landlock.h

    Link: https://lore.kernel.org/r/20220506160513.523257-3-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     
  • commit 6cc2df8e3a3967e7c13a424f87f6efb1d4a62d80 upstream.

    In preparation to a following commit, add clang-format on and
    clang-format off stanzas around constant definitions. This enables to
    keep aligned values, which is much more readable than packed
    definitions.

    Link: https://lore.kernel.org/r/20220506160513.523257-2-mic@digikod.net
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Mickaël Salaün
     

08 Apr, 2022

1 commit

  • commit aea0b9f2486da8497f35c7114b764bf55e17c7ea upstream.

    Make the name of the anon inode fd "[landlock-ruleset]" instead of
    "landlock-ruleset". This is minor but most anon inode fds already
    carry square brackets around their name:

    [eventfd]
    [eventpoll]
    [fanotify]
    [fscontext]
    [io_uring]
    [pidfd]
    [signalfd]
    [timerfd]
    [userfaultfd]

    For the sake of consistency lets do the same for the landlock-ruleset anon
    inode fd that comes with landlock. We did the same in
    1cdc415f1083 ("uapi, fsopen: use square brackets around "fscontext" [ver #2]")
    for the new mount api.

    Cc: linux-security-module@vger.kernel.org
    Signed-off-by: Christian Brauner
    Link: https://lore.kernel.org/r/20211011133704.1704369-1-brauner@kernel.org
    Cc: stable@vger.kernel.org
    Signed-off-by: Mickaël Salaün
    Signed-off-by: Greg Kroah-Hartman

    Christian Brauner
     

23 Apr, 2021

7 commits

  • Add a new flag LANDLOCK_CREATE_RULESET_VERSION to
    landlock_create_ruleset(2). This enables to retreive a Landlock ABI
    version that is useful to efficiently follow a best-effort security
    approach. Indeed, it would be a missed opportunity to abort the whole
    sandbox building, because some features are unavailable, instead of
    protecting users as much as possible with the subset of features
    provided by the running kernel.

    This new flag enables user space to identify the minimum set of Landlock
    features supported by the running kernel without relying on a filesystem
    interface (e.g. /proc/version, which might be inaccessible) nor testing
    multiple syscall argument combinations (i.e. syscall bisection). New
    Landlock features will be documented and tied to a minimum version
    number (greater than 1). The current version will be incremented for
    each new kernel release supporting new Landlock features. User space
    libraries can leverage this information to seamlessly restrict processes
    as much as possible while being compatible with newer APIs.

    This is a much more lighter approach than the previous
    landlock_get_features(2): the complexity is pushed to user space
    libraries. This flag meets similar needs as securityfs versions:
    selinux/policyvers, apparmor/features/*/version* and tomoyo/version.

    Supporting this flag now will be convenient for backward compatibility.

    Cc: Arnd Bergmann
    Cc: James Morris
    Cc: Jann Horn
    Cc: Kees Cook
    Cc: Serge E. Hallyn
    Signed-off-by: Mickaël Salaün
    Link: https://lore.kernel.org/r/20210422154123.13086-14-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • These 3 system calls are designed to be used by unprivileged processes
    to sandbox themselves:
    * landlock_create_ruleset(2): Creates a ruleset and returns its file
    descriptor.
    * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
    ruleset, identified by the dedicated file descriptor.
    * landlock_restrict_self(2): Enforces a ruleset on the calling thread
    and its future children (similar to seccomp). This syscall has the
    same usage restrictions as seccomp(2): the caller must have the
    no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
    namespace.

    All these syscalls have a "flags" argument (not currently used) to
    enable extensibility.

    Here are the motivations for these new syscalls:
    * A sandboxed process may not have access to file systems, including
    /dev, /sys or /proc, but it should still be able to add more
    restrictions to itself.
    * Neither prctl(2) nor seccomp(2) (which was used in a previous version)
    fit well with the current definition of a Landlock security policy.

    All passed structs (attributes) are checked at build time to ensure that
    they don't contain holes and that they are aligned the same way for each
    architecture.

    See the user and kernel documentation for more details (provided by a
    following commit):
    * Documentation/userspace-api/landlock.rst
    * Documentation/security/landlock.rst

    Cc: Arnd Bergmann
    Cc: James Morris
    Cc: Jann Horn
    Cc: Kees Cook
    Signed-off-by: Mickaël Salaün
    Acked-by: Serge Hallyn
    Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • Using Landlock objects and ruleset, it is possible to tag inodes
    according to a process's domain. To enable an unprivileged process to
    express a file hierarchy, it first needs to open a directory (or a file)
    and pass this file descriptor to the kernel through
    landlock_add_rule(2). When checking if a file access request is
    allowed, we walk from the requested dentry to the real root, following
    the different mount layers. The access to each "tagged" inodes are
    collected according to their rule layer level, and ANDed to create
    access to the requested file hierarchy. This makes possible to identify
    a lot of files without tagging every inodes nor modifying the
    filesystem, while still following the view and understanding the user
    has from the filesystem.

    Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
    keep the same struct inodes for the same inodes whereas these inodes are
    in use.

    This commit adds a minimal set of supported filesystem access-control
    which doesn't enable to restrict all file-related actions. This is the
    result of multiple discussions to minimize the code of Landlock to ease
    review. Thanks to the Landlock design, extending this access-control
    without breaking user space will not be a problem. Moreover, seccomp
    filters can be used to restrict the use of syscall families which may
    not be currently handled by Landlock.

    Cc: Al Viro
    Cc: Anton Ivanov
    Cc: James Morris
    Cc: Jann Horn
    Cc: Jeff Dike
    Cc: Kees Cook
    Cc: Richard Weinberger
    Cc: Serge E. Hallyn
    Signed-off-by: Mickaël Salaün
    Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • Using ptrace(2) and related debug features on a target process can lead
    to a privilege escalation. Indeed, ptrace(2) can be used by an attacker
    to impersonate another task and to remain undetected while performing
    malicious activities. Thanks to ptrace_may_access(), various part of
    the kernel can check if a tracer is more privileged than a tracee.

    A landlocked process has fewer privileges than a non-landlocked process
    and must then be subject to additional restrictions when manipulating
    processes. To be allowed to use ptrace(2) and related syscalls on a
    target process, a landlocked process must have a subset of the target
    process's rules (i.e. the tracee must be in a sub-domain of the tracer).

    Cc: James Morris
    Signed-off-by: Mickaël Salaün
    Reviewed-by: Jann Horn
    Acked-by: Serge Hallyn
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20210422154123.13086-5-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • Process's credentials point to a Landlock domain, which is underneath
    implemented with a ruleset. In the following commits, this domain is
    used to check and enforce the ptrace and filesystem security policies.
    A domain is inherited from a parent to its child the same way a thread
    inherits a seccomp policy.

    Cc: James Morris
    Signed-off-by: Mickaël Salaün
    Reviewed-by: Jann Horn
    Acked-by: Serge Hallyn
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20210422154123.13086-4-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • A Landlock ruleset is mainly a red-black tree with Landlock rules as
    nodes. This enables quick update and lookup to match a requested
    access, e.g. to a file. A ruleset is usable through a dedicated file
    descriptor (cf. following commit implementing syscalls) which enables a
    process to create and populate a ruleset with new rules.

    A domain is a ruleset tied to a set of processes. This group of rules
    defines the security policy enforced on these processes and their future
    children. A domain can transition to a new domain which is the
    intersection of all its constraints and those of a ruleset provided by
    the current process. This modification only impact the current process.
    This means that a process can only gain more constraints (i.e. lose
    accesses) over time.

    Cc: James Morris
    Signed-off-by: Mickaël Salaün
    Acked-by: Serge Hallyn
    Reviewed-by: Kees Cook
    Reviewed-by: Jann Horn
    Link: https://lore.kernel.org/r/20210422154123.13086-3-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün
     
  • A Landlock object enables to identify a kernel object (e.g. an inode).
    A Landlock rule is a set of access rights allowed on an object. Rules
    are grouped in rulesets that may be tied to a set of processes (i.e.
    subjects) to enforce a scoped access-control (i.e. a domain).

    Because Landlock's goal is to empower any process (especially
    unprivileged ones) to sandbox themselves, we cannot rely on a
    system-wide object identification such as file extended attributes.
    Indeed, we need innocuous, composable and modular access-controls.

    The main challenge with these constraints is to identify kernel objects
    while this identification is useful (i.e. when a security policy makes
    use of this object). But this identification data should be freed once
    no policy is using it. This ephemeral tagging should not and may not be
    written in the filesystem. We then need to manage the lifetime of a
    rule according to the lifetime of its objects. To avoid a global lock,
    this implementation make use of RCU and counters to safely reference
    objects.

    A following commit uses this generic object management for inodes.

    Cc: James Morris
    Signed-off-by: Mickaël Salaün
    Reviewed-by: Jann Horn
    Acked-by: Serge Hallyn
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20210422154123.13086-2-mic@digikod.net
    Signed-off-by: James Morris

    Mickaël Salaün