26 Jul, 2008

3 commits

  • - clean up set_majmin()
    - use simple_strtoul() to parse major/minor

    [akpm@linux-foundation.org: fix simple_strtoul() usage]
    [kosaki.motohiro@jp.fujitsu.com: fix warnings]
    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Currently this list is protected with a simple spinlock, even for reading
    from one. This is OK, but can be better.

    Actually I want it to be better very much, since after replacing the
    OpenVZ device permissions engine with the cgroup-based one I noticed, that
    we set 12 default device permissions for each newly created container (for
    /dev/null, full, terminals, ect devices), and people sometimes have up to
    20 perms more, so traversing the ~30-40 elements list under a spinlock
    doesn't seem very good.

    Here's the RCU protection for white-list - dev_whitelist_item-s are added
    and removed under the devcg->lock, but are looked up in permissions
    checking under the rcu_read_lock.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Balbir Singh
    Cc: Paul Menage
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This patch converts devcgroup_access_write() from a raw file handler
    into a handler for the cgroup write_string() method. This allows some
    boilerplate copying/locking/checking to be removed and simplifies the
    cleanup path, since these functions are performed by the cgroups
    framework before calling the handler.

    Signed-off-by: Paul Menage
    Cc: Paul Jackson
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Acked-by: Serge Hallyn
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     

25 Jul, 2008

2 commits

  • Filesystem capabilities have come of age. Remove the experimental tag for
    configuring filesystem capabilities.

    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     
  • When cap_bset suppresses some of the forced (fP) capabilities of a file,
    it is generally only safe to execute the program if it understands how to
    recognize it doesn't have enough privilege to work correctly. For legacy
    applications (fE!=0), which have no non-destructive way to determine that
    they are missing privilege, we fail to execute (EPERM) any executable that
    requires fP capabilities, but would otherwise get pP' < fP. This is a
    fail-safe permission check.

    For some discussion of why it is problematic for (legacy) privileged
    applications to run with less than the set of capabilities requested for
    them, see:

    http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html

    With this iteration of this support, we do not include setuid-0 based
    privilege protection from the bounding set. That is, the admin can still
    (ab)use the bounding set to suppress the privileges of a setuid-0 program.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     

15 Jul, 2008

1 commit


14 Jul, 2008

27 commits

  • The register security hook is no longer required, as the capability
    module is always registered. LSMs wishing to stack capability as
    a secondary module should do so explicitly.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley
    Acked-by: Greg Kroah-Hartman

    James Morris
     
  • Fix small oversight in "security: remove dummy module":
    CONFIG_SECURITY_FILE_CAPABILITIES doesn't depend on CONFIG_SECURITY

    Signed-off-by: Miklos Szeredi
    Signed-off-by: James Morris

    Miklos Szeredi
     
  • Remove the dummy module and make the "capability" module the default.

    Compile and boot tested.

    Signed-off-by: Miklos Szeredi
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    Miklos Szeredi
     
  • The sb_get_mnt_opts() hook is unused, and is superseded by the
    sb_show_options() hook.

    Signed-off-by: Miklos Szeredi
    Acked-by: James Morris

    Miklos Szeredi
     
  • This patch causes SELinux mount options to show up in /proc/mounts. As
    with other code in the area seq_put errors are ignored. Other LSM's
    will not have their mount options displayed until they fill in their own
    security_sb_show_options() function.

    Signed-off-by: Eric Paris
    Signed-off-by: Miklos Szeredi
    Signed-off-by: James Morris

    Eric Paris
     
  • Currently if a FS is mounted for which SELinux policy does not define an
    fs_use_* that FS will either be genfs labeled or not labeled at all.
    This decision is based on the existence of a genfscon rule in policy and
    is irrespective of the capabilities of the filesystem itself. This
    patch allows the kernel to check if the filesystem supports security
    xattrs and if so will use those if there is no fs_use_* rule in policy.
    An fstype with a no fs_use_* rule but with a genfs rule will use xattrs
    if available and will follow the genfs rule.

    This can be particularly interesting for things like ecryptfs which
    actually overlays a real underlying FS. If we define excryptfs in
    policy to use xattrs we will likely get this wrong at times, so with
    this path we just don't need to define it!

    Overlay ecryptfs on top of NFS with no xattr support:
    SELinux: initialized (dev ecryptfs, type ecryptfs), uses genfs_contexts
    Overlay ecryptfs on top of ext4 with xattr support:
    SELinux: initialized (dev ecryptfs, type ecryptfs), uses xattr

    It is also useful as the kernel adds new FS we don't need to add them in
    policy if they support xattrs and that is how we want to handle them.

    Signed-off-by: Eric Paris
    Acked-by: Stephen Smalley
    Signed-off-by: James Morris

    Eric Paris
     
  • Fix several warnings generated by sparse of the form
    "returning void-valued expression".

    Signed-off-by: James Morris
    Acked-by: Casey Schaufler
    Acked-by: Serge Hallyn

    James Morris
     
  • Use do_each_thread as a proper do/while block. Sparse complained.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley

    James Morris
     
  • Remove unused and shadowed addrlen variable. Picked up by sparse.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley
    Acked-by: Paul Moore

    James Morris
     
  • I've gotten complaints and reports about people not understanding the
    meaning of the current unknown class/perm handling the kernel emits on
    every policy load. Hopefully this will make make it clear to everyone
    the meaning of the message and won't waste a printk the user won't care
    about anyway on systems where the kernel and the policy agree on
    everything.

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     
  • On Mon, 2008-06-09 at 01:24 -0700, Andrew Morton wrote:
    > Getting a few of these with FC5:
    >
    > SELinux: context_struct_compute_av: unrecognized class 69
    > SELinux: context_struct_compute_av: unrecognized class 69
    >
    > one came out when I logged in.
    >
    > No other symptoms, yet.

    Change handling of invalid classes by SELinux, reporting class values
    unknown to the kernel as errors (w/ ratelimit applied) and handling
    class values unknown to policy as normal denials.

    Signed-off-by: Stephen Smalley
    Acked-by: Eric Paris
    Signed-off-by: James Morris

    Stephen Smalley
     
  • We used to protect against races of policy load in security_load_policy
    by using the load_mutex. Since then we have added a new mutex,
    sel_mutex, in sel_write_load() which is always held across all calls to
    security_load_policy we are covered and can safely just drop this one.

    Signed-off-by: Eric Paris
    Acked-by: Stephen Smalley
    Signed-off-by: James Morris

    Eric Paris
     
  • The class_to_string array is referenced by tclass. My code mistakenly
    was using tclass - 1. If the proceeding class is a userspace class
    rather than kernel class this may cause a denial/EINVAL even if unknown
    handling is set to allow. The bug shouldn't be allowing excess
    privileges since those are given based on the contents of another array
    which should be correctly referenced.

    At this point in time its pretty unlikely this is going to cause
    problems. The most recently added kernel classes which could be
    affected are association, dccp_socket, and peer. Its pretty unlikely
    any policy with handle_unknown=allow doesn't have association and
    dccp_socket undefined (they've been around longer than unknown handling)
    and peer is conditionalized on a policy cap which should only be defined
    if that class exists in policy.

    Signed-off-by: Eric Paris
    Acked-by: Stephen Smalley
    Signed-off-by: James Morris

    Eric Paris
     
  • Open code sidtab lock to make Andrew Morton happy.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley

    James Morris
     
  • Open code load_mutex as suggested by Andrew Morton.

    Signed-off-by: James Morris

    James Morris
     
  • Open code policy_rwlock, as suggested by Andrew Morton.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley

    James Morris
     
  • Fix an endianness bug in the handling of network node addresses by
    SELinux. This yields no change on little endian hardware but fixes
    the incorrect handling on big endian hardware. The network node
    addresses are stored in network order in memory by checkpolicy, not in
    cpu/host order, and thus should not have cpu_to_le32/le32_to_cpu
    conversions applied upon policy write/read unlike other data in the
    policy.

    Bug reported by John Weeks of Sun, who noticed that binary policy
    files built from the same policy source on x86 and sparc differed and
    tracked it down to the ipv4 address handling in checkpolicy.

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris

    Stephen Smalley
     
  • Simplify and improve the robustness of the SELinux ioctl checking by
    using the "access mode" bits of the ioctl command to determine the
    permission check rather than dealing with individual command values.
    This removes any knowledge of specific ioctl commands from SELinux
    and follows the same guidance we gave to Smack earlier.

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris

    Stephen Smalley
     
  • Enable processes with CAP_MAC_ADMIN + mac_admin permission in policy
    to get undefined contexts on inodes. This extends the support for
    deferred mapping of security contexts in order to permit restorecon
    and similar programs to see the raw file contexts unknown to the
    system policy in order to check them.

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris

    Stephen Smalley
     
  • Enable security modules to distinguish reading of process state via
    proc from full ptrace access by renaming ptrace_may_attach to
    ptrace_may_access and adding a mode argument indicating whether only
    read access or full attach access is requested. This allows security
    modules to permit access to reading process state without granting
    full ptrace access. The base DAC/capability checking remains unchanged.

    Read access to /proc/pid/mem continues to apply a full ptrace attach
    check since check_mem_permission() already requires the current task
    to already be ptracing the target. The other ptrace checks within
    proc for elements like environ, maps, and fds are changed to pass the
    read mode instead of attach.

    In the SELinux case, we model such reading of process state as a
    reading of a proc file labeled with the target process' label. This
    enables SELinux policy to permit such reading of process state without
    permitting control or manipulation of the target process, as there are
    a number of cases where programs probe for such information via proc
    but do not need to be able to control the target (e.g. procps,
    lsof, PolicyKit, ConsoleKit). At present we have to choose between
    allowing full ptrace in policy (more permissive than required/desired)
    or breaking functionality (or in some cases just silencing the denials
    via dontaudit rules but this can hide genuine attacks).

    This version of the patch incorporates comments from Casey Schaufler
    (change/replace existing ptrace_may_attach interface, pass access
    mode), and Chris Wright (provide greater consistency in the checking).

    Note that like their predecessors __ptrace_may_attach and
    ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
    interfaces use different return value conventions from each other (0
    or -errno vs. 1 or 0). I retained this difference to avoid any
    changes to the caller logic but made the difference clearer by
    changing the latter interface to return a bool rather than an int and
    by adding a comment about it to ptrace.h for any future callers.

    Signed-off-by: Stephen Smalley
    Acked-by: Chris Wright
    Signed-off-by: James Morris

    Stephen Smalley
     
  • Remove inherit field from inode_security_struct, per Stephen Smalley:
    "Let's just drop inherit altogether - dead field."

    Signed-off-by: James Morris

    James Morris
     
  • reorder inode_security_struct to remove padding on 64 bit builds

    size reduced from 72 to 64 bytes increasing objects per slab to 64.

    Signed-off-by: Richard Kennedy
    Signed-off-by: James Morris

    Richard Kennedy
     
  • Formatting and syntax changes

    whitespace, tabs to spaces, trailing space
    put open { on same line as struct def
    remove unneeded {} after if statements
    change printk("Lu") to printk("llu")
    convert asm/uaccess.h to linux/uaacess.h includes
    remove unnecessary asm/bug.h includes
    convert all users of simple_strtol to strict_strtol

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     
  • Fix a sleeping function called from invalid context bug by moving allocation
    to the callers prior to taking the policy rdlock.

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris

    Stephen Smalley
     
  • Introduce SELinux support for deferred mapping of security contexts in
    the SID table upon policy reload, and use this support for inode
    security contexts when the context is not yet valid under the current
    policy. Only processes with CAP_MAC_ADMIN + mac_admin permission in
    policy can set undefined security contexts on inodes. Inodes with
    such undefined contexts are treated as having the unlabeled context
    until the context becomes valid upon a policy reload that defines the
    context. Context invalidation upon policy reload also uses this
    support to save the context information in the SID table and later
    recover it upon a subsequent policy reload that defines the context
    again.

    This support is to enable package managers and similar programs to set
    down file contexts unknown to the system policy at the time the file
    is created in order to better support placing loadable policy modules
    in packages and to support build systems that need to create images of
    different distro releases with different policies w/o requiring all of
    the contexts to be defined or legal in the build host policy.

    With this patch applied, the following sequence is possible, although
    in practice it is recommended that this permission only be allowed to
    specific program domains such as the package manager.

    # rmdir baz
    # rm bar
    # touch bar
    # chcon -t foo_exec_t bar # foo_exec_t is not yet defined
    chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
    # mkdir -Z system_u:object_r:foo_exec_t baz
    mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument
    # cat setundefined.te
    policy_module(setundefined, 1.0)
    require {
    type unconfined_t;
    type unlabeled_t;
    }
    files_type(unlabeled_t)
    allow unconfined_t self:capability2 mac_admin;
    # make -f /usr/share/selinux/devel/Makefile setundefined.pp
    # semodule -i setundefined.pp
    # chcon -t foo_exec_t bar # foo_exec_t is not yet defined
    # mkdir -Z system_u:object_r:foo_exec_t baz
    # ls -Zd bar baz
    -rw-r--r-- root root system_u:object_r:unlabeled_t bar
    drwxr-xr-x root root system_u:object_r:unlabeled_t baz
    # cat foo.te
    policy_module(foo, 1.0)
    type foo_exec_t;
    files_type(foo_exec_t)
    # make -f /usr/share/selinux/devel/Makefile foo.pp
    # semodule -i foo.pp # defines foo_exec_t
    # ls -Zd bar baz
    -rw-r--r-- root root user_u:object_r:foo_exec_t bar
    drwxr-xr-x root root system_u:object_r:foo_exec_t baz
    # semodule -r foo
    # ls -Zd bar baz
    -rw-r--r-- root root system_u:object_r:unlabeled_t bar
    drwxr-xr-x root root system_u:object_r:unlabeled_t baz
    # semodule -i foo.pp
    # ls -Zd bar baz
    -rw-r--r-- root root user_u:object_r:foo_exec_t bar
    drwxr-xr-x root root system_u:object_r:foo_exec_t baz
    # semodule -r setundefined foo
    # chcon -t foo_exec_t bar # no longer defined and not allowed
    chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
    # rmdir baz
    # mkdir -Z system_u:object_r:foo_exec_t baz
    mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris

    Stephen Smalley
     
  • # cat devices.list
    c 1:3 r
    # echo 'c 1:3 w' > sub/devices.allow
    # cat sub/devices.list
    c 1:3 w

    As illustrated, the parent group has no write permission to /dev/null, so
    it's child should not be allowed to add this write permission.

    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • # echo "b $((0x7fffffff)):$((0x80000000)) rwm" > devices.allow
    # cat devices.list
    b 214748364:-21474836 rwm

    though a major/minor number of 0x800000000 is meaningless, we
    should not cast it to a negative value.

    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

05 Jul, 2008

2 commits

  • # cat /devcg/devices.list
    a *:* rwm
    # echo a > devices.allow
    # cat /devcg/devices.list
    a *:* rwm
    a 0:0 rwm

    This is odd and maybe confusing. With this patch, writing 'a' to
    devices.allow will add 'a *:* rwm' to the whitelist.

    Also a few fixes and updates to the document.

    Signed-off-by: Li Zefan
    Cc: Pavel Emelyanov
    Cc: Serge E. Hallyn
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • The filesystem capability support meaning for CAP_SETPCAP is less powerful
    than the non-filesystem capability support. As such, when filesystem
    capabilities are configured, we should not permit CAP_SETPCAP to 'enhance'
    the current process through strace manipulation of a child process.

    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     

13 Jun, 2008

1 commit

  • The dummy module is used by folk that run security conscious code(!?). A
    feature of such code (for example, dhclient) is that it tries to operate
    with minimum privilege (dropping unneeded capabilities). While the dummy
    module doesn't restrict code execution based on capability state, the user
    code expects the kernel to appear to support it. This patch adds back
    faked support for the PR_SET_KEEPCAPS etc., calls - making the kernel
    behave as before 2.6.26.

    For details see: http://bugzilla.kernel.org/show_bug.cgi?id=10748

    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Cc: James Morris
    Cc: Stephen Smalley
    Cc: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     

07 Jun, 2008

4 commits

  • This semaphore doesn't appear to be used, so remove it.

    Signed-off-by: Daniel Walker
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Walker
     
  • Consider you added a 'c foo:bar r' permission to some cgroup and then (a
    bit later) 'c'foo:bar w' for it. After this you'll see the

    c foo:bar r
    c foo:bar w

    lines in a devices.list file.

    Another example - consider you added 10 'c foo:bar r' permissions to some
    cgroup (e.g. by mistake). After this you'll see 10 c foo:bar r lines in
    a list file.

    This is weird. This situation also has one more annoying consequence.
    Having many items in a white list makes permissions checking slower, sine
    it has to walk a longer list.

    The proposal is to merge permissions for items, that correspond to the
    same device.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Two functions, that need to get a device_cgroup from a task (they are
    devcgroup_inode_permission and devcgroup_inode_mknod) make it in a strange
    way:

    They get a css_set from task, then a subsys_state from css_set, then a
    cgroup from the state and then a subsys_state again from the cgroup.
    Besides, the devices_subsys_id is read from memory, whilst there's a
    enum-ed constant for it.

    Optimize this part a bit:
    1. Get the subsys_stats form the task and be done - no 2 extra
    dereferences,
    2. Use the device_subsys_id constant, not the value from memory
    (i.e. one less dereference).

    Found while preparing 2.6.26 OpenVZ port.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Acked-by: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This is just picking the container_of out of cgroup_to_devcgroup into a
    separate function.

    This new css_to_devcgroup will be used in the 2nd patch.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov