01 Apr, 2009

2 commits

  • * all changes of current->fs are done under task_lock and write_lock of
    old fs->lock
    * refcount is not atomic anymore (same protection)
    * its decrements are done when removing reference from current; at the
    same time we decide whether to free it.
    * put_fs_struct() is gone
    * new field - ->in_exec. Set by check_unsafe_exec() if we are trying to do
    execve() and only subthreads share fs_struct. Cleared when finishing exec
    (success and failure alike). Makes CLONE_FS fail with -EAGAIN if set.
    * check_unsafe_exec() may fail with -EAGAIN if another execve() from subthread
    is in progress.

    Signed-off-by: Al Viro

    Al Viro
     
  • Pure code move; two new helper functions for nfsd and daemonize
    (unshare_fs_struct() and daemonize_fs_struct() resp.; for now -
    the same code as used to be in callers). unshare_fs_struct()
    exported (for nfsd, as copy_fs_struct()/exit_fs() used to be),
    copy_fs_struct() and exit_fs() don't need exports anymore.

    Signed-off-by: Al Viro

    Al Viro
     

29 Mar, 2009

1 commit

  • Joe Malicki reports that setuid sometimes doesn't: very rarely,
    a setuid root program does not get root euid; and, by the way,
    they have a health check running lsof every few minutes.

    Right, check_unsafe_exec() notes whether the files_struct is being
    shared by more threads than will get killed by the exec, and if so
    sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
    But /proc//fd and /proc//fdinfo lookups make transient
    use of get_files_struct(), which also raises that sharing count.

    There's a rather simple fix for this: exec's check on files->count
    has been redundant ever since 2.6.1 made it unshare_files() (except
    while compat_do_execve() omitted to do so) - just remove that check.

    [Note to -stable: this patch will not apply before 2.6.29: earlier
    releases should just remove the files->count line from unsafe_exec().]

    Reported-by: Joe Malicki
    Narrowed-down-by: Michael Itz
    Tested-by: Joe Malicki
    Signed-off-by: Hugh Dickins
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

07 Feb, 2009

1 commit

  • The patch:

    commit a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
    CRED: Make execve() take advantage of copy-on-write credentials

    moved the place in which the 'safeness' of a SUID/SGID exec was performed to
    before de_thread() was called. This means that LSM_UNSAFE_SHARE is now
    calculated incorrectly. This flag is set if any of the usage counts for
    fs_struct, files_struct and sighand_struct are greater than 1 at the time the
    determination is made. All of which are true for threads created by the
    pthread library.

    However, since we wish to make the security calculation before irrevocably
    damaging the process so that we can return it an error code in the case where
    we decide we want to reject the exec request on this basis, we have to make the
    determination before calling de_thread().

    So, instead, we count up the number of threads (CLONE_THREAD) that are sharing
    our fs_struct (CLONE_FS), files_struct (CLONE_FILES) and sighand_structs
    (CLONE_SIGHAND/CLONE_THREAD) with us. These will be killed by de_thread() and
    so can be discounted by check_unsafe_exec().

    We do have to be careful because CLONE_THREAD does not imply FS or FILES.

    We _assume_ that there will be no extra references to these structs held by the
    threads we're going to kill.

    This can be tested with the attached pair of programs. Build the two programs
    using the Makefile supplied, and run ./test1 as a non-root user. If
    successful, you should see something like:

    [dhowells@andromeda tmp]$ ./test1
    --TEST1--
    uid=4043, euid=4043 suid=4043
    exec ./test2
    --TEST2--
    uid=4043, euid=0 suid=0
    SUCCESS - Correct effective user ID

    and if unsuccessful, something like:

    [dhowells@andromeda tmp]$ ./test1
    --TEST1--
    uid=4043, euid=4043 suid=4043
    exec ./test2
    --TEST2--
    uid=4043, euid=4043 suid=4043
    ERROR - Incorrect effective user ID!

    The non-root user ID you see will depend on the user you run as.

    [test1.c]
    #include
    #include
    #include
    #include

    static void *thread_func(void *arg)
    {
    while (1) {}
    }

    int main(int argc, char **argv)
    {
    pthread_t tid;
    uid_t uid, euid, suid;

    printf("--TEST1--\n");
    getresuid(&uid, &euid, &suid);
    printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid);

    if (pthread_create(&tid, NULL, thread_func, NULL) < 0) {
    perror("pthread_create");
    exit(1);
    }

    printf("exec ./test2\n");
    execlp("./test2", "test2", NULL);
    perror("./test2");
    _exit(1);
    }

    [test2.c]
    #include
    #include
    #include

    int main(int argc, char **argv)
    {
    uid_t uid, euid, suid;

    getresuid(&uid, &euid, &suid);
    printf("--TEST2--\n");
    printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid);

    if (euid != 0) {
    fprintf(stderr, "ERROR - Incorrect effective user ID!\n");
    exit(1);
    }
    printf("SUCCESS - Correct effective user ID\n");
    exit(0);
    }

    [Makefile]
    CFLAGS = -D_GNU_SOURCE -Wall -Werror -Wunused
    all: test1 test2

    test1: test1.c
    gcc $(CFLAGS) -o test1 test1.c -lpthread

    test2: test2.c
    gcc $(CFLAGS) -o test2 test2.c
    sudo chown root.root test2
    sudo chmod +s test2

    Reported-by: David Smith
    Signed-off-by: David Howells
    Acked-by: David Smith
    Signed-off-by: James Morris

    David Howells
     

14 Nov, 2008

1 commit

  • Make execve() take advantage of copy-on-write credentials, allowing it to set
    up the credentials in advance, and then commit the whole lot after the point
    of no return.

    This patch and the preceding patches have been tested with the LTP SELinux
    testsuite.

    This patch makes several logical sets of alteration:

    (1) execve().

    The credential bits from struct linux_binprm are, for the most part,
    replaced with a single credentials pointer (bprm->cred). This means that
    all the creds can be calculated in advance and then applied at the point
    of no return with no possibility of failure.

    I would like to replace bprm->cap_effective with:

    cap_isclear(bprm->cap_effective)

    but this seems impossible due to special behaviour for processes of pid 1
    (they always retain their parent's capability masks where normally they'd
    be changed - see cap_bprm_set_creds()).

    The following sequence of events now happens:

    (a) At the start of do_execve, the current task's cred_exec_mutex is
    locked to prevent PTRACE_ATTACH from obsoleting the calculation of
    creds that we make.

    (a) prepare_exec_creds() is then called to make a copy of the current
    task's credentials and prepare it. This copy is then assigned to
    bprm->cred.

    This renders security_bprm_alloc() and security_bprm_free()
    unnecessary, and so they've been removed.

    (b) The determination of unsafe execution is now performed immediately
    after (a) rather than later on in the code. The result is stored in
    bprm->unsafe for future reference.

    (c) prepare_binprm() is called, possibly multiple times.

    (i) This applies the result of set[ug]id binaries to the new creds
    attached to bprm->cred. Personality bit clearance is recorded,
    but now deferred on the basis that the exec procedure may yet
    fail.

    (ii) This then calls the new security_bprm_set_creds(). This should
    calculate the new LSM and capability credentials into *bprm->cred.

    This folds together security_bprm_set() and parts of
    security_bprm_apply_creds() (these two have been removed).
    Anything that might fail must be done at this point.

    (iii) bprm->cred_prepared is set to 1.

    bprm->cred_prepared is 0 on the first pass of the security
    calculations, and 1 on all subsequent passes. This allows SELinux
    in (ii) to base its calculations only on the initial script and
    not on the interpreter.

    (d) flush_old_exec() is called to commit the task to execution. This
    performs the following steps with regard to credentials:

    (i) Clear pdeath_signal and set dumpable on certain circumstances that
    may not be covered by commit_creds().

    (ii) Clear any bits in current->personality that were deferred from
    (c.i).

    (e) install_exec_creds() [compute_creds() as was] is called to install the
    new credentials. This performs the following steps with regard to
    credentials:

    (i) Calls security_bprm_committing_creds() to apply any security
    requirements, such as flushing unauthorised files in SELinux, that
    must be done before the credentials are changed.

    This is made up of bits of security_bprm_apply_creds() and
    security_bprm_post_apply_creds(), both of which have been removed.
    This function is not allowed to fail; anything that might fail
    must have been done in (c.ii).

    (ii) Calls commit_creds() to apply the new credentials in a single
    assignment (more or less). Possibly pdeath_signal and dumpable
    should be part of struct creds.

    (iii) Unlocks the task's cred_replace_mutex, thus allowing
    PTRACE_ATTACH to take place.

    (iv) Clears The bprm->cred pointer as the credentials it was holding
    are now immutable.

    (v) Calls security_bprm_committed_creds() to apply any security
    alterations that must be done after the creds have been changed.
    SELinux uses this to flush signals and signal handlers.

    (f) If an error occurs before (d.i), bprm_free() will call abort_creds()
    to destroy the proposed new credentials and will then unlock
    cred_replace_mutex. No changes to the credentials will have been
    made.

    (2) LSM interface.

    A number of functions have been changed, added or removed:

    (*) security_bprm_alloc(), ->bprm_alloc_security()
    (*) security_bprm_free(), ->bprm_free_security()

    Removed in favour of preparing new credentials and modifying those.

    (*) security_bprm_apply_creds(), ->bprm_apply_creds()
    (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

    Removed; split between security_bprm_set_creds(),
    security_bprm_committing_creds() and security_bprm_committed_creds().

    (*) security_bprm_set(), ->bprm_set_security()

    Removed; folded into security_bprm_set_creds().

    (*) security_bprm_set_creds(), ->bprm_set_creds()

    New. The new credentials in bprm->creds should be checked and set up
    as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
    second and subsequent calls.

    (*) security_bprm_committing_creds(), ->bprm_committing_creds()
    (*) security_bprm_committed_creds(), ->bprm_committed_creds()

    New. Apply the security effects of the new credentials. This
    includes closing unauthorised files in SELinux. This function may not
    fail. When the former is called, the creds haven't yet been applied
    to the process; when the latter is called, they have.

    The former may access bprm->cred, the latter may not.

    (3) SELinux.

    SELinux has a number of changes, in addition to those to support the LSM
    interface changes mentioned above:

    (a) The bprm_security_struct struct has been removed in favour of using
    the credentials-under-construction approach.

    (c) flush_unauthorized_files() now takes a cred pointer and passes it on
    to inode_has_perm(), file_has_perm() and dentry_open().

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     

22 Apr, 2008

1 commit


09 May, 2007

1 commit

  • Merge all compat ioctl handling into compat_ioctl.c instead of splitting it
    over compat.c and compat_ioctl.c. This also allows to get rid of ioctl32.h

    Signed-off-by: Christoph Hellwig
    Looks-good-to: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

01 Oct, 2006

4 commits

  • - forward declare struct superblock
    - use inlines, not macros

    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Andrew Morton
     
  • Make it possible to disable the block layer. Not all embedded devices require
    it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
    the block layer to be present.

    This patch does the following:

    (*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
    support.

    (*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
    an item that uses the block layer. This includes:

    (*) Block I/O tracing.

    (*) Disk partition code.

    (*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

    (*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
    block layer to do scheduling. Some drivers that use SCSI facilities -
    such as USB storage - end up disabled indirectly from this.

    (*) Various block-based device drivers, such as IDE and the old CDROM
    drivers.

    (*) MTD blockdev handling and FTL.

    (*) JFFS - which uses set_bdev_super(), something it could avoid doing by
    taking a leaf out of JFFS2's book.

    (*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
    linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
    however, still used in places, and so is still available.

    (*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
    parts of linux/fs.h.

    (*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

    (*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

    (*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
    is not enabled.

    (*) fs/no-block.c is created to hold out-of-line stubs and things that are
    required when CONFIG_BLOCK is not set:

    (*) Default blockdev file operations (to give error ENODEV on opening).

    (*) Makes some /proc changes:

    (*) /proc/devices does not list any blockdevs.

    (*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

    (*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

    (*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
    given command other than Q_SYNC or if a special device is specified.

    (*) In init/do_mounts.c, no reference is made to the blockdev routines if
    CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.

    (*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
    error ENOSYS by way of cond_syscall if so).

    (*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
    CONFIG_BLOCK is not set, since they can't then happen.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     
  • Move blockdev_superblock extern declaration from fs/fs-writeback.c to a
    headerfile and remove the dependence on it by wrapping it in a macro.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     
  • Create a new header file, fs/internal.h, for common definitions local to the
    sources in the fs/ directory.

    Move extern definitions that should be in header files from fs/*.c to
    fs/internal.h or other main header files where they span directories.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells