04 Mar, 2013

3 commits

  • ... and switch i386 to HAVE_SYSCALL_WRAPPERS, killing open-coded
    uses of asmlinkage_protect() in a bunch of syscalls.

    Signed-off-by: Al Viro

    Al Viro
     
  • ... and convert a bunch of SYSCALL_DEFINE ones to SYSCALL_DEFINE,
    killing the boilerplate crap around them.

    Signed-off-by: Al Viro

    Al Viro
     
  • Pull more VFS bits from Al Viro:
    "Unfortunately, it looks like xattr series will have to wait until the
    next cycle ;-/

    This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
    etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
    more file_inode() work"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    constify path_get/path_put and fs_struct.c stuff
    fix nommu breakage in shmem.c
    cache the value of file_inode() in struct file
    9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
    9p: make sure ->lookup() adds fid to the right dentry
    9p: untangle ->lookup() a bit
    9p: double iput() in ->lookup() if d_materialise_unique() fails
    9p: v9fs_fid_add() can't fail now
    v9fs: get rid of v9fs_dentry
    9p: turn fid->dlist into hlist
    9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
    more file_inode() open-coded instances
    selinux: opened file can't have NULL or negative ->f_path.dentry

    (In the meantime, the hlist traversal macros have changed, so this
    required a semantic conflict fixup for the newly hlistified fid->dlist)

    Linus Torvalds
     

03 Mar, 2013

1 commit

  • Pull signal/compat fixes from Al Viro:
    "Fixes for several regressions introduced in the last signal.git pile,
    along with fixing bugs in truncate and ftruncate compat (on just about
    anything biarch at least one of those two had been done wrong)."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    compat: restore timerfd settime and gettime compat syscalls
    [regression] braino in "sparc: convert to ksignal"
    fix compat truncate/ftruncate
    switch lseek to COMPAT_SYSCALL_DEFINE
    lseek() and truncate() on sparc really need sign extension

    Linus Torvalds
     

02 Mar, 2013

1 commit


26 Feb, 2013

1 commit


25 Feb, 2013

1 commit


23 Feb, 2013

2 commits


21 Dec, 2012

7 commits


19 Nov, 2012

1 commit


13 Oct, 2012

2 commits

  • ...and fix up the callers. For do_file_open_root, just declare a
    struct filename on the stack and fill out the .name field. For
    do_filp_open, make it also take a struct filename pointer, and fix up its
    callers to call it appropriately.

    For filp_open, add a variant that takes a struct filename pointer and turn
    filp_open into a wrapper around it.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • getname() is intended to copy pathname strings from userspace into a
    kernel buffer. The result is just a string in kernel space. It would
    however be quite helpful to be able to attach some ancillary info to
    the string.

    For instance, we could attach some audit-related info to reduce the
    amount of audit-related processing needed. When auditing is enabled,
    we could also call getname() on the string more than once and not
    need to recopy it from userspace.

    This patchset converts the getname()/putname() interfaces to return
    a struct instead of a string. For now, the struct just tracks the
    string in kernel space and the original userland pointer for it.

    Later, we'll add other information to the struct as it becomes
    convenient.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     

12 Oct, 2012

1 commit

  • Currently, this gets set mostly by happenstance when we call into
    audit_inode_child. While that might be a little more efficient, it seems
    wrong. If the syscall ends up failing before audit_inode_child ever gets
    called, then you'll have an audit_names record that shows the full path
    but has the parent inode info attached.

    Fix this by passing in a parent flag when we call audit_inode that gets
    set to the value of LOOKUP_PARENT. We can then fix up the pathname for
    the audit entry correctly from the get-go.

    While we're at it, clean up the no-op macro for audit_inode in the
    !CONFIG_AUDITSYSCALL case.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     

03 Oct, 2012

2 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

27 Sep, 2012

7 commits


21 Sep, 2012

1 commit


15 Aug, 2012

1 commit

  • Userspace can pass weird create mode in open(2) that we canonicalize to
    "(mode & S_IALLUGO) | S_IFREG" in vfs_create().

    The problem is that we use the uncanonicalized mode before calling vfs_create()
    with unforseen consequences.

    So do the canonicalization early in build_open_flags().

    Signed-off-by: Miklos Szeredi
    Tested-by: Richard W.M. Jones
    CC: stable@vger.kernel.org

    Miklos Szeredi
     

04 Aug, 2012

1 commit


31 Jul, 2012

2 commits

  • There are several entry points which dirty pages in a filesystem. mmap
    (handled by block_page_mkwrite()), buffered write (handled by
    __generic_file_aio_write()), splice write (generic_file_splice_write),
    truncate, and fallocate (these can dirty last partial page - handled inside
    each filesystem separately). Protect these places with sb_start_write() and
    sb_end_write().

    ->page_mkwrite() calls are particularly complex since they are called with
    mmap_sem held and thus we cannot use standard sb_start_write() due to lock
    ordering constraints. We solve the problem by using a special freeze protection
    sb_start_pagefault() which ranks below mmap_sem.

    BugLink: https://bugs.launchpad.net/bugs/897421
    Tested-by: Kamal Mostafa
    Tested-by: Peter M. Petrakis
    Tested-by: Dann Frazier
    Tested-by: Massimo Morana
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • Most of places where we want freeze protection coincides with the places where
    we also have remount-ro protection. So make mnt_want_write() and
    mnt_drop_write() (and their _file alternative) prevent freezing as well.
    For the few cases that are really interested only in remount-ro protection
    provide new function variants.

    BugLink: https://bugs.launchpad.net/bugs/897421
    Tested-by: Kamal Mostafa
    Tested-by: Peter M. Petrakis
    Tested-by: Dann Frazier
    Tested-by: Massimo Morana
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

30 Jul, 2012

1 commit


23 Jul, 2012

1 commit


14 Jul, 2012

4 commits