10 Oct, 2012

1 commit

  • This is to complete part of the Userspace API (UAPI) disintegration for which
    the preparatory patches were pulled recently. After these patches, userspace
    headers will be segregated into:

    include/uapi/linux/.../foo.h

    for the userspace interface stuff, and:

    include/linux/.../foo.h

    for the strictly kernel internal stuff.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

03 Oct, 2012

2 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

02 Oct, 2012

12 commits


27 Sep, 2012

1 commit


26 Sep, 2012

1 commit


18 Sep, 2012

1 commit

  • - Pass the user namespace the uid and gid values in the xattr are stored
    in into posix_acl_from_xattr.

    - Pass the user namespace kuid and kgid values should be converted into
    when storing uid and gid values in an xattr in posix_acl_to_xattr.

    - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to
    pass in &init_user_ns.

    In the short term this change is not strictly needed but it makes the
    code clearer. In the longer term this change is necessary to be able to
    mount filesystems outside of the initial user namespace that natively
    store posix acls in the linux xattr format.

    Cc: Theodore Tso
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: Jan Kara
    Cc: Al Viro
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

11 Sep, 2012

5 commits


10 Sep, 2012

2 commits

  • You can use nfsd/portlist to give nfsd additional sockets to listen on.
    In theory you can also remove listening sockets this way. But nobody's
    ever done that as far as I can tell.

    Also this was partially broken in 2.6.25, by
    a217813f9067b785241cb7f31956e51d2071703a "knfsd: Support adding
    transports by writing portlist file".

    (Note that we decide whether to take the "delfd" case by checking for a
    digit--but what's actually expected in that case is something made by
    svc_one_sock_name(), which won't begin with a digit.)

    So, let's just rip out this stuff.

    Acked-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Processes that open and close multiple files may end up setting this
    oo_last_closed_stid without freeing what was previously pointed to.
    This can result in a major leak, visible for example by watching the
    nfsd4_stateids line of /proc/slabinfo.

    Reported-by: Cyril B.
    Tested-by: Cyril B.
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

22 Aug, 2012

6 commits


21 Aug, 2012

6 commits


02 Aug, 2012

1 commit

  • Pull second vfs pile from Al Viro:
    "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
    deadlock reproduced by xfstests 068), symlink and hardlink restriction
    patches, plus assorted cleanups and fixes.

    Note that another fsfreeze deadlock (emergency thaw one) is *not*
    dealt with - the series by Fernando conflicts a lot with Jan's, breaks
    userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
    for massive vfsmount leak; this is going to be handled next cycle.
    There probably will be another pull request, but that stuff won't be
    in it."

    Fix up trivial conflicts due to unrelated changes next to each other in
    drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
    delousing target_core_file a bit
    Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
    fs: Remove old freezing mechanism
    ext2: Implement freezing
    btrfs: Convert to new freezing mechanism
    nilfs2: Convert to new freezing mechanism
    ntfs: Convert to new freezing mechanism
    fuse: Convert to new freezing mechanism
    gfs2: Convert to new freezing mechanism
    ocfs2: Convert to new freezing mechanism
    xfs: Convert to new freezing code
    ext4: Convert to new freezing mechanism
    fs: Protect write paths by sb_start_write - sb_end_write
    fs: Skip atime update on frozen filesystem
    fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
    fs: Improve filesystem freezing handling
    switch the protection of percpu_counter list to spinlock
    nfsd: Push mnt_want_write() outside of i_mutex
    btrfs: Push mnt_want_write() outside of i_mutex
    fat: Push mnt_want_write() outside of i_mutex
    ...

    Linus Torvalds
     

01 Aug, 2012

1 commit

  • Pull nfsd changes from J. Bruce Fields:
    "This has been an unusually quiet cycle--mostly bugfixes and cleanup.
    The one large piece is Stanislav's work to containerize the server's
    grace period--but that in itself is just one more step in a
    not-yet-complete project to allow fully containerized nfs service.

    There are a number of outstanding delegation, container, v4 state, and
    gss patches that aren't quite ready yet; 3.7 may be wilder."

    * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (35 commits)
    NFSd: make boot_time variable per network namespace
    NFSd: make grace end flag per network namespace
    Lockd: move grace period management from lockd() to per-net functions
    LockD: pass actual network namespace to grace period management functions
    LockD: manage grace list per network namespace
    SUNRPC: service request network namespace helper introduced
    NFSd: make nfsd4_manager allocated per network namespace context.
    LockD: make lockd manager allocated per network namespace
    LockD: manage grace period per network namespace
    Lockd: add more debug to host shutdown functions
    Lockd: host complaining function introduced
    LockD: manage used host count per networks namespace
    LockD: manage garbage collection timeout per networks namespace
    LockD: make garbage collector network namespace aware.
    LockD: mark host per network namespace on garbage collect
    nfsd4: fix missing fault_inject.h include
    locks: move lease-specific code out of locks_delete_lock
    locks: prevent side-effects of locks_release_private before file_lock is initialized
    NFSd: set nfsd_serv to NULL after service destruction
    NFSd: introduce nfsd_destroy() helper
    ...

    Linus Torvalds
     

31 Jul, 2012

1 commit

  • When mnt_want_write() starts to handle freezing it will get a full lock
    semantics requiring proper lock ordering. So push mnt_want_write() call
    consistently outside of i_mutex.

    CC: linux-nfs@vger.kernel.org
    CC: "J. Bruce Fields"
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara