06 Jul, 2016

1 commit

  • When a filesystem outside of init_user_ns is mounted it could have
    uids and gids stored in it that do not map to init_user_ns.

    The plan is to allow those filesystems to set i_uid to INVALID_UID and
    i_gid to INVALID_GID for unmapped uids and gids and then to handle
    that strange case in the vfs to ensure there is consistent robust
    handling of the weirdness.

    Upon a careful review of the vfs and filesystems about the only case
    where there is any possibility of confusion or trouble is when the
    inode is written back to disk. In that case filesystems typically
    read the inode->i_uid and inode->i_gid and write them to disk even
    when just an inode timestamp is being updated.

    Which leads to a rule that is very simple to implement and understand
    inodes whose i_uid or i_gid is not valid may not be written.

    In dealing with access times this means treat those inodes as if the
    inode flag S_NOATIME was set. Reads of the inodes appear safe and
    useful, but any write or modification is disallowed. The only inode
    write that is allowed is a chown that sets the uid and gid on the
    inode to valid values. After such a chown the inode is normal and may
    be treated as such.

    Denying all writes to inodes with uids or gids unknown to the vfs also
    prevents several oddball cases where corruption would have occurred
    because the vfs does not have complete information.

    One problem case that is prevented is attempting to use the gid of a
    directory for new inodes where the directories sgid bit is set but the
    directories gid is not mapped.

    Another problem case avoided is attempting to update the evm hash
    after setxattr, removexattr, and setattr. As the evm hash includeds
    the inode->i_uid or inode->i_gid not knowning the uid or gid prevents
    a correct evm hash from being computed. evm hash verification also
    fails when i_uid or i_gid is unknown but that is essentially harmless
    as it does not cause filesystem corruption.

    Acked-by: Seth Forshee
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

28 May, 2016

2 commits

  • smack ->d_instantiate() uses ->setxattr(), so to be able to call it before
    we'd hashed the new dentry and attached it to inode, we need ->setxattr()
    instances getting the inode as an explicit argument rather than obtaining
    it from dentry.

    Similar change for ->getxattr() had been done in commit ce23e64. Unlike
    ->getxattr() (which is used by both selinux and smack instances of
    ->d_instantiate()) ->setxattr() is used only by smack one and unfortunately
    it got missed back then.

    Reported-by: Seung-Woo Kim
    Tested-by: Casey Schaufler
    Signed-off-by: Al Viro

    Al Viro
     
  • preparation for similar switch in ->setxattr() (see the next commit for
    rationale).

    Signed-off-by: Al Viro

    Al Viro
     

26 May, 2016

2 commits


11 Apr, 2016

2 commits


20 Feb, 2016

1 commit


23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

13 Jan, 2016

1 commit

  • Pull misc vfs updates from Al Viro:
    "All kinds of stuff. That probably should've been 5 or 6 separate
    branches, but by the time I'd realized how large and mixed that bag
    had become it had been too close to -final to play with rebasing.

    Some fs/namei.c cleanups there, memdup_user_nul() introduction and
    switching open-coded instances, burying long-dead code, whack-a-mole
    of various kinds, several new helpers for ->llseek(), assorted
    cleanups and fixes from various people, etc.

    One piece probably deserves special mention - Neil's
    lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
    called without ->i_mutex and tries to avoid ever taking it. That, of
    course, means that it's not useful for any directory modifications,
    but things like getting inode attributes in nfds readdirplus are fine
    with that. I really should've asked for moratorium on lookup-related
    changes this cycle, but since I hadn't done that early enough... I
    *am* asking for that for the coming cycle, though - I'm going to try
    and get conversion of i_mutex to rwsem with ->lookup() done under lock
    taken shared.

    There will be a patch closer to the end of the window, along the lines
    of the one Linus had posted last May - mechanical conversion of
    ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
    inode_is_locked()/inode_lock_nested(). To quote Linus back then:

    -----
    | This is an automated patch using
    |
    | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    | with a very few manual fixups
    -----

    I'm going to send that once the ->i_mutex-affecting stuff in -next
    gets mostly merged (or when Linus says he's about to stop taking
    merges)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    nfsd: don't hold i_mutex over userspace upcalls
    fs:affs:Replace time_t with time64_t
    fs/9p: use fscache mutex rather than spinlock
    proc: add a reschedule point in proc_readfd_common()
    logfs: constify logfs_block_ops structures
    fcntl: allow to set O_DIRECT flag on pipe
    fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
    fs: xattr: Use kvfree()
    [s390] page_to_phys() always returns a multiple of PAGE_SIZE
    nbd: use ->compat_ioctl()
    fs: use block_device name vsprintf helper
    lib/vsprintf: add %*pg format specifier
    fs: use gendisk->disk_name where possible
    poll: plug an unused argument to do_poll
    amdkfd: don't open-code memdup_user()
    cdrom: don't open-code memdup_user()
    rsxx: don't open-code memdup_user()
    mtip32xx: don't open-code memdup_user()
    [um] mconsole: don't open-code memdup_user_nul()
    [um] hostaudio: don't open-code memdup_user()
    ...

    Linus Torvalds
     

09 Jan, 2016

1 commit


14 Dec, 2015

2 commits


07 Dec, 2015

4 commits

  • When a file on tmpfs has an ACL or a Default ACL, listxattr should include the
    corresponding xattr name.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: James Morris
    Cc: Hugh Dickins
    Cc: linux-mm@kvack.org
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Use the VFS xattr handler infrastructure and get rid of similar code in
    the filesystem. For implementing shmem_xattr_handler_set, we need a
    version of simple_xattr_set which removes the attribute when value is
    NULL. Use this to implement kernfs_iop_removexattr as well.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: James Morris
    Cc: Hugh Dickins
    Cc: linux-mm@kvack.org
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Add an additional "name" field to struct xattr_handler. When the name
    is set, the handler matches attributes with exactly that name. When the
    prefix is set instead, the handler matches attributes with the given
    prefix and with a non-empty suffix.

    This patch should avoid bugs like the one fixed in commit c361016a in
    the future.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: James Morris
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • This function was only briefly used in security/integrity/evm, between
    commits 66dbc325 and 15647eb3.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: James Morris
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

14 Nov, 2015

2 commits

  • Now that the xattr handler is passed to the xattr handler operations, we
    can use the same get and set operations for the user, trusted, and security
    xattr namespaces. In those namespaces, we can access the full attribute
    name by "reattaching" the name prefix the vfs has skipped for us. Add a
    xattr_full_name helper to make this obvious in the code.

    For the "system.posix_acl_access" and "system.posix_acl_default"
    attributes, handler->prefix is the full attribute name; the suffix is the
    empty string.

    Signed-off-by: Andreas Gruenbacher
    Cc: Eric Van Hensbergen
    Cc: Ron Minnich
    Cc: Latchesar Ionkov
    Cc: v9fs-developer@lists.sourceforge.net
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • The xattr_handler operations are currently all passed a file system
    specific flags value which the operations can use to disambiguate between
    different handlers; some file systems use that to distinguish the xattr
    namespace, for example. In some oprations, it would be useful to also have
    access to the handler prefix. To allow that, pass a pointer to the handler
    to operations instead of the flags value alone.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

22 May, 2015

1 commit

  • EVM needs to be atomically updated when removing xattrs.
    Otherwise concurrent EVM verification may fail in between.
    This patch fixes by moving i_mutex unlocking after calling
    EVM hook. fsnotify_xattr() is also now called while locked
    the same way as it is done in __vfs_setxattr_noperm.

    Changelog:
    - remove unused 'inode' variable.

    Signed-off-by: Dmitry Kasatkin
    Signed-off-by: Mimi Zohar

    Dmitry Kasatkin
     

20 Nov, 2014

1 commit

  • ... for situations when we don't have any candidate in pathnames - basically,
    in descriptor-based syscalls.

    [Folded the build fix for !CONFIG_AUDITSYSCALL configs from Chen Gang]

    Signed-off-by: Al Viro

    Al Viro
     

13 Oct, 2014

1 commit

  • The following pairs of system calls dealing with extended attributes only
    differ in their behavior on whether the symbolic link is followed (when
    the named file is a symbolic link):

    - setxattr() and lsetxattr()
    - getxattr() and lgetxattr()
    - listxattr() and llistxattr()
    - removexattr() and lremovexattr()

    Despite this, the implementations all had duplicated code, so this commit
    redirects each of the above pairs of system calls to a corresponding
    function to which different lookup flags (LOOKUP_FOLLOW or 0) are passed.

    For me this reduced the stripped size of xattr.o from 8824 to 8248 bytes.

    Signed-off-by: Eric Biggers
    Signed-off-by: Al Viro

    Eric Biggers
     

24 Jul, 2014

1 commit

  • If a filesystem uses simple_xattr to support user extended attributes,
    LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
    memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
    values of zero size. Fix that off-by-one (but apparently no filesystem
    needs them yet).

    Signed-off-by: Hugh Dickins
    Cc: Al Viro
    Cc: Jeff Layton
    Cc: Aristeu Rozanski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

21 Dec, 2012

8 commits


19 Oct, 2012

1 commit

  • Commit 38f38657444d ("xattr: extract simple_xattr code from tmpfs") moved
    some code from tmpfs but introduced a subtle bug along the way.

    If the name passed to simple_xattr_remove() does not exist in the list of
    xattrs, then it is possible to call kfree(new_xattr) when new_xattr is
    actually initialized to itself on the stack via uninitialized_var().

    This causes a BUG() since the memory was not allocated via the slab
    allocator and was not bypassed through to the page allocator because it
    was too large.

    Initialize the local variable to NULL so the kfree() never takes place.

    Reported-by: Fengguang Wu
    Signed-off-by: David Rientjes
    Acked-by: Hugh Dickins
    Acked-by: Aristeu Rozanski
    Signed-off-by: Linus Torvalds

    David Rientjes
     

12 Oct, 2012

1 commit

  • Currently, this gets set mostly by happenstance when we call into
    audit_inode_child. While that might be a little more efficient, it seems
    wrong. If the syscall ends up failing before audit_inode_child ever gets
    called, then you'll have an audit_names record that shows the full path
    but has the parent inode info attached.

    Fix this by passing in a parent flag when we call audit_inode that gets
    set to the value of LOOKUP_PARENT. We can then fix up the pathname for
    the audit entry correctly from the get-go.

    While we're at it, clean up the no-op macro for audit_inode in the
    !CONFIG_AUDITSYSCALL case.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     

03 Oct, 2012

3 commits

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - Integrity: add local fs integrity verification to detect offline
    attacks
    - Integrity: add digital signature verification
    - Simple stacking of Yama with other LSMs (per LSS discussions)
    - IBM vTPM support on ppc64
    - Add new driver for Infineon I2C TIS TPM
    - Smack: add rule revocation for subject labels"

    Fixed conflicts with the user namespace support in kernel/auditsc.c and
    security/integrity/ima/ima_policy.c.

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
    Documentation: Update git repository URL for Smack userland tools
    ima: change flags container data type
    Smack: setprocattr memory leak fix
    Smack: implement revoking all rules for a subject label
    Smack: remove task_wait() hook.
    ima: audit log hashes
    ima: generic IMA action flag handling
    ima: rename ima_must_appraise_or_measure
    audit: export audit_log_task_info
    tpm: fix tpm_acpi sparse warning on different address spaces
    samples/seccomp: fix 31 bit build on s390
    ima: digital signature verification support
    ima: add support for different security.ima data types
    ima: add ima_inode_setxattr/removexattr function and calls
    ima: add inode_post_setattr call
    ima: replace iint spinblock with rwlock/read_lock
    ima: allocating iint improvements
    ima: add appraise action keywords and default rules
    ima: integrity appraisal extension
    vfs: move ima_file_free before releasing the file
    ...

    Linus Torvalds
     
  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

27 Sep, 2012

1 commit


18 Sep, 2012

1 commit

  • - In setxattr if we are setting a posix acl convert uids and gids from
    the current user namespace into the initial user namespace, before
    the xattrs are passed to the underlying filesystem.

    Untranslatable uids and gids are represented as -1 which
    posix_acl_from_xattr will represent as INVALID_UID or INVALID_GID.
    posix_acl_valid will fail if an acl from userspace has any
    INVALID_UID or INVALID_GID values. In net this guarantees that
    untranslatable posix acls will not be stored by filesystems.

    - In getxattr if we are reading a posix acl convert uids and gids from
    the initial user namespace into the current user namespace.

    Uids and gids that can not be tranlsated into the current user namespace
    will be represented as -1.

    - Replace e_id in struct posix_acl_entry with an anymouns union of
    e_uid and e_gid. For the short term retain the e_id field
    until all of the users are converted.

    - Don't set struct posix_acl.e_id in the cases where the acl type
    does not use e_id. Greatly reducing the use of ACL_UNDEFINED_ID.

    - Rework the ordering checks in posix_acl_valid so that I use kuid_t
    and kgid_t types throughout the code, and so that I don't need
    arithmetic on uid and gid types.

    Cc: Theodore Tso
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: Jan Kara
    Cc: Al Viro
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

14 Sep, 2012

2 commits

  • new_xattr in __simple_xattr_set() is only initialized with a valid
    pointer if value is not NULL, which only happens if this function is
    called directly with the intention to remove an existing extended
    attribute. Even being safe to be this way, smatch warns about possible
    NULL dereference. Dan Carpenter suggested using uninitialized_var()
    which will make both gcc and smatch happy.

    Cc: Fengguang Wu
    Cc: Dan Carpenter
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski
     
  • v2: add function documentation instead of adding a separate file under
    Documentation/

    tj: Updated comment a bit and rolled in Randy's suggestions.

    Cc: Li Zefan
    Cc: Tejun Heo
    Cc: Hugh Dickins
    Cc: Hillf Danton
    Cc: Lennart Poettering
    Cc: Randy Dunlap
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski