02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

31 Jul, 2017

2 commits

  • When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
    set, DIR1 is expected to have SGID bit set (and owning group equal to
    the owning group of 'DIR0'). However when 'DIR0' also has some default
    ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
    'DIR1' to get cleared if user is not member of the owning group.

    Fix the problem by moving posix_acl_update_mode() out of
    __ext4_set_acl() into ext4_set_acl(). That way the function will not be
    called when inheriting ACLs which is what we want as it prevents SGID
    bit clearing and the mode has been properly set by posix_acl_create()
    anyway.

    Fixes: 073931017b49d9458aa351605b43a7e34598caef
    CC: stable@vger.kernel.org
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Jan Kara
    Reviewed-by: Andreas Gruenbacher

    Jan Kara
     
  • When changing a file's acl mask, __ext4_set_acl() will first set the group
    bits of i_mode to the value of the mask, and only then set the actual
    extended attribute representing the new acl.

    If the second part fails (due to lack of space, for example) and the file
    had no acl attribute to begin with, the system will from now on assume
    that the mask permission bits are actual group permission bits, potentially
    granting access to the wrong users.

    Prevent this by only changing the inode mode after the acl has been set.

    Signed-off-by: Ernesto A. Fernández
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Ernesto A. Fernández
     

06 Jul, 2017

1 commit

  • ea_inode feature allows creating extended attributes that are up to
    64k in size. Update __ext4_new_inode() to pick increased credit limits.

    To avoid overallocating too many journal credits, update
    __ext4_xattr_set_credits() to make a distinction between xattr create
    vs update. This helps __ext4_new_inode() because all attributes are
    known to be new, so we can save credits that are normally needed to
    delete old values.

    Also, have fscrypt specify its maximum context size so that we don't
    end up allocating credits for 64k size.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     

22 Jun, 2017

2 commits

  • Ext4 now supports xattr values that are up to 64k in size (vfs limit).
    Large xattr values are stored in external inodes each one holding a
    single value. Once written the data blocks of these inodes are immutable.

    The real world use cases are expected to have a lot of value duplication
    such as inherited acls etc. To reduce data duplication on disk, this patch
    implements a deduplicator that allows sharing of xattr inodes.

    The deduplication is based on an in-memory hash lookup that is a best
    effort sharing scheme. When a xattr inode is read from disk (i.e.
    getxattr() call), its crc32c hash is added to a hash table. Before
    creating a new xattr inode for a value being set, the hash table is
    checked to see if an existing inode holds an identical value. If such an
    inode is found, the ref count on that inode is incremented. On value
    removal the ref count is decremented and if it reaches zero the inode is
    deleted.

    The quota charging for such inodes is manually managed. Every reference
    holder is charged the full size as if there was no sharing happening.
    This is consistent with how xattr blocks are also charged.

    [ Fixed up journal credits calculation to handle inline data and the
    rare case where an shared xattr block can get freed when two thread
    race on breaking the xattr block sharing. --tytso ]

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     
  • Both ext4_set_acl() and ext4_set_context() need to be made aware of
    ea_inode feature when it comes to credits calculation.

    Also add a sufficient credits check in ext4_xattr_set_handle() right
    after xattr write lock is grabbed. Original credits calculation is done
    outside the lock so there is a possiblity that the initially calculated
    credits are not sufficient anymore.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     

25 May, 2017

1 commit

  • ext4_xattr_block_set() calls dquot_alloc_block() to charge for an xattr
    block when new references are made. However if dquot_initialize() hasn't
    been called on an inode, request for charging is effectively ignored
    because ext4_inode_info->i_dquot is not initialized yet.

    Add dquot_initialize() to call paths that lead to ext4_xattr_block_set().

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Tahsin Erdogan
     

15 Nov, 2016

1 commit

  • CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
    current_time() will be transitioned to be y2038 safe
    along with vfs.

    current_time() returns timestamps according to the
    granularities set in the super_block.
    The granularity check in ext4_current_time() to call
    current_time() or CURRENT_TIME_SEC is not required.
    Use current_time() directly to obtain timestamps
    unconditionally, and remove ext4_current_time().

    Quota files are assumed to be on the same filesystem.
    Hence, use current_time() for these files as well.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Arnd Bergmann

    Deepa Dinamani
     

22 Sep, 2016

1 commit

  • When file permissions are modified via chmod(2) and the user is not in
    the owning group or capable of CAP_FSETID, the setgid bit is cleared in
    inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file
    permissions as well as the new ACL, but doesn't clear the setgid bit in
    a similar way; this allows to bypass the check in chmod(2). Fix that.

    References: CVE-2016-7097
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Jeff Layton
    Signed-off-by: Jan Kara
    Signed-off-by: Andreas Gruenbacher

    Jan Kara
     

31 Mar, 2016

1 commit

  • When get_acl() is called for an inode whose ACL is not cached yet, the
    get_acl inode operation is called to fetch the ACL from the filesystem.
    The inode operation is responsible for updating the cached acl with
    set_cached_acl(). This is done without locking at the VFS level, so
    another task can call set_cached_acl() or forget_cached_acl() before the
    get_acl inode operation gets to calling set_cached_acl(), and then
    get_acl's call to set_cached_acl() results in caching an outdate ACL.

    Prevent this from happening by setting the cached ACL pointer to a
    task-specific sentinel value before calling the get_acl inode operation.
    Move the responsibility for updating the cached ACL from the get_acl
    inode operations to get_acl(). There, only set the cached ACL if the
    sentinel value hasn't changed.

    The sentinel values are chosen to have odd values. Likewise, the value
    of ACL_NOT_CACHED is odd. In contrast, ACL object pointers always have
    an even value (ACLs are aligned in memory). This allows to distinguish
    uncached ACLs values from ACL objects.

    In addition, switch from guarding inode->i_acl and inode->i_default_acl
    upates by the inode->i_lock spinlock to using xchg() and cmpxchg().

    Filesystems that do not want ACLs returned from their get_acl inode
    operations to be cached must call forget_cached_acl() to prevent the VFS
    from doing so.

    (Patch written by Al Viro and Andreas Gruenbacher.)

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

03 Apr, 2015

1 commit


26 Jan, 2014

3 commits


10 Feb, 2013

1 commit

  • Operations which modify extended attributes may need extra journal
    credits if inline data is used, since there is a chance that some
    extended attributes may need to get pushed to an external attribute
    block.

    Changes to reflect this was made in xattr.c, but they were missed in
    fs/ext4/acl.c. To fix this, abstract the calculation of the number of
    credits needed for xattr operations to an inline function defined in
    ext4_jbd2.h, and use it in acl.c and xattr.c.

    Also move the function declarations used in inline.c from xattr.h
    (where they are non-obviously hidden, and caused problems since
    ext4_jbd2.h needs to use the function ext4_has_inline_data), and move
    them to ext4.h.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Tao Ma
    Reviewed-by: Jan Kara

    Theodore Ts'o
     

09 Feb, 2013

1 commit

  • So we can better understand what bits of ext4 are responsible for
    long-running jbd2 handles, use jbd2__journal_start() so we can pass
    context information for logging purposes.

    The recommended way for finding the longer-running handles is:

    T=/sys/kernel/debug/tracing
    EVENT=$T/events/jbd2/jbd2_handle_stats
    echo "interval > 5" > $EVENT/filter
    echo 1 > $EVENT/enable

    ./run-my-fs-benchmark

    cat $T/trace > /tmp/problem-handles

    This will list handles that were active for longer than 20ms. Having
    longer-running handles is bad, because a commit started at the wrong
    time could stall for those 20+ milliseconds, which could delay an
    fsync() or an O_SYNC operation. Here is an example line from the
    trace file describing a handle which lived on for 311 jiffies, or over
    1.2 seconds:

    postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
    tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
    dirtied_blocks 0

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

09 Nov, 2012

1 commit


18 Sep, 2012

2 commits

  • Convert ext2, ext3, and ext4 to fully support the posix acl changes,
    using e_uid e_gid instead e_id.

    Enabled building with posix acls enabled, all filesystems supporting
    user namespaces, now also support posix acls when user namespaces are enabled.

    Cc: Theodore Tso
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: Jan Kara
    Cc: Al Viro
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • - Pass the user namespace the uid and gid values in the xattr are stored
    in into posix_acl_from_xattr.

    - Pass the user namespace kuid and kgid values should be converted into
    when storing uid and gid values in an xattr in posix_acl_to_xattr.

    - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to
    pass in &init_user_ns.

    In the short term this change is not strictly needed but it makes the
    code clearer. In the longer term this change is necessary to be able to
    mount filesystems outside of the initial user namespace that natively
    store posix acls in the linux xattr format.

    Cc: Theodore Tso
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: Jan Kara
    Cc: Al Viro
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

01 Aug, 2011

2 commits


26 Jul, 2011

4 commits

  • Replace the ->check_acl method with a ->get_acl method that simply reads an
    ACL from disk after having a cache miss. This means we can replace the ACL
    checking boilerplate code with a single implementation in namei.c.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • new helper: posix_acl_create(&acl, gfp, mode_p). Replaces acl with
    modified clone, on failure releases acl and replaces with NULL.
    Returns 0 or -ve on error. All callers of posix_acl_create_masq()
    switched.

    Signed-off-by: Al Viro

    Al Viro
     
  • new helper: posix_acl_chmod(&acl, gfp, mode). Replaces acl with modified
    clone or with NULL if that has failed; returns 0 or -ve on error. All
    callers of posix_acl_chmod_masq() switched to that - they'd been doing
    exactly the same thing.

    Signed-off-by: Al Viro

    Al Viro
     
  • This moves logic for checking the cached ACL values from low-level
    filesystems into generic code. The end result is a streamlined ACL
    check that doesn't need to load the inode->i_op->check_acl pointer at
    all for the common cached case.

    The filesystems also don't need to check for a non-blocking RCU walk
    case in their acl_check() functions, because that is all handled at a
    VFS layer.

    Signed-off-by: Linus Torvalds
    Signed-off-by: Al Viro

    Linus Torvalds
     

20 Jul, 2011

2 commits


24 Mar, 2011

1 commit


07 Jan, 2011

2 commits


16 Jun, 2010

1 commit


22 May, 2010

1 commit


17 Dec, 2009

1 commit

  • Add a flags argument to struct xattr_handler and pass it to all xattr
    handler methods. This allows using the same methods for multiple
    handlers, e.g. for the ACL methods which perform exactly the same action
    for the access and default ACLs, just using a different underlying
    attribute. With a little more groundwork it'll also allow sharing the
    methods for the regular user/trusted/secure handlers in extN, ocfs2 and
    jffs2 like it's already done for xfs in this patch.

    Also change the inode argument to the handlers to a dentry to allow
    using the handlers mechnism for filesystems that require it later,
    e.g. cifs.

    [with GFS2 bits updated by Steven Whitehouse ]

    Signed-off-by: Christoph Hellwig
    Reviewed-by: James Morris
    Acked-by: Joel Becker
    Signed-off-by: Al Viro

    Christoph Hellwig
     

09 Sep, 2009

1 commit


24 Jun, 2009

2 commits


17 Jun, 2009

1 commit

  • If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem
    to do POSIX ACL checks on any files not owned by the caller, and it does
    this for every single pathname component that it looks up.

    That obviously can be pretty expensive if the filesystem isn't careful
    about it, especially with locking. That's doubly sad, since the common
    case tends to be that there are no ACL's associated with the files in
    question.

    ext4 already caches the ACL data so that it doesn't have to look it up
    over and over again, but it does so by taking the inode->i_lock spinlock
    on every lookup. Which is a noticeable overhead even if it's a private
    lock, especially on CPU's where the serialization is expensive (eg Intel
    Netburst aka 'P4').

    For the special case of not actually having any ACL's, all that locking is
    unnecessary. Even if somebody else were to be changing the ACL's on
    another CPU, we simply don't care - if we've seen a NULL ACL, we might as
    well use it.

    So just load the ACL speculatively without any locking, and if it was
    NULL, just use it. If it's non-NULL (either because we had a cached
    entry, or because the cache hasn't been filled in at all), it means that
    we'll need to get the lock and re-load it properly.

    (This commit was ported from a patch originally authored by Linus for
    ext3.)

    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Al Viro

    Theodore Ts'o
     

01 Apr, 2009

1 commit


27 Jul, 2008

2 commits