12 Nov, 2020

1 commit


07 Nov, 2020

1 commit

  • It takes xattr_sem to check inline data again but without unlock it
    in case not have. So unlock it before return.

    Fixes: aef1c8513c1f ("ext4: let ext4_truncate handle inline data correctly")
    Reported-by: Dan Carpenter
    Cc: Tao Ma
    Signed-off-by: Joseph Qi
    Reviewed-by: Andreas Dilger
    Link: https://lore.kernel.org/r/1604370542-124630-1-git-send-email-joseph.qi@linux.alibaba.com
    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org

    Joseph Qi
     

27 Oct, 2020

1 commit


18 Oct, 2020

1 commit

  • Delete repeated words in fs/ext4/.
    {the, this, of, we, after}

    Also change spelling of "xttr" in inline.c to "xattr" in 2 places.

    Signed-off-by: Randy Dunlap
    Reviewed-by: Jan Kara
    Link: https://lore.kernel.org/r/20200805024850.12129-1-rdunlap@infradead.org
    Signed-off-by: Theodore Ts'o

    Randy Dunlap
     

24 Aug, 2020

1 commit


20 Aug, 2020

1 commit


19 Aug, 2020

1 commit


26 Jun, 2020

1 commit

  • This adds support for encryption with casefolding.

    Since the name on disk is case preserving, and also encrypted, we can no
    longer just recompute the hash on the fly. Additionally, to avoid
    leaking extra information from the hash of the unencrypted name, we use
    siphash via an fscrypt v2 policy.

    The hash is stored at the end of the directory entry for all entries
    inside of an encrypted and casefolded directory apart from those that
    deal with '.' and '..'. This way, the change is backwards compatible
    with existing ext4 filesystems.

    Signed-off-by: Daniel Rosenberg
    Signed-off-by: Paul Lawrence
    Test: Boots, /data/media is case insensitive
    Bug: 138322712
    Change-Id: I07354e3129aa07d309fbe36c002fee1af718f348

    Daniel Rosenberg
     

24 Jun, 2020

1 commit


04 Jun, 2020

1 commit

  • ext4_mark_inode_dirty() can fail for real reasons. Ignoring its return
    value may lead ext4 to ignore real failures that would result in
    corruption / crashes. Harden ext4_mark_inode_dirty error paths to fail
    as soon as possible and return errors to the caller whenever
    appropriate.

    One of the possible scnearios when this bug could affected is that
    while creating a new inode, its directory entry gets added
    successfully but while writing the inode itself mark_inode_dirty
    returns error which is ignored. This would result in inconsistency
    that the directory entry points to a non-existent inode.

    Ran gce-xfstests smoke tests and verified that there were no
    regressions.

    Signed-off-by: Harshad Shirwadkar
    Link: https://lore.kernel.org/r/20200427013438.219117-1-harshadshirwadkar@gmail.com
    Signed-off-by: Theodore Ts'o

    Harshad Shirwadkar
     

09 Apr, 2020

1 commit


02 Apr, 2020

1 commit

  • Using a separate function, ext4_set_errno() to set the errno is
    problematic because it doesn't do the right thing once
    s_last_error_errorcode is non-zero. It's also less racy to set all of
    the error information all at once. (Also, as a bonus, it shrinks code
    size slightly.)

    Link: https://lore.kernel.org/r/20200329020404.686965-1-tytso@mit.edu
    Fixes: 878520ac45f9 ("ext4: save the error code which triggered...")
    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     

15 Mar, 2020

1 commit

  • This patch moves ext4_fiemap to use iomap framework.
    For xattr a new 'ext4_iomap_xattr_ops' is added.

    Reported-by: kbuild test robot
    Reviewed-by: Jan Kara
    Reviewed-by: Darrick J. Wong
    Link: https://lore.kernel.org/r/b9f45c885814fcdd0631747ff0fe08886270828c.1582880246.git.riteshh@linux.ibm.com
    Signed-off-by: Ritesh Harjani
    Signed-off-by: Theodore Ts'o

    Ritesh Harjani
     

07 Feb, 2020

1 commit


25 Jan, 2020

1 commit


27 Dec, 2019

1 commit

  • This allows the cause of an ext4_error() report to be categorized
    based on whether it was triggered due to an I/O error, or an memory
    allocation error, or other possible causes. Most errors are caused by
    a detected file system inconsistency, so the default code stored in
    the superblock will be EXT4_ERR_EFSCORRUPTED.

    Link: https://lore.kernel.org/r/20191204032335.7683-1-tytso@mit.edu
    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     

23 Sep, 2019

1 commit


13 Aug, 2019

1 commit

  • Currently when the call to ext4_htree_store_dirent fails the error return
    variable 'ret' is is not being set to the error code and variable count is
    instead, hence the error code is not being returned. Fix this by assigning
    ret to the error return code.

    Addresses-Coverity: ("Unused value")
    Fixes: 8af0f0822797 ("ext4: fix readdir error in the case of inline_data+dir_index")
    Signed-off-by: Colin Ian King
    Signed-off-by: Theodore Ts'o

    Colin Ian King
     

24 Jul, 2019

1 commit


22 Jun, 2019

3 commits

  • Clean up namespace pollution by the inline_data code.

    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     
  • Move the calculation of the location of the dirent tail into
    initialize_dirent_tail(). Also prefix the function with ext4_ to fix
    kernel namepsace polution.

    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     
  • Functions such as ext4_dirent_csum_verify() and ext4_dirent_csum_set()
    don't actually operate on a directory entry, but a directory block.
    And while they take a struct ext4_dir_entry *dirent as an argument, it
    had better be the first directory at the beginning of the direct
    block, or things will go very wrong.

    Rename the following functions so that things make more sense, and
    remove a lot of confusing casts along the way:

    ext4_dirent_csum_verify -> ext4_dirblock_csum_verify
    ext4_dirent_csum_set -> ext4_dirblock_csum_set
    ext4_dirent_csum -> ext4_dirblock_csum
    ext4_handle_dirty_dirent_node -> ext4_handle_dirty_dirblock

    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     

21 May, 2019

1 commit


04 May, 2019

2 commits

  • Change-Id: I4380c68c3474026a42ffa9f95c525f9a563ba7a3

    Todd Kjos
     
  • Adds tracepoints in ext4/f2fs/mpage to track readpages/buffered
    write()s. This allows us to track files that are being read/written
    to PIDs.

    Bug: 120445624
    Change-Id: I44476230324e9397e292328463f846af4befbd6d
    [joelaf: Needed for storaged fsync accounting ("storaged --uid" and
    "storaged --task".)]
    Signed-off-by: Mohan Srinivasan
    [AmitP: Folded following android-4.9 commit changes into this patch
    a5c4dbb05ab7 ("ANDROID: Replace spaces by '_' for some
    android filesystem tracepoints.")]
    Signed-off-by: Amit Pundir
    [astrachan: Folded 63066f4acf92 ("ANDROID: fs: Refactor FS
    readpage/write tracepoints.") into this patch
    Signed-off-by: Alistair Strachan

    Mohan Srinivasan
     

26 Apr, 2019

1 commit

  • This patch implements the actual support for case-insensitive file name
    lookups in ext4, based on the feature bit and the encoding stored in the
    superblock.

    A filesystem that has the casefold feature set is able to configure
    directories with the +F (EXT4_CASEFOLD_FL) attribute, enabling lookups
    to succeed in that directory in a case-insensitive fashion, i.e: match
    a directory entry even if the name used by userspace is not a byte per
    byte match with the disk name, but is an equivalent case-insensitive
    version of the Unicode string. This operation is called a
    case-insensitive file name lookup.

    The feature is configured as an inode attribute applied to directories
    and inherited by its children. This attribute can only be enabled on
    empty directories for filesystems that support the encoding feature,
    thus preventing collision of file names that only differ by case.

    * dcache handling:

    For a +F directory, Ext4 only stores the first equivalent name dentry
    used in the dcache. This is done to prevent unintentional duplication of
    dentries in the dcache, while also allowing the VFS code to quickly find
    the right entry in the cache despite which equivalent string was used in
    a previous lookup, without having to resort to ->lookup().

    d_hash() of casefolded directories is implemented as the hash of the
    casefolded string, such that we always have a well-known bucket for all
    the equivalencies of the same string. d_compare() uses the
    utf8_strncasecmp() infrastructure, which handles the comparison of
    equivalent, same case, names as well.

    For now, negative lookups are not inserted in the dcache, since they
    would need to be invalidated anyway, because we can't trust missing file
    dentries. This is bad for performance but requires some leveraging of
    the vfs layer to fix. We can live without that for now, and so does
    everyone else.

    * on-disk data:

    Despite using a specific version of the name as the internal
    representation within the dcache, the name stored and fetched from the
    disk is a byte-per-byte match with what the user requested, making this
    implementation 'name-preserving'. i.e. no actual information is lost
    when writing to storage.

    DX is supported by modifying the hashes used in +F directories to make
    them case/encoding-aware. The new disk hashes are calculated as the
    hash of the full casefolded string, instead of the string directly.
    This allows us to efficiently search for file names in the htree without
    requiring the user to provide an exact name.

    * Dealing with invalid sequences:

    By default, when a invalid UTF-8 sequence is identified, ext4 will treat
    it as an opaque byte sequence, ignoring the encoding and reverting to
    the old behavior for that unique file. This means that case-insensitive
    file name lookup will not work only for that file. An optional bit can
    be set in the superblock telling the filesystem code and userspace tools
    to enforce the encoding. When that optional bit is set, any attempt to
    create a file name using an invalid UTF-8 sequence will fail and return
    an error to userspace.

    * Normalization algorithm:

    The UTF-8 algorithms used to compare strings in ext4 is implemented
    lives in fs/unicode, and is based on a previous version developed by
    SGI. It implements the Canonical decomposition (NFD) algorithm
    described by the Unicode specification 12.1, or higher, combined with
    the elimination of ignorable code points (NFDi) and full
    case-folding (CF) as documented in fs/unicode/utf8_norm.c.

    NFD seems to be the best normalization method for EXT4 because:

    - It has a lower cost than NFC/NFKC (which requires
    decomposing to NFD as an intermediary step)
    - It doesn't eliminate important semantic meaning like
    compatibility decompositions.

    Although:

    - This implementation is not completely linguistic accurate, because
    different languages have conflicting rules, which would require the
    specialization of the filesystem to a given locale, which brings all
    sorts of problems for removable media and for users who use more than
    one language.

    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Theodore Ts'o

    Gabriel Krisman Bertazi
     

25 Dec, 2018

1 commit

  • The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent()
    while still holding the xattr semaphore. This is not necessary and it
    triggers a circular lockdep warning. This is because
    fiemap_fill_next_extent() could trigger a page fault when it writes
    into page which triggers a page fault. If that page is mmaped from
    the inline file in question, this could very well result in a
    deadlock.

    This problem can be reproduced using generic/519 with a file system
    configuration which has the inline_data feature enabled.

    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org

    Theodore Ts'o
     

04 Dec, 2018

1 commit


03 Oct, 2018

1 commit


27 Aug, 2018

1 commit

  • A specially crafted file system can trick empty_inline_dir() into
    reading past the last valid entry in a inline directory, and then run
    into the end of xattr marker. This will trigger a divide by zero
    fault. Fix this by using the size of the inline directory instead of
    dir->i_size.

    Also clean up error reporting in __ext4_check_dir_entry so that the
    message is clearer and more understandable --- and avoids the division
    by zero trap if the size passed in is zero. (I'm not sure why we
    coded it that way in the first place; printing offset % size is
    actually more confusing and less useful.)

    https://bugzilla.kernel.org/show_bug.cgi?id=200933

    Signed-off-by: Theodore Ts'o
    Reported-by: Wen Xu
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     

10 Jul, 2018

1 commit

  • The inline data code was updating the raw inode directly; this is
    problematic since if metadata checksums are enabled,
    ext4_mark_inode_dirty() must be called to update the inode's checksum.
    In addition, the jbd2 layer requires that get_write_access() be called
    before the metadata buffer is modified. Fix both of these problems.

    https://bugzilla.kernel.org/show_bug.cgi?id=200443

    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     

09 Jul, 2018

1 commit

  • Pull ext4 bugfixes from Ted Ts'o:
    "Bug fixes for ext4; most of which relate to vulnerabilities where a
    maliciously crafted file system image can result in a kernel OOPS or
    hang.

    At least one fix addresses an inline data bug could be triggered by
    userspace without the need of a crafted file system (although it does
    require that the inline data feature be enabled)"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: check superblock mapped prior to committing
    ext4: add more mount time checks of the superblock
    ext4: add more inode number paranoia checks
    ext4: avoid running out of journal credits when appending to an inline file
    jbd2: don't mark block as modified if the handle is out of credits
    ext4: never move the system.data xattr out of the inode body
    ext4: clear i_data in ext4_inode_info when removing inline data
    ext4: include the illegal physical block in the bad map ext4_error msg
    ext4: verify the depth of extent tree in ext4_find_extent()
    ext4: only look at the bg_flags field if it is valid
    ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
    ext4: always check block group bounds in ext4_init_block_bitmap()
    ext4: always verify the magic number in xattr blocks
    ext4: add corruption check in ext4_xattr_set_entry()
    ext4: add warn_on_error mount option

    Linus Torvalds
     

17 Jun, 2018

1 commit


16 Jun, 2018

1 commit

  • When converting from an inode from storing the data in-line to a data
    block, ext4_destroy_inline_data_nolock() was only clearing the on-disk
    copy of the i_blocks[] array. It was not clearing copy of the
    i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually
    used by ext4_map_blocks().

    This didn't matter much if we are using extents, since the extents
    header would be invalid and thus the extents could would re-initialize
    the extents tree. But if we are using indirect blocks, the previous
    contents of the i_blocks array will be treated as block numbers, with
    potentially catastrophic results to the file system integrity and/or
    user data.

    This gets worse if the file system is using a 1k block size and
    s_first_data is zero, but even without this, the file system can get
    quite badly corrupted.

    This addresses CVE-2018-10881.

    https://bugzilla.kernel.org/show_bug.cgi?id=200015

    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org

    Theodore Ts'o
     

06 Jun, 2018

1 commit

  • Pull xfs updates from Darrick Wong:
    "New features this cycle include the ability to relabel mounted
    filesystems, support for fallocated swapfiles, and using FUA for pure
    data O_DSYNC directio writes. With this cycle we begin to integrate
    online filesystem repair and refactor the growfs code in preparation
    for eventual subvolume support, though the road ahead for both
    features is quite long.

    There are also numerous refactorings of the iomap code to remove
    unnecessary log overhead, to disentangle some of the quota code, and
    to prepare for buffer head removal in a future upstream kernel.

    Metadata validation continues to improve, both in the hot path
    veifiers and the online filesystem check code. I anticipate sending a
    second pull request in a few days with more metadata validation
    improvements.

    This series has been run through a full xfstests run over the weekend
    and through a quick xfstests run against this morning's master, with
    no major failures reported.

    Summary:

    - Strengthen inode number and structure validation when allocating
    inodes.

    - Reduce pointless buffer allocations during cache miss

    - Use FUA for pure data O_DSYNC directio writes

    - Various iomap refactorings

    - Strengthen quota metadata verification to avoid unfixable broken
    quota

    - Make AGFL block freeing a deferred operation to avoid blowing out
    transaction reservations when running complex operations

    - Get rid of the log item descriptors to reduce log overhead

    - Fix various reflink bugs where inodes were double-joined to
    transactions

    - Don't issue discards when trimming unwritten extents

    - Refactor incore dquot initialization and retrieval interfaces

    - Fix some locking problmes in the quota scrub code

    - Strengthen btree structure checks in scrub code

    - Rewrite swapfile activation to use iomap and support unwritten
    extents

    - Make scrub exit to userspace sooner when corruptions or
    cross-referencing problems are found

    - Make scrub invoke the data fork scrubber directly on metadata
    inodes

    - Don't do background reclamation of post-eof and cow blocks when the
    fs is suspended

    - Fix secondary superblock buffer lifespan hinting

    - Refactor growfs to use table-dispatched functions instead of long
    stringy functions

    - Move growfs code to libxfs

    - Implement online fs label getting and setting

    - Introduce online filesystem repair (in a very limited capacity)

    - Fix unit conversion problems in the realtime freemap iteration
    functions

    - Various refactorings and cleanups in preparation to remove buffer
    heads in a future release

    - Reimplement the old bmap call with iomap

    - Remove direct buffer head accesses from seek hole/data

    - Various bug fixes"

    * tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits)
    fs: use ->is_partially_uptodate in page_cache_seek_hole_data
    fs: remove the buffer_unwritten check in page_seek_hole_data
    fs: move page_cache_seek_hole_data to iomap.c
    xfs: use iomap_bmap
    iomap: add an iomap-based bmap implementation
    iomap: add a iomap_sector helper
    iomap: use __bio_add_page in iomap_dio_zero
    iomap: move IOMAP_F_BOUNDARY to gfs2
    iomap: fix the comment describing IOMAP_NOWAIT
    iomap: inline data should be an iomap type, not a flag
    mm: split ->readpages calls to avoid non-contiguous pages lists
    mm: return an unsigned int from __do_page_cache_readahead
    mm: give the 'ret' variable a better name __do_page_cache_readahead
    block: add a lower-level bio_add_page interface
    xfs: fix error handling in xfs_refcount_insert()
    xfs: fix xfs_rtalloc_rec units
    xfs: strengthen rtalloc query range checks
    xfs: xfs_rtbuf_get should check the bmapi_read results
    xfs: xfs_rtword_t should be unsigned, not signed
    dax: change bdev_dax_supported() to support boolean returns
    ...

    Linus Torvalds
     

02 Jun, 2018

1 commit


23 May, 2018

1 commit

  • The inline data feature was implemented before we added support for
    external inodes for xattrs. It makes no sense to support that
    combination, but the problem is that there are a number of extended
    attribute checks that are skipped if e_value_inum is non-zero.

    Unfortunately, the inline data code is completely e_value_inum
    unaware, and attempts to interpret the xattr fields as if it were an
    inline xattr --- at which point, Hilarty Ensues.

    This addresses CVE-2018-11412.

    https://bugzilla.kernel.org/show_bug.cgi?id=199803

    Reported-by: Jann Horn
    Reviewed-by: Andreas Dilger
    Signed-off-by: Theodore Ts'o
    Fixes: e50e5129f384 ("ext4: xattr-in-inode support")
    Cc: stable@kernel.org

    Theodore Ts'o
     

08 Feb, 2018

1 commit

  • Pull inode->i_version cleanup from Jeff Layton:
    "Goffredo went ahead and sent a patch to rename this function, and
    reverse its sense, as we discussed last week.

    The patch is very straightforward and I figure it's probably best to
    go ahead and merge this to get the API as settled as possible"

    * tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
    iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}

    Linus Torvalds
     

04 Feb, 2018

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Only miscellaneous cleanups and bug fixes for ext4 this cycle"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: create ext4_kset dynamically
    ext4: create ext4_feat kobject dynamically
    ext4: release kobject/kset even when init/register fail
    ext4: fix incorrect indentation of if statement
    ext4: correct documentation for grpid mount option
    ext4: use 'sbi' instead of 'EXT4_SB(sb)'
    ext4: save error to disk in __ext4_grp_locked_error()
    jbd2: fix sphinx kernel-doc build warnings
    ext4: fix a race in the ext4 shutdown path
    mbcache: make sure c_entry_count is not decremented past zero
    ext4: no need flush workqueue before destroying it
    ext4: fixed alignment and minor code cleanup in ext4.h
    ext4: fix ENOSPC handling in DAX page fault handler
    dax: pass detailed error code from dax_iomap_fault()
    mbcache: revert "fs/mbcache.c: make count_objects() more robust"
    mbcache: initialize entry->e_referenced in mb_cache_entry_create()
    ext4: fix up remaining files with SPDX cleanups

    Linus Torvalds
     

01 Feb, 2018

1 commit