18 Dec, 2012

1 commit


17 Dec, 2012

1 commit

  • Pull ext4 update from Ted Ts'o:
    "There are two major features for this merge window. The first is
    inline data, which allows small files or directories to be stored in
    the in-inode extended attribute area. (This requires that the file
    system use inodes which are at least 256 bytes or larger; 128 byte
    inodes do not have any room for in-inode xattrs.)

    The second new feature is SEEK_HOLE/SEEK_DATA support. This is
    enabled by the extent status tree patches, and this infrastructure
    will be used to further optimize ext4 in the future.

    Beyond that, we have the usual collection of code cleanups and bug
    fixes."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (63 commits)
    ext4: zero out inline data using memset() instead of empty_zero_page
    ext4: ensure Inode flags consistency are checked at build time
    ext4: Remove CONFIG_EXT4_FS_XATTR
    ext4: remove unused variable from ext4_ext_in_cache()
    ext4: remove redundant initialization in ext4_fill_super()
    ext4: remove redundant code in ext4_alloc_inode()
    ext4: use sync_inode_metadata() when syncing inode metadata
    ext4: enable ext4 inline support
    ext4: let fallocate handle inline data correctly
    ext4: let ext4_truncate handle inline data correctly
    ext4: evict inline data out if we need to strore xattr in inode
    ext4: let fiemap work with inline data
    ext4: let ext4_rename handle inline dir
    ext4: let empty_dir handle inline dir
    ext4: let ext4_delete_entry() handle inline data
    ext4: make ext4_delete_entry generic
    ext4: let ext4_find_entry handle inline data
    ext4: create a new function search_dir
    ext4: let ext4_readdir handle inline data
    ext4: let add_dir_entry handle inline data properly
    ...

    Linus Torvalds
     

14 Dec, 2012

1 commit

  • Pull trivial branch from Jiri Kosina:
    "Usual stuff -- comment/printk typo fixes, documentation updates, dead
    code elimination."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    HOWTO: fix double words typo
    x86 mtrr: fix comment typo in mtrr_bp_init
    propagate name change to comments in kernel source
    doc: Update the name of profiling based on sysfs
    treewide: Fix typos in various drivers
    treewide: Fix typos in various Kconfig
    wireless: mwifiex: Fix typo in wireless/mwifiex driver
    messages: i2o: Fix typo in messages/i2o
    scripts/kernel-doc: check that non-void fcts describe their return value
    Kernel-doc: Convention: Use a "Return" section to describe return values
    radeon: Fix typo and copy/paste error in comments
    doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
    various: Fix spelling of "asynchronous" in comments.
    Fix misspellings of "whether" in comments.
    eisa: Fix spelling of "asynchronous".
    various: Fix spelling of "registered" in comments.
    doc: fix quite a few typos within Documentation
    target: iscsi: fix comment typos in target/iscsi drivers
    treewide: fix typo of "suport" in various comments and Kconfig
    treewide: fix typo of "suppport" in various comments
    ...

    Linus Torvalds
     

11 Dec, 2012

28 commits

  • Not all architectures (in particular, sparc64) have empty_zero_page.
    So instead of copying from empty_zero_page, use memset to clear the
    inline data by signalling to ext4_xattr_set_entry() via a magic
    pointer value, EXT4_ZERO_ATTR_VALUE, which is defined by casting -1 to
    a pointer.

    This fixes a build failure on sparc64, and the memset() should be more
    efficient than using memcpy() anyway.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Flags being used by atomic operations in inode flags (e.g.
    ext4_test_inode_flag(), should be consistent with that actually stored
    in inodes, i.e.: EXT4_XXX_FL.

    It ensures that this consistency is checked at build-time, not at
    run-time.

    Currently, the flags consistency are being checked at run-time, but,
    there is no real reason to not do a build-time check instead of a
    run-time check. The code is comparing macro defined values with enum
    type variables, where both are constants, so, there is no problem in
    comparing constants at build-time.

    enum variables are treated as constants by the C compiler, according
    to the C99 specs (see www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
    sec. 6.2.5, item 16), so, there is no real problem in comparing an
    enumeration type at build time

    Signed-off-by: Carlos Maiolino
    Signed-off-by: "Theodore Ts'o"

    Carlos Maiolino
     
  • Ted has sent out a RFC about removing this feature. Eric and Jan
    confirmed that both RedHat and SUSE enable this feature in all their
    product. David also said that "As far as I know, it's enabled in all
    Android kernels that use ext4." So it seems OK for us.

    And what's more, as inline data depends its implementation on xattr,
    and to be frank, I don't run any test again inline data enabled while
    xattr disabled. So I think we should add inline data and remove this
    config option in the same release.

    [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which
    isn't much in the grand scheme of things. Since no one seems to be
    testing this configuration except for some automated compile farms, on
    balance we are better removing this config option, and so that it is
    effectively always enabled. -- tytso ]

    Cc: David Brown
    Cc: Eric Sandeen
    Reviewed-by: Jan Kara
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Zhi Yong Wu
    Reviewed-by: Zheng Liu

    Zhi Yong Wu
     
  • We use kzalloc() to allocate sbi, no need to zero its field.

    Signed-off-by: Guo Chao
    Signed-off-by: "Theodore Ts'o"

    Guo Chao
     
  • inode_init_always() will initialize inode->i_data.writeback_index
    anyway, no need to do this in ext4_alloc_inode().

    Signed-off-by: Guo Chao
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Lukas Czerner

    Guo Chao
     
  • We have a dedicated interface to sync inode metadata. Use it to
    simplify ext4's code some.

    Signed-off-by: Guo Chao
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Lukas Czerner

    Guo Chao
     
  • Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • If we are punching hole in a file, we will return ENOTSUPP.
    As for the fallocation of some extents, we will convert the
    inline data to a normal extent based file first.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Signed-off-by: Robin Dong
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Now we that store data in the inode, in case we need to store some
    xattrs and inode doesn't have enough space, Andreas suggested that we
    should keep the xattr(metadata) in and data should be pushed out. So
    this patch does the work.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • fiemap is used to find the disk layout of a file, as for inline data,
    let us just pretend like a file with just one extent.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • In case we rename a directory, ext4_rename has to read the dir block
    and change its dotdot's information. The old ext4_rename encapsulated
    the dir_block read into itself. So this patch adds a new function
    ext4_get_first_dir_block() which gets the dir buffer information so
    the ext4_rename can handle it properly. As it will also change the
    parent inode number, we return the parent_de so that ext4_rename() can
    handle it more easily.

    ext4_find_entry is also changed so that the caller(rename) can tell
    whether the found entry is an inlined one or not and journaling the
    corresponding buffer head.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • empty_dir is used when deleting a dir. So it should handle inline dir
    properly.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Currently ext4_delete_entry() is used only for dir entry removing from
    a dir block. So let us create a new function
    ext4_generic_delete_entry and this function takes a entry_buf and a
    buf_size so that it can be used for inline data.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Create a new function ext4_find_inline_entry() to handle the case of
    inline data.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • search_dirblock is used to search a dir block, but the code is almost
    the same for searching an inline dir.

    So create a new fuction search_dir and let search_dirblock call it.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • For "." and "..", we just call filldir by ourselves
    instead of iterating the real dir entry.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • This patch let add_dir_entry handle the inline data case. So the
    dir is initialized as inline dir first and then we can try to add
    some files to it, when the inline space can't hold all the entries,
    a dir block will be created and the dir entry will be moved to it.

    Also for an inlined dir, "." and ".." are removed and we only use
    4 bytes to store the parent inode number. These 2 entries will be
    added when we convert an inline dir to a block-based one.

    [ Folded in patch from Dan Carpenter to remove an unused variable. ]

    Signed-off-by: Tao Ma
    Signed-off-by: Dan Carpenter
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • The old add_dirent_to_buf handles all the work related to the
    work of adding dir entry to a dir block. Now we have inline data,
    so create 2 new function __ext4_find_dest_de and __ext4_insert_dentry
    that do the real work and let add_dirent_to_buf call them.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • The __ext4_check_dir_entry() function() is used to check whether the
    de is over the block boundary. Now with inline data, it could be
    within the block boundary while exceeds the inode size. So check this
    function to check the overflow more precisely.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Currently, the initialization of dot and dotdot are encapsulated in
    ext4_mkdir and also bond with dir_block. So create a new function
    named ext4_init_new_dir and the initialization is moved to
    ext4_init_dot_dotdot. Now it will called either in the normal non-inline
    case(rec_len of ".." will cover the whole block) or when we converting an
    inline dir to a block(rec len of ".." will be the real length). The start
    of the next entry is also returned for inline dir usage.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • For delayed allocation mode, we write to inline data if the file
    is small enough. And in case of we write to some offset larger
    than the inline size, the 1st page is dirtied, so that
    ext4_da_writepages can handle the conversion. When the 1st page
    is initialized with blocks, the inline part is removed.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • For a normal write case (not journalled write, not delayed
    allocation), we write to the inline if the file is small and convert
    it to an extent based file when the write is larger than the max
    inline size.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Let readpage and readpages handle the case when we want to read an
    inlined file.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Implement inline data with xattr.

    Now we use "system.data" to store xattr, and the xattr will
    be extended if the i_size is increased while we don't release
    the space during truncate.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

05 Dec, 2012

1 commit


03 Dec, 2012

1 commit

  • Currently, in ext4_iget we do a simple check to see whether
    there does exist some information starting from the end
    of i_extra_size. With inline data added, this procedure
    is more complicated. So move it to a new function named
    ext4_iget_extra_inode.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

30 Nov, 2012

2 commits

  • Commit fa77dcfafeaa introduces block bitmap checksum calculation into
    ext4_new_inode() in the case that block group was uninitialized.
    However we brelse() the bitmap buffer before we attempt to checksum it
    so we have no guarantee that the buffer is still there.

    Fix this by releasing the buffer after the possible checksum
    computation.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"
    Acked-by: Darrick J. Wong
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • Remove a level of indentation by moving the DIO read and extending
    write case to the beginning of the file. This results in no actual
    programmatic changes to the file, but makes it easier to
    read/understand.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

29 Nov, 2012

4 commits

  • Previously, ext4_extents.h was being included at the end of ext4.h,
    which was bad for a number of reasons: (a) it was not being included
    in the expected place, and (b) it caused the header to be included
    multiple times. There were #ifdef's to prevent this from causing any
    problems, but it still was unnecessary.

    By moving the function declarations that were in ext4_extents.h to
    ext4.h, which is standard practice for where the function declarations
    for the rest of ext4.h can be found, we can remove ext4_extents.h from
    being included in ext4.h at all, and then we can only include
    ext4_extents.h where it is needed in ext4's source files.

    It should be possible to move a few more things into ext4.h, and
    further reduce the number of source files that need to #include
    ext4_extents.h, but that's a cleanup for another day.

    Reported-by: Sachin Kamat
    Reported-by: Wei Yongjun
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The memset operation before check can cause a BUG if the memory
    allocation failed. Since we are using get_zeroed_age, there is no
    need to use memset anyway.

    Found by the Spruce system in cooperation with the KEDR Framework.

    Signed-off-by: Vahram Martirosyan
    Signed-off-by: "Theodore Ts'o"

    Vahram Martirosyan
     
  • This commit is simple cleanup of fiemap codepath which has not been
    included in previous commit to make the changes clearer. In this commit
    we rename cbex variable to newex in ext4_fill_fiemap_extents() because
    callback is no longer present

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     
  • Currently ext4_ext_walk_space() only takes i_data_sem for read when
    searching for the extent at given block with ext4_ext_find_extent().
    Then it drops the lock and the extent tree can be changed at will.
    However later on we're searching for the 'next' extent, but the extent
    tree might already have changed, so the information might not be
    accurate.

    In fact we can hit BUG_ON(end ext4_fill_fiemap_extents
    ext4_ext_fiemap_cb -> ext4_find_delayed_extent
    3. move fiemap_fill_next_extent() into ext4_fill_fiemap_extents()
    4. hold the i_data_sem for:
    ext4_ext_find_extent()
    ext4_ext_next_allocated_block()
    ext4_find_delayed_extent()
    5. call fiemap_fill_next_extent after releasing the i_data_sem
    6. move path reinitialization into the critical section.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

19 Nov, 2012

1 commit