09 Jan, 2012

1 commit

  • Both ext3 and ext4 put the half-created symlink inode into the orphan list
    for a while (see the comment in ext[34]_symlink() for gory details). Then,
    if everything went fine, they pull it out of the orphan list and bump the
    link count back to 1. The thing is, inc_nlink() is going to complain about
    seeing somebody changing i_nlink from 0 to 1. With a good reason, since
    normally something like that is a bug. Explicit set_nlink(inode, 1) does
    the same thing as inc_nlink() here, but it does *not* complain - exactly
    because it should be usable in strange situations like this one.

    Signed-off-by: Al Viro

    Al Viro
     

04 Jan, 2012

3 commits


03 Nov, 2011

2 commits

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue:
    vfs: add d_prune dentry operation
    vfs: protect i_nlink
    filesystems: add set_nlink()
    filesystems: add missing nlink wrappers
    logfs: remove unnecessary nlink setting
    ocfs2: remove unnecessary nlink setting
    jfs: remove unnecessary nlink setting
    hypfs: remove unnecessary nlink setting
    vfs: ignore error on forced remount
    readlinkat: ensure we return ENOENT for the empty pathname for normal lookups
    vfs: fix dentry leak in simple_fill_super()

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits)
    jbd2: Unify log messages in jbd2 code
    jbd/jbd2: validate sb->s_first in journal_get_superblock()
    ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined
    ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled
    ext4: fix a typo in struct ext4_allocation_context
    ext4: Don't normalize an falloc request if it can fit in 1 extent.
    ext4: remove comments about extent mount option in ext4_new_inode()
    ext4: let ext4_discard_partial_buffers handle unaligned range correctly
    ext4: return ENOMEM if find_or_create_pages fails
    ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock()
    ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten
    ext4: optimize locking for end_io extent conversion
    ext4: remove unnecessary call to waitqueue_active()
    ext4: Use correct locking for ext4_end_io_nolock()
    ext4: fix race in xattr block allocation path
    ext4: trace punch_hole correctly in ext4_ext_map_blocks
    ext4: clean up AGGRESSIVE_TEST code
    ext4: move variables to their scope
    ext4: fix quota accounting during migration
    ext4: migrate cleanup
    ...

    Linus Torvalds
     

02 Nov, 2011

2 commits


29 Oct, 2011

1 commit


26 Oct, 2011

1 commit

  • If a directory with more than EXT4_LINK_MAX subdirectories, the nlink
    count is set to 1. Subsequently, if any subdirectories are deleted,
    ext4_dec_count() decrements the i_nlink count, which may go to 0
    temporarily before being incremented back to 1.

    While this is done under i_mutex, which prevents races for directory
    and inode operations that check i_nlink, the temporary i_nlink == 0
    case is exposed to userspace via stat() and similar calls that do not
    hold i_mutex.

    Instead, change the code to not decrement i_nlink count for any
    directories that do not already have i_nlink larger than 2.

    Reported-by: Cliff White
    Reviewed-by: Johann Lombardi
    Signed-off-by: Andreas Dilger
    Signed-off-by: "Theodore Ts'o"

    Andreas Dilger
     

01 Sep, 2011

2 commits

  • ext4_dx_add_entry manipulates bh2 and frames[0].bh, which are two buffer_heads
    that point to directory blocks assigned to the directory inode. However, the
    function calls ext4_handle_dirty_metadata with the inode of the file that's
    being added to the directory, not the directory inode itself. Therefore,
    correct the code to dirty the directory buffers with the directory inode, not
    the file inode.

    Signed-off-by: Darrick J. Wong
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Theodore Ts'o
     
  • ext4_mkdir calls ext4_handle_dirty_metadata with dir_block and the inode "dir".
    Unfortunately, dir_block belongs to the newly created directory (which is
    "inode"), not the parent directory (which is "dir"). Fix the incorrect
    association.

    Signed-off-by: Darrick J. Wong
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Darrick J. Wong
     

31 Aug, 2011

1 commit

  • When ext4_rename performs a directory rename (move), dir_bh is a
    buffer that is modified to update the '..' link in the directory being
    moved (old_inode). However, ext4_handle_dirty_metadata is called with
    the old parent directory inode (old_dir) and dir_bh, which is
    incorrect because dir_bh does not belong to the parent inode. Fix
    this error.

    Signed-off-by: Darrick J. Wong
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Darrick J. Wong
     

23 Aug, 2011

2 commits


12 Aug, 2011

1 commit

  • Commit df5e6223407e ("ext4: fix deadlock in ext4_symlink() in ENOSPC
    conditions") recalculated the number of credits needed for a long
    symlink, in the process of splitting it into two transactions. However,
    the first credit calculation under-counted because if selinux is
    enabled, credits are needed to create the selinux xattr as well.

    Overrunning the reservation will result in an OOPS in
    jbd2_journal_dirty_metadata() due to this assert:

    J_ASSERT_JH(jh, handle->h_buffer_credits > 0);

    Fix this by increasing the reservation size.

    Signed-off-by: Eric Sandeen
    Reviewed-by: Jan Kara
    Acked-by: "Theodore Ts'o"
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

02 Aug, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (60 commits)
    ext4: prevent memory leaks from ext4_mb_init_backend() on error path
    ext4: use EXT4_BAD_INO for buddy cache to avoid colliding with valid inode #
    ext4: use ext4_msg() instead of printk in mballoc
    ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info
    ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()
    ext4: use the correct error exit path in ext4_init_inode_table()
    ext4: add missing kfree() on error return path in add_new_gdb()
    ext4: change umode_t in tracepoint headers to be an explicit __u16
    ext4: fix races in ext4_sync_parent()
    ext4: Fix overflow caused by missing cast in ext4_fallocate()
    ext4: add action of moving index in ext4_ext_rm_idx for Punch Hole
    ext4: simplify parameters of reserve_backup_gdb()
    ext4: simplify parameters of add_new_gdb()
    ext4: remove lock_buffer in bclean() and setup_new_group_blocks()
    ext4: simplify journal handling in setup_new_group_blocks()
    ext4: let setup_new_group_blocks() set multiple bits at a time
    ext4: fix a typo in ext4_group_extend()
    ext4: let ext4_group_add_blocks() handle 0 blocks quickly
    ext4: let ext4_group_add_blocks() return an error code
    ext4: rename ext4_add_groupblocks() to ext4_group_add_blocks()
    ...

    Fix up conflict in fs/ext4/inode.c: commit aacfc19c626e ("fs: simplify
    the blockdev_direct_IO prototype") had changed the ext4_ind_direct_IO()
    function for the new simplified calling convention, while commit
    dae1e52cb126 ("ext4: move ext4_ind_* functions from inode.c to
    indirect.c") moved the function to another file.

    Linus Torvalds
     

26 Jul, 2011

1 commit

  • Replace the ->check_acl method with a ->get_acl method that simply reads an
    ACL from disk after having a cache miss. This means we can replace the ACL
    checking boilerplate code with a single implementation in namei.c.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

20 Jul, 2011

1 commit


17 Jul, 2011

1 commit


12 Jul, 2011

1 commit

  • The comment from Al Viro about possible race in the ext4_orphan_add() is
    not justified. There is no race possible as we always have either i_mutex
    locked, or the inode can not be referenced from outside hence the
    J_ASSERS should not be hit from the reason described in comment.

    This commit replaces it with notion that we are holding i_mutex so it
    should not be possible for i_nlink to be changed while waiting for
    s_orphan_lock.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

15 May, 2011

1 commit


03 May, 2011

2 commits

  • ext4_symlink() cannot call __page_symlink() with transaction open.
    __page_symlink() calls ext4_write_begin() which can wait for
    transaction commit if we are running out of space thus causing a
    deadlock. Also error recovery in ext4_truncate_failed_write() does not
    count with the transaction being already started (although I'm not
    aware of any particular deadlock here).

    Fix the problem by stopping a transaction before calling
    __page_symlink() (we have to be careful and put inode to orphan list
    so that it gets deleted in case of crash) and starting another one
    after __page_symlink() returns for addition of symlink into a
    directory.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • When make_indexed_dir() fails (e.g. because of ENOSPC) after it has
    allocated block for index tree root, we did not properly mark all
    changed buffers dirty. This lead to only some of these buffers being
    written out and thus effectively corrupting the directory.

    Fix the issue by marking all changed data dirty even in the error
    failure case.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

26 Mar, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits)
    ext4: fix a BUG in mb_mark_used during trim.
    ext4: unused variables cleanup in fs/ext4/extents.c
    ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep()
    ext4: add more tracepoints and use dev_t in the trace buffer
    ext4: don't kfree uninitialized s_group_info members
    ext4: add missing space in printk's in __ext4_grp_locked_error()
    ext4: add FITRIM to compat_ioctl.
    ext4: handle errors in ext4_clear_blocks()
    ext4: unify the ext4_handle_release_buffer() api
    ext4: handle errors in ext4_rename
    jbd2: add COW fields to struct jbd2_journal_handle
    jbd2: add the b_cow_tid field to journal_head struct
    ext4: Initialize fsync transaction ids in ext4_new_inode()
    ext4: Use single thread to perform DIO unwritten convertion
    ext4: optimize ext4_bio_write_page() when no extent conversion is needed
    ext4: skip orphan cleanup if fs has unknown ROCOMPAT features
    ext4: use the nblocks arg to ext4_truncate_restart_trans()
    ext4: fix missing iput of root inode for some mount error paths
    ext4: make FIEMAP and delayed allocation play well together
    ext4: suppress verbose debugging information if malloc-debug is off
    ...

    Fi up conflicts in fs/ext4/super.c due to workqueue changes

    Linus Torvalds
     

22 Mar, 2011

1 commit

  • - Add more ext4 tracepoints.
    - Change ext4 tracepoints to use dev_t field with MAJOR/MINOR macros
    so that we can save 4 bytes in the ring buffer on some platforms.
    - Add sync_mode to ext4_da_writepages, ext4_da_write_pages, and
    ext4_da_writepages_result tracepoints. Also remove for_reclaim
    field from ext4_da_writepages since it is usually not very useful.

    Signed-off-by: Jiaying Zhang
    Signed-off-by: "Theodore Ts'o"

    Jiaying Zhang
     

21 Mar, 2011

1 commit

  • Checking return code from ext4_journal_get_write_access() is important
    with snapshots, because this function invokes COW, so may return new
    errors, such as ENOSPC.

    We move the call to ext4_journal_get_write_access earlier in the
    function, to simplify error handling in the case that this function
    returns returns an error.

    Signed-off-by: Amir Goldstein
    Signed-off-by: "Theodore Ts'o"

    Amir Goldstein
     

15 Mar, 2011

1 commit


11 Jan, 2011

3 commits


20 Dec, 2010

2 commits


15 Dec, 2010

1 commit


28 Oct, 2010

3 commits

  • Conflicts:
    fs/ext4/inode.c
    fs/ext4/mballoc.c
    include/trace/events/ext4.h

    Theodore Ts'o
     
  • Use the search_dirblock() in ext4_dx_find_entry(). It makes the code
    easier to read, and it takes advantage of common code. It also saves
    100 bytes or so of text space.

    Signed-off-by: "Theodore Ts'o"
    Cc: Brad Spengler

    Theodore Ts'o
     
  • If the first block of htree directory is missing '.' or '..' but is
    otherwise a valid directory, and we do a lookup for '.' or '..', it's
    possible to dereference an uninitialized memory pointer in
    ext4_htree_next_block().

    We avoid this by moving the special case from ext4_dx_find_entry() to
    ext4_find_entry(); this also means we can optimize ext4_find_entry()
    slightly when NFS looks up "..".

    Thanks to Brad Spengler for pointing a Clang warning that led me to
    look more closely at this code. The warning was harmless, but it was
    useful in pointing out code that was too ugly to live. This warning was
    also reported by Roman Borisov.

    Signed-off-by: "Theodore Ts'o"
    Cc: Brad Spengler

    Theodore Ts'o
     

26 Oct, 2010

1 commit


05 Aug, 2010

1 commit

  • commit 3d0518f4, "ext4: New rec_len encoding for very
    large blocksizes" made several changes to this path, but from
    a perf perspective, un-inlining ext4_rec_len_from_disk() seems
    most significant. This function is called from ext4_check_dir_entry(),
    which on a file-creation workload is called extremely often.

    I tested this with bonnie:

    # bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32

    (this does 200 iterations) and got this for the file creations:

    ext4 stock: Average = 21206.8 files/s
    ext4 inlined: Average = 22346.7 files/s (+5%)

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     

27 Jul, 2010

1 commit