17 Aug, 2012

4 commits

  • Pull ext4 bug fixes from Ted Ts'o:
    "The following are all bug fixes and regressions. The most notable are
    the ones which cause problems for ext4 on RAID --- a performance
    problem when mounting very large filesystems, and a kernel OOPS when
    doing an rm -rf on large directory hierarchies on fast devices."

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix kernel BUG on large-scale rm -rf commands
    ext4: fix long mount times on very big file systems
    ext4: don't call ext4_error while block group is locked
    ext4: avoid kmemcheck complaint from reading uninitialized memory
    ext4: make sure the journal sb is written in ext4_clear_journal_err()

    Linus Torvalds
     
  • Commit 968dee7722: "ext4: fix hole punch failure when depth is greater
    than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel
    crashes when users ran run "rm -rf" on large directory hierarchy on
    ext4 filesystems on RAID devices:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

    Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710)
    Call Trace:
    [] ? __ext4_handle_dirty_metadata+0x83/0x110
    [] ext4_ext_truncate+0x193/0x1d0
    [] ? ext4_mark_inode_dirty+0x7f/0x1f0
    [] ext4_truncate+0xf5/0x100
    [] ext4_evict_inode+0x461/0x490
    [] evict+0xa2/0x1a0
    [] iput+0x103/0x1f0
    [] do_unlinkat+0x154/0x1c0
    [] ? sys_newfstatat+0x2a/0x40
    [] sys_unlinkat+0x1b/0x50
    [] system_call_fastpath+0x16/0x1b
    Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00

    RIP [] ext4_ext_remove_space+0xa34/0xdf0

    This could be reproduced as follows:

    The problem in commit 968dee7722 was that caused the variable 'i' to
    be left uninitialized if the truncate required more space than was
    available in the journal. This resulted in the function
    ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused
    ext4_ext_remove_space() to restart the truncate operation after
    starting a new jbd2 handle.

    Reported-by: Maciej Żenczykowski
    Reported-by: Marti Raudsepp
    Tested-by: Fengguang Wu
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
    ext4_statfs()" introduced a O(n**2) calculation which makes very large
    file systems take forever to mount. Fix this with an optimization for
    non-bigalloc file systems. (For bigalloc file systems the overhead
    needs to be set in the the superblock.)

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • While in ext4_validate_block_bitmap(), if an block allocation bitmap
    is found to be invalid, we call ext4_error() while the block group is
    still locked. This causes ext4_commit_super() to call a function
    which might sleep while in an atomic context.

    There's no need to keep the block group locked at this point, so hoist
    the ext4_error() call up to ext4_validate_block_bitmap() and release
    the block group spinlock before calling ext4_error().

    The reported stack trace can be found at:

    http://article.gmane.org/gmane.comp.file-systems.ext4/33731

    Reported-by: Dave Jones
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     

06 Aug, 2012

2 commits

  • Commit 03179fe923 introduced a kmemcheck complaint in
    ext4_da_get_block_prep() because we save and restore
    ei->i_da_metadata_calc_last_lblock even though it is left
    uninitialized in the case where i_da_metadata_calc_len is zero.

    This doesn't hurt anything, but silencing the kmemcheck complaint
    makes it easier for people to find real bugs.

    Addresses https://bugzilla.kernel.org/show_bug.cgi?id=45631
    (which is marked as a regression).

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • After we transfer set the EXT4_ERROR_FS bit in the file system
    superblock, it's not enough to call jbd2_journal_clear_err() to clear
    the error indication from journal superblock --- we need to call
    jbd2_journal_update_sb_errno() as well. Otherwise, when the root file
    system is mounted read-only, the journal is replayed, and the error
    indicator is transferred to the superblock --- but the s_errno field
    in the jbd2 superblock is left set (since although we cleared it in
    memory, we never flushed it out to disk).

    This can end up confusing e2fsck. We should make e2fsck more robust
    in this case, but the kernel shouldn't be leaving things in this
    confused state, either.

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Theodore Ts'o
     

04 Aug, 2012

2 commits


02 Aug, 2012

1 commit

  • Pull second vfs pile from Al Viro:
    "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
    deadlock reproduced by xfstests 068), symlink and hardlink restriction
    patches, plus assorted cleanups and fixes.

    Note that another fsfreeze deadlock (emergency thaw one) is *not*
    dealt with - the series by Fernando conflicts a lot with Jan's, breaks
    userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
    for massive vfsmount leak; this is going to be handled next cycle.
    There probably will be another pull request, but that stuff won't be
    in it."

    Fix up trivial conflicts due to unrelated changes next to each other in
    drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
    delousing target_core_file a bit
    Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
    fs: Remove old freezing mechanism
    ext2: Implement freezing
    btrfs: Convert to new freezing mechanism
    nilfs2: Convert to new freezing mechanism
    ntfs: Convert to new freezing mechanism
    fuse: Convert to new freezing mechanism
    gfs2: Convert to new freezing mechanism
    ocfs2: Convert to new freezing mechanism
    xfs: Convert to new freezing code
    ext4: Convert to new freezing mechanism
    fs: Protect write paths by sb_start_write - sb_end_write
    fs: Skip atime update on frozen filesystem
    fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
    fs: Improve filesystem freezing handling
    switch the protection of percpu_counter list to spinlock
    nfsd: Push mnt_want_write() outside of i_mutex
    btrfs: Push mnt_want_write() outside of i_mutex
    fat: Push mnt_want_write() outside of i_mutex
    ...

    Linus Torvalds
     

31 Jul, 2012

2 commits

  • We remove most of frozen checks since upper layer takes care of blocking all
    writes. We have to handle protection in ext4_page_mkwrite() in a special way
    because we cannot use generic block_page_mkwrite(). Also we add a freeze
    protection to ext4_evict_inode() so that iput() of unlinked inode cannot modify
    a frozen filesystem (we cannot easily instrument ext4_journal_start() /
    ext4_journal_stop() with freeze protection because we are missing the
    superblock pointer in ext4_journal_stop() in nojournal mode).

    CC: linux-ext4@vger.kernel.org
    CC: "Theodore Ts'o"
    BugLink: https://bugs.launchpad.net/bugs/897421
    Tested-by: Kamal Mostafa
    Tested-by: Peter M. Petrakis
    Tested-by: Dann Frazier
    Tested-by: Massimo Morana
    Acked-by: "Theodore Ts'o"
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • Convert ext4_count_free() to use memweight() instead of table lookup
    based counting clear bits implementation. This change only affects the
    code segments enabled by EXT4FS_DEBUG.

    Note that this memweight() call can't be replaced with a single
    bitmap_weight() call, although the pointer to the memory area is aligned
    to long-word boundary. Because the size of the memory area may not be a
    multiple of BITS_PER_LONG, then it returns wrong value on big-endian
    architecture.

    This also includes the following change.

    - Remove unnecessary map == NULL check in ext4_count_free() which
    always takes non-null pointer as the memory area.

    Signed-off-by: Akinobu Mita
    Cc: "Theodore Ts'o"
    Cc: Andreas Dilger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

28 Jul, 2012

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "The usual collection of bug fixes and optimizations. Perhaps of
    greatest note is a speed up for parallel, non-allocating DIO writes,
    since we no longer take the i_mutex lock in that case.

    For bug fixes, we fix an incorrect overhead calculation which caused
    slightly incorrect results for df(1) and statfs(2). We also fixed
    bugs in the metadata checksum feature."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
    ext4: undo ext4_calc_metadata_amount if we fail to claim space
    ext4: don't let i_reserved_meta_blocks go negative
    ext4: fix hole punch failure when depth is greater than 0
    ext4: remove unnecessary argument from __ext4_handle_dirty_metadata()
    ext4: weed out ext4_write_super
    ext4: remove unnecessary superblock dirtying
    ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super()
    ext4: remove useless marking of superblock dirty
    ext4: fix ext4 mismerge back in January
    ext4: remove dynamic array size in ext4_chksum()
    ext4: remove unused variable in ext4_update_super()
    ext4: make quota as first class supported feature
    ext4: don't take the i_mutex lock when doing DIO overwrites
    ext4: add a new nolock flag in ext4_map_blocks
    ext4: split ext4_file_write into buffered IO and direct IO
    ext4: remove an unused statement in ext4_mb_get_buddy_page_lock()
    ext4: fix out-of-date comments in extents.c
    ext4: use s_csum_seed instead of i_csum_seed for xattr block
    ext4: use proper csum calculation in ext4_rename
    ext4: fix overhead calculation used by ext4_statfs()
    ...

    Linus Torvalds
     

24 Jul, 2012

1 commit

  • Pull the big VFS changes from Al Viro:
    "This one is *big* and changes quite a few things around VFS. What's in there:

    - the first of two really major architecture changes - death to open
    intents.

    The former is finally there; it was very long in making, but with
    Miklos getting through really hard and messy final push in
    fs/namei.c, we finally have it. Unlike his variant, this one
    doesn't introduce struct opendata; what we have instead is
    ->atomic_open() taking preallocated struct file * and passing
    everything via its fields.

    Instead of returning struct file *, it returns -E... on error, 0
    on success and 1 in "deal with it yourself" case (e.g. symlink
    found on server, etc.).

    See comments before fs/namei.c:atomic_open(). That made a lot of
    goodies finally possible and quite a few are in that pile:
    ->lookup(), ->d_revalidate() and ->create() do not get struct
    nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
    flags instead, ->create() gets "do we want it exclusive" flag.

    With the introduction of new helper (kern_path_locked()) we are rid
    of all struct nameidata instances outside of fs/namei.c; it's still
    visible in namei.h, but not for long. Come the next cycle,
    declaration will move either to fs/internal.h or to fs/namei.c
    itself. [me, miklos, hch]

    - The second major change: behaviour of final fput(). Now we have
    __fput() done without any locks held by caller *and* not from deep
    in call stack.

    That obviously lifts a lot of constraints on the locking in there.
    Moreover, it's legal now to call fput() from atomic contexts (which
    has immediately simplified life for aio.c). We also don't need
    anti-recursion logics in __scm_destroy() anymore.

    There is a price, though - the damn thing has become partially
    asynchronous. For fput() from normal process we are guaranteed
    that pending __fput() will be done before the caller returns to
    userland, exits or gets stopped for ptrace.

    For kernel threads and atomic contexts it's done via
    schedule_work(), so theoretically we might need a way to make sure
    it's finished; so far only one such place had been found, but there
    might be more.

    There's flush_delayed_fput() (do all pending __fput()) and there's
    __fput_sync() (fput() analog doing __fput() immediately). I hope
    we won't need them often; see warnings in fs/file_table.c for
    details. [me, based on task_work series from Oleg merged last
    cycle]

    - sync series from Jan

    - large part of "death to sync_supers()" work from Artem; the only
    bits missing here are exofs and ext4 ones. As far as I understand,
    those are going via the exofs and ext4 trees resp.; once they are
    in, we can put ->write_super() to the rest, along with the thread
    calling it.

    - preparatory bits from unionmount series (from dhowells).

    - assorted cleanups and fixes all over the place, as usual.

    This is not the last pile for this cycle; there's at least jlayton's
    ESTALE work and fsfreeze series (the latter - in dire need of fixes,
    so I'm not sure it'll make the cut this cycle). I'll probably throw
    symlink/hardlink restrictions stuff from Kees into the next pile, too.
    Plus there's a lot of misc patches I hadn't thrown into that one -
    it's large enough as it is..."

    * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
    ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
    btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
    switch dentry_open() to struct path, make it grab references itself
    spufs: shift dget/mntget towards dentry_open()
    zoran: don't bother with struct file * in zoran_map
    ecryptfs: don't reinvent the wheels, please - use struct completion
    don't expose I_NEW inodes via dentry->d_inode
    tidy up namei.c a bit
    unobfuscate follow_up() a bit
    ext3: pass custom EOF to generic_file_llseek_size()
    ext4: use core vfs llseek code for dir seeks
    vfs: allow custom EOF in generic_file_llseek code
    vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
    vfs: Remove unnecessary flushing of block devices
    vfs: Make sys_sync writeout also block device inodes
    vfs: Create function for iterating over block devices
    vfs: Reorder operations during sys_sync
    quota: Move quota syncing to ->sync_fs method
    quota: Split dquot_quota_sync() to writeback and cache flushing part
    vfs: Move noop_backing_dev_info check from sync into writeback
    ...

    Linus Torvalds
     

23 Jul, 2012

18 commits

  • The function ext4_calc_metadata_amount() has side effects, although
    it's not obvious from its function name. So if we fail to claim
    space, regardless of whether we retry to claim the space again, or
    return an error, we need to undo these side effects.

    Otherwise we can end up incorrectly calculating the number of metadata
    blocks needed for the operation, which was responsible for an xfstests
    failure for test #271 when using an ext2 file system with delalloc
    enabled.

    Reported-by: Brian Foster
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     
  • If we hit a condition where we have allocated metadata blocks that
    were not appropriately reserved, we risk underflow of
    ei->i_reserved_meta_blocks. In turn, this can throw
    sbi->s_dirtyclusters_counter significantly out of whack and undermine
    the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
    occurs and set i_allocated_meta_blocks to avoid this problem.

    This condition is reproduced by xfstests 270 against ext2 with
    delalloc enabled:

    Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
    Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost

    270 ultimately fails with an inconsistent filesystem and requires an
    fsck to repair. The cause of the error is an underflow in
    ext4_da_update_reserve_space() due to an unreserved meta block
    allocation.

    Signed-off-by: Brian Foster
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Brian Foster
     
  • Whether to continue removing extents or not is decided by the return
    value of function ext4_ext_more_to_rm() which checks 2 conditions:
    a) if there are no more indexes to process.
    b) if the number of entries are decreased in the header of "depth -1".

    In case of hole punch, if the last block to be removed is not part of
    the last extent index than this index will not be deleted, hence the
    number of valid entries in the extent header of "depth - 1" will
    remain as it is and ext4_ext_more_to_rm will return 0 although the
    required blocks are not yet removed.

    This patch fixes the above mentioned problem as instead of removing
    the extents from the end of file, it starts removing the blocks from
    the particular extent from which removing blocks is actually required
    and continue backward until done.

    Signed-off-by: Ashish Sangwan
    Signed-off-by: Namjae Jeon
    Reviewed-by: Lukas Czerner
    Cc: stable@vger.kernel.org

    Ashish Sangwan
     
  • The '__ext4_handle_dirty_metadata()' does not need the 'now' argument
    anymore and we can kill it.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Artem Bityutskiy
     
  • We do not depend on VFS's '->write_super()' anymore and do not need
    the 's_dirt' flag anymore, so weed out 'ext4_write_super()' and
    's_dirt'.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Artem Bityutskiy
     
  • This patch changes the 'ext4_handle_dirty_super()' function which
    submits the superblock for I/O in the following cases:

    1. When creating the first large file on a file system without
    EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
    2. When re-sizing the file-system.
    3. When creating an xattr on a file-system without the
    EXT4_FEATURE_COMPAT_EXT_ATTR feature.

    If the file-system has journal enabled, the superblock is written via
    the journal. We do not modify this path.

    If the file-system has no journal, this function, falls back to just
    marking the superblock as dirty using the 's_dirt' superblock
    flag. This means that it delays the actual superblock I/O submission
    by 5 seconds (default setting). Namely, the 'sync_supers()' kernel
    thread will call 'ext4_write_super()' later and will actually submit
    the superblock for I/O.

    And this is the behavior this patch modifies: we stop using 's_dirt'
    and just mark the superblock buffer as dirty right away. Indeed, all 3
    cases above are extremely rare and it does not add any value to delay
    the I/O submission for them.

    Note: 'ext4_handle_dirty_super()' executes
    '__ext4_handle_dirty_super()' with 'now = 0'. This patch basically
    makes the 'now' argument unneeded and it will be deleted in one of the
    next patches.

    This patch also removes 's_dirt' condition on the unmount path because
    we never set it anymore, so we should not test it.

    Tested using xfstests for both journalled and non-journalled ext4.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Artem Bityutskiy
     
  • The last user of ext4_mark_super_dirty() in ext4_file_open() is so
    rare it can well be modifying the superblock properly by journalling
    the change. Change it and get rid of ext4_mark_super_dirty() as it's
    not needed anymore.

    Artem: small amendments.
    Artem: tested using xfstests for both journalled and non-journalled ext4.

    Signed-off-by: Jan Kara
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Tested-by: Artem Bityutskiy

    Jan Kara
     
  • Commit a0375156 properly notes that superblock doesn't need to be marked
    as dirty when only number of free inodes / blocks / number of directories
    changes since that is recomputed on each mount anyway. However that comment
    leaves some unnecessary markings as dirty in place. Remove these.

    Artem: tested using xfstests for both journalled and non-journalled ext4.

    Signed-off-by: Jan Kara
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Tested-by: Artem Bityutskiy

    Jan Kara
     
  • Duplicate caused, AFAICS, by mismerge in
    ff9cb1c4eead5e4c292e75cd3170a82d66944101>

    Signed-off-by: Al Viro
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Al Viro
     
  • The ext4_checksum() inline function was using a dynamic array size,
    which is not legal C. (It is a gcc extension).

    Remove it.

    Cc: "Darrick J. Wong"
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This patch adds support for quotas as a first class feature in ext4;
    which is to say, the quota files are stored in hidden inodes as file
    system metadata, instead of as separate files visible in the file system
    directory hierarchy.

    It is based on the proposal at:
    https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4

    This patch introduces a new feature - EXT4_FEATURE_RO_COMPAT_QUOTA
    which, when turned on, enables quota accounting at mount time
    iteself. Also, the quota inodes are stored in two additional superblock
    fields. Some changes introduced by this patch that should be pointed
    out are:

    1) Two new ext4-superblock fields - s_usr_quota_inum and
    s_grp_quota_inum for storing the quota inodes in use.
    2) Default quota inodes are: inode#3 for tracking userquota and inode#4
    for tracking group quota. The superblock fields can be set to use
    other inodes as well.
    3) If the QUOTA feature and corresponding quota inodes are set in
    superblock, the quota usage tracking is turned on at mount time. On
    'quotaon' ioctl, the quota limits enforcement is turned
    on. 'quotaoff' ioctl turns off only the limits enforcement in this
    case.
    4) When QUOTA feature is in use, the quota mount options 'quota',
    'usrquota', 'grpquota' are ignored by the kernel.
    5) mke2fs or tune2fs can be used to set the QUOTA feature and initialize
    quota inodes. The default reserved inodes will not be visible to user
    as regular files.
    6) The quota-tools will need to be modified to support hidden quota
    files on ext4. E2fsprogs will also include support for creating and
    fixing quota files.
    7) Support is only for the new V2 quota file format.

    Tested-by: Jan Kara
    Reviewed-by: Jan Kara
    Reviewed-by: Johann Lombardi
    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     
  • Aligned and overwrite direct I/O can be parallelized. In
    ext4_file_dio_write, we first check whether these conditions are
    satisfied or not. If so, we take i_data_sem and release i_mutex lock
    directly. Meanwhile iocb->private is set to indicate that this is a
    dio overwrite, and it will be handled in ext4_ext_direct_IO.

    [ Added fix from Dan Carpenter to fix locking bug on the error path. ]

    CC: Tao Ma
    CC: Eric Sandeen
    CC: Robin Dong
    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Dan Carpenter

    Zheng Liu
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • d_instantiate(dentry, inode);
    unlock_new_inode(inode);

    is a bad idea; do it the other way round...

    Signed-off-by: Al Viro

    Al Viro
     
  • Use the new functionality in generic_file_llseek_size() to
    accept a custom EOF position, and un-cut-and-paste all the
    vfs llseek code from ext4.

    Also fix up comments on ext4_llseek() to reflect reality.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Al Viro

    Eric Sandeen
     
  • For ext3/4 htree directories, using the vfs llseek function with
    SEEK_END goes to i_size like for any other file, but in reality
    we want the maximum possible hash value. Recent changes
    in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
    but replicating this core code seems like a bad idea, especially
    since the copy has already diverged from the vfs.

    This patch updates generic_file_llseek_size to accept
    both a custom maximum offset, and a custom EOF position. With this
    in place, ext4_dir_llseek can pass in the appropriate maximum hash
    position for both maxsize and eof, and get what it wants.

    As far as I know, this does not fix any bugs - nfs in the kernel
    doesn't use SEEK_END, and I don't know of any user who does. But
    some ext4 folks seem keen on doing the right thing here, and I can't
    really argue.

    (Patch also fixes up some comments slightly)

    Signed-off-by: Eric Sandeen
    Signed-off-by: Al Viro

    Eric Sandeen
     
  • Since the moment writes to quota files are using block device page cache and
    space for quota structures is reserved at the moment they are first accessed we
    have no reason to sync quota before inode writeback. In fact this order is now
    only harmful since quota information can easily change during inode writeback
    (either because conversion of delayed-allocated extents or simply because of
    allocation of new blocks for simple filesystems not using page_mkwrite).

    So move syncing of quota information after writeback of inodes into ->sync_fs
    method. This way we do not have to use ->quota_sync callback which is primarily
    intended for use by quotactl syscall anyway and we get rid of calling
    ->sync_fs() twice unnecessarily. We skip quota syncing for OCFS2 since it does
    proper quota journalling in all cases (unlike ext3, ext4, and reiserfs which
    also support legacy non-journalled quotas) and thus there are no dirty quota
    structures.

    CC: "Theodore Ts'o"
    CC: Joel Becker
    CC: reiserfs-devel@vger.kernel.org
    Acked-by: Steven Whitehouse
    Acked-by: Dave Kleikamp
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

18 Jul, 2012

1 commit


14 Jul, 2012

4 commits


10 Jul, 2012

4 commits

  • EXT4_GET_BLOCKS_NO_LOCK flag is added to indicate that we don't need
    to acquire i_data_sem lock in ext4_map_blocks. Meanwhile, it changes
    ext4_get_block() to not start a new journal because when we do a
    overwrite dio, there is no any metadata that needs to be modified.

    We define a new function called ext4_get_block_write_nolock, which is
    used in dio overwrite nolock. In this function, it doesn't try to
    acquire i_data_sem lock and doesn't start a new journal as it does a
    lookup.

    CC: Tao Ma
    CC: Eric Sandeen
    CC: Robin Dong
    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"

    Zheng Liu
     
  • ext4_file_dio_write is defined in order to split buffered IO and
    direct IO in ext4. This patch just refactor some stuff in write path.

    CC: Tao Ma
    CC: Eric Sandeen
    CC: Robin Dong
    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"

    Zheng Liu
     
  • In this patch, the statement "poff = block % blocks_per_page"
    in ext4_mb_get_buddy_page_lock has no effect.

    It will be optimized out by the compiler, but it's better to remove it.

    Signed-off-by: Haibo Liu
    Signed-off-by: "Theodore Ts'o"

    Haibo Liu
     
  • In this patch, ext4_ext_try_to_merge has been change to merge
    an extent both left and right. So we need to update the comment
    in here.

    Signed-off-by: HaiboLiu
    Signed-off-by: "Theodore Ts'o"

    HaiboLiu