18 Oct, 2008

1 commit


17 Oct, 2008

4 commits


16 Oct, 2008

3 commits


14 Oct, 2008

3 commits


13 Oct, 2008

1 commit

  • fs/ext4/super.c: In function 'ext4_fill_super':
    fs/ext4/super.c:2226: error: 'ext4_ui_proc_fops' undeclared (first use
    in this function)
    fs/ext4/super.c:2226: error: (Each undeclared identifier is reported
    only once
    fs/ext4/super.c:2226: error: for each function it appears in.)

    Signed-off-by: Alexander Beregalov
    Signed-off-by: Theodore Ts'o

    Alexander Beregalov
     

11 Oct, 2008

5 commits

  • We need to make sure we don't reuse the data blocks released
    during the transaction untill the transaction commits. We force
    this mode only for ordered and journalled mode. Writeback mode
    already don't provided data consistency.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Theodore Ts'o

    Aneesh Kumar K.V
     
  • During filesystem recovery we may be doing a truncate
    which expects some of the mballoc data structures to
    be initialized. So do ext4_mb_init before recovery.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Theodore Ts'o

    Aneesh Kumar K.V
     
  • If the journal doesn't abort when it gets an IO error in file data
    blocks, the file data corruption will spread silently. Because
    most of applications and commands do buffered writes without fsync(),
    they don't notice the IO error. It's scary for mission critical
    systems. On the other hand, if the journal aborts whenever it gets
    an IO error in file data blocks, the system will easily become
    inoperable. So this patch introduces a filesystem option to
    determine whether it aborts the journal or just call printk() when
    it gets an IO error in file data.

    If you mount an ext4 fs with data_err=abort option, it aborts on file
    data write error. If you mount it with data_err=ignore, it doesn't
    abort, just call printk(). data_err=ignore is the default.

    Here is the corresponding patch of the ext3 version:
    http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374

    Signed-off-by: Hidehiro Kawai
    Signed-off-by: Theodore Ts'o

    Hidehiro Kawai
     
  • If the journal has aborted due to a checkpointing failure, we
    have to keep the contents of the journal space. Otherwise, the
    filesystem will lose uncheckpointed metadata completely and
    become inconsistent. To avoid this, we need to keep needs_recovery
    flag if checkpoint has failed.

    With this patch, ext4_put_super() detects a checkpointing failure
    from the return value of journal_destroy(), then it invokes
    ext4_abort() to make the filesystem read only and keep
    needs_recovery flag. Errors from jbd2_journal_flush() are also
    handled by this patch in some places.

    Signed-off-by: Hidehiro Kawai
    Signed-off-by: Theodore Ts'o

    Hidehiro Kawai
     
  • The ext4 filesystem is getting stable enough that it's time to drop
    the "dev" prefix. Also remove the requirement for the TEST_FILESYS
    flag.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

10 Oct, 2008

3 commits

  • This fixes a bug which caused on-line resizing of filesystems with a
    1k blocksize to fail. The root cause of this bug was the fact that if
    an uninitalized bitmap block gets read in by userspace (which
    e2fsprogs does try to avoid, but can happen when the blocksize is less
    than the pagesize and an adjacent blocks is read into memory)
    ext4_read_block_bitmap() was erroneously depending on the buffer
    uptodate flag to decide whether it needed to initialize the bitmap
    block in memory --- i.e., to set the standard set of blocks in use by
    a block group (superblock, bitmaps, inode table, etc.). Essentially,
    ext4_read_block_bitmap() assumed it was the only routine that might
    try to read a block containing a block bitmap, which is simply not
    true.

    To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
    must always initialize uninitialized bitmap blocks. Once a block or
    inode is allocated out of that bitmap, it will be marked as
    initialized in the block group descriptor, so in general this won't
    result any extra unnecessary work.

    Signed-off-by: Frederic Bohe
    Signed-off-by: "Theodore Ts'o"

    Frederic Bohe
     
  • Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • With modern hard drives, reading 64k takes roughly the same time as
    reading a 4k block. So request readahead for adjacent inode table
    blocks to reduce the time it takes when iterating over directories
    (especially when doing this in htree sort order) in a cold cache case.
    With this patch, the time it takes to run "git status" on a kernel
    tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
    is reduced by 21%.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

09 Oct, 2008

3 commits

  • ext4_xattr_set_handle() eventually ends up calling
    ext4_mark_inode_dirty() which tries to expand the inode by shifting
    the EAs. This leads to the xattr_sem being downed again and leading
    to a deadlock.

    This patch makes sure that if ext4_xattr_set_handle() is in the
    call-chain, ext4_mark_inode_dirty() will not expand the inode.

    Signed-off-by: Kalpak Shah
    Signed-off-by: "Theodore Ts'o"

    Kalpak Shah
     
  • This patch hooks the ext3 to ext4 migrate interface to
    EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We
    only allow setting extent flags. Clearing extent flag (migrating from
    ext4 to ext3) is not supported.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • Note: some people thinks this represents a security bug, since it
    might make the system go away while it is printing a large number of
    console messages, especially if a serial console is involved. Hence,
    it has been assigned CVE-2008-3528, but it requires that the attacker
    either has physical access to your machine to insert a USB disk with a
    corrupted filesystem image (at which point why not just hit the power
    button), or is otherwise able to convince the system administrator to
    mount an arbitrary filesystem image (at which point why not just
    include a setuid shell or world-writable hard disk device file or some
    such). Me, I think they're just being silly. --tytso

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Cc: linux-ext4@vger.kernel.org
    Cc: Eugene Teo

    Eric Sandeen
     

07 Oct, 2008

3 commits


06 Oct, 2008

1 commit


24 Sep, 2008

1 commit


23 Sep, 2008

2 commits


17 Sep, 2008

1 commit

  • Calculate the journal device name once and stash it away in the
    journal_s structure. This avoids needing to call bdevname()
    everywhere and reduces stack usage by not needing to allocate an
    on-stack buffer. In addition, we eliminate the '/' that can appear in
    device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that
    can cause problems when creating proc directory names, and include the
    inode number to support ocfs2 which creates multiple journals with
    different inode numbers.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

14 Sep, 2008

6 commits

  • lg_prealloc_list seems to cry out for a per-cpu data structure; on a large
    smp system I think this should be better. I've lightly tested this change
    on a 4-cpu system.

    Signed-off-by: Eric Sandeen
    Acked-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with
    other ext4 ioctl's. Since there haven't been any major userspace
    users of this ioctl, we can afford to change this now, to avoid
    potential problems later.

    Also, reorder the ioctl numbers in ext4.h to avoid this sort of
    mistake in the future.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The migrate ioctl writes to the filsystem, so we need to elevate the
    write count.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • ext4 creates per-suberblock directory in /proc/ext4/ . Name used as
    basis is taken from bdevname, which, surprise, can contain slash.

    However, proc while allowing to use proc_create("a/b", parent) form of
    PDE creation, assumes that parent/a was already created.

    bdevname in question is 'cciss/c0d0p9', directory is not created and all
    this stuff goes directly into /proc (which is real bug).

    Warning comes when _second_ partition is mounted.

    http://bugzilla.kernel.org/show_bug.cgi?id=11321

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: "Theodore Ts'o"

    Alexey Dobriyan
     
  • With delayed allocation we use i_data_sem to update i_disksize. We need
    to update i_disksize only if the new size specified is greater than the
    current value and we need to make sure we don't race with other
    i_disksize update. With delayed allocation we will switch to the
    write_begin function for non-delayed allocation if we are low on free
    blocks. This means the write_begin function for non-delayed allocation
    also needs to use the same locking.

    We also need to check and update i_disksize even if the new size is less
    that inode.i_size because of delayed allocation.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • For blocksize < pagesize we need to remove blocks that got allocated in
    block_write_begin() if we fail with ENOSPC for later blocks.
    block_write_begin() internally does this if it allocated pages locally.
    This makes sure we don't have blocks outside inode.i_size during ENOSPC.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

09 Sep, 2008

1 commit

  • When we truncate files, the meta-data blocks released are not reused
    untill we commit the truncate transaction. That means delayed get_block
    request will return ENOSPC even if we have free blocks left. Force a
    journal commit and retry block allocation if we get ENOSPC with free
    blocks left.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

08 Sep, 2008

2 commits