27 Jul, 2008

2 commits

  • * kill nameidata * argument; map the 3 bits in ->flags anybody cares
    about to new MAY_... ones and pass with the mask.
    * kill redundant gfs2_iop_permission()
    * sanitize ecryptfs_permission()
    * fix remaining places where ->permission() instances might barf on new
    MAY_... found in mask.

    The obvious next target in that direction is permission(9)

    folded fix for nfs_permission() breakage from Miklos Szeredi

    Signed-off-by: Al Viro

    Al Viro
     
  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

26 Jul, 2008

10 commits

  • ext3_dx_find_entry uses ext3_next_entry without verifying that the entry
    is valid. If its rec_len == 0 this causes an infinite loop. Refactor the
    loop to check the validity of entries before checking whether they match
    and moving onto the next one.

    There are other uses of ext3_next_entry in this file which also look
    problematic. They should be reviewed and fixed if/when we have a
    test-case that triggers them.

    This patch fixes the first case (image hdb.25.softlockup.gz) reported in
    http://bugzilla.kernel.org/show_bug.cgi?id=10882.

    Signed-off-by: Duane Griffin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Duane Griffin
     
  • dx_root_limit() will never return 20, and I can't figure out what 20
    stands for. This function has never changed since htree directory
    indexing was merged.

    Similar for dx_node_limit() and the magic 22.

    Signed-off-by: Li Zefan
    Acked-by: Andreas Dilger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • While freeing indirect blocks we attach a journal head to the parent
    buffer head, free the blocks, then journal the parent. If the indirect
    block list is corrupted and points to the parent the journal head will be
    detached when the block is cleared, causing an OOPS.

    Check for that explicitly and handle it gracefully.

    This patch fixes the third case (image hdb.20000057.nullderef.gz)
    reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.

    Immediately above the change, in the ext3_free_data function, we call
    ext3_clear_blocks to clear the indirect blocks in this parent block. If
    one of those blocks happens to actually be the parent block it will clear
    b_private / BH_JBD.

    I did the check at the end rather than earlier as it seemed more elegant.
    I don't think there should be much practical difference, although it is
    possible the FS may not be quite so badly corrupted if we did it the other
    way (and didn't clear the block at all). To be honest, I'm not convinced
    there aren't other similar failure modes lurking in this code, although I
    couldn't find any with a quick review.

    [akpm@linux-foundation.org: fix printk warning]
    Signed-off-by: Duane Griffin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Duane Griffin
     
  • A transient I/O error can corrupt inode data. Here is the scenario:

    (1) update inode_A at the block_B
    (2) pdflush writes out new inode_A to the filesystem, but it results
    in write I/O error, at this point, BH_Uptodate flag of the buffer
    for block_B is cleared and BH_Write_EIO is set
    (3) create new inode_C which located at block_B, and
    __ext3_get_inode_loc() tries to read on-disk block_B because the
    buffer is not uptodate
    (4) if it can read on-disk block_B successfully, inode_A is
    overwritten by old data

    This patch makes __ext3_get_inode_loc() not read the inode block if the
    buffer has BH_Write_EIO flag. In this case, the buffer should have the
    latest information, so setting the uptodate flag to the buffer (this
    avoids WARN_ON_ONCE() in mark_buffer_dirty().)

    According to this change, we would need to test BH_Write_EIO flag for the
    error checking. Currently nobody checks write I/O errors on metadata
    buffers, but it will be done in other patches I'm working on.

    Signed-off-by: Hidehiro Kawai
    Cc: sugita
    Cc: Satoshi OSHIMA
    Cc: Nick Piggin
    Cc: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidehiro Kawai
     
  • If the orphan node list includes valid, untruncatable nodes with nlink > 0
    the ext3_orphan_cleanup loop which attempts to delete them will not do so,
    causing it to loop forever. Fix by checking for such nodes in the
    ext3_orphan_get function.

    This patch fixes the second case (image hdb.20000009.softlockup.gz)
    reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: printk warning fix]
    Signed-off-by: Duane Griffin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Duane Griffin
     
  • remove the definitions of macros:
    XATTR_TRUSTED_PREFIX
    XATTR_USER_PREFIX
    since they are defined in linux/xattr.h

    Signed-off-by: Shen Feng
    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shen Feng
     
  • - remove unnecessary code in free_rb_tree_fname
    - rename free_rb_tree_fname to ext3_htree_create_dir_info
    since it and ext3_htree_free_dir_info are a pair
    - replace kmalloc with kzalloc in ext3_htree_free_dir_info

    Signed-off-by: Shen Feng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shen Feng
     
  • We should not allow user to change quota mount options when quota is just
    suspended. I would make mount options and internal quota state inconsistent.
    Also we should not allow user to change quota format when quota is turned on.
    On the other hand we can just silently ignore when some option is set to the
    value it already has (mount does this on remount).

    Cc:
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Cc:
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • In journal=data mode, it is not enough to do write_inode_now as done in
    vfs_quota_on() to write all data to their final location (which is needed for
    quota_read to work correctly). Calling journal_flush() does its job.

    Reported-by: Nick
    Cc:
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

05 Jul, 2008

1 commit


07 Jun, 2008

1 commit

  • There is a bug when we are trying to verify that the reserve inode's
    double indirect blocks point back to the primary gdt blocks. The fix is
    obvious, we need to mod the gdb count by the addr's per block. You can
    verify this with the following test case

    dd if=/dev/zero of=disk1 seek=1024 count=1 bs=100M
    losetup /dev/loop1 disk1
    pvcreate /dev/loop1
    vgcreate loopvg1 /dev/loop1
    lvcreate -l 100%VG loopvg1 -n looplv1
    mkfs.ext3 -J size=64 -b 1024 /dev/loopvg1/looplv1
    mount /dev/loopvg1/looplv1 /mnt/loop
    dd if=/dev/zero of=disk2 seek=1024 count=1 bs=50M
    losetup /dev/loop2 disk2
    pvcreate /dev/loop2
    vgextend loopvg1 /dev/loop2
    lvextend -l 100%VG /dev/loopvg1/looplv1
    resize2fs /dev/loopvg1/looplv1

    without this patch the resize2fs fails, with it the resize2fs succeeds.

    Signed-off-by: Josef Bacik
    Acked-by: Andreas Dilger
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef Bacik
     

15 May, 2008

1 commit

  • This fix the uninitialized bs when we try to replace a xattr entry in
    ibody with the new value which require more than free space.

    This situation only happens we format ext3/4 with inode size more than 128 and
    we have put xattr entries both in ibody and block. The consequences about
    this bug is we will lost the xattr block which pointed by i_file_acl with all
    xattr entires in it. We will alloc a new xattr block and put that large value
    entry in it. The old xattr block will become orphan block.

    Signed-off-by: Tiger Yang
    Cc:
    Cc: Andreas Gruenbacher
    Acked-by: Andreas Dilger
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tiger Yang
     

30 Apr, 2008

1 commit


28 Apr, 2008

14 commits

  • __FUNCTION__ is gcc-specific, use __func__

    Signed-off-by: Harvey Harrison
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • When quota is disabled, we should not print 'journaled quota not supported'
    when user tried to mount non-journaled quota. Also fix typo in the message.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • If the block allocator gets blocks out of system zone ext3 calls ext3_error.
    But if the file system is mounted with errors=continue retry block allocation.
    We need to mark the system zone blocks as in use to make sure retry don't
    pick them again

    System zone is the block range mapping block bitmap, inode bitmap and inode
    table.

    [akpm@linux-foundation.org: fix typo in comment]
    Signed-off-by: Aneesh Kumar K.V
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • Call dquot_drop() from ext3_dquot_drop() even if we fail to start a
    transaction. Otherwise we never get to dropping references to quota
    structures from the inode and umount will hang indefinitely. Thanks to
    Payphone LIOU for spotting the problem.

    Signed-off-by: Jan Kara
    Cc: Payphone LIOU
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Make ext3 update mtime and ctime of the directory into which we move file even
    if the directory entry already exists.

    Signed-off-by: Jan Kara
    Cc: Al Viro
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
    side-effects to allow a definition of BUG_ON that drops the code completely.

    The semantic patch that makes this change is as follows:
    (http://www.emn.fr/x-info/coccinelle/)

    //
    @ disable unlikely @ expression E,f; @@

    (
    if () { BUG(); }
    |
    - if (unlikely(E)) { BUG(); }
    + BUG_ON(E);
    )

    @@ expression E,f; @@

    (
    if () { BUG(); }
    |
    - if (E) { BUG(); }
    + BUG_ON(E);
    )
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Julia Lawall
     
  • Check ext3_journal_get_write_access() errors.

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Use ext3_get_group_desc()

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Add missing ext3_journal_stop() in error handling.

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Use ext3_group_first_block_no()

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Make the needlessly global ext3_xattr_list() static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Convert byte order of constant instead of variable which can be done at
    compile time (vs run time).

    Signed-off-by: Marcin Slusarz
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Slusarz
     
  • Currently fdatasync is identical to fsync in ext3.

    I think fdatasync should skip journal flush in data=ordered and
    data=writeback mode when it overwrites to already-instantiated blocks on
    HDD. When I_DIRTY_DATASYNC flag is not set, fdatasync should skip journal
    writeout because this indicates only atime or/and mtime updates.

    Following patch is the same approach of ext2's fsync code(ext2_sync_file).

    I did a performance test using the sysbench.

    #sysbench --num-threads=128 --max-requests=50000 --test=fileio --file-total-size=128G
    --file-test-mode=rndwr --file-fsync-mode=fdatasync run

    The result on ext3 was:

    -2.6.24
    Operations performed: 0 Read, 50080 Write, 59600 Other = 109680 Total
    Read 0b Written 782.5Mb Total transferred 782.5Mb (12.116Mb/sec)
    775.45 Requests/sec executed

    Test execution summary:
    total time: 64.5814s
    total number of events: 50080
    total time taken by event execution: 3713.9836
    per-request statistics:
    min: 0.0000s
    avg: 0.0742s
    max: 0.9375s
    approx. 95 percentile: 0.2901s

    Threads fairness:
    events (avg/stddev): 391.2500/23.26
    execution time (avg/stddev): 29.0155/1.99

    -2.6.24-patched
    Operations performed: 0 Read, 50009 Write, 61596 Other = 111605 Total
    Read 0b Written 781.39Mb Total transferred 781.39Mb (16.419Mb/sec)
    1050.83 Requests/sec executed

    Test execution summary:
    total time: 47.5900s
    total number of events: 50009
    total time taken by event execution: 2934.5768
    per-request statistics:
    min: 0.0000s
    avg: 0.0587s
    max: 0.8938s
    approx. 95 percentile: 0.1993s

    Threads fairness:
    events (avg/stddev): 390.6953/22.64
    execution time (avg/stddev): 22.9264/1.17

    Filesystem I/O throughput was improved.

    Signed-off-by :Hisashi Hifumi
    Acked-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hisashi Hifumi
     
  • Update ext3 handle quotaon on remount RW.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

22 Apr, 2008

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial: (24 commits)
    DOC: A couple corrections and clarifications in USB doc.
    Generate a slightly more informative error msg for bad HZ
    fix typo "is" -> "if" in Makefile
    ext*: spelling fix prefered -> preferred
    DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs.
    KEYS: Fix the comment to match the file name in rxrpc-type.h.
    RAID: remove trailing space from printk line
    DMA engine: typo fixes
    Remove unused MAX_NODES_SHIFT
    MAINTAINERS: Clarify access to OCFS2 development mailing list.
    V4L: Storage class should be before const qualifier (sn9c102)
    V4L: Storage class should be before const qualifier
    sonypi: Storage class should be before const qualifier
    intel_menlow: Storage class should be before const qualifier
    DVB: Storage class should be before const qualifier
    arm: Storage class should be before const qualifier
    ALSA: Storage class should be before const qualifier
    acpi: Storage class should be before const qualifier
    firmware_sample_driver.c: fix coding style
    MAINTAINERS: Add ati_remote2 driver
    ...

    Fixed up trivial conflicts in firmware_sample_driver.c

    Linus Torvalds
     
  • Spelling fix: prefered -> preferred

    Signed-off-by: Benoit Boissinot
    Signed-off-by: Jesper Juhl

    Benoit Boissinot
     

19 Apr, 2008

1 commit


16 Apr, 2008

1 commit

  • mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL. But
    filesystems are calling this function while holding xattr_sem so possible
    recursion into the fs violates locking ordering of xattr_sem and transaction
    start / i_mutex for ext2-4. Change mb_cache_entry_alloc() so that filesystems
    can specify desired gfp mask and use GFP_NOFS from all of them.

    Signed-off-by: Jan Kara
    Reported-by: Dave Jones
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

20 Mar, 2008

1 commit

  • There are several places where we make allocations with GFP_KERNEL while under
    a transaction, which could lead to an assertion panic or lockup if under
    memory pressure. This patch switches these problem areas to use GFP_NOFS to
    keep these problems from happening.

    Signed-off-by: Josef Bacik
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef Bacik
     

05 Mar, 2008

1 commit

  • The "resize" option won't be noticed as it comes after the NULL option, so if
    you try to mount (or in this case remount) with that option it won't be
    recognized.

    Signed-off-by: Josef Bacik
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef Bacik
     

15 Feb, 2008

2 commits

  • * Add path_put() functions for releasing a reference to the dentry and
    vfsmount of a struct path in the right order

    * Switch from path_release(nd) to path_put(&nd->path)

    * Rename dput_path() to path_put_conditional()

    [akpm@linux-foundation.org: fix cifs]
    Signed-off-by: Jan Blunck
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Christoph Hellwig
    Cc:
    Cc: Al Viro
    Cc: Steven French
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     
  • This is the central patch of a cleanup series. In most cases there is no good
    reason why someone would want to use a dentry for itself. This series reflects
    that fact and embeds a struct path into nameidata.

    Together with the other patches of this series
    - it enforced the correct order of getting/releasing the reference count on
    pairs
    - it prepares the VFS for stacking support since it is essential to have a
    struct path in every place where the stack can be traversed
    - it reduces the overall code size:

    without patch series:
    text data bss dec hex filename
    5321639 858418 715768 6895825 6938d1 vmlinux

    with patch series:
    text data bss dec hex filename
    5320026 858418 715768 6894212 693284 vmlinux

    This patch:

    Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix cifs]
    [akpm@linux-foundation.org: fix smack]
    Signed-off-by: Jan Blunck
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Casey Schaufler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     

09 Feb, 2008

1 commit

  • replace all:
    little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
    expression_in_cpu_byteorder);
    with:
    leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
    sparse didn't generate any new warning with this patch

    Signed-off-by: Marcin Slusarz
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: Timothy Shimmin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Slusarz
     

08 Feb, 2008

1 commit

  • Stop the EXT3 filesystem from using iget() and read_inode(). Replace
    ext3_read_inode() with ext3_iget(), and call that instead of iget().
    ext3_iget() then uses iget_locked() directly and returns a proper error code
    instead of an inode in the event of an error.

    ext3_fill_super() returns any error incurred when getting the root inode
    instead of EINVAL.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: David Howells
    Acked-by: "Theodore Ts'o"
    Acked-by: Jan Kara
    Cc:
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells