14 Dec, 2006

1 commit

  • Run this:

    #!/bin/sh
    for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
    echo "De-casting $f..."
    perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
    done

    And then go through and reinstate those cases where code is casting pointers
    to non-pointers.

    And then drop a few hunks which conflicted with outstanding work.

    Cc: Russell King , Ian Molton
    Cc: Mikael Starvik
    Cc: Yoshinori Sato
    Cc: Roman Zippel
    Cc: Geert Uytterhoeven
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Kyle McMartin
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Jeff Dike
    Cc: Greg KH
    Cc: Jens Axboe
    Cc: Paul Fulghum
    Cc: Alan Cox
    Cc: Karsten Keil
    Cc: Mauro Carvalho Chehab
    Cc: Jeff Garzik
    Cc: James Bottomley
    Cc: Ian Kent
    Cc: Steven French
    Cc: David Woodhouse
    Cc: Neil Brown
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

09 Dec, 2006

3 commits


08 Dec, 2006

6 commits

  • We add a save link for O_DIRECT writes to protect the i_size against the
    crashes before we actually finish the I/O. If we hit an -ENOSPC in
    aops->prepare_write(), we would do a truncate() to release the blocks which
    might have got initialized. Now the truncate would add another save link
    for the same inode causing a reiserfs panic for having multiple save links
    for the same inode.

    Signed-off-by: Vladimir V. Saveliev
    Signed-off-by: Amit Arora
    Signed-off-by: Suzuki K P
    Cc: Jeff Mahoney
    Cc: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir V. Saveliev
     
  • Replace kmalloc+memset with kzalloc

    Signed-off-by: Yan Burman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yan Burman
     
  • Looks like, reiserfs_prepare_file_region_for_write() doesn't contain
    several flush_dcache_page() calls.

    Found with help from Dmitriy Monakhov

    [akpm@osdl.org: small speedup]
    Signed-off-by: Alexey Dobriyan
    Cc: Dmitriy Monakhov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • One of our test team hit a reiserfs_panic while running fsstress tests on
    2.6.19-rc1. The message looks like :

    REISERFS: panic(device Null superblock):
    reiserfs[5676]: assertion !(p->path_length != 1 ) failed at
    fs/reiserfs/stree.c:397:reiserfs_check_path: path not properly relsed.

    The backtrace looked :

    kernel BUG in reiserfs_panic at fs/reiserfs/prints.c:361!
    .reiserfs_check_path+0x58/0x74
    .reiserfs_get_block+0x1444/0x1508
    .__block_prepare_write+0x1c8/0x558
    .block_prepare_write+0x34/0x64
    .reiserfs_prepare_write+0x118/0x1d0
    .generic_file_buffered_write+0x314/0x82c
    .__generic_file_aio_write_nolock+0x350/0x3e0
    .__generic_file_write_nolock+0x78/0xb0
    .generic_file_write+0x60/0xf0
    .reiserfs_file_write+0x198/0x2038
    .vfs_write+0xd0/0x1b4
    .sys_write+0x4c/0x8c
    syscall_exit+0x0/0x4

    Upon debugging I found that the restart_transaction was not releasing
    the path if the th->refcount was > 1.

    /*static*/
    int restart_transaction(struct reiserfs_transaction_handle *th,
    struct inode *inode, struct path *path)
    {
    [...]

    /* we cannot restart while nested */
    if (th->t_refcount > 1) { <<i_sb)->j_next_async_flush = 1;

    -->> retval = restart_transaction(th, inode, &path); <refcount is > 1, the path is still valid. And,

    if (retval)
    goto failure;
    repeat =
    _allocate_block(th, block, inode,
    &allocated_block_nr, NULL, create);

    If the above allocate_block fails with NO_DISK_SPACE or QUOTA_EXCEEDED,
    we would have path which is not released.

    if (repeat != NO_DISK_SPACE && repeat != QUOTA_EXCEEDED) {
    goto research;
    }
    if (repeat == QUOTA_EXCEEDED)
    retval = -EDQUOT;
    else
    retval = -ENOSPC;
    goto failure;
    [...]

    failure:
    [...]
    reiserfs_check_path(&path); << Panics here !

    Attached here is a patch which could fix the issue.

    fix reiserfs/inode.c : restart_transaction() to release the path in all
    cases.

    The restart_transaction() doesn't release the path when the the journal
    handle has a refcount > 1. This would trigger a reiserfs_panic() if we
    encounter an -ENOSPC / -EDQUOT in reiserfs_get_block().

    Signed-off-by: Suzuki K P
    Cc: "Vladimir V. Saveliev"
    Cc:
    Cc: Jeff Mahoney
    Acked-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Suzuki K P
     
  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • SLAB_KERNEL is an alias of GFP_KERNEL.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

05 Dec, 2006

1 commit


03 Dec, 2006

1 commit


30 Nov, 2006

1 commit


26 Nov, 2006

1 commit


22 Nov, 2006

1 commit


04 Nov, 2006

1 commit

  • Callers after reiserfs_init_bitmap_cache() expect errval to contain -EINVAL
    until much later. If a condition fails before errval is reset later,
    reiserfs_fill_super() will mistakenly return 0, causing an Oops in
    do_add_mount(). This patch resets errval to -EINVAL after the call.

    I view this as a temporary fix and real error codes should be used
    throughout reiserfs_fill_super().

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     

21 Oct, 2006

1 commit

  • Separate out the concept of "queue congestion" from "backing-dev congestion".
    Congestion is a backing-dev concept, not a queue concept.

    The blk_* congestion functions are retained, as wrappers around the core
    backing-dev congestion functions.

    This proper layering is needed so that NFS can cleanly use the congestion
    functions, and so that CONFIG_BLOCK=n actually links.

    Cc: "Thomas Maier"
    Cc: "Jens Axboe"
    Cc: Trond Myklebust
    Cc: David Howells
    Cc: Peter Osterlund
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

12 Oct, 2006

1 commit

  • Make sure all dentries refs are released before calling kill_block_super()
    so that the assumption that generic_shutdown_super() can completely destroy
    the dentry tree for there will be no external references holds true.

    What was being done in the put_super() superblock op, is now done in the
    kill_sb() filesystem op instead, prior to calling kill_block_super().

    Changes made in [try #2]:

    (*) reiserfs_kill_sb() now checks that the superblock FS info pointer is set
    before trying to dereference it.

    Signed-off-by: David Howells
    Cc: "Rafael J. Wysocki"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

08 Oct, 2006

1 commit


05 Oct, 2006

1 commit


04 Oct, 2006

3 commits


03 Oct, 2006

1 commit

  • These patches make the kernel pass 64-bit inode numbers internally when
    communicating to userspace, even on a 32-bit system. They are required
    because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
    for example. The 64-bit inode numbers are then propagated to userspace
    automatically where the arch supports it.

    Problems have been seen with userspace (eg: ld.so) using the 64-bit inode
    number returned by stat64() or getdents64() to differentiate files, and
    failing because the 64-bit inode number space was compressed to 32-bits, and
    so overlaps occur.

    This patch:

    Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit
    inode number so that 64-bit inode numbers can be passed back to userspace.

    The stat functions then returns the full 64-bit inode number where
    available and where possible. If it is not possible to represent the inode
    number supplied by the filesystem in the field provided by userspace, then
    error EOVERFLOW will be issued.

    Similarly, the getdents/readdir functions now pass the full 64-bit inode
    number to userspace where possible, returning EOVERFLOW instead when a
    directory entry is encountered that can't be properly represented.

    Note that this means that some inodes will not be stat'able on a 32-bit
    system with old libraries where they were before - but it does mean that
    there will be no ambiguity over what a 32-bit inode number refers to.

    Note similarly that directory scans may be cut short with an error on a
    32-bit system with old libraries where the scan would work before for the
    same reasons.

    It is judged unlikely that this situation will occur because modern glibc
    uses 64-bit capable versions of stat and getdents class functions
    exclusively, and that older systems are unlikely to encounter
    unrepresentable inode numbers anyway.

    [akpm: alpha build fix]
    Signed-off-by: David Howells
    Cc: Trond Myklebust
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

01 Oct, 2006

11 commits

  • Some filesystems, instead of simply decrementing i_nlink, simply zero it
    during an unlink operation. We need to catch these in addition to the
    decrement operations.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • This is mostly included for parity with dec_nlink(), where we will have some
    more hooks. This one should stay pretty darn straightforward for now.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • When a filesystem decrements i_nlink to zero, it means that a write must be
    performed in order to drop the inode from the filesystem.

    We're shortly going to have keep filesystems from being remounted r/o between
    the time that this i_nlink decrement and that write occurs.

    So, add a little helper function to do the decrements. We'll tie into it in a
    bit to note when i_nlink hits zero.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • This patch vectorizes aio_read() and aio_write() methods to prepare for
    collapsing all aio & vectored operations into one interface - which is
    aio_read()/aio_write().

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Cc: Michael Holzheu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • When a file system becomes fragmented (using MythTV, for example), the
    bigalloc window searching ends up causing huge performance problems. In a
    file system presented by a user experiencing this bug, the file system was
    90% free, but no 32-block free windows existed on the entire file system.
    This causes the allocator to scan the entire file system for each 128k
    write before backing down to searching for individual blocks.

    In the end, finding a contiguous window for all the blocks in a write is an
    advantageous special case, but one that can be found naturally when such a
    window exists anyway.

    This patch removes the bigalloc window searching, and has been proven to
    fix the test case described above.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • The other common disk-based file systems (I checked ext[23], xfs, jfs)
    check to ensure that opens of files > 2 GB fail unless O_LARGEFILE is
    specified. They check via generic_file_open or their own open routine.

    ReiserFS doesn't have an f_op->open defined, and as such, it's possible to
    open files > 2 GB without O_LARGEFILE.

    This patch adds the f_op->open member to conform with the expected
    behavior.

    Signed-off-by: Jeff Mahoney
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This is the patch the three previous ones have been leading up to.

    It changes the behavior of ReiserFS from loading and caching all the bitmaps
    as special, to treating the bitmaps like any other bit of metadata and just
    letting the system-wide caches figure out what to hang on to.

    Buffer heads are allocated on the fly, so there is no need to retain pointers
    to all of them. The caching of the metadata occurs when the data is read and
    updated, and is considered invalid and uncached until then.

    I needed to remove the vs-4040 check for performing a duplicate operation on a
    particular bit. The reason is that while the other sites for working with
    bitmaps are allowed to schedule, is_reusable() is called from do_balance(),
    which will panic if a schedule occurs in certain places.

    The benefit of on-demand bitmaps clearly outweighs a sanity check that depends
    on a compile-time option that is discouraged.

    [akpm@osdl.org: warning fix]
    Signed-off-by: Jeff Mahoney
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • This patch moves the bitmap loading code from super.c to bitmap.c

    The code is also restructured somewhat. The only difference between new
    format bitmaps and old format bitmaps is where they are. That's a two liner
    before loading the block to use the correct one. There's no need for an
    entirely separate code path.

    The load path is generally the same, with the pattern being to throw out a
    bunch of requests and then wait for them, then cache the metadata from the
    contents.

    Again, like the previous patches, the purpose is to set up for later ones.

    Update: There was a bug in the previously posted version of this that resulted
    in corruption. The problem was that bitmap 0 on new format file systems must
    be treated specially, and wasn't. A stupid bug with an easy fix.

    This is hopefully the last fix for the disaster that is the reiserfs bitmap
    patch set.

    If a bitmap block was full, first_zero_hint would end up at zero since it
    would never be changed from it's zeroed out value. This just sets it
    beyond the end of the bitmap block. If any bits are freed, it will be
    reset to a valid bit. When info->free_count = 0, then we already know it's
    full.

    Signed-off-by: Jeff Mahoney
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • Similar to the SB_JOURNAL cleanup that was accepted a while ago, this patch
    uses a temporary variable for buffer head references from the bitmap info
    array.

    This makes the code much more readable in some areas.

    It also uses proper reference counting, doing a get_bh() after using the
    pointer from the array and brelse()'ing it later. This may seem silly, but a
    later patch will replace the simple temporary variables with an actual read,
    so the reference freeing will be used then.

    Signed-off-by: Jeff Mahoney
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • There is a check in is_reusable to determine if a particular block is a bitmap
    block. It verifies this by going through the array of bitmap block buffer
    heads and comparing the block number to each one.

    Bitmap blocks are at defined locations on the disk in both old and current
    formats. Simply checking against the known good values is enough.

    This is a trivial optimization for a non-production codepath, but this is the
    first in a series of patches that will ultimately remove the buffer heads from
    that array.

    Signed-off-by: Jeff Mahoney
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • Move the ReiserFS device ioctl compat stuff from fs/compat_ioctl.c to the
    ReiserFS driver so that the ReiserFS header file doesn't need to be included.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     

30 Sep, 2006

5 commits

  • Shrink reiserfs inode more (by 8 bytes) for ACL non-users:

    -reiser_inode_cache 344 11
    +reiser_inode_cache 336 11

    Signed-off-by: Alexey Dobriyan
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Shrink reiserfs inode by 12 bytes for xattr non-users (me).

    -reiser_inode_cache 356 11
    +reiser_inode_cache 344 11

    Signed-off-by: Alexey Dobriyan
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • ReiserFS does periodic cleanup of old transactions in order to limit the
    length of time a journal replay may take after a crash. Sometimes, writing
    metadata from an old (already committed) transaction may require committing
    a newer transaction, which also requires writing all data=ordered buffers.
    This can cause very long stalls on journal_begin.

    This patch makes sure new transactions will not need to be committed before
    trying a periodic reclaim of an old transaction. It is low risk because if
    a bad decision is made, it just means a slightly longer journal replay
    after a crash.

    Signed-off-by: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Mason
     
  • make sure that reiserfs_fsync only triggers barriers when mounted with -o
    barrier=flush

    Signed-off-by: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Mason
     
  • Only compile with -O1 if the (very old) compiler is broken. We use
    reiserfs alot since SLES9 on ppc64, and it was never seen with gcc33.
    Assume the broken gcc is gcc-3.4 or older.

    Signed-off-by: Olaf Hering
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Hering