04 Oct, 2008

1 commit

  • Any block based fs (this patch includes ext3) just has to declare its own
    fiemap() function and then call this generic function with its own
    get_block_t. This works well for block based filesystems that will map
    multiple contiguous blocks at one time, but will work for filesystems that
    only map one block at a time, you will just end up with an "extent" for each
    block. One gotcha is this will not play nicely where there is hole+data
    after the EOF. This function will assume its hit the end of the data as soon
    as it hits a hole after the EOF, so if there is any data past that it will
    not pick that up. AFAIK no block based fs does this anyway, but its in the
    comments of the function anyway just in case.

    Signed-off-by: Josef Bacik
    Signed-off-by: Mark Fasheh
    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org

    Josef Bacik
     

29 Jul, 2008

1 commit

  • When we read some part of a file through pagecache, if there is a
    pagecache of corresponding index but this page is not uptodate, read IO
    is issued and this page will be uptodate.

    I think this is good for pagesize == blocksize environment but there is
    room for improvement on pagesize != blocksize environment. Because in
    this case a page can have multiple buffers and even if a page is not
    uptodate, some buffers can be uptodate.

    So I suggest that when all buffers which correspond to a part of a file
    that we want to read are uptodate, use this pagecache and copy data from
    this pagecache to user buffer even if a page is not uptodate. This can
    reduce read IO and improve system throughput.

    I wrote a benchmark program and got result number with this program.

    This benchmark do:

    1: mount and open a test file.

    2: create a 512MB file.

    3: close a file and umount.

    4: mount and again open a test file.

    5: pwrite randomly 300000 times on a test file. offset is aligned
    by IO size(1024bytes).

    6: measure time of preading randomly 100000 times on a test file.

    The result was:
    2.6.26
    330 sec

    2.6.26-patched
    226 sec

    Arch:i386
    Filesystem:ext3
    Blocksize:1024 bytes
    Memory: 1GB

    On ext3/4, a file is written through buffer/block. So random read/write
    mixed workloads or random read after random write workloads are optimized
    with this patch under pagesize != blocksize environment. This test result
    showed this.

    The benchmark program is as follows:

    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define LEN 1024
    #define LOOP 1024*512 /* 512MB */

    main(void)
    {
    unsigned long i, offset, filesize;
    int fd;
    char buf[LEN];
    time_t t1, t2;

    if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
    perror("cannot mount\n");
    exit(1);
    }
    memset(buf, 0, LEN);
    fd = open("/root/test1/testfile", O_CREAT|O_RDWR|O_TRUNC);
    if (fd < 0) {
    perror("cannot open file\n");
    exit(1);
    }
    for (i = 0; i < LOOP; i++)
    write(fd, buf, LEN);
    close(fd);
    if (umount("/root/test1/") < 0) {
    perror("cannot umount\n");
    exit(1);
    }
    if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
    perror("cannot mount\n");
    exit(1);
    }
    fd = open("/root/test1/testfile", O_RDWR);
    if (fd < 0) {
    perror("cannot open file\n");
    exit(1);
    }

    filesize = LEN * LOOP;
    for (i = 0; i < 300000; i++){
    offset = (random() % filesize) & (~(LEN - 1));
    pwrite(fd, buf, LEN, offset);
    }
    printf("start test\n");
    time(&t1);
    for (i = 0; i < 100000; i++){
    offset = (random() % filesize) & (~(LEN - 1));
    pread(fd, buf, LEN, offset);
    }
    time(&t2);
    printf("%ld sec\n", t2-t1);
    close(fd);
    if (umount("/root/test1/") < 0) {
    perror("cannot umount\n");
    exit(1);
    }
    }

    Signed-off-by: Hisashi Hifumi
    Cc: Nick Piggin
    Cc: Christoph Hellwig
    Cc: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hisashi Hifumi
     

27 Jul, 2008

2 commits

  • * kill nameidata * argument; map the 3 bits in ->flags anybody cares
    about to new MAY_... ones and pass with the mask.
    * kill redundant gfs2_iop_permission()
    * sanitize ecryptfs_permission()
    * fix remaining places where ->permission() instances might barf on new
    MAY_... found in mask.

    The obvious next target in that direction is permission(9)

    folded fix for nfs_permission() breakage from Miklos Szeredi

    Signed-off-by: Al Viro

    Al Viro
     
  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

26 Jul, 2008

2 commits

  • Move declarations of some macros, which should be in fact functions to
    quotaops.h. This way they can be later converted to inline functions
    because we can now use declarations from quota.h. Also add necessary
    includes of quotaops.h to a few files.

    [akpm@linux-foundation.org: fix JFS build]
    [akpm@linux-foundation.org: fix UFS build]
    [vegard.nossum@gmail.com: fix QUOTA=n build]
    Signed-off-by: Jan Kara
    Cc: Vegard Nossum
    Cc: Arjen Pool
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • remove the definitions of macros:
    XATTR_TRUSTED_PREFIX
    XATTR_USER_PREFIX
    since they are defined in linux/xattr.h

    Signed-off-by: Shen Feng
    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shen Feng
     

28 Apr, 2008

10 commits

  • If the block allocator gets blocks out of system zone ext2 calls ext2_error.
    But if the file system is mounted with errors=continue retry block allocation.
    We need to mark the system zone blocks as in use to make sure retry don't
    pick them again

    System zone is the block range mapping block bitmap, inode bitmap and inode
    table.

    [akpm@linux-foundation.org: fix typo in comment]
    Signed-off-by: Aneesh Kumar K.V
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • __FUNCTION__ is gcc-specific, use __func__

    Signed-off-by: Harvey Harrison
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
    side-effects to allow a definition of BUG_ON that drops the code completely.

    The semantic patch that makes this change is as follows:
    (http://www.emn.fr/x-info/coccinelle/)

    //
    @ disable unlikely @ expression E,f; @@

    (
    if () { BUG(); }
    |
    - if (unlikely(E)) { BUG(); }
    + BUG_ON(E);
    )

    @@ expression E,f; @@

    (
    if () { BUG(); }
    |
    - if (E) { BUG(); }
    + BUG_ON(E);
    )
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Julia Lawall
     
  • Use ext2_fsblk_t type for filesystem-wide blocks number

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Use ext2_group_first_block_no() and assign the return values to
    ext2_fsblk_t variables.

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Improve ext2_readdir() return value for ext2_get_page() failure by using the
    actual result of ext2_get_page().

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Convert byte order of constant instead of variable which can be done at
    compile time (vs run time).

    Signed-off-by: Marcin Slusarz
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Slusarz
     
  • replace all:
    little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
    expression_in_cpu_byteorder);
    with:
    leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
    generated with semantic patch

    Signed-off-by: Marcin Slusarz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Slusarz
     
  • Convert XIP to support non-struct page backed memory, using VM_MIXEDMAP for
    the user mappings.

    This requires the get_xip_page API to be changed to an address based one.
    Improve the API layering a little bit too, while we're here.

    This is required in order to support XIP filesystems on memory that isn't
    backed with struct page (but memory with struct page is still supported too).

    Signed-off-by: Nick Piggin
    Acked-by: Carsten Otte
    Cc: Jared Hulbert
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Alter the block device ->direct_access() API to work with the new
    get_xip_mem() API (that requires both kaddr and pfn are returned).

    Some architectures will not do the right thing in their virt_to_page() for use
    by XIP (to translate from the kernel virtual address returned by
    direct_access(), to a user mappable pfn in XIP's page fault handler.

    However, we can't switch it to just return the pfn and not the kaddr, because
    we have no good way to get a kva from a pfn, and XIP requires the kva for its
    read(2) and write(2) handlers. So we have to return both.

    Signed-off-by: Jared Hulbert
    Signed-off-by: Nick Piggin
    Cc: Carsten Otte
    Cc: Heiko Carstens
    Cc: linux-mm@kvack.org
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jared Hulbert
     

22 Apr, 2008

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial: (24 commits)
    DOC: A couple corrections and clarifications in USB doc.
    Generate a slightly more informative error msg for bad HZ
    fix typo "is" -> "if" in Makefile
    ext*: spelling fix prefered -> preferred
    DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs.
    KEYS: Fix the comment to match the file name in rxrpc-type.h.
    RAID: remove trailing space from printk line
    DMA engine: typo fixes
    Remove unused MAX_NODES_SHIFT
    MAINTAINERS: Clarify access to OCFS2 development mailing list.
    V4L: Storage class should be before const qualifier (sn9c102)
    V4L: Storage class should be before const qualifier
    sonypi: Storage class should be before const qualifier
    intel_menlow: Storage class should be before const qualifier
    DVB: Storage class should be before const qualifier
    arm: Storage class should be before const qualifier
    ALSA: Storage class should be before const qualifier
    acpi: Storage class should be before const qualifier
    firmware_sample_driver.c: fix coding style
    MAINTAINERS: Add ati_remote2 driver
    ...

    Fixed up trivial conflicts in firmware_sample_driver.c

    Linus Torvalds
     
  • Spelling fix: prefered -> preferred

    Signed-off-by: Benoit Boissinot
    Signed-off-by: Jesper Juhl

    Benoit Boissinot
     

19 Apr, 2008

1 commit


16 Apr, 2008

1 commit

  • mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL. But
    filesystems are calling this function while holding xattr_sem so possible
    recursion into the fs violates locking ordering of xattr_sem and transaction
    start / i_mutex for ext2-4. Change mb_cache_entry_alloc() so that filesystems
    can specify desired gfp mask and use GFP_NOFS from all of them.

    Signed-off-by: Jan Kara
    Reported-by: Dave Jones
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

09 Feb, 2008

2 commits


08 Feb, 2008

1 commit

  • Stop the EXT2 filesystem from using iget() and read_inode(). Replace
    ext2_read_inode() with ext2_iget(), and call that instead of iget().
    ext2_iget() then uses iget_locked() directly and returns a proper error code
    instead of an inode in the event of an error.

    ext2_fill_super() returns any error incurred when getting the root inode
    instead of EINVAL.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: David Howells
    Acked-by: "Theodore Ts'o"
    Cc:
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

07 Feb, 2008

10 commits


29 Jan, 2008

1 commit


30 Nov, 2007

1 commit

  • In commit a686cd898bd999fd026a51e90fb0a3410d258ddb:

    "Val's cross-port of the ext3 reservations code into ext2."

    include/linux/ext2_fs.h got a new function whose return value is only
    defined if __KERNEL__ is defined. Putting #ifdef __KERNEL__ around the
    function seems to help, patch below.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tobias Poschwatta
     

15 Nov, 2007

1 commit

  • Forbid user from changing file flags on quota files. User has no bussiness
    in playing with these flags when quota is on. Furthermore there is a
    remote possibility of deadlock due to a lock inversion between quota file's
    i_mutex and transaction's start (i_mutex for quota file is locked only when
    trasaction is started in quota operations) in ext3 and ext4.

    Signed-off-by: Jan Kara
    Cc: LIOU Payphone
    Cc:
    Acked-by: Dave Kleikamp
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

14 Nov, 2007

1 commit

  • This reverts commit 7c9e69faa28027913ee059c285a5ea8382e24b5d, fixing up
    conflicts in fs/ext4/balloc.c manually.

    The cost of doing the bitmap validation on each lookup - even when the
    bitmap is cached - is absolutely prohibitive. We could, and probably
    should, do it only when adding the bitmap to the buffer cache. However,
    right now we are better off just reverting it.

    Peter Zijlstra measured the cost of this extra validation as a 85%
    decrease in cached iozone, and while I had a patch that took it down to
    just 17% by not being _quite_ so stupid in the validation, it was still
    a big slowdown that could have been avoided by just doing it right.

    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Aneesh Kumar
    Cc: Andreas Dilger
    Cc: Mingming Cao
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

22 Oct, 2007

3 commits

  • Now that nfsd has stopped writing to the find_exported_dentry member we an
    mark the export_operations const

    Signed-off-by: Christoph Hellwig
    Cc: Neil Brown
    Cc: "J. Bruce Fields"
    Cc:
    Cc: Dave Kleikamp
    Cc: Anton Altaparmakov
    Cc: David Chinner
    Cc: Timothy Shimmin
    Cc: OGAWA Hirofumi
    Cc: Hugh Dickins
    Cc: Chris Mason
    Cc: Jeff Mahoney
    Cc: "Vladimir V. Saveliev"
    Cc: Steven Whitehouse
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Trivial switch over to the new generic helpers.

    Signed-off-by: Christoph Hellwig
    Cc: Neil Brown
    Cc: "J. Bruce Fields"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • With 64KB blocksize, a directory entry can have size 64KB which does not
    fit into 16 bits we have for entry length. So we store 0xffff instead and
    convert the value when read from / written to disk.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara