08 Apr, 2014

1 commit

  • Pull ext3 improvements, cleanups, reiserfs fix from Jan Kara:
    "various cleanups for ext2, ext3, udf, isofs, a documentation update
    for quota, and a fix of a race in reiserfs readdir implementation"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    reiserfs: fix race in readdir
    ext2: acl: remove unneeded include of linux/capability.h
    ext3: explicitly remove inode from orphan list after failed direct io
    fs/isofs/inode.c add __init to init_inodecache()
    ext3: Speedup WB_SYNC_ALL pass
    fs/quota/Kconfig: Update filesystems
    ext3: Update outdated comment before ext3_ordered_writepage()
    ext3: Update PF_MEMALLOC handling in ext3_write_inode()
    ext2/3: use prandom_u32() instead of get_random_bytes()
    ext3: remove an unneeded check in ext3_new_blocks()
    ext3: remove unneeded check in ext3_ordered_writepage()
    fs: Mark function as static in ext3/xattr_security.c
    fs: Mark function as static in ext3/dir.c
    fs: Mark function as static in ext2/xattr_security.c
    ext3: Add __init macro to init_inodecache
    ext2: Add __init macro to init_inodecache
    udf: Add __init macro to init_inodecache
    fs: udf: parse_options: blocksize check

    Linus Torvalds
     

05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

04 Apr, 2014

1 commit

  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

18 Mar, 2014

1 commit


13 Mar, 2014

2 commits

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     
  • When doing filesystem wide sync, there's no need to force transaction
    commit separately for each inode because ext3_sync_fs() takes care of
    forcing commit at the end. Most of the time this slowness doesn't
    manifest because previous WB_SYNC_NONE writeback doesn't leave much to
    write but when there are processes aggressively creating new files and
    several filesystems to sync, the sync slowness can be noticeable. In the
    following test script sync(1) takes around 6 minutes when there are two
    ext3 filesystems mounted on a standard SATA drive. After this patch sync
    is about twice as fast in the default data=ordered mode. For
    data=writeback mode we have even bigger speedup.

    function run_writers
    {
    for (( i = 0; i < 10; i++ )); do
    mkdir $1/dir$i
    for (( j = 0; j < 40000; j++ )); do
    dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
    done &
    done
    }

    for dir in "$@"; do
    run_writers $dir
    done

    sleep 40
    time sync

    Signed-off-by: Jan Kara

    Jan Kara
     

04 Mar, 2014

3 commits

  • The comment is heavily outdated. The recursion into the filesystem isn't
    possible because we use GFP_NOFS for our allocations, the issue about
    block_write_full_page() dirtying tail page is long resolved as well
    (that function doesn't dirty buffers at all), and finally we don't start
    a transaction if all blocks are already allocated and mapped.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • The special handling of PF_MEMALLOC callers in ext3_write_inode()
    shouldn't be necessary as there shouldn't be any. Warn about it. Also
    update comment before the function as it seems somewhat outdated.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Many of the uses of get_random_bytes() do not actually need
    cryptographically secure random numbers. Replace those uses with a
    call to prandom_u32(), which is faster and which doesn't consume
    entropy from the /dev/random driver.

    The commit dd1f723bf56bd96efc9d90e9e60dc511c79de48f has made that for
    ext4, and i did the same for ext2/3.

    Signed-off-by: Zhang Zhen
    Signed-off-by: Jan Kara

    ZhangZhen
     

03 Mar, 2014

5 commits


29 Jan, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus
    assorted cleanups and fixes all over the place...

    There will be another pile later this week"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits)
    __dentry_path() fixes
    vfs: Remove second variable named error in __dentry_path
    vfs: Is mounted should be testing mnt_ns for NULL or error.
    Fix race when checking i_size on direct i/o read
    hfsplus: remove can_set_xattr
    nfsd: use get_acl and ->set_acl
    fs: remove generic_acl
    nfs: use generic posix ACL infrastructure for v3 Posix ACLs
    gfs2: use generic posix ACL infrastructure
    jfs: use generic posix ACL infrastructure
    xfs: use generic posix ACL infrastructure
    reiserfs: use generic posix ACL infrastructure
    ocfs2: use generic posix ACL infrastructure
    jffs2: use generic posix ACL infrastructure
    hfsplus: use generic posix ACL infrastructure
    f2fs: use generic posix ACL infrastructure
    ext2/3/4: use generic posix ACL infrastructure
    btrfs: use generic posix ACL infrastructure
    fs: make posix_acl_create more useful
    fs: make posix_acl_chmod more useful
    ...

    Linus Torvalds
     

26 Jan, 2014

3 commits


24 Jan, 2014

2 commits


13 Nov, 2013

1 commit

  • Pull ext[23], udf and quota fixes from Jan Kara:
    "Assorted fixes in quota, ext2, ext3 & udf.

    Probably the most important is a fix of fs corruption issue in ext2
    XIP support (OTOH xip is rarely used)"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2: Fix fs corruption in ext2_get_xip_mem()
    quota: info leak in quota_getquota()
    jbd: Revert "jbd: remove dependency on __GFP_NOFAIL"
    udf: fix for pathetic mount times in case of invalid file system
    ext3: Count journal as bsddf overhead in ext3_statfs

    Linus Torvalds
     

16 Oct, 2013

2 commits

  • ext4 counts journal space as bsddf overhead, but ext3 does not.

    For some reason when I patched ext4 I thought I should leave
    ext3 alone, but frankly it makes more sense to fix it, I think.

    Otherwise we get inconsistent behavior from ext3 under ext3.ko,
    and ext3 under ext4.ko, which is not at all desirable...

    This is testable by xfstests shared/289, though it will need
    modification because it currently special-cases ext3.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Jan Kara

    Eric Sandeen
     
  • d_tmpfile() already swallowed the inode ref.

    Signed-off-by: Miklos Szeredi
    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Miklos Szeredi
     

07 Sep, 2013

1 commit

  • Pull ext3, reiserfs, udf & isofs fixes from Jan Kara:
    "The contains a bunch of ext3 cleanups and minor improvements, major
    reiserfs locking changes which should hopefully fix deadlocks
    introduced by BKL removal, and udf/isofs changes to refuse mounting fs
    rw instead of mounting it ro automatically which makes eject button
    work as expected for all media (see the changelog for why userspace
    should be ok with this change)"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    jbd: use a single printk for jbd_debug()
    reiserfs: locking, release lock around quota operations
    reiserfs: locking, handle nested locks properly
    reiserfs: locking, push write lock out of xattr code
    jbd: relocate assert after state lock in journal_commit_transaction()
    udf: Refuse RW mount of the filesystem instead of making it RO
    udf: Standardize return values in mount sequence
    isofs: Refuse RW mount of the filesystem instead of making it RO
    ext3: allow specifying external journal by pathname mount option
    jbd: remove unneeded semicolon

    Linus Torvalds
     

29 Aug, 2013

1 commit


01 Aug, 2013

1 commit

  • It's always been a hassle that if an external journal's
    device number changes, the filesystem won't mount.
    And since boot-time enumeration can change, device number
    changes aren't unusual.

    The current mechanism to update the journal location is by
    passing in a mount option w/ a new devnum, but that's a hassle;
    it's a manual approach, fixing things after the fact.

    Adding a mount option, "-o journal_path=/dev/$DEVICE" would
    help, since then we can do i.e.

    # mount -o journal_path=/dev/disk/by-label/$JOURNAL_LABEL ...

    and it'll mount even if the devnum has changed, as shown here:

    # losetup /dev/loop0 journalfile
    # mke2fs -L mylabel-journal -O journal_dev /dev/loop0
    # mkfs.ext3 -L mylabel -J device=/dev/loop0 /dev/sdb1

    Change the journal device number:

    # losetup -d /dev/loop0
    # losetup /dev/loop1 journalfile

    And today it will fail:

    # mount /dev/sdb1 /mnt/test
    mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
    missing codepage or helper program, or other error
    In some cases useful info is found in syslog - try
    dmesg | tail or so

    # dmesg | tail -n 1
    [17343.240702] EXT3-fs (sdb1): error: couldn't read superblock of external journal

    But with this new mount option, we can specify the new path:

    # mount -o journal_path=/dev/loop1 /dev/sdb1 /mnt/test
    #

    (which does update the encoded device number, incidentally):

    # umount /dev/sdb1
    # dumpe2fs -h /dev/sdb1 | grep "Journal device"
    dumpe2fs 1.41.12 (17-May-2010)
    Journal device: 0x0701

    But best of all we can just always mount by journal-path, and
    it'll always work:

    # mount -o journal_path=/dev/disk/by-label/mylabel-journal /dev/sdb1 /mnt/test
    #

    So the journal_path option can be specified in fstab, and as long as
    the disk is available somewhere, and findable by label (or by UUID),
    we can mount.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Jan Kara

    Eric Sandeen
     

21 Jul, 2013

1 commit

  • When we try to open a file with O_TMPFILE flag, we will trigger a bug.
    The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
    this check always fails because we set ->i_nlink = 1 in
    inode_init_always(). We can use the following program to trigger it:

    int main(int argc, char *argv[])
    {
    int fd;

    fd = open(argv[1], O_TMPFILE, 0666);
    if (fd < 0) {
    perror("open ");
    return -1;
    }
    close(fd);
    return 0;
    }

    The oops message looks like this:

    kernel: kernel BUG at fs/ext3/namei.c:1992!
    kernel: invalid opcode: 0000 [#1] SMP
    kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd
    kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4
    kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010
    kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000
    kernel: RIP: 0010:[] [] ext3_orphan_add+0x6a/0x1eb [ext3]
    kernel: RSP: 0018:ffff8801124d5cc8 EFLAGS: 00010202
    kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0
    kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8
    kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f
    kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000
    kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8
    kernel: FS: 00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000
    kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0
    kernel: Stack:
    kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7
    kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000
    kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1
    kernel: Call Trace:
    kernel: [] ? journal_start+0x8c/0xbd [jbd]
    kernel: [] ext3_tmpfile+0xb2/0x13b [ext3]
    kernel: [] path_openat+0x11f/0x5e7
    kernel: [] ? list_del+0x11/0x30
    kernel: [] ? __dequeue_entity+0x33/0x38
    kernel: [] do_filp_open+0x3f/0x8d
    kernel: [] ? __alloc_fd+0x50/0x102
    kernel: [] do_sys_open+0x13b/0x1cd
    kernel: [] SyS_open+0x1e/0x20
    kernel: [] system_call_fastpath+0x16/0x1b
    kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0
    kernel: RIP [] ext3_orphan_add+0x6a/0x1eb [ext3]
    kernel: RSP

    Here we couldn't call clear_nlink() directly because in d_tmpfile() we
    will call inode_dec_link_count() to decrease ->i_nlink. So this commit
    tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.

    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"
    Cc: Jan Kara
    Cc: Al Viro

    Zheng Liu
     

10 Jul, 2013

1 commit


05 Jul, 2013

1 commit

  • If filesystem was aborted we will return success
    due to (sb->s_flags & MS_RDONLY) which is incorrect and
    results in data loss.
    In order to handle fs abort correctly we have to check
    fs state once we discover that it is in MS_RDONLY state

    Test case: http://patchwork.ozlabs.org/patch/244297/
    Changes from V1:
    - fix spelling
    - fix smp_rmb()/debug order

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Jan Kara

    Dmitry Monakhov
     

04 Jul, 2013

2 commits

  • Page reclaim keeps track of dirty and under writeback pages and uses it
    to determine if wait_iff_congested() should stall or if kswapd should
    begin writing back pages. This fails to account for buffer pages that
    can be under writeback but not PageWriteback which is the case for
    filesystems like ext3 ordered mode. Furthermore, PageDirty buffer pages
    can have all the buffers clean and writepage does no IO so it should not
    be accounted as congested.

    This patch adds an address_space operation that filesystems may
    optionally use to check if a page is really dirty or really under
    writeback. An implementation is provided for for buffer_heads is added
    and used for block operations and ext3 in ordered mode. By default the
    page flags are obeyed.

    Credit goes to Jan Kara for identifying that the page flags alone are
    not sufficient for ext3 and sanity checking a number of ideas on how the
    problem could be addressed.

    Signed-off-by: Mel Gorman
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Rik van Riel
    Cc: KAMEZAWA Hiroyuki
    Cc: Jiri Slaby
    Cc: Valdis Kletnieks
    Cc: Zlatko Calusic
    Cc: dormando
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Pull second set of VFS changes from Al Viro:
    "Assorted f_pos race fixes, making do_splice_direct() safe to call with
    i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
    ->d_hash/->d_compare calling conventions changes from Linus, misc
    stuff all over the place."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    Document ->tmpfile()
    ext4: ->tmpfile() support
    vfs: export lseek_execute() to modules
    lseek_execute() doesn't need an inode passed to it
    block_dev: switch to fixed_size_llseek()
    cpqphp_sysfs: switch to fixed_size_llseek()
    tile-srom: switch to fixed_size_llseek()
    proc_powerpc: switch to fixed_size_llseek()
    ubi/cdev: switch to fixed_size_llseek()
    pci/proc: switch to fixed_size_llseek()
    isapnp: switch to fixed_size_llseek()
    lpfc: switch to fixed_size_llseek()
    locks: give the blocked_hash its own spinlock
    locks: add a new "lm_owner_key" lock operation
    locks: turn the blocked_list into a hashtable
    locks: convert fl_link to a hlist_node
    locks: avoid taking global lock if possible when waking up blocked waiters
    locks: protect most of the file_lock handling with i_lock
    locks: encapsulate the fl_link list handling
    locks: make "added" in __posix_lock_file a bool
    ...

    Linus Torvalds
     

03 Jul, 2013

1 commit

  • Pull ext4 update from Ted Ts'o:
    "Lots of bug fixes, cleanups and optimizations. In the bug fixes
    category, of note is a fix for on-line resizing file systems where the
    block size is smaller than the page size (i.e., file systems 1k blocks
    on x86, or more interestingly file systems with 4k blocks on Power or
    ia64 systems.)

    In the cleanup category, the ext4's punch hole implementation was
    significantly improved by Lukas Czerner, and now supports bigalloc
    file systems. In addition, Jan Kara significantly cleaned up the
    write submission code path. We also improved error checking and added
    a few sanity checks.

    In the optimizations category, two major optimizations deserve
    mention. The first is that ext4_writepages() is now used for
    nodelalloc and ext3 compatibility mode. This allows writes to be
    submitted much more efficiently as a single bio request, instead of
    being sent as individual 4k writes into the block layer (which then
    relied on the elevator code to coalesce the requests in the block
    queue). Secondly, the extent cache shrink mechanism, which was
    introduce in 3.9, no longer has a scalability bottleneck caused by the
    i_es_lru spinlock. Other optimizations include some changes to reduce
    CPU usage and to avoid issuing empty commits unnecessarily."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
    ext4: optimize starting extent in ext4_ext_rm_leaf()
    jbd2: invalidate handle if jbd2_journal_restart() fails
    ext4: translate flag bits to strings in tracepoints
    ext4: fix up error handling for mpage_map_and_submit_extent()
    jbd2: fix theoretical race in jbd2__journal_restart
    ext4: only zero partial blocks in ext4_zero_partial_blocks()
    ext4: check error return from ext4_write_inline_data_end()
    ext4: delete unnecessary C statements
    ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
    jbd2: move superblock checksum calculation to jbd2_write_superblock()
    ext4: pass inode pointer instead of file pointer to punch hole
    ext4: improve free space calculation for inline_data
    ext4: reduce object size when !CONFIG_PRINTK
    ext4: improve extent cache shrink mechanism to avoid to burn CPU time
    ext4: implement error handling of ext4_mb_new_preallocation()
    ext4: fix corruption when online resizing a fs with 1K block size
    ext4: delete unused variables
    ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
    jbd2: remove debug dependency on debug_fs and update Kconfig help text
    jbd2: use a single printk for jbd_debug()
    ...

    Linus Torvalds
     

01 Jul, 2013

1 commit

  • Both ext3 and ext4 htree_dirblock_to_tree() is just filling the
    in-core rbtree for use by call_filldir(). All updates of ->f_pos are
    done by the latter; bumping it here (on error) is obviously wrong - we
    might very well have it nowhere near the block we'd found an error in.

    Signed-off-by: Al Viro
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Al Viro
     

29 Jun, 2013

2 commits

  • In this case we do need a bit more than usual, due to orphan
    list handling.

    Signed-off-by: Al Viro

    Al Viro
     
  • new helper: dir_relax(inode). Call when you are in location that will
    _not_ be invalidated by directory modifications (block boundary, in case
    of ext*). Returns whether the directory has survived (dropping i_mutex
    allows rmdir to kill the sucker; if it returns false to us, ->iterate()
    is obviously done)

    Signed-off-by: Al Viro

    Al Viro
     

22 May, 2013

2 commits

  • ->invalidatepage() aop now accepts range to invalidate so we can make
    use of it in journal_invalidatepage() and all the users in ext3 file
    system. Also update ext3 trace point to print out length argument.

    Signed-off-by: Lukas Czerner
    Reviewed-by: Jan Kara

    Lukas Czerner
     
  • Currently there is no way to truncate partial page where the end
    truncate point is not at the end of the page. This is because it was not
    needed and the functionality was enough for file system truncate
    operation to work properly. However more file systems now support punch
    hole feature and it can benefit from mm supporting truncating page just
    up to the certain point.

    Specifically, with this functionality truncate_inode_pages_range() can
    be changed so it supports truncating partial page at the end of the
    range (currently it will BUG_ON() if 'end' is not at the end of the
    page).

    This commit changes the invalidatepage() address space operation
    prototype to accept range to be invalidated and update all the instances
    for it.

    We also change the block_invalidatepage() in the same way and actually
    make a use of the new length argument implementing range invalidation.

    Actual file system implementations will follow except the file systems
    where the changes are really simple and should not change the behaviour
    in any way .Implementation for truncate_page_range() which will be able
    to accept page unaligned ranges will follow as well.

    Signed-off-by: Lukas Czerner
    Cc: Andrew Morton
    Cc: Hugh Dickins

    Lukas Czerner
     

08 May, 2013

2 commits

  • Merge more incoming from Andrew Morton:

    - Various fixes which were stalled or which I picked up recently

    - A large rotorooting of the AIO code. Allegedly to improve
    performance but I don't really have good performance numbers (I might
    have lost the email) and I can't raise Kent today. I held this out
    of 3.9 and we could give it another cycle if it's all too late/scary.

    I ended up taking only the first two thirds of the AIO rotorooting. I
    left the percpu parts and the batch completion for later. - Linus

    * emailed patches from Andrew Morton : (33 commits)
    aio: don't include aio.h in sched.h
    aio: kill ki_retry
    aio: kill ki_key
    aio: give shared kioctx fields their own cachelines
    aio: kill struct aio_ring_info
    aio: kill batch allocation
    aio: change reqs_active to include unreaped completions
    aio: use cancellation list lazily
    aio: use flush_dcache_page()
    aio: make aio_read_evt() more efficient, convert to hrtimers
    wait: add wait_event_hrtimeout()
    aio: refcounting cleanup
    aio: make aio_put_req() lockless
    aio: do fget() after aio_get_req()
    aio: dprintk() -> pr_debug()
    aio: move private stuff out of aio.h
    aio: add kiocb_cancel()
    aio: kill return value of aio_complete()
    char: add aio_{read,write} to /dev/{null,zero}
    aio: remove retry-based AIO
    ...

    Linus Torvalds
     
  • Faster kernel compiles by way of fewer unnecessary includes.

    [akpm@linux-foundation.org: fix fallout]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     

07 May, 2013

1 commit