23 Aug, 2011

2 commits


12 Aug, 2011

1 commit

  • Commit ae54870a1dc9 ("ext3: Fix lock inversion in ext3_symlink()")
    recalculated the number of credits needed for a long symlink, in the
    process of splitting it into two transactions. However, the first
    credit calculation under-counted because if selinux is enabled, credits
    are needed to create the selinux xattr as well.

    Overrunning the reservation will result in an OOPS in
    journal_dirty_metadata() due to this assert:

    J_ASSERT_JH(jh, handle->h_buffer_credits > 0);

    Fix this by increasing the reservation size.

    Signed-off-by: Eric Sandeen
    Reviewed-by: Jan Kara
    Acked-by: "Theodore Ts'o"
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

01 Aug, 2011

2 commits


27 Jul, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    jbd: change the field "b_cow_tid" of struct journal_head from type unsigned to tid_t
    ext3.txt: update the links in the section "useful links" to the latest ones
    ext3: Fix data corruption in inodes with journalled data
    ext2: check xattr name_len before acquiring xattr_sem in ext2_xattr_get
    ext3: Fix compilation with -DDX_DEBUG
    quota: Remove unused declaration
    jbd: Use WRITE_SYNC in journal checkpoint.
    jbd: Fix oops in journal_remove_journal_head()
    ext3: Return -EINVAL when start is beyond the end of fs in ext3_trim_fs()
    ext3/ioctl.c: silence sparse warnings about different address spaces
    ext3/ext4 Documentation: remove bh/nobh since it has been deprecated
    ext3: Improve truncate error handling
    ext3: use proper little-endian bitops
    ext2: include fs.h into ext2_fs.h
    ext3: Fix oops in ext3_try_to_allocate_with_rsv()
    jbd: fix a bug of leaking jh->b_jcount
    jbd: remove dependency on __GFP_NOFAIL
    ext3: Convert ext3 to new truncate calling convention
    jbd: Add fixed tracepoints
    ext3: Add fixed tracepoints

    Resolve conflicts in fs/ext3/fsync.c due to fsync locking push-down and
    new fixed tracepoints.

    Linus Torvalds
     

26 Jul, 2011

4 commits

  • Replace the ->check_acl method with a ->get_acl method that simply reads an
    ACL from disk after having a cache miss. This means we can replace the ACL
    checking boilerplate code with a single implementation in namei.c.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • new helper: posix_acl_create(&acl, gfp, mode_p). Replaces acl with
    modified clone, on failure releases acl and replaces with NULL.
    Returns 0 or -ve on error. All callers of posix_acl_create_masq()
    switched.

    Signed-off-by: Al Viro

    Al Viro
     
  • new helper: posix_acl_chmod(&acl, gfp, mode). Replaces acl with modified
    clone or with NULL if that has failed; returns 0 or -ve on error. All
    callers of posix_acl_chmod_masq() switched to that - they'd been doing
    exactly the same thing.

    Signed-off-by: Al Viro

    Al Viro
     
  • This moves logic for checking the cached ACL values from low-level
    filesystems into generic code. The end result is a streamlined ACL
    check that doesn't need to load the inode->i_op->check_acl pointer at
    all for the common cached case.

    The filesystems also don't need to check for a non-blocking RCU walk
    case in their acl_check() functions, because that is all handled at a
    VFS layer.

    Signed-off-by: Linus Torvalds
    Signed-off-by: Al Viro

    Linus Torvalds
     

23 Jul, 2011

1 commit

  • When journalling data for an inode (either because it is a symlink or
    because the filesystem is mounted in data=journal mode), ext3_evict_inode()
    can discard unwritten data by calling truncate_inode_pages(). This is
    because we don't mark the buffer / page dirty when journalling data but only
    add the buffer to the running transaction and thus mm does not know there
    are still unwritten data.

    Fix the problem by carefully tracking transaction containing inode's data,
    committing this transaction, and writing uncheckpointed buffers when inode
    should be reaped.

    Signed-off-by: Jan Kara

    Jan Kara
     

21 Jul, 2011

5 commits

  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     
  • This patch turns on barriers by default for ext3. mount -o barrier=0
    will turn them off. Based on a patch from Chris Mason in the SuSE tree.

    Signed-off-by: Chris Mason
    Signed-off-by: Christoph Hellwig
    Acked-by: Eric Sandeen
    Acked-by: Jan Kara
    Acked-by: Jeff Mahoney
    Acked-by: Ted Ts'o
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Simple filesystems always pass inode->i_sb_bdev as the block device
    argument, and never need a end_io handler. Let's simply things for
    them and for my grepping activity by dropping these arguments. The
    only thing not falling into that scheme is ext4, which passes and
    end_io handler without needing special flags (yet), but given how
    messy the direct I/O code there is use of __blockdev_direct_IO
    in one instead of two out of three cases isn't going to make a large
    difference anyway.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Let filesystems handle waiting for direct I/O requests themselves instead
    of doing it beforehand. This means filesystem-specific locks to prevent
    new dio referenes from appearing can be held. This is important to allow
    generalizing i_dio_count to non-DIO_LOCKING filesystems.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Compilation of ext3/namei.c brought up an error and warning messages
    when compiled with -DDX_DEBUG.

    Signed-off-by: Bernd Schubert
    Signed-off-by: Jan Kara

    Bernd Schubert
     

20 Jul, 2011

3 commits


25 Jun, 2011

6 commits

  • We should return -EINVAL when the FITRIM parameters are not sane, but
    currently we are exiting silently if start is beyond the end of the
    file system. This commit fixes this so we return -EINVAL as other file
    systems do.

    Signed-off-by: Lukas Czerner
    CC: Jan Kara
    Signed-off-by: Jan Kara

    Lukas Czerner
     
  • The 'from' argument for copy_from_user and the 'to' argument for
    copy_to_user should both be tagged as __user address space.

    Signed-off-by: H Hartley Sweeten
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Signed-off-by: Jan Kara

    H Hartley Sweeten
     
  • New truncate calling convention allows us to handle errors from
    ext3_block_truncate_page(). So reorganize the code so that
    ext3_block_truncate_page() is called before we change inode size.

    This also removes unnecessary block zeroing from error recovery after failed
    buffered writes (zeroing isn't needed because we could have never written
    non-zero data to disk). We have to be careful and keep zeroing in direct IO
    write error recovery because there we might have already overwritten end of the
    last file block.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Block allocation is called from two places: ext3_get_blocks_handle() and
    ext3_xattr_block_set(). These two callers are not necessarily synchronized
    because xattr code holds only xattr_sem and i_mutex, and
    ext3_get_blocks_handle() may hold only truncate_mutex when called from
    writepage() path. Block reservation code does not expect two concurrent
    allocations to happen to the same inode and thus assertions can be triggered
    or reservation structure corruption can occur.

    Fix the problem by taking truncate_mutex in xattr code to serialize
    allocations.

    CC: Sage Weil
    CC: stable@kernel.org
    Reported-by: Fyodor Ustinov
    Signed-off-by: Jan Kara

    Jan Kara
     
  • Mostly trivial conversion. We fix a bug that IS_IMMUTABLE and IS_APPEND files
    could not be truncated during failed writes as we change the code. In fact the
    test is not needed at all because both IS_IMMUTABLE and IS_APPEND is tested in
    upper layers in do_sys_[f]truncate(), may_write(), etc.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • This commit adds fixed tracepoints to the ext3 code. It is based on ext4
    tracepoints, however due to the differences of both file systems, there
    are some tracepoints missing (those for delaloc and for multi-block
    allocator) and there are some ext3 specific as well (for reservation
    windows).

    Here is a list:

    ext3_free_inode
    ext3_request_inode
    ext3_allocate_inode
    ext3_evict_inode
    ext3_drop_inode
    ext3_mark_inode_dirty
    ext3_write_begin
    ext3_ordered_write_end
    ext3_writeback_write_end
    ext3_journalled_write_end
    ext3_ordered_writepage
    ext3_writeback_writepage
    ext3_journalled_writepage
    ext3_readpage
    ext3_releasepage
    ext3_invalidatepage
    ext3_discard_blocks
    ext3_request_blocks
    ext3_allocate_blocks
    ext3_free_blocks
    ext3_sync_file_enter
    ext3_sync_file_exit
    ext3_sync_fs
    ext3_rsv_window_add
    ext3_discard_reservation
    ext3_alloc_new_reservation
    ext3_reserved
    ext3_forget
    ext3_read_block_bitmap
    ext3_direct_IO_enter
    ext3_direct_IO_exit
    ext3_unlink_enter
    ext3_unlink_exit
    ext3_truncate_enter
    ext3_truncate_exit
    ext3_get_blocks_enter
    ext3_get_blocks_exit
    ext3_load_inode

    Signed-off-by: Lukas Czerner
    Cc: Jan Kara
    Signed-off-by: Jan Kara

    Lukas Czerner
     

27 May, 2011

3 commits

  • Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or
    anything else, so that the filesystem can track internally if it
    needs to push out a transaction for fdatasync or not.

    This is just the prototype change with no user for it yet. I plan
    to push large XFS changes for the next merge window, and getting
    this trivial infrastructure in this window would help a lot to avoid
    tree interdependencies.

    Also remove incorrect comments that ->dirty_inode can't block. That
    has been changed a long time ago, and many implementations rely on it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
    xen: cleancache shim to Xen Transcendent Memory
    ocfs2: add cleancache support
    ext4: add cleancache support
    btrfs: add cleancache support
    ext3: add cleancache support
    mm/fs: add hooks to support cleancache
    mm: cleancache core ops functions and config
    fs: add field to superblock to support cleancache
    mm/fs: cleancache documentation

    Fix up trivial conflict in fs/btrfs/extent_io.c due to includes

    Linus Torvalds
     
  • This fifth patch of eight in this cleancache series "opts-in"
    cleancache for ext3. Filesystems must explicitly enable
    cleancache by calling cleancache_init_fs anytime an instance
    of the filesystem is mounted. For ext3, all other cleancache
    hooks are in the VFS layer including the matching cleancache_flush_fs
    hook which must be called on unmount.

    Details and a FAQ can be found in Documentation/vm/cleancache.txt

    [v6-v8: no changes]
    [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
    Signed-off-by: Dan Magenheimer
    Reviewed-by: Jeremy Fitzhardinge
    Reviewed-by: Konrad Rzeszutek Wilk
    Acked-by: Andreas Dilger
    Cc: Ted Ts'o
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Matthew Wilcox
    Cc: Nick Piggin
    Cc: Mel Gorman
    Cc: Rik Van Riel
    Cc: Jan Beulich
    Cc: Chris Mason
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Nitin Gupta

    Dan Magenheimer
     

17 May, 2011

1 commit

  • When make_indexed_dir() fails (e.g. because of ENOSPC) after it has allocated
    block for index tree root, we did not properly mark all changed buffers dirty.
    This lead to only some of these buffers being written out and thus effectively
    corrupting the directory.

    Fix the issue by marking all changed data dirty even in the error failure case.

    CC: stable@kernel.org
    Signed-off-by: Jan Kara

    Jan Kara
     

30 Apr, 2011

1 commit

  • ext3_symlink() cannot call __page_symlink() with transaction open.
    __page_symlink() calls ext3_write_begin() which gets page lock which ranks
    above transaction start (thus lock ordering is violated) and and also
    ext3_write_begin() waits for a transaction commit when we run out of space
    which never happens if we hold transaction open.

    Fix the problem by stopping a transaction before calling __page_symlink()
    (we have to be careful and put inode to orphan list so that it gets deleted
    in case of crash) and starting another one after __page_symlink() returns
    for addition of symlink into a directory.

    Signed-off-by: Jan Kara

    Jan Kara
     

08 Apr, 2011

1 commit


31 Mar, 2011

1 commit


25 Mar, 2011

1 commit

  • * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
    Documentation/iostats.txt: bit-size reference etc.
    cfq-iosched: removing unnecessary think time checking
    cfq-iosched: Don't clear queue stats when preempt.
    blk-throttle: Reset group slice when limits are changed
    blk-cgroup: Only give unaccounted_time under debug
    cfq-iosched: Don't set active queue in preempt
    block: fix non-atomic access to genhd inflight structures
    block: attempt to merge with existing requests on plug flush
    block: NULL dereference on error path in __blkdev_get()
    cfq-iosched: Don't update group weights when on service tree
    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
    block: Require subsystems to explicitly allocate bio_set integrity mempool
    jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    fs: make fsync_buffers_list() plug
    mm: make generic_writepages() use plugging
    blk-cgroup: Add unaccounted time to timeslice_used.
    block: fixup plugging stubs for !CONFIG_BLOCK
    block: remove obsolete comments for blkdev_issue_zeroout.
    blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
    ...

    Fix up conflicts in fs/{aio.c,super.c}

    Linus Torvalds
     

24 Mar, 2011

2 commits


18 Mar, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    ext3: Always set dx_node's fake_dirent explicitly.
    ext3: Fix an overflow in ext3_trim_fs.
    jbd: Remove one to many n's in a word.
    ext3: skip orphan cleanup on rocompat fs
    ext2: Fix link count corruption under heavy link+rename load
    ext3: speed up group trim with the right free block count.
    ext3: Adjust trim start with first_data_block.
    quota: return -ENOMEM when memory allocation fails

    Linus Torvalds
     

17 Mar, 2011

1 commit

  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
    AppArmor: kill unused macros in lsm.c
    AppArmor: cleanup generated files correctly
    KEYS: Add an iovec version of KEYCTL_INSTANTIATE
    KEYS: Add a new keyctl op to reject a key with a specified error code
    KEYS: Add a key type op to permit the key description to be vetted
    KEYS: Add an RCU payload dereference macro
    AppArmor: Cleanup make file to remove cruft and make it easier to read
    SELinux: implement the new sb_remount LSM hook
    LSM: Pass -o remount options to the LSM
    SELinux: Compute SID for the newly created socket
    SELinux: Socket retains creator role and MLS attribute
    SELinux: Auto-generate security_is_socket_class
    TOMOYO: Fix memory leak upon file open.
    Revert "selinux: simplify ioctl checking"
    selinux: drop unused packet flow permissions
    selinux: Fix packet forwarding checks on postrouting
    selinux: Fix wrong checks for selinux_policycap_netpeer
    selinux: Fix check for xfrm selinux context algorithm
    ima: remove unnecessary call to ima_must_measure
    IMA: remove IMA imbalance checking
    ...

    Linus Torvalds
     

15 Mar, 2011

2 commits


10 Mar, 2011

1 commit

  • Code has been converted over to the new explicit on-stack plugging,
    and delay users have been converted to use the new API for that.
    So lets kill off the old plugging along with aops->sync_page().

    Signed-off-by: Jens Axboe

    Jens Axboe