10 Jan, 2012

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2/3/4: delete unneeded includes of module.h
    ext{3,4}: Fix potential race when setversion ioctl updates inode
    udf: Mark LVID buffer as uptodate before marking it dirty
    ext3: Don't warn from writepage when readonly inode is spotted after error
    jbd: Remove j_barrier mutex
    reiserfs: Force inode evictions before umount to avoid crash
    reiserfs: Fix quota mount option parsing
    udf: Treat symlink component of type 2 as /
    udf: Fix deadlock when converting file from in-ICB one to normal one
    udf: Cleanup calling convention of inode_getblk()
    ext2: Fix error handling on inode bitmap corruption
    ext3: Fix error handling on inode bitmap corruption
    ext3: replace ll_rw_block with other functions
    ext3: NULL dereference in ext3_evict_inode()
    jbd: clear revoked flag on buffers before a new transaction started
    ext3: call ext3_mark_recovery_complete() when recovery is really needed

    Linus Torvalds
     

09 Jan, 2012

3 commits

  • Delete any instances of include module.h that were not strictly
    required. In the case of ext2, the declaration of MODULE_LICENSE
    etc. were in inode.c but the module_init/exit were in super.c, so
    relocate the MODULE_LICENCE/AUTHOR block to super.c which makes it
    consistent with ext3 and ext4 at the same time.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: Jan Kara

    Paul Gortmaker
     
  • WARN_ON_ONCE(IS_RDONLY(inode)) tends to trip when filesystem hits error and is
    remounted read-only. This unnecessarily scares users (well, they should be
    scared because of filesystem error, but the stack trace distracts them from the
    right source of their fear ;-). We could as well just remove the WARN_ON but
    it's not hard to fix it to not trip on filesystem with errors and not use more
    cycles in the common case so that's what we do.

    CC: stable@kernel.org
    Signed-off-by: Jan Kara

    Jan Kara
     
  • ll_rw_block() is deprecated. Thus we replace it with other functions.

    CC: Jan Kara
    Signed-off-by: Zheng Liu
    Signed-off-by: Jan Kara

    Zheng Liu
     

02 Dec, 2011

1 commit


22 Nov, 2011

1 commit

  • This is an fsfuzzer bug. ->s_journal is set at the end of
    ext3_load_journal() but we try to use it in the error handling from
    ext3_get_journal() while it's still NULL.

    [ 337.039041] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024
    [ 337.040380] IP: [] _raw_spin_lock+0x9/0x30
    [ 337.041687] PGD 0
    [ 337.043118] Oops: 0002 [#1] SMP
    [ 337.044483] CPU 3
    [ 337.044495] Modules linked in: ecb md4 cifs fuse kvm_intel kvm brcmsmac brcmutil crc8 cordic r8169 [last unloaded: scsi_wait_scan]
    [ 337.047633]
    [ 337.049259] Pid: 8308, comm: mount Not tainted 3.2.0-rc2-next-20111121+ #24 SAMSUNG ELECTRONICS CO., LTD. RV411/RV511/E3511/S3511 /RV411/RV511/E3511/S3511
    [ 337.051064] RIP: 0010:[] [] _raw_spin_lock+0x9/0x30
    [ 337.052879] RSP: 0018:ffff8800b1d11ae8 EFLAGS: 00010282
    [ 337.054668] RAX: 0000000000000100 RBX: 0000000000000000 RCX: ffff8800b77c2000
    [ 337.056400] RDX: ffff8800a97b5c00 RSI: 0000000000000000 RDI: 0000000000000024
    [ 337.058099] RBP: ffff8800b1d11ae8 R08: 6000000000000000 R09: e018000000000000
    [ 337.059841] R10: ff67366cc2607c03 R11: 00000000110688e6 R12: 0000000000000000
    [ 337.061607] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8800a78f06e8
    [ 337.063385] FS: 00007f9d95652800(0000) GS:ffff8800b7180000(0000) knlGS:0000000000000000
    [ 337.065110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 337.066801] CR2: 0000000000000024 CR3: 00000000aef2c000 CR4: 00000000000006e0
    [ 337.068581] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 337.070321] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 337.072105] Process mount (pid: 8308, threadinfo ffff8800b1d10000, task ffff8800b1d02be0)
    [ 337.073800] Stack:
    [ 337.075487] ffff8800b1d11b08 ffffffff811f48cf ffff88007ac9b158 0000000000000000
    [ 337.077255] ffff8800b1d11b38 ffffffff8119405d ffff88007ac9b158 ffff88007ac9b250
    [ 337.078851] ffffffff8181bda0 ffffffff8181bda0 ffff8800b1d11b68 ffffffff81131e31
    [ 337.080284] Call Trace:
    [ 337.081706] [] log_start_commit+0x1f/0x40
    [ 337.083107] [] ext3_evict_inode+0x1fd/0x2a0
    [ 337.084490] [] evict+0xa1/0x1a0
    [ 337.085857] [] iput+0x101/0x210
    [ 337.087220] [] iget_failed+0x21/0x30
    [ 337.088581] [] ext3_iget+0x15c/0x450
    [ 337.089936] [] ? ext3_rsv_window_add+0x81/0x100
    [ 337.091284] [] ext3_get_journal+0x15/0xde
    [ 337.092641] [] ext3_fill_super+0xf2b/0x1c30
    [ 337.093991] [] ? register_shrinker+0x4d/0x60
    [ 337.095332] [] mount_bdev+0x1a2/0x1e0
    [ 337.096680] [] ? ext3_setup_super+0x210/0x210
    [ 337.098026] [] ext3_mount+0x10/0x20
    [ 337.099362] [] mount_fs+0x3e/0x1b0
    [ 337.100759] [] ? __alloc_percpu+0xb/0x10
    [ 337.102330] [] vfs_kern_mount+0x65/0xc0
    [ 337.103889] [] do_kern_mount+0x4f/0x100
    [ 337.105442] [] do_mount+0x19c/0x890
    [ 337.106989] [] ? memdup_user+0x46/0x90
    [ 337.108572] [] ? strndup_user+0x53/0x70
    [ 337.110114] [] sys_mount+0x8b/0xe0
    [ 337.111617] [] system_call_fastpath+0x16/0x1b
    [ 337.113133] Code: 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f b6 03 38 c2 75 f7 48 83 c4 08 5b 5d c3 0f 1f 84 00 00 00 00 00 55 b8 00 01 00 00 48 89 e5 66 0f c1 07 0f b6 d4 38 c2 74 0c 0f 1f 00 f3 90 0f b6 07 38
    [ 337.116588] RIP [] _raw_spin_lock+0x9/0x30
    [ 337.118260] RSP
    [ 337.119998] CR2: 0000000000000024
    [ 337.188701] ---[ end trace c36d790becac1615 ]---

    Signed-off-by: Dan Carpenter
    Signed-off-by: Jan Kara

    Dan Carpenter
     

02 Nov, 2011

1 commit


23 Aug, 2011

2 commits


27 Jul, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    jbd: change the field "b_cow_tid" of struct journal_head from type unsigned to tid_t
    ext3.txt: update the links in the section "useful links" to the latest ones
    ext3: Fix data corruption in inodes with journalled data
    ext2: check xattr name_len before acquiring xattr_sem in ext2_xattr_get
    ext3: Fix compilation with -DDX_DEBUG
    quota: Remove unused declaration
    jbd: Use WRITE_SYNC in journal checkpoint.
    jbd: Fix oops in journal_remove_journal_head()
    ext3: Return -EINVAL when start is beyond the end of fs in ext3_trim_fs()
    ext3/ioctl.c: silence sparse warnings about different address spaces
    ext3/ext4 Documentation: remove bh/nobh since it has been deprecated
    ext3: Improve truncate error handling
    ext3: use proper little-endian bitops
    ext2: include fs.h into ext2_fs.h
    ext3: Fix oops in ext3_try_to_allocate_with_rsv()
    jbd: fix a bug of leaking jh->b_jcount
    jbd: remove dependency on __GFP_NOFAIL
    ext3: Convert ext3 to new truncate calling convention
    jbd: Add fixed tracepoints
    ext3: Add fixed tracepoints

    Resolve conflicts in fs/ext3/fsync.c due to fsync locking push-down and
    new fixed tracepoints.

    Linus Torvalds
     

23 Jul, 2011

1 commit

  • When journalling data for an inode (either because it is a symlink or
    because the filesystem is mounted in data=journal mode), ext3_evict_inode()
    can discard unwritten data by calling truncate_inode_pages(). This is
    because we don't mark the buffer / page dirty when journalling data but only
    add the buffer to the running transaction and thus mm does not know there
    are still unwritten data.

    Fix the problem by carefully tracking transaction containing inode's data,
    committing this transaction, and writing uncheckpointed buffers when inode
    should be reaped.

    Signed-off-by: Jan Kara

    Jan Kara
     

21 Jul, 2011

2 commits

  • Simple filesystems always pass inode->i_sb_bdev as the block device
    argument, and never need a end_io handler. Let's simply things for
    them and for my grepping activity by dropping these arguments. The
    only thing not falling into that scheme is ext4, which passes and
    end_io handler without needing special flags (yet), but given how
    messy the direct I/O code there is use of __blockdev_direct_IO
    in one instead of two out of three cases isn't going to make a large
    difference anyway.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Let filesystems handle waiting for direct I/O requests themselves instead
    of doing it beforehand. This means filesystem-specific locks to prevent
    new dio referenes from appearing can be held. This is important to allow
    generalizing i_dio_count to non-DIO_LOCKING filesystems.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

25 Jun, 2011

3 commits

  • New truncate calling convention allows us to handle errors from
    ext3_block_truncate_page(). So reorganize the code so that
    ext3_block_truncate_page() is called before we change inode size.

    This also removes unnecessary block zeroing from error recovery after failed
    buffered writes (zeroing isn't needed because we could have never written
    non-zero data to disk). We have to be careful and keep zeroing in direct IO
    write error recovery because there we might have already overwritten end of the
    last file block.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Mostly trivial conversion. We fix a bug that IS_IMMUTABLE and IS_APPEND files
    could not be truncated during failed writes as we change the code. In fact the
    test is not needed at all because both IS_IMMUTABLE and IS_APPEND is tested in
    upper layers in do_sys_[f]truncate(), may_write(), etc.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • This commit adds fixed tracepoints to the ext3 code. It is based on ext4
    tracepoints, however due to the differences of both file systems, there
    are some tracepoints missing (those for delaloc and for multi-block
    allocator) and there are some ext3 specific as well (for reservation
    windows).

    Here is a list:

    ext3_free_inode
    ext3_request_inode
    ext3_allocate_inode
    ext3_evict_inode
    ext3_drop_inode
    ext3_mark_inode_dirty
    ext3_write_begin
    ext3_ordered_write_end
    ext3_writeback_write_end
    ext3_journalled_write_end
    ext3_ordered_writepage
    ext3_writeback_writepage
    ext3_journalled_writepage
    ext3_readpage
    ext3_releasepage
    ext3_invalidatepage
    ext3_discard_blocks
    ext3_request_blocks
    ext3_allocate_blocks
    ext3_free_blocks
    ext3_sync_file_enter
    ext3_sync_file_exit
    ext3_sync_fs
    ext3_rsv_window_add
    ext3_discard_reservation
    ext3_alloc_new_reservation
    ext3_reserved
    ext3_forget
    ext3_read_block_bitmap
    ext3_direct_IO_enter
    ext3_direct_IO_exit
    ext3_unlink_enter
    ext3_unlink_exit
    ext3_truncate_enter
    ext3_truncate_exit
    ext3_get_blocks_enter
    ext3_get_blocks_exit
    ext3_load_inode

    Signed-off-by: Lukas Czerner
    Cc: Jan Kara
    Signed-off-by: Jan Kara

    Lukas Czerner
     

27 May, 2011

1 commit

  • Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or
    anything else, so that the filesystem can track internally if it
    needs to push out a transaction for fdatasync or not.

    This is just the prototype change with no user for it yet. I plan
    to push large XFS changes for the next merge window, and getting
    this trivial infrastructure in this window would help a lot to avoid
    tree interdependencies.

    Also remove incorrect comments that ->dirty_inode can't block. That
    has been changed a long time ago, and many implementations rely on it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

08 Apr, 2011

1 commit


31 Mar, 2011

1 commit


24 Mar, 2011

1 commit


10 Mar, 2011

1 commit

  • Code has been converted over to the new explicit on-stack plugging,
    and delay users have been converted to use the new API for that.
    So lets kill off the old plugging along with aops->sync_page().

    Signed-off-by: Jens Axboe

    Jens Axboe
     

11 Jan, 2011

1 commit


28 Oct, 2010

3 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (24 commits)
    quota: Fix possible oops in __dquot_initialize()
    ext3: Update kernel-doc comments
    jbd/2: fixed typos
    ext2: fixed typo.
    ext3: Fix debug messages in ext3_group_extend()
    jbd: Convert atomic_inc() to get_bh()
    ext3: Remove misplaced BUFFER_TRACE() in ext3_truncate()
    jbd: Fix debug message in do_get_write_access()
    jbd: Check return value of __getblk()
    ext3: Use DIV_ROUND_UP() on group desc block counting
    ext3: Return proper error code on ext3_fill_super()
    ext3: Remove unnecessary casts on bh->b_data
    ext3: Cleanup ext3_setup_super()
    quota: Fix issuing of warnings from dquot_transfer
    quota: fix dquot_disable vs dquot_transfer race v2
    jbd: Convert bitops to buffer fns
    ext3/jbd: Avoid WARN() messages when failing to write the superblock
    jbd: Use offset_in_page() instead of manual calculation
    jbd: Remove unnecessary goto statement
    jbd: Use printk_ratelimited() in journal_alloc_journal_head()
    ...

    Linus Torvalds
     
  • Update missing/broken argument descriptions and fix formatting.

    Signed-off-by: Namhyung Kim
    Signed-off-by: Jan Kara

    Namhyung Kim
     
  • Signed-off-by: Namhyung Kim
    Signed-off-by: Jan Kara

    Namhyung Kim
     

26 Oct, 2010

1 commit

  • __block_write_begin and block_prepare_write are identical except for slightly
    different calling conventions. Convert all callers to the __block_write_begin
    calling conventions and drop block_prepare_write.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
    no need for list_for_each_entry_safe()/resetting with superblock list
    Fix sget() race with failing mount
    vfs: don't hold s_umount over close_bdev_exclusive() call
    sysv: do not mark superblock dirty on remount
    sysv: do not mark superblock dirty on mount
    btrfs: remove junk sb_dirt change
    BFS: clean up the superblock usage
    AFFS: wait for sb synchronization when needed
    AFFS: clean up dirty flag usage
    cifs: truncate fallout
    mbcache: fix shrinker function return value
    mbcache: Remove unused features
    add f_flags to struct statfs(64)
    pass a struct path to vfs_statfs
    update VFS documentation for method changes.
    All filesystems that need invalidate_inode_buffers() are doing that explicitly
    convert remaining ->clear_inode() to ->evict_inode()
    Make ->drop_inode() just return whether inode needs to be dropped
    fs/inode.c:clear_inode() is gone
    fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
    ...

    Fix up trivial conflicts in fs/nilfs2/super.c

    Linus Torvalds
     

10 Aug, 2010

4 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • Replace inode_setattr with opencoded variants of it in all callers. This
    moves the remaining call to vmtruncate into the filesystem methods where it
    can be replaced with the proper truncate sequence.

    In a few cases it was obvious that we would never end up calling vmtruncate
    so it was left out in the opencoded variant:

    spufs: explicitly checks for ATTR_SIZE earlier
    btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
    ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

    In addition to that ncpfs called inode_setattr with handcrafted iattrs,
    which allowed to trim down the opencoded variant.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Split up the block_write_begin implementation - __block_write_begin is a new
    trivial wrapper for block_prepare_write that always takes an already
    allocated page and can be either called from block_write_begin or filesystem
    code that already has a page allocated. Remove the handling of already
    allocated pages from block_write_begin after switching all callers that
    do it to __block_write_begin.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Move the call to vmtruncate to get rid of accessive blocks to the callers
    in prepearation of the new truncate calling sequence. This was only done
    for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
    was not needed anyway. Get rid of blockdev_direct_IO_no_locking and
    its _newtrunc variant while at it as just opencoding the two additional
    paramters is shorted than the name suffix.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

06 Aug, 2010

1 commit

  • In data=journal mode, we still use block_write_begin() to prepare page for
    writing. This function can occasionally mark buffer dirty which violates
    journalling assumptions - when a buffer is part of a transaction, it should be
    dirty and a buffer can be already part of a forget list of some transaction
    when block_write_begin() gets called. This violation of journalling assumptions
    then results in "JBD: Spotted dirty metadata buffer..." warnings.

    In fact, temporary dirtying the buffer while the page is still locked does not
    really cause problems to the journalling because we won't write the buffer
    until the page gets unlocked. So we just have to make sure to clear dirty bits
    before unlocking the page.

    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Jan Kara

    Jan Kara
     

21 Jul, 2010

2 commits

  • It can happen that ext3_free_branches calls ext3_forget() for an indirect block
    in an earlier transaction than a transaction in which we clear pointer to this
    indirect block. Thus if we crash before a transaction clearing the block
    pointer is committed, we will see indirect block pointing to already freed
    blocks and complain during orphan list cleanup.

    The fix is simple: Make sure ext3_forget() is called in the transaction
    doing block pointer clearing.

    This is a backport of an ext4 fix by Amir G.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • The nobh option was only supported for writeback mode, but given that all
    write paths (except mmapped writed) actually create buffer heads, it
    effectively was a no-op already.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

22 May, 2010

1 commit

  • Quota must being initialized if size or uid/git changes requested.
    But initialization performed in two different places:
    in case of i_size file system is responsible for dquot init
    , but in case of uid/gid init will be called internally in
    dquot_transfer().
    This ambiguity makes code harder to understand.
    Let's move this logic to one common helper function.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Jan Kara

    Dmitry Monakhov
     

30 Mar, 2010

1 commit

  • In commit 9df93939b735 ("ext3: Use bitops to read/modify
    EXT3_I(inode)->i_state") ext3 changed its internal 'i_state' variable to
    use bitops for its state handling. However, unline the same ext4
    change, it didn't actually change the name of the field when it changed
    the semantics of it.

    As a result, an old use of 'i_state' remained in fs/ext3/ialloc.c that
    initialized the field to EXT3_STATE_NEW. And that does not work
    _at_all_ when we're now working with individually named bits rather than
    values that get masked. So the code tried to mark the state to be new,
    but in actual fact set the field to EXT3_STATE_JDATA. Which makes no
    sense at all, and screws up all the code that checks whether the inode
    was newly allocated.

    In particular, it made the xattr code unhappy, and caused various random
    behavior, like apparently

    https://bugzilla.redhat.com/show_bug.cgi?id=577911

    So fix the initialization, and rename the field to match ext4 so that we
    don't have this happen again.

    Cc: James Morris
    Cc: Stephen Smalley
    Cc: Daniel J Walsh
    Cc: Eric Paris
    Cc: Jan Kara
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 Mar, 2010

2 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
    quota: stop using QUOTA_OK / NO_QUOTA
    dquot: cleanup dquot initialize routine
    dquot: move dquot initialization responsibility into the filesystem
    dquot: cleanup dquot drop routine
    dquot: move dquot drop responsibility into the filesystem
    dquot: cleanup dquot transfer routine
    dquot: move dquot transfer responsibility into the filesystem
    dquot: cleanup inode allocation / freeing routines
    dquot: cleanup space allocation / freeing routines
    ext3: add writepage sanity checks
    ext3: Truncate allocated blocks if direct IO write fails to update i_size
    quota: Properly invalidate caches even for filesystems with blocksize < pagesize
    quota: generalize quota transfer interface
    quota: sb_quota state flags cleanup
    jbd: Delay discarding buffers in journal_unmap_buffer
    ext3: quota_write cross block boundary behaviour
    quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
    quota: split out compat_sys_quotactl support from quota.c
    quota: split out netlink notification support from quota.c
    quota: remove invalid optimization from quota_sync_all
    ...

    Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

    Linus Torvalds
     
  • This gives the filesystem more information about the writeback that
    is happening. Trond requested this for the NFS unstable write handling,
    and other filesystems might benefit from this too by beeing able to
    distinguish between the different callers in more detail.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

05 Mar, 2010

2 commits

  • Get rid of the initialize dquot operation - it is now always called from
    the filesystem and if a filesystem really needs it's own (which none
    currently does) it can just call into it's own routine directly.

    Rename the now static low-level dquot_initialize helper to __dquot_initialize
    and vfs_dq_init to dquot_initialize to have a consistent namespace.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Currently various places in the VFS call vfs_dq_init directly. This means
    we tie the quota code into the VFS. Get rid of that and make the
    filesystem responsible for the initialization. For most metadata operations
    this is a straight forward move into the methods, but for truncate and
    open it's a bit more complicated.

    For truncate we currently only call vfs_dq_init for the sys_truncate case
    because open already takes care of it for ftruncate and open(O_TRUNC) - the
    new code causes an additional vfs_dq_init for those which is harmless.

    For open the initialization is moved from do_filp_open into the open method,
    which means it happens slightly earlier now, and only for regular files.
    The latter is fine because we don't need to initialize it for operations
    on special files, and we already do it as part of the namespace operations
    for directories.

    Add a dquot_file_open helper that filesystems that support generic quotas
    can use to fill in ->open.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig