05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

26 Mar, 2016

2 commits

  • In the current implementation of unaligned aio+dio, lock order behave as
    follow:

    in user process context:
    -> call io_submit()
    -> get i_mutex
    get ip_unaligned_aio
    -> submit direct io to block device
    -> release i_mutex
    -> io_submit() return

    in dio work queue context(the work queue is created in __blockdev_direct_IO):
    -> release ip_unaligned_aio
    get i_mutex
    -> clear unwritten flag & change i_size
    -> release i_mutex

    There is a limitation to the thread number of dio work queue. 256 at
    default. If all 256 thread are in the above 'window2' stage, and there
    is a user process in the 'window1' stage, the system will became
    deadlock. Since the user process hold i_mutex to wait ip_unaligned_aio
    lock, while there is a direct bio hold ip_unaligned_aio mutex who is
    waiting for a dio work queue thread to be schedule. But all the dio
    work queue thread is waiting for i_mutex lock in 'window2'.

    This case only happened in a test which send a large number(more than
    256) of aio at one io_submit() call.

    My design is to remove ip_unaligned_aio lock. Change it to a sync io
    instead. Just like ip_unaligned_aio lock, serialize the unaligned aio
    dio.

    [akpm@linux-foundation.org: remove OCFS2_IOCB_UNALIGNED_IO, per Junxiao Bi]
    Signed-off-by: Ryan Ding
    Reviewed-by: Junxiao Bi
    Cc: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryan Ding
     
  • Clean up ocfs2_file_write_iter & ocfs2_prepare_inode_for_write:
    * remove append dio check: it will be checked in ocfs2_direct_IO()
    * remove file hole check: file hole is supported for now
    * remove inline data check: it will be checked in ocfs2_direct_IO()
    * remove the full_coherence check when append dio: we will get the
    inode_lock in ocfs2_dio_get_block, there is no need to fall back to
    buffer io to ensure the coherence semantics.

    Now the drop dio procedure is gone. :)

    [akpm@linux-foundation.org: remove unused label]
    Signed-off-by: Ryan Ding
    Reviewed-by: Junxiao Bi
    Cc: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryan Ding
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

15 Jan, 2016

1 commit

  • Some versions of tar assume that files with st_blocks == 0 do not
    contain any data and will skip reading them entirely. See also commit
    9206c561554c ("ext4: return non-zero st_blocks for inline data").

    Signed-off-by: John Haxby
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Acked-by: Gang He
    Reviewed-by: Junxiao Bi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Haxby
     

05 Sep, 2015

5 commits

  • PID: 614 TASK: ffff882a739da580 CPU: 3 COMMAND: "ocfs2dc"
    #0 [ffff882ecc3759b0] machine_kexec at ffffffff8103b35d
    #1 [ffff882ecc375a20] crash_kexec at ffffffff810b95b5
    #2 [ffff882ecc375af0] oops_end at ffffffff815091d8
    #3 [ffff882ecc375b20] die at ffffffff8101868b
    #4 [ffff882ecc375b50] do_trap at ffffffff81508bb0
    #5 [ffff882ecc375ba0] do_invalid_op at ffffffff810165e5
    #6 [ffff882ecc375c40] invalid_op at ffffffff815116fb
    [exception RIP: ocfs2_ci_checkpointed+208]
    RIP: ffffffffa0a7e940 RSP: ffff882ecc375cf0 RFLAGS: 00010002
    RAX: 0000000000000001 RBX: 000000000000654b RCX: ffff8812dc83f1f8
    RDX: 00000000000017d9 RSI: ffff8812dc83f1f8 RDI: ffffffffa0b2c318
    RBP: ffff882ecc375d20 R8: ffff882ef6ecfa60 R9: ffff88301f272200
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffff
    R13: ffff8812dc83f4f0 R14: 0000000000000000 R15: ffff8812dc83f1f8
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
    #7 [ffff882ecc375d28] ocfs2_check_meta_downconvert at ffffffffa0a7edbd [ocfs2]
    #8 [ffff882ecc375d38] ocfs2_unblock_lock at ffffffffa0a84af8 [ocfs2]
    #9 [ffff882ecc375dc8] ocfs2_process_blocked_lock at ffffffffa0a85285 [ocfs2]
    #10 [ffff882ecc375e18] ocfs2_downconvert_thread_do_work at ffffffffa0a85445 [ocfs2]
    #11 [ffff882ecc375e68] ocfs2_downconvert_thread at ffffffffa0a854de [ocfs2]
    #12 [ffff882ecc375ee8] kthread at ffffffff81090da7
    #13 [ffff882ecc375f48] kernel_thread_helper at ffffffff81511884
    assert is tripped because the tran is not checkpointed and the lock level is PR.

    Some time ago, chmod command had been executed. As result, the following call
    chain left the inode cluster lock in PR state, latter on causing the assert.
    system_call_fastpath
    -> my_chmod
    -> sys_chmod
    -> sys_fchmodat
    -> notify_change
    -> ocfs2_setattr
    -> posix_acl_chmod
    -> ocfs2_iop_set_acl
    -> ocfs2_set_acl
    -> ocfs2_acl_set_mode
    Here is how.
    1119 int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
    1120 {
    1247 ocfs2_inode_unlock(inode, 1); <<< WRONG thing to do.
    ..
    1258 if (!status && attr->ia_valid & ATTR_MODE) {
    1259 status = posix_acl_chmod(inode, inode->i_mode);

    519 posix_acl_chmod(struct inode *inode, umode_t mode)
    520 {
    ..
    539 ret = inode->i_op->set_acl(inode, acl, ACL_TYPE_ACCESS);

    287 int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, ...
    288 {
    289 return ocfs2_set_acl(NULL, inode, NULL, type, acl, NULL, NULL);

    224 int ocfs2_set_acl(handle_t *handle,
    225 struct inode *inode, ...
    231 {
    ..
    252 ret = ocfs2_acl_set_mode(inode, di_bh,
    253 handle, mode);

    168 static int ocfs2_acl_set_mode(struct inode *inode, struct buffer_head ...
    170 {
    183 if (handle == NULL) {
    >>> BUG: inode lock not held in ex at this point <<<
    184 handle = ocfs2_start_trans(OCFS2_SB(inode->i_sb),
    185 OCFS2_INODE_UPDATE_CREDITS);

    ocfs2_setattr.#1247 we unlock and at #1259 call posix_acl_chmod. When we reach
    ocfs2_acl_set_mode.#181 and do trans, the inode cluster lock is not held in EX
    mode (it should be). How this could have happended?

    We are the lock master, were holding lock EX and have released it in
    ocfs2_setattr.#1247. Note that there are no holders of this lock at
    this point. Another node needs the lock in PR, and we downconvert from
    EX to PR. So the inode lock is PR when do the trans in
    ocfs2_acl_set_mode.#184. The trans stays in core (not flushed to disc).
    Now another node want the lock in EX, downconvert thread gets kicked
    (the one that tripped assert abovt), finds an unflushed trans but the
    lock is not EX (it is PR). If the lock was at EX, it would have flushed
    the trans ocfs2_ci_checkpointed -> ocfs2_start_checkpoint before
    downconverting (to NULL) for the request.

    ocfs2_setattr must not drop inode lock ex in this code path. If it
    does, takes it again before the trans, say in ocfs2_set_acl, another
    cluster node can get in between, execute another setattr, overwriting
    the one in progress on this node, resulting in a mode acl size combo
    that is a mix of the two.

    Orabug: 20189959
    Signed-off-by: Tariq Saeed
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tariq Saeed
     
  • Since commit 86b9c6f3f891 ("ocfs2: remove filesize checks for sync I/O
    journal commit") removes filesize checks for sync I/O journal commit,
    variables old_size and old_clusters are not actually used any more. So
    clean them up.

    Signed-off-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • 1) Take rw EX lock in case of append dio.
    2) Explicitly treat the error code -EIOCBQUEUED as normal.
    3) Set di_bh to NULL after brelse if it may be used again later.

    Signed-off-by: Joseph Qi
    Cc: Yiwen Jiang
    Cc: Weiwei Wang
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • During direct io the inode will be added to orphan first and then
    deleted from orphan. There is a race window that the orphan entry will
    be deleted twice and thus trigger the BUG when validating
    OCFS2_DIO_ORPHANED_FL in ocfs2_del_inode_from_orphan.

    ocfs2_direct_IO_write
    ...
    ocfs2_add_inode_to_orphan
    >>>>>>>> race window.
    1) another node may rm the file and then down, this node
    take care of orphan recovery and clear flag
    OCFS2_DIO_ORPHANED_FL.
    2) since rw lock is unlocked, it may race with another
    orphan recovery and append dio.
    ocfs2_del_inode_from_orphan

    So take inode mutex lock when recovering orphans and make rw unlock at the
    end of aio write in case of append dio.

    Signed-off-by: Joseph Qi
    Reported-by: Yiwen Jiang
    Cc: Weiwei Wang
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • ocfs2_file_write_iter() is usng the wrong return value ('written'). This
    will cause ocfs2_rw_unlock() be called both in write_iter & end_io,
    triggering a BUG_ON.

    This issue was introduced by commit 7da839c47589 ("ocfs2: use
    __generic_file_write_iter()").

    Orabug: 21612107
    Fixes: 7da839c47589 ("ocfs2: use __generic_file_write_iter()")
    Signed-off-by: Ryan Ding
    Reviewed-by: Junxiao Bi
    Cc: Al Viro
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryan Ding
     

24 Jul, 2015

2 commits


26 Jun, 2015

1 commit

  • Pull cgroup writeback support from Jens Axboe:
    "This is the big pull request for adding cgroup writeback support.

    This code has been in development for a long time, and it has been
    simmering in for-next for a good chunk of this cycle too. This is one
    of those problems that has been talked about for at least half a
    decade, finally there's a solution and code to go with it.

    Also see last weeks writeup on LWN:

    http://lwn.net/Articles/648292/"

    * 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits)
    writeback, blkio: add documentation for cgroup writeback support
    vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB
    writeback: do foreign inode detection iff cgroup writeback is enabled
    v9fs: fix error handling in v9fs_session_init()
    bdi: fix wrong error return value in cgwb_create()
    buffer: remove unusued 'ret' variable
    writeback: disassociate inodes from dying bdi_writebacks
    writeback: implement foreign cgroup inode bdi_writeback switching
    writeback: add lockdep annotation to inode_to_wb()
    writeback: use unlocked_inode_to_wb transaction in inode_congested()
    writeback: implement unlocked_inode_to_wb transaction and use it for stat updates
    writeback: implement [locked_]inode_to_wb_and_lock_list()
    writeback: implement foreign cgroup inode detection
    writeback: make writeback_control track the inode being written back
    writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb()
    mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use
    writeback: implement memcg writeback domain based throttling
    writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
    writeback: implement memcg wb_domain
    writeback: update wb_over_bg_thresh() to use wb_domain aware operations
    ...

    Linus Torvalds
     

25 Jun, 2015

1 commit

  • In ocfs2 direct read/write, OCFS2_IOCB_SEM lock type is used to protect
    inode->i_alloc_sem rw semaphore lock in the earlier kernel version.
    However, in the latest kernel, inode->i_alloc_sem rw semaphore lock is not
    used at all, so OCFS2_IOCB_SEM lock type needs to be removed.

    Signed-off-by: Weiwei Wang
    Cc: Mark Fasheh
    Cc: Joel Becker
    Reviewed-by: Junxiao Bi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WeiWei Wang
     

02 Jun, 2015

1 commit

  • With the planned cgroup writeback support, backing-dev related
    declarations will be more widely used across block and cgroup;
    unfortunately, including backing-dev.h from include/linux/blkdev.h
    makes cyclic include dependency quite likely.

    This patch separates out backing-dev-defs.h which only has the
    essential definitions and updates blkdev.h to include it. c files
    which need access to more backing-dev details now include
    backing-dev.h directly. This takes backing-dev.h off the common
    include dependency chain making it a lot easier to use it across block
    and cgroup.

    v2: fs/fat build failure fixed.

    Signed-off-by: Tejun Heo
    Reviewed-by: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    Tejun Heo
     

16 Apr, 2015

1 commit


12 Apr, 2015

8 commits


09 Apr, 2015

3 commits


13 Mar, 2015

1 commit


17 Feb, 2015

4 commits

  • Intruduce a bit OCFS2_FEATURE_RO_COMPAT_APPEND_DIO and check it in
    write flow. If the bit is not set, fall back to the old way.

    Signed-off-by: Joseph Qi
    Cc: Weiwei Wang
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Mark Fasheh
    Cc: Xuejiufei
    Cc: alex chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Complte the rest request thourgh buffer io after direct write performed.

    Signed-off-by: Joseph Qi
    Cc: Weiwei Wang
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Mark Fasheh
    Cc: Xuejiufei
    Cc: alex chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Now we can do direct io and do not fallback to buffered IO any more in
    case of append O_DIRECT write.

    Signed-off-by: Joseph Qi
    Cc: Weiwei Wang
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Mark Fasheh
    Cc: Xuejiufei
    Cc: alex chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Currently in case of append O_DIRECT write (block not allocated yet),
    ocfs2 will fall back to buffered I/O. This has some disadvantages.
    Firstly, it is not the behavior as expected. Secondly, it will consume
    huge page cache, e.g. in mass backup scenario. Thirdly, modern
    filesystems such as ext4 support this feature.

    In this patch set, the direct I/O write doesn't fallback to buffer I/O
    write any more because the allocate blocks are enabled in direct I/O now.

    This patch (of 9):

    Prepare some interfaces which will be used in append O_DIRECT write.

    Signed-off-by: Joseph Qi
    Cc: Weiwei Wang
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Xuejiufei
    Cc: Junxiao Bi
    Cc: alex chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

13 Feb, 2015

1 commit

  • Pull backing device changes from Jens Axboe:
    "This contains a cleanup of how the backing device is handled, in
    preparation for a rework of the life time rules. In this part, the
    most important change is to split the unrelated nommu mmap flags from
    it, but also removing a backing_dev_info pointer from the
    address_space (and inode), and a cleanup of other various minor bits.

    Christoph did all the work here, I just fixed an oops with pages that
    have a swap backing. Arnd fixed a missing export, and Oleg killed the
    lustre backing_dev_info from staging. Last patch was from Al,
    unexporting parts that are now no longer needed outside"

    * 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
    Make super_blocks and sb_lock static
    mtd: export new mtd_mmap_capabilities
    fs: make inode_to_bdi() handle NULL inode
    staging/lustre/llite: get rid of backing_dev_info
    fs: remove default_backing_dev_info
    fs: don't reassign dirty inodes to default_backing_dev_info
    nfs: don't call bdi_unregister
    ceph: remove call to bdi_unregister
    fs: remove mapping->backing_dev_info
    fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
    nilfs2: set up s_bdi like the generic mount_bdev code
    block_dev: get bdev inode bdi directly from the block device
    block_dev: only write bdev inode on close
    fs: introduce f_op->mmap_capabilities for nommu mmap support
    fs: kill BDI_CAP_SWAP_BACKED
    fs: deduplicate noop_backing_dev_info

    Linus Torvalds
     

11 Feb, 2015

1 commit


21 Jan, 2015

1 commit

  • Now that we got rid of the bdi abuse on character devices we can always use
    sb->s_bdi to get at the backing_dev_info for a file, except for the block
    device special case. Export inode_to_bdi and replace uses of
    mapping->backing_dev_info with it to prepare for the removal of
    mapping->backing_dev_info.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Tejun Heo
    Reviewed-by: Jan Kara
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

19 Dec, 2014

1 commit

  • When running ocfs2 test suite multiple nodes reflink stress test, for a
    4 nodes cluster, every unlink() for refcounted file needs about 700s.

    The slow unlink is caused by the contention of refcount tree lock since
    all nodes are unlink files using the same refcount tree. When the
    unlinking file have many extents(over 1600 in our test), most of the
    extents has refcounted flag set. In ocfs2_commit_truncate(), it will
    execute the following call trace for every extents. This means it needs
    get and released refcount tree lock about 1600 times. And when several
    nodes are do this at the same time, the performance will be very low.

    ocfs2_remove_btree_range()
    -- ocfs2_lock_refcount_tree()
    ---- ocfs2_refcount_lock()
    ------ __ocfs2_cluster_lock()

    ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to
    do lock/unlock once can improve a lot performance.

    Signed-off-by: Junxiao Bi
    Cc: Wengang
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     

11 Dec, 2014

1 commit

  • Filesize is not a good indication that the file needs to be synced.
    An example where this breaks is:
    1. Open the file in O_SYNC|O_RDWR
    2. Read a small portion of the file (say 64 bytes)
    3. Lseek to starting of the file
    4. Write 64 bytes

    If the node crashes, it is not written out to disk because this was not
    committed in the journal and the other node which reads the file after
    recovery reads stale data (even if the write on the other node was
    successful)

    Signed-off-by: Goldwyn Rodrigues
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     

11 Oct, 2014

1 commit

  • Pull UDF and quota updates from Jan Kara:
    "A few UDF fixes and also a few patches which are preparing filesystems
    for support of project quotas in VFS"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    udf: Fix loading of special inodes
    ocfs2: Back out change to use OCFS2_MAXQUOTAS in ocfs2_setattr()
    udf: remove redundant sys_tz declaration
    ocfs2: Don't use MAXQUOTAS value
    reiserfs: Don't use MAXQUOTAS value
    ext3: Don't use MAXQUOTAS value
    udf: Fix race between write(2) and close(2)

    Linus Torvalds
     

10 Oct, 2014

1 commit

  • For commit ocfs2 journal, ocfs2 journal thread will acquire the mutex
    osb->journal->j_trans_barrier and wake up jbd2 commit thread, then it
    will wait until jbd2 commit thread done. In order journal mode, jbd2
    needs flushing dirty data pages first, and this needs get page lock.
    So osb->journal->j_trans_barrier should be got before page lock.

    But ocfs2_write_zero_page() and ocfs2_write_begin_inline() obey this
    locking order, and this will cause deadlock and hung the whole cluster.

    One deadlock catched is the following:

    PID: 13449 TASK: ffff8802e2f08180 CPU: 31 COMMAND: "oracle"
    #0 [ffff8802ee3f79b0] __schedule at ffffffff8150a524
    #1 [ffff8802ee3f7a58] schedule at ffffffff8150acbf
    #2 [ffff8802ee3f7a68] rwsem_down_failed_common at ffffffff8150cb85
    #3 [ffff8802ee3f7ad8] rwsem_down_read_failed at ffffffff8150cc55
    #4 [ffff8802ee3f7ae8] call_rwsem_down_read_failed at ffffffff812617a4
    #5 [ffff8802ee3f7b50] ocfs2_start_trans at ffffffffa0498919 [ocfs2]
    #6 [ffff8802ee3f7ba0] ocfs2_zero_start_ordered_transaction at ffffffffa048b2b8 [ocfs2]
    #7 [ffff8802ee3f7bf0] ocfs2_write_zero_page at ffffffffa048e9bd [ocfs2]
    #8 [ffff8802ee3f7c80] ocfs2_zero_extend_range at ffffffffa048ec83 [ocfs2]
    #9 [ffff8802ee3f7ce0] ocfs2_zero_extend at ffffffffa048edfd [ocfs2]
    #10 [ffff8802ee3f7d50] ocfs2_extend_file at ffffffffa049079e [ocfs2]
    #11 [ffff8802ee3f7da0] ocfs2_setattr at ffffffffa04910ed [ocfs2]
    #12 [ffff8802ee3f7e70] notify_change at ffffffff81187d29
    #13 [ffff8802ee3f7ee0] do_truncate at ffffffff8116bbc1
    #14 [ffff8802ee3f7f50] sys_ftruncate at ffffffff8116bcbd
    #15 [ffff8802ee3f7f80] system_call_fastpath at ffffffff81515142
    RIP: 00007f8de750c6f7 RSP: 00007fffe786e478 RFLAGS: 00000206
    RAX: 000000000000004d RBX: ffffffff81515142 RCX: 0000000000000000
    RDX: 0000000000000200 RSI: 0000000000028400 RDI: 000000000000000d
    RBP: 00007fffe786e040 R8: 0000000000000000 R9: 000000000000000d
    R10: 0000000000000000 R11: 0000000000000206 R12: 000000000000000d
    R13: 00007fffe786e710 R14: 00007f8de70f8340 R15: 0000000000028400
    ORIG_RAX: 000000000000004d CS: 0033 SS: 002b

    crash64> bt
    PID: 7610 TASK: ffff88100fd56140 CPU: 1 COMMAND: "ocfs2cmt"
    #0 [ffff88100f4d1c50] __schedule at ffffffff8150a524
    #1 [ffff88100f4d1cf8] schedule at ffffffff8150acbf
    #2 [ffff88100f4d1d08] jbd2_log_wait_commit at ffffffffa01274fd [jbd2]
    #3 [ffff88100f4d1d98] jbd2_journal_flush at ffffffffa01280b4 [jbd2]
    #4 [ffff88100f4d1dd8] ocfs2_commit_cache at ffffffffa0499b14 [ocfs2]
    #5 [ffff88100f4d1e38] ocfs2_commit_thread at ffffffffa0499d38 [ocfs2]
    #6 [ffff88100f4d1ee8] kthread at ffffffff81090db6
    #7 [ffff88100f4d1f48] kernel_thread_helper at ffffffff81516284

    crash64> bt
    PID: 7609 TASK: ffff88100f2d4480 CPU: 0 COMMAND: "jbd2/dm-20-86"
    #0 [ffff88100def3920] __schedule at ffffffff8150a524
    #1 [ffff88100def39c8] schedule at ffffffff8150acbf
    #2 [ffff88100def39d8] io_schedule at ffffffff8150ad6c
    #3 [ffff88100def39f8] sleep_on_page at ffffffff8111069e
    #4 [ffff88100def3a08] __wait_on_bit_lock at ffffffff8150b30a
    #5 [ffff88100def3a58] __lock_page at ffffffff81110687
    #6 [ffff88100def3ab8] write_cache_pages at ffffffff8111b752
    #7 [ffff88100def3be8] generic_writepages at ffffffff8111b901
    #8 [ffff88100def3c48] journal_submit_data_buffers at ffffffffa0120f67 [jbd2]
    #9 [ffff88100def3cf8] jbd2_journal_commit_transaction at ffffffffa0121372[jbd2]
    #10 [ffff88100def3e68] kjournald2 at ffffffffa0127a86 [jbd2]
    #11 [ffff88100def3ee8] kthread at ffffffff81090db6
    #12 [ffff88100def3f48] kernel_thread_helper at ffffffff81516284

    Signed-off-by: Junxiao Bi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Alex Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     

01 Oct, 2014

1 commit

  • ocfs2_setattr() actually needs to really use MAXQUOTAS and not
    OCFS2_MAXQUOTAS since it will pass the array over to VFS. Currently
    this isn't a problem since MAXQUOTAS == OCFS2_MAXQUOTAS but it would
    be once we introduce project quotas.

    CC: Mark Fasheh
    CC: Joel Becker
    CC: ocfs2-devel@oss.oracle.com
    Signed-off-by: Jan Kara

    Jan Kara