Eric Lee / smarc-fsl-linux-kernel

06 Nov, 2015

2 commits

4e357b932 ocfs2: fill in the unused portion of the block with zeros by dio_zero_block() ... Browse Code »

A simplified test case is (this case from Ryan):
1) dd if=/dev/zero of=/mnt/hello bs=512 count=1 oflag=direct;
2) truncate /mnt/hello -s 2097152
file 'hello' is not exist before test. After this command,
file 'hello' should be all zero. But 512~4096 is some random data.

Setting bh state to new when get a new block, if so,
direct_io_worker()->dio_zero_block() will fill-in the unused portion
of the block with zero.

Signed-off-by: Yiwen Jiang
Reviewed-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

jiangyiwen
2015-11-06 11:34:48 +0800
d162eaad7 ocfs2_direct_IO_write() misses ocfs2_is_overwrite() error code ... Browse Code »

If ocfs2_is_overwrite failed, ocfs2_direct_IO_write mays till return
success to the caller.

Signed-off-by: Norton.Zhu
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Norton.Zhu
2015-11-06 11:34:48 +0800

05 Sep, 2015

5 commits

7ecef14ab ocfs2: neaten do_error, ocfs2_error and ocfs2_abort ... Browse Code »

These uses sometimes do and sometimes don't have '\n' terminations. Make
the uses consistently use '\n' terminations and remove the newline from
the functions.

Miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-09-05 07:54:41 +0800
7f27ec978 ocfs2: call ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock() ... Browse Code »

1: After we call ocfs2_journal_access_di() in ocfs2_write_begin(),
jbd2_journal_restart() may also be called, in this function transaction
A's t_updates-- and obtains a new transaction B. If
jbd2_journal_commit_transaction() is happened to commit transaction A,
when t_updates==0, it will continue to complete commit and unfile
buffer.

So when jbd2_journal_dirty_metadata(), the handle is pointed a new
transaction B, and the buffer head's journal head is already freed,
jh->b_transaction == NULL, jh->b_next_transaction == NULL, it returns
EINVAL, So it triggers the BUG_ON(status).

thread 1 jbd2
ocfs2_write_begin jbd2_journal_commit_transaction
ocfs2_write_begin_nolock
ocfs2_start_trans
jbd2__journal_start(t_updates+1,
transaction A)
ocfs2_journal_access_di
ocfs2_write_cluster_by_desc
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_split_extent
ocfs2_extend_rotate_transaction
jbd2_journal_restart
(t_updates-1,transaction B) t_updates==0
__jbd2_journal_refile_buffer
(jh->b_transaction = NULL)
ocfs2_write_end
ocfs2_write_end_nolock
ocfs2_journal_dirty
jbd2_journal_dirty_metadata(bug)
ocfs2_commit_trans

2. In ext4, I found that: jbd2_journal_get_write_access() called by
ext4_write_end.

ext4_write_begin
ext4_journal_start
__ext4_journal_start_sb
ext4_journal_check_start
jbd2__journal_start

ext4_write_end
ext4_mark_inode_dirty
ext4_reserve_inode_write
ext4_journal_get_write_access
jbd2_journal_get_write_access
ext4_mark_iloc_dirty
ext4_do_update_inode
ext4_handle_dirty_metadata
jbd2_journal_dirty_metadata

3. So I think we should put ocfs2_journal_access_di before
ocfs2_journal_dirty in the ocfs2_write_end. and it works well after my
modification.

Signed-off-by: vicky
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Cc: Zhangguanghui
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

yangwenfang
2015-09-05 07:54:41 +0800
6ab855a99 ocfs2: add ip_alloc_sem in direct IO to protect allocation changes ... Browse Code »

In ocfs2, ip_alloc_sem is used to protect allocation changes on the
node. In direct IO, we add ip_alloc_sem to protect date consistent
between direct-io and ocfs2_truncate_file race (buffer io use
ip_alloc_sem already). Although inode->i_mutex lock is used to avoid
concurrency of above situation, i think ip_alloc_sem is still needed
because protect allocation changes is significant.

Other filesystem like ext4 also uses rw_semaphore to protect data
consistent between get_block-vs-truncate race by other means, So
ip_alloc_sem in ocfs2 direct io is needed.

Signed-off-by: Weiwei Wang
Signed-off-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

WeiWei Wang
2015-09-05 07:54:41 +0800
faaebf18f ocfs2: fix several issues of append dio ... Browse Code »

1) Take rw EX lock in case of append dio.
2) Explicitly treat the error code -EIOCBQUEUED as normal.
3) Set di_bh to NULL after brelse if it may be used again later.

Signed-off-by: Joseph Qi
Cc: Yiwen Jiang
Cc: Weiwei Wang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-09-05 07:54:41 +0800
512f62acb ocfs2: fix race between dio and recover orphan ... Browse Code »

During direct io the inode will be added to orphan first and then
deleted from orphan. There is a race window that the orphan entry will
be deleted twice and thus trigger the BUG when validating
OCFS2_DIO_ORPHANED_FL in ocfs2_del_inode_from_orphan.

ocfs2_direct_IO_write
...
ocfs2_add_inode_to_orphan
>>>>>>>> race window.
1) another node may rm the file and then down, this node
take care of orphan recovery and clear flag
OCFS2_DIO_ORPHANED_FL.
2) since rw lock is unlocked, it may race with another
orphan recovery and append dio.
ocfs2_del_inode_from_orphan

So take inode mutex lock when recovering orphans and make rw unlock at the
end of aio write in case of append dio.

Signed-off-by: Joseph Qi
Reported-by: Yiwen Jiang
Cc: Weiwei Wang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-09-05 07:54:41 +0800

07 Aug, 2015

1 commit

32e5a2a2b ocfs2: fix shift left overflow ... Browse Code »

When using a large volume, for example 9T volume with 2T already used,
frequent creation of small files with O_DIRECT when the IO is not
cluster aligned may clear sectors in the wrong place. This will cause
filesystem corruption.

This is because p_cpos is a u32. When calculating the corresponding
sector it should be converted to u64 first, otherwise it may overflow.

Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: [4.0+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-08-07 09:39:41 +0800

25 Jun, 2015

3 commits

ae1f08146 ocfs2: fix wrong check in ocfs2_direct_IO_get_blocks ... Browse Code »

contig_blocks gotten from ocfs2_extent_map_get_blocks cannot be compared
with clusters_to_alloc. So convert it to clusters first.

Signed-off-by: Joseph Qi
Reviewed-by: Weiwei Wang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-06-25 08:49:40 +0800
fa5a0eb3b ocfs2: remove OCFS2_IOCB_SEM lock type in direct io ... Browse Code »

In ocfs2 direct read/write, OCFS2_IOCB_SEM lock type is used to protect
inode->i_alloc_sem rw semaphore lock in the earlier kernel version.
However, in the latest kernel, inode->i_alloc_sem rw semaphore lock is not
used at all, so OCFS2_IOCB_SEM lock type needs to be removed.

Signed-off-by: Weiwei Wang
Cc: Mark Fasheh
Cc: Joel Becker
Reviewed-by: Junxiao Bi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

WeiWei Wang
2015-06-25 08:49:39 +0800
cf1776a9e ocfs2: fix a tiny race when truncate dio orohaned entry ... Browse Code »

Once dio crashed it will leave an entry in orphan dir. And orphan scan
will take care of the clean up. There is a tiny race case that the same
entry will be truncated twice and then trigger the BUG in
ocfs2_del_inode_from_orphan.

Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-06-25 08:49:39 +0800

17 Apr, 2015

1 commit

4fc8adcfe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull third hunk of vfs changes from Al Viro:
"This contains the ->direct_IO() changes from Omar + saner
generic_write_checks() + dealing with fcntl()/{read,write}() races
(mirroring O_APPEND/O_DIRECT into iocb->ki_flags and instead of
repeatedly looking at ->f_flags, which can be changed by fcntl(2),
check ->ki_flags - which cannot) + infrastructure bits for dhowells'
d_inode annotations + Christophs switch of /dev/loop to
vfs_iter_write()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (30 commits)
block: loop: switch to VFS ITER_BVEC
configfs: Fix inconsistent use of file_inode() vs file->f_path.dentry->d_inode
VFS: Make pathwalk use d_is_reg() rather than S_ISREG()
VFS: Fix up debugfs to use d_is_dir() in place of S_ISDIR()
VFS: Combine inode checks with d_is_negative() and d_is_positive() in pathwalk
NFS: Don't use d_inode as a variable name
VFS: Impose ordering on accesses of d_inode and d_flags
VFS: Add owner-filesystem positive/negative dentry checks
nfs: generic_write_checks() shouldn't be done on swapout...
ocfs2: use __generic_file_write_iter()
mirror O_APPEND and O_DIRECT into iocb->ki_flags
switch generic_write_checks() to iocb and iter
ocfs2: move generic_write_checks() before the alignment checks
ocfs2_file_write_iter: stop messing with ppos
udf_file_write_iter: reorder and simplify
fuse: ->direct_IO() doesn't need generic_write_checks()
ext4_file_write_iter: move generic_write_checks() up
xfs_file_aio_write_checks: switch to iocb/iov_iter
generic_write_checks(): drop isblk argument
blkdev_write_iter: expand generic_file_checks() call in there
...

Linus Torvalds
2015-04-17 11:27:56 +0800

15 Apr, 2015

5 commits

1dcf58d6e Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge first patchbomb from Andrew Morton:

- arch/sh updates

- ocfs2 updates

- kernel/watchdog feature

- about half of mm/

* emailed patches from Andrew Morton : (122 commits)
Documentation: update arch list in the 'memtest' entry
Kconfig: memtest: update number of test patterns up to 17
arm: add support for memtest
arm64: add support for memtest
memtest: use phys_addr_t for physical addresses
mm: move memtest under mm
mm, hugetlb: abort __get_user_pages if current has been oom killed
mm, mempool: do not allow atomic resizing
memcg: print cgroup information when system panics due to panic_on_oom
mm: numa: remove migrate_ratelimited
mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
mm: split ET_DYN ASLR from mmap ASLR
s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
mm: expose arch_mmap_rnd when available
s390: standardize mmap_rnd() usage
powerpc: standardize mmap_rnd() usage
mips: extract logic for mmap_rnd()
arm64: standardize mmap_rnd() usage
x86: standardize mmap_rnd() usage
arm: factor out mmap ASLR into mmap_rnd
...

Linus Torvalds
2015-04-15 07:49:17 +0800
14a5275d8 ocfs2: do not use ocfs2_zero_extend during direct IO ... Browse Code »

In ocfs2_direct_IO_write, we use ocfs2_zero_extend to zero allocated
clusters in case of cluster not aligned. But ocfs2_zero_extend uses page
cache, this may happen that it clears the data which blockdev_direct_IO
has already written.

We should use blkdev_issue_zeroout instead of ocfs2_zero_extend during
direct IO.

So fix this issue by introducing ocfs2_direct_IO_zero_extend and
ocfs2_direct_IO_extend_no_holes.

Reported-by: Yiwen Jiang
Signed-off-by: Joseph Qi
Tested-by: Yiwen Jiang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-04-15 07:48:57 +0800
37a8d89ae ocfs2: take inode lock when get clusters ... Browse Code »

We need take inode lock when calling ocfs2_get_clusters.
And use GFP_NOFS instead of GFP_KERNEL.

Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-04-15 07:48:57 +0800
7e9b19551 ocfs2: no need get dinode bh when zeroing extend ... Browse Code »

Since di_bh won't be used when zeroing extend, set it to NULL.

Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-04-15 07:48:57 +0800
bdd86215b ocfs2: fix a typing error in ocfs2_direct_IO_write ... Browse Code »

Only when direct IO succeeds we need consider zeroing out in case of
cluster not aligned.

Signed-off-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-04-15 07:48:57 +0800

12 Apr, 2015

3 commits

22c6186ec direct_IO: remove rw from a_ops->direct_IO() ... Browse Code »

Now that no one is using rw, remove it completely.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800
6f6737631 direct_IO: use iov_iter_rw() instead of rw everywhere ... Browse Code »

The rw parameter to direct_IO is redundant with iov_iter->type, and
treated slightly differently just about everywhere it's used: some users
do rw & WRITE, and others do rw == WRITE where they should be doing a
bitwise check. Simplify this with the new iov_iter_rw() helper, which
always returns either READ or WRITE.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800
17f8c842d Remove rw from {,__,do_}blockdev_direct_IO() ... Browse Code »

Most filesystems call through to these at some point, so we'll start
here.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:44 +0800

26 Mar, 2015

1 commit

e2e40f2c1 fs: move struct kiocb to fs.h ... Browse Code »

struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2015-03-26 08:28:11 +0800

17 Feb, 2015

2 commits

49255dce6 ocfs2: allocate blocks in ocfs2_direct_IO_get_blocks ... Browse Code »

Allow blocks allocation in ocfs2_direct_IO_get_blocks.

Signed-off-by: Joseph Qi
Cc: Weiwei Wang
Cc: Junxiao Bi
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Xuejiufei
Cc: alex chen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-02-17 09:56:05 +0800
24c40b329 ocfs2: implement ocfs2_direct_IO_write ... Browse Code »

Implement ocfs2_direct_IO_write. Add the inode to orphan dir first, and
then delete it once append O_DIRECT finished.

This is to make sure block allocation and inode size are consistent.

[akpm@linux-foundation.org: fix it for "block: Add discard flag to blkdev_issue_zeroout() function"]
Signed-off-by: Joseph Qi
Cc: Weiwei Wang
Cc: Junxiao Bi
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Xuejiufei
Cc: alex chen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-02-17 09:56:05 +0800

19 Dec, 2014

1 commit

136f49b91 ocfs2: fix journal commit deadlock ... Browse Code »

For buffer write, page lock will be got in write_begin and released in
write_end, in ocfs2_write_end_nolock(), before it unlock the page in
ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask
for the read lock of journal->j_trans_barrier. Holding page lock and
ask for journal->j_trans_barrier breaks the locking order.

This will cause a deadlock with journal commit threads, ocfs2cmt will
get write lock of journal->j_trans_barrier first, then it wakes up
kjournald2 to do the commit work, at last it waits until done. To
commit journal, kjournald2 needs flushing data first, it needs get the
cache page lock.

Since some ocfs2 cluster locks are holding by write process, this
deadlock may hung the whole cluster.

unlock pages before ocfs2_run_deallocs() can fix the locking order, also
put unlock before ocfs2_commit_trans() to make page lock is unlocked
before j_trans_barrier to preserve unlocking order.

Signed-off-by: Junxiao Bi
Reviewed-by: Wengang Wang
Cc:
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-12-19 11:08:11 +0800

11 Dec, 2014

1 commit

61fb9ea4b ocfs2: do not set filesystem readonly if link down ... Browse Code »

Do not set the filesystem readonly if the storage link is down. In this
case, metadata is not corrupted and only -EIO is returned. And if it is
indeed corrupted metadata, it has already called ocfs2_error() in
ocfs2_validate_inode_block().

Signed-off-by: Yiwen Jiang
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

jiangyiwen
2014-12-11 09:41:03 +0800

10 Oct, 2014

1 commit

f775da2fc ocfs2: fix deadlock due to wrong locking order ... Browse Code »

For commit ocfs2 journal, ocfs2 journal thread will acquire the mutex
osb->journal->j_trans_barrier and wake up jbd2 commit thread, then it
will wait until jbd2 commit thread done. In order journal mode, jbd2
needs flushing dirty data pages first, and this needs get page lock.
So osb->journal->j_trans_barrier should be got before page lock.

But ocfs2_write_zero_page() and ocfs2_write_begin_inline() obey this
locking order, and this will cause deadlock and hung the whole cluster.

One deadlock catched is the following:

PID: 13449 TASK: ffff8802e2f08180 CPU: 31 COMMAND: "oracle"
#0 [ffff8802ee3f79b0] __schedule at ffffffff8150a524
#1 [ffff8802ee3f7a58] schedule at ffffffff8150acbf
#2 [ffff8802ee3f7a68] rwsem_down_failed_common at ffffffff8150cb85
#3 [ffff8802ee3f7ad8] rwsem_down_read_failed at ffffffff8150cc55
#4 [ffff8802ee3f7ae8] call_rwsem_down_read_failed at ffffffff812617a4
#5 [ffff8802ee3f7b50] ocfs2_start_trans at ffffffffa0498919 [ocfs2]
#6 [ffff8802ee3f7ba0] ocfs2_zero_start_ordered_transaction at ffffffffa048b2b8 [ocfs2]
#7 [ffff8802ee3f7bf0] ocfs2_write_zero_page at ffffffffa048e9bd [ocfs2]
#8 [ffff8802ee3f7c80] ocfs2_zero_extend_range at ffffffffa048ec83 [ocfs2]
#9 [ffff8802ee3f7ce0] ocfs2_zero_extend at ffffffffa048edfd [ocfs2]
#10 [ffff8802ee3f7d50] ocfs2_extend_file at ffffffffa049079e [ocfs2]
#11 [ffff8802ee3f7da0] ocfs2_setattr at ffffffffa04910ed [ocfs2]
#12 [ffff8802ee3f7e70] notify_change at ffffffff81187d29
#13 [ffff8802ee3f7ee0] do_truncate at ffffffff8116bbc1
#14 [ffff8802ee3f7f50] sys_ftruncate at ffffffff8116bcbd
#15 [ffff8802ee3f7f80] system_call_fastpath at ffffffff81515142
RIP: 00007f8de750c6f7 RSP: 00007fffe786e478 RFLAGS: 00000206
RAX: 000000000000004d RBX: ffffffff81515142 RCX: 0000000000000000
RDX: 0000000000000200 RSI: 0000000000028400 RDI: 000000000000000d
RBP: 00007fffe786e040 R8: 0000000000000000 R9: 000000000000000d
R10: 0000000000000000 R11: 0000000000000206 R12: 000000000000000d
R13: 00007fffe786e710 R14: 00007f8de70f8340 R15: 0000000000028400
ORIG_RAX: 000000000000004d CS: 0033 SS: 002b

crash64> bt
PID: 7610 TASK: ffff88100fd56140 CPU: 1 COMMAND: "ocfs2cmt"
#0 [ffff88100f4d1c50] __schedule at ffffffff8150a524
#1 [ffff88100f4d1cf8] schedule at ffffffff8150acbf
#2 [ffff88100f4d1d08] jbd2_log_wait_commit at ffffffffa01274fd [jbd2]
#3 [ffff88100f4d1d98] jbd2_journal_flush at ffffffffa01280b4 [jbd2]
#4 [ffff88100f4d1dd8] ocfs2_commit_cache at ffffffffa0499b14 [ocfs2]
#5 [ffff88100f4d1e38] ocfs2_commit_thread at ffffffffa0499d38 [ocfs2]
#6 [ffff88100f4d1ee8] kthread at ffffffff81090db6
#7 [ffff88100f4d1f48] kernel_thread_helper at ffffffff81516284

crash64> bt
PID: 7609 TASK: ffff88100f2d4480 CPU: 0 COMMAND: "jbd2/dm-20-86"
#0 [ffff88100def3920] __schedule at ffffffff8150a524
#1 [ffff88100def39c8] schedule at ffffffff8150acbf
#2 [ffff88100def39d8] io_schedule at ffffffff8150ad6c
#3 [ffff88100def39f8] sleep_on_page at ffffffff8111069e
#4 [ffff88100def3a08] __wait_on_bit_lock at ffffffff8150b30a
#5 [ffff88100def3a58] __lock_page at ffffffff81110687
#6 [ffff88100def3ab8] write_cache_pages at ffffffff8111b752
#7 [ffff88100def3be8] generic_writepages at ffffffff8111b901
#8 [ffff88100def3c48] journal_submit_data_buffers at ffffffffa0120f67 [jbd2]
#9 [ffff88100def3cf8] jbd2_journal_commit_transaction at ffffffffa0121372[jbd2]
#10 [ffff88100def3e68] kjournald2 at ffffffffa0127a86 [jbd2]
#11 [ffff88100def3ee8] kthread at ffffffff81090db6
#12 [ffff88100def3f48] kernel_thread_helper at ffffffff81516284

Signed-off-by: Junxiao Bi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Alex Chen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-10-10 10:25:48 +0800

07 May, 2014

2 commits

31b140398 switch {__,}blockdev_direct_IO() to iov_iter ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-05-07 05:32:46 +0800
d8d3d94b8 pass iov_iter to ->direct_IO() ... Browse Code »

unmodified, for now

Signed-off-by: Al Viro

Al Viro
2014-05-07 05:32:44 +0800

04 Apr, 2014

2 commits

2931cdcb4 ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file ... Browse Code »

Currently, ocfs2_sync_file grabs i_mutex and forces the current journal
transaction to complete. This isn't terribly efficient, since sync_file
really only needs to wait for the last transaction involving that inode
to complete, and this doesn't require i_mutex.

Therefore, implement the necessary bits to track the newest tid
associated with an inode, and teach sync_file to wait for that instead
of waiting for everything in the journal to commit. Furthermore, only
issue the flush request to the drive if jbd2 hasn't already done so.

This also eliminates the deadlock between ocfs2_file_aio_write() and
ocfs2_sync_file(). aio_write takes i_mutex then calls
ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
However, if that dio completion involves calling fsync, then we can get
into trouble when some ocfs2_sync_file tries to take i_mutex.

Signed-off-by: Darrick J. Wong
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Darrick J. Wong
2014-04-04 07:20:53 +0800
c18ceab01 ocfs2: change ip_unaligned_aio to of type mutex from atomit_t ... Browse Code »

There is a problem that waitqueue_active() may check stale data thus miss
a wakeup of threads waiting on ip_unaligned_aio.

The valid value of ip_unaligned_aio is only 0 and 1 so we can change it to
be of type mutex thus the above prolem is avoid. Another benifit is that
mutex which works as FIFO is fairer than wake_up_all().

Signed-off-by: Wengang Wang
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wengang Wang
2014-04-04 07:20:53 +0800

13 Nov, 2013

4 commits

41ecc3459 ocfs2: simplify ocfs2_invalidatepage() and ocfs2_releasepage() ... Browse Code »

Ocfs2 doesn't do data journalling. Thus its ->invalidatepage and
->releasepage functions never get called on buffers that have journal
heads attached. So just use standard variants of functions from
buffer.c.

Signed-off-by: Jan Kara
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2013-11-13 11:09:02 +0800
b1214e475 ocfs2: fix possible double free in ocfs2_write_begin_nolock ... Browse Code »

When ocfs2_write_cluster_by_desc() failed in ocfs2_write_begin_nolock()
because of ENOSPC, it goes to out_quota, freeing data_ac(meta_ac). Then
it calls ocfs2_try_to_free_truncate_log() to free space. If enough
space freed, it will try to write again. Unfortunately, some error
happenes before ocfs2_lock_allocators(), it goes to out and free
data_ac(meta_ac) again.

Signed-off-by: joyce
Reviewed-by: Jie Liu
Acked-by: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xue jiufei
2013-11-13 11:09:02 +0800
7391a294b ocfs2: return ENOMEM when sb_getblk() fails ... Browse Code »

The only reason for sb_getblk() failing is if it can't allocate the
buffer_head. So return ENOMEM instead when it fails.

[joseph.qi@huawei.com: ocfs2_symlink_get_block() and ocfs2_read_blocks_sync() and ocfs2_read_blocks() need the same change]
Signed-off-by: Rui Xiang
Reviewed-by: Jie Liu
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Cc: Joseph Qi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rui Xiang
2013-11-13 11:09:00 +0800
06f9da6e8 fs/ocfs2: remove unnecessary variable bits_wanted from ocfs2_calc_extend_credits ... Browse Code »

Code cleanup to remove unnecessary variable passed but never used
to ocfs2_calc_extend_credits.

Signed-off-by: Goldwyn Rodrigues
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Goldwyn Rodrigues
2013-11-13 11:09:00 +0800

12 Sep, 2013

1 commit

f17c20dd2 ocfs2: use i_size_read() to access i_size ... Browse Code »

Though ocfs2 uses inode->i_mutex to protect i_size, there are both
i_size_read/write() and direct accesses. Clean up all direct access to
eliminate confusion.

Signed-off-by: Junxiao Bi
Cc: Jie Liu
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2013-09-12 06:56:30 +0800

04 Sep, 2013

1 commit

7b7a8665e direct-io: Implement generic deferred AIO completions ... Browse Code »

Add support to the core direct-io code to defer AIO completions to user
context using a workqueue. This replaces opencoded and less efficient
code in XFS and ext4 (we save a memory allocation for each direct IO)
and will be needed to properly support O_(D)SYNC for AIO.

The communication between the filesystem and the direct I/O code requires
a new buffer head flag, which is a bit ugly but not avoidable until the
direct I/O code stops abusing the buffer_head structure for communicating
with the filesystems.

Currently this creates a per-superblock unbound workqueue for these
completions, which is taken from an earlier patch by Jan Kara. I'm
not really convinced about this use and would prefer a "normal" global
workqueue with a high concurrency limit, but this needs further discussion.

JK: Fixed ext4 part, dynamic allocation of the workqueue.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara
Signed-off-by: Al Viro

Christoph Hellwig
2013-09-04 21:23:46 +0800

14 Aug, 2013

1 commit

c7dd3392a ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page ... Browse Code »

Since ocfs2_cow_file_pos will invoke ocfs2_refcount_icow with a NULL as
the struct file pointer, it finally result in a null pointer dereference
in ocfs2_duplicate_clusters_by_page.

This patch replace file pointer with inode pointer in
cow_duplicate_clusters to fix this issue.

[jeff.liu@oracle.com: rebased patch against linux-next tree]
Signed-off-by: Tiger Yang
Signed-off-by: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Acked-by: Tao Ma
Tested-by: David Weber
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tiger Yang
2013-08-14 08:57:49 +0800

22 May, 2013

3 commits

e5f8d30d6 ocfs2: use ->invalidatepage() length argument ... Browse Code »

->invalidatepage() aop now accepts range to invalidate so we can make
use of it in ocfs2_invalidatepage().

Signed-off-by: Lukas Czerner
Reviewed-by: Jan Kara
Acked-by: Joel Becker

Lukas Czerner
2013-05-22 11:58:46 +0800
259709b07 jbd2: change jbd2_journal_invalidatepage to accept length ... Browse Code »

invalidatepage now accepts range to invalidate and there are two file
system using jbd2 also implementing punch hole feature which can benefit
from this. We need to implement the same thing for jbd2 layer in order to
allow those file system take benefit of this functionality.

This commit adds length argument to the jbd2_journal_invalidatepage()
and updates all instances in ext4 and ocfs2.

Signed-off-by: Lukas Czerner
Reviewed-by: Jan Kara

Lukas Czerner
2013-05-22 11:20:03 +0800
d47992f86 mm: change invalidatepage prototype to accept length ... Browse Code »

Currently there is no way to truncate partial page where the end
truncate point is not at the end of the page. This is because it was not
needed and the functionality was enough for file system truncate
operation to work properly. However more file systems now support punch
hole feature and it can benefit from mm supporting truncating page just
up to the certain point.

Specifically, with this functionality truncate_inode_pages_range() can
be changed so it supports truncating partial page at the end of the
range (currently it will BUG_ON() if 'end' is not at the end of the
page).

This commit changes the invalidatepage() address space operation
prototype to accept range to be invalidated and update all the instances
for it.

We also change the block_invalidatepage() in the same way and actually
make a use of the new length argument implementing range invalidation.

Actual file system implementations will follow except the file systems
where the changes are really simple and should not change the behaviour
in any way .Implementation for truncate_page_range() which will be able
to accept page unaligned ranges will follow as well.

Signed-off-by: Lukas Czerner
Cc: Andrew Morton
Cc: Hugh Dickins

Lukas Czerner
2013-05-22 11:17:23 +0800