Eric Lee / smarc-fsl-linux-kernel

04 Sep, 2015

1 commit

4c12ab7e5 Merge tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"The major work includes fixing and enhancing the existing extent_cache
feature, which has been well settling down so far and now it becomes a
default mount option accordingly.

Also, this version newly registers a f2fs memory shrinker to reclaim
several objects consumed by a couple of data structures in order to
avoid memory pressures.

Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
triggers a cleaning job explicitly by users.

Most of the other patches are to fix bugs occurred in the corner cases
across the whole code area"

* tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
f2fs: upset segment_info repair
f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
f2fs: update extent tree in batches
f2fs: fix to release inode correctly
f2fs: handle f2fs_truncate error correctly
f2fs: avoid unneeded initializing when converting inline dentry
f2fs: atomically set inode->i_flags
f2fs: fix wrong pointer access during try_to_free_nids
f2fs: use __GFP_NOFAIL to avoid infinite loop
f2fs: lookup neighbor extent nodes for merging later
f2fs: split __insert_extent_tree_ret for readability
f2fs: kill dead code in __insert_extent_tree
f2fs: adjust showing of extent cache stat
f2fs: add largest/cached stat in extent cache
f2fs: fix incorrect mapping for bmap
f2fs: add annotation for space utilization of regular/inline dentry
f2fs: fix to update cached_en of extent tree properly
f2fs: fix typo
f2fs: check the node block address of newly allocated nid
f2fs: go out for insert_inode_locked failure
...

Linus Torvalds
2015-09-04 04:10:22 +0800

03 Sep, 2015

1 commit

1081230b7 Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block updates from Jens Axboe:
"This first core part of the block IO changes contains:

- Cleanup of the bio IO error signaling from Christoph. We used to
rely on the uptodate bit and passing around of an error, now we
store the error in the bio itself.

- Improvement of the above from myself, by shrinking the bio size
down again to fit in two cachelines on x86-64.

- Revert of the max_hw_sectors cap removal from a revision again,
from Jeff Moyer. This caused performance regressions in various
tests. Reinstate the limit, bump it to a more reasonable size
instead.

- Make /sys/block//queue/discard_max_bytes writeable, by me.
Most devices have huge trim limits, which can cause nasty latencies
when deleting files. Enable the admin to configure the size down.
We will look into having a more sane default instead of UINT_MAX
sectors.

- Improvement of the SGP gaps logic from Keith Busch.

- Enable the block core to handle arbitrarily sized bios, which
enables a nice simplification of bio_add_page() (which is an IO hot
path). From Kent.

- Improvements to the partition io stats accounting, making it
faster. From Ming Lei.

- Also from Ming Lei, a basic fixup for overflow of the sysfs pending
file in blk-mq, as well as a fix for a blk-mq timeout race
condition.

- Ming Lin has been carrying Kents above mentioned patches forward
for a while, and testing them. Ming also did a few fixes around
that.

- Sasha Levin found and fixed a use-after-free problem introduced by
the bio->bi_error changes from Christoph.

- Small blk cgroup cleanup from Viresh Kumar"

* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
blk: Fix bio_io_vec index when checking bvec gaps
block: Replace SG_GAPS with new queue limits mask
block: bump BLK_DEF_MAX_SECTORS to 2560
Revert "block: remove artifical max_hw_sectors cap"
blk-mq: fix race between timeout and freeing request
blk-mq: fix buffer overflow when reading sysfs file of 'pending'
Documentation: update notes in biovecs about arbitrarily sized bios
block: remove bio_get_nr_vecs()
fs: use helper bio_add_page() instead of open coding on bi_io_vec
block: kill merge_bvec_fn() completely
md/raid5: get rid of bio_fits_rdev()
md/raid5: split bio for chunk_aligned_read
block: remove split code in blkdev_issue_{discard,write_same}
btrfs: remove bio splitting and merge_bvec_fn() calls
bcache: remove driver private bio splitting code
block: simplify bio_add_page()
block: make generic_make_request handle arbitrarily sized bios
blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
block: don't access bio->bi_error after bio_put()
block: shrink struct bio down to 2 cache lines again
...

Linus Torvalds
2015-09-03 04:10:25 +0800

22 Aug, 2015

1 commit

e2b4e2bc8 f2fs: fix incorrect mapping for bmap ... Browse Code »

The test step is like below:
1. touch file
2. truncate -s $((1024*1024)) file
3. fallocate -o 0 -l $((1024*1024)) file
4. fibmap.f2fs file

Our result of fibmap.f2fs showed below is not correct:

file_pos start_blk end_blk blks
0 -937166132 -937166132 1
4096 -937166132 -937166132 1
8192 -937166132 -937166132 1
12288 -937166132 -937166132 1
16384 -937166132 -937166132 1
20480 -937166132 -937166132 1
...
1040384 -937166132 -937166132 1
1044480 -937166132 -937166132 1

This is because f2fs_map_blocks will return with no error when meeting
a hole or preallocated block, the caller __get_data_block will map the
uninitialized variable value to bh->b_blocknr.

Unfortunately generic_block_bmap will neither check the return value of
get_data() nor check mapping info of buffer_head, result in returning
the random block address.

After fixing the issue, our result shows correctly:

file_pos start_blk end_blk blks
0 0 0 256

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-22 13:45:14 +0800

21 Aug, 2015

1 commit

740432f83 f2fs: handle failed bio allocation ... Browse Code »

As the below comment of bio_alloc_bioset, f2fs can allocate multiple bios at the
same time. So, we can't guarantee that bio is allocated all the time.

"
* When @bs is not NULL, if %__GFP_WAIT is set then bio_alloc will always be
* able to allocate a bio. This is due to the mempool guarantees. To make this
* work, callers must never allocate more than 1 bio at a time from this pool.
* Callers that need to allocate more than 1 bio must always submit the
* previously allocated bio for IO before attempting to allocate a new one.
* Failure to do so can cause deadlocks under memory pressure.
"

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-21 00:00:09 +0800

14 Aug, 2015

1 commit

b54ffb73c block: remove bio_get_nr_vecs() ... Browse Code »

We can always fill up the bio now, no need to estimate the possible
size based on queue parameters.

Acked-by: Steven Whitehouse
Signed-off-by: Kent Overstreet
[hch: rebased and wrote a changelog]
Signed-off-by: Christoph Hellwig
Signed-off-by: Ming Lin
Signed-off-by: Jens Axboe

Kent Overstreet
2015-08-14 02:32:04 +0800

12 Aug, 2015

2 commits

decd36b6c f2fs: remove inmem radix tree ... Browse Code »

Previously, we use radix tree to index all registered page entries for
atomic file, but now we only use radix tree to see whether current page
is indexed or not, since the other user of radix tree is gone in commit
042b7816aaeb ("f2fs: remove unnecessary call to invalidate inmemory pages").

So in this patch, we try to use one more efficient way:
Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
value to indicate page indexing status. By using this way, we can save
memory and lookup time.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-12 02:31:14 +0800
c15e8599f f2fs: report EINVAL for unalignment direct IO ... Browse Code »

We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in
detail is as fallow:

dio04

<<>>
tag=dio04 stime=1432278894
cmdline="diotest4"
contacts=""
analysis=exit
<<>>
diotest4 1 TPASS : Negative Offset
diotest4 2 TPASS : removed
diotest4 3 TFAIL : diotest4.c:129: write allows odd count.returns 1: Success
diotest4 4 TFAIL : diotest4.c:183: Odd count of read and write
diotest4 5 TPASS : Read beyond the file size
......

the result of ext4 with same environment:

dio04

<<>>
tag=dio04 stime=1432259643
cmdline="diotest4"
contacts=""
analysis=exit
<<>>
diotest4 1 TPASS : Negative Offset
diotest4 2 TPASS : removed
diotest4 3 TPASS : Odd count of read and write
diotest4 4 TPASS : Read beyond the file size
......

The reason is that when triggering DIO in f2fs, we will return zero value
in ->direct_IO if writer's buffer offset, file offset and transfer size is
not alignment to block size of filesystem, resulting in falling back into
buffered write instead of returning -EINVAL.

This patch fixes that problem by returning correct error number for above
case, and removing the judgement condition in check_direct_IO to make sure
the verification will be enabled for direct reader too.

Besides, Jaegeuk Kim pointed out that there is expectional cases we should
always make direct-io falling back into buffered write, such as dio in
encrypted file.

Signed-off-by: Yunlei He
[Chao Yu make small change and add detail description in commit message]
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-12 02:30:24 +0800

06 Aug, 2015

1 commit

759af1c9c f2fs: use extent cache to optimize f2fs_reserve_block ... Browse Code »

In some cases, we only need the block address when we call
f2fs_reserve_block,
other fields of struct dnode_of_data aren't necessary.
We can try extent cache first for such cases in order to speed up the
process.

Signed-off-by: Fan li
Signed-off-by: Jaegeuk Kim

Fan Li
2015-08-06 13:15:42 +0800

05 Aug, 2015

17 commits

470f00e96 f2fs: fix to release inode page correctly ... Browse Code »

In following call path, we will pass a locked and referenced ipage
pointer to get_new_data_page:
- init_inode_metadata
- make_empty_dir
- get_new_data_page

There are two exit paths in get_new_data_page when error occurs:
1) grab_cache_page fails, ipage will not be released;
2) f2fs_reserve_block fails, ipage will be released in callee.

So, it's not consistent for error handling in get_new_data_page.

For f2fs_reserve_block, it's not very easy to change the rule
of error handling, since it's already complicated.

Here we deside to choose an easy way to fix this issue:
If any error occur in get_new_data_page, we will ensure releasing
ipage in this function.

The same issue is in f2fs_convert_inline_dir, fix that too.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 23:08:23 +0800
5768dcdd7 f2fs: change the timing of f2fs_wait_on_page_writeback ... Browse Code »

some backing devices need pages to be stable during writeback. It doesn't
matter if
the page is completely overwritten or already uptodate, it needs to wait
before write.

Signed-off-by: Fan li
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Fan Li
2015-08-05 23:08:16 +0800
6a2905443 f2fs: skip writing in ->writepages when no dirty pages exist ... Browse Code »

When flushing comes from background, if there is no dirty page in the
mapping of inode, we'd better to skip seeking dirty page from mapping
for writebacking.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 23:08:11 +0800
737f18992 f2fs: optimize f2fs_write_cache_pages ... Browse Code »

The if statement "goto continue_unlock" is exactly the same when
each if condition is true that is depended on the value of both
"step" and "is_cold_data(page)" are 0 or 1. That means when the
value of "step" equals to "is_cold_data(page)", the if condition
is true and the if statement "goto continue_unlock" appears only
once, so it can be optimized to reduce the duplicated code.

Signed-off-by: Tiezhu Yang
Signed-off-by: Jaegeuk Kim

Tiezhu Yang
2015-08-05 23:08:10 +0800
86531d6b8 f2fs: callers take care of the page from bio error ... Browse Code »

This patch changes for a caller to handle the page after its bio gets an error.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 23:08:07 +0800
8f46dcaea f2fs: expose f2fs_write_cache_pages ... Browse Code »

If there are gced dirty pages and normal dirty pages in the mapping
of one inode, we might writeback them alternately with discontinuous
block address, resulting in low performance.

This patch introduces f2fs_write_cache_pages with codes copied from
write_cache_pages in mm/page-writeback.c.

In this function, we refactor flow with two steps:
1) writeback all cold type pages.
2) writeback all non-cold type pages.

By using this method, f2fs will writeback dirty pages with the same
temperature in bunch mode, it makes writeouted block being with
more continuous address, so they can be merged as much as possible
in f2fs bio cache, and also it will reduce the chance of submiting
small IO from block layer.

Test environment: 8g nokia sd card (very old sd card, but it shows
better effect when testing with this patch, and with a 32g kingston
sd card, I didn't see much more improvement).

Test step:
1. touch testfile;
2. truncate -s 512K testfile;
3. write all pages with odd index;
4. trigger gc by ioctl;
5. write all pages with even index;
6. time fsync testfile.

before:
real 0m0.402s
user 0m0.000s
sys 0m0.000s

after:
real 0m0.143s
user 0m0.004s
sys 0m0.004s

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 05:09:59 +0800
a28ef1f5a f2fs: maintain extent cache in separated file ... Browse Code »

This patch moves extent cache related code from data.c into extent_cache.c
since extent cache is independent feature, and its codes are not relate to
others in data.c, it's better for us to maintain them in separated place.

There is no functionality change, but several small coding style fixes
including:
* rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
* rename misspelled word 'untill' to 'until';
* remove unneeded 'return' in the end of f2fs_destroy_extent_tree().

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 05:09:58 +0800
3c7df87da f2fs: don't try to split extents shorter than F2FS_MIN_EXTENT_LEN ... Browse Code »

Since only parts of extents longer than F2FS_MIN_EXTENT_LEN will
be kept in extent cache after split, extents already shorter than
F2FS_MIN_EXTENT_LEN don't need to try split at all.

Signed-off-by: Fan Li
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Fan Li
2015-08-05 05:09:58 +0800
90d4388ac f2fs: fix to update page flag ... Browse Code »

This patch fixes to update page flag (e.g. Uptodate/cold flag) in
->write_begin.

Otherwise, page will be non-uptodate when we try to write entire
page, and cold data flag in page will not be clean when gced page
is being rewritten.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 05:09:57 +0800
7023a1ad1 f2fs: shrink unreferenced extent_caches first ... Browse Code »

If an extent_tree entry has a zero reference count, we can drop it from the
cache in higher priority rather than currently referencing entries.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:57 +0800
bb96a8d51 f2fs: enhance multithread performance ... Browse Code »

In ->writepages, we use writepages mutex lock to serialize all block
address allocation and page submitting pairs from different inodes.
This method makes our delayed dirty pages of one inode being written
continously as many as possible.

But there is one problem that we did not submit current cached bio in
protection region of writepages mutex lock, so there is a small chance
that we submit the one of other thread's as below, resulting in
splitting more bios.

thread 1 thread 2
->writepages
lock(writepages)
->write_cache_pages
unlock(writepages)
lock(writepages)
->write_cache_pages
->f2fs_submit_merged_bio
->writepage
unlock(writepages)

fs_mark-6535 [002] .... 2242.270230: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5766152, size = 524288
fs_mark-6536 [000] .... 2242.270361: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767176, size = 4096
fs_mark-6536 [000] .... 2242.270370: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, NODE, sector = 8138112, size = 4096
fs_mark-6535 [002] .... 2242.270776: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767184, size = 516096

This may really increase time of block layer works, and may cause
larger IO lantency.

This patch moves the submitting operation into region of writepages
mutex lock to avoid bio splits when concurrently writebacking is
intensive.

my test environment: virtual machine,
intel cpu i5 2500, 8GB size memory, 4GB size ramdisk

time fs_mark -t 16 -L 1 -s 524288 -S 1 -d /mnt/f2fs/

before:
real 0m4.244s
user 0m0.088s
sys 0m12.336s

after:
real 0m3.822s
user 0m0.072s
sys 0m10.760s

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-08-05 05:09:57 +0800
84bc926c0 f2fs: check the largest extent at look-up time ... Browse Code »

Because of the extent shrinker or other -ENOMEM scenarios, it cannot guarantee
that the largest extent would be cached in the tree all the time.

Instead of relying on extent_tree, we can simply check the cached one in extent
tree accordingly.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:56 +0800
3e72f7213 f2fs: use extent_cache by default ... Browse Code »

We don't need to handle the duplicate extent information.

The integrated rule is:
- update on-disk extent with largest one tracked by in-memory extent_cache
- destroy extent_tree for the truncation case
- drop per-inode extent_cache by shrinker

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:56 +0800
554df79e5 f2fs: shrink extent_cache entries ... Browse Code »

This patch registers shrinking extent_caches.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:55 +0800
244f4fc1c f2fs: set cached_en after checking finally ... Browse Code »

This patch relocates cached_en not only to be covered by spin_lock, but also
to set once after checking out completely.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:54 +0800
cbe91923a f2fs: update on-disk extents even under extent_cache ... Browse Code »

Previously, f2fs_update_extent_cache() updates in-memory extent_cache all the
time, and then finally preserves its up-to-date extent into on-disk one during
f2fs_evict_inode.

But, in the following scenario:

1. mount
2. open & write an extent X
3. f2fs_evict_inode; on-disk extent is X
4. open & update the extent X with Y
5. sync; trigger checkpoint
6. power-cut

after power-on, f2fs should serve extent Y, but we have an on-disk extent X.

This causes a failure on xfstests/311.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:54 +0800
7a2cb6786 f2fs: fix wrong block address calculation for a split extent ... Browse Code »

This patch fixes wrong calculation on block address field when an extent is
split.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-08-05 05:09:54 +0800

29 Jul, 2015

1 commit

4246a0b63 block: add a bi_error field to struct bio ... Browse Code »

Currently we have two different ways to signal an I/O error on a BIO:

(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: NeilBrown
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-07-29 22:55:15 +0800

25 Jul, 2015

1 commit

6282adbf9 f2fs: call set_page_dirty to attach i_wb for cgroup ... Browse Code »

The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
is called, __inc_wb_stat() updates i_wb's stat.

So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
any writebacking pages.

This patch should resolve the following kernel panic reported by Andreas Reis.

https://bugzilla.kernel.org/show_bug.cgi?id=101801

--- Comment #2 from Andreas Reis ---
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: [] __percpu_counter_add+0x1a/0x90
PGD 2951ff067 PUD 2df43f067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1
Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
T01 02/03/2015
task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
RIP: 0010:[] []
__percpu_counter_add+0x1a/0x90
RSP: 0018:ffff880295143ac8 EFLAGS: 00010082
RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
Call Trace:
[] __test_set_page_writeback+0xde/0x1d0
[] do_write_data_page+0xe7/0x3a0
[] gc_data_segment+0x5aa/0x640
[] do_garbage_collect+0x138/0x150
[] f2fs_gc+0x1be/0x3e0
[] f2fs_balance_fs+0x81/0x90
[] f2fs_unlink+0x47/0x1d0
[] vfs_unlink+0x109/0x1b0
[] do_unlinkat+0x287/0x2c0
[] SyS_unlink+0x16/0x20
[] entry_SYSCALL_64_fastpath+0x12/0x71
Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e 8b 47 20 48 63 ca
65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
RIP [] __percpu_counter_add+0x1a/0x90
RSP
CR2: 00000000000000a8
---[ end trace 5132449a58ed93a3 ]---
note: gcc[10356] exited with preempt_count 2

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-07-25 23:54:26 +0800

02 Jun, 2015

1 commit

123770247 f2fs: avoid duplicated code by reusing f2fs_read_end_io ... Browse Code »

This patch tries to clean up code because part code of f2fs_read_end_io
and mpage_end_io are the same, so it's better to merge and reuse them.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2015-06-02 07:21:04 +0800

29 May, 2015

8 commits

4375a3366 f2fs crypto: add encryption support in read/write paths ... Browse Code »

This patch adds encryption support in read and write paths.

Note that, in f2fs, we need to consider cleaning operation.
In cleaning procedure, we must avoid encrypting and decrypting written blocks.
So, this patch implements move_encrypted_block().

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:52 +0800
fcc85a4d8 f2fs crypto: activate encryption support for fs APIs ... Browse Code »

This patch activates the following APIs for encryption support.

The rules quoted by ext4 are:
- An unencrypted directory may contain encrypted or unencrypted files
or directories.
- All files or directories in a directory must be protected using the
same key as their containing directory.
- Encrypted inode for regular file should not have inline_data.
- Encrypted symlink and directory may have inline_data and inline_dentry.

This patch activates the following APIs.
1. f2fs_link : validate context
2. f2fs_lookup : ''
3. f2fs_rename : ''
4. f2fs_create/f2fs_mkdir : inherit its dir's context
5. f2fs_direct_IO : do buffered io for regular files
6. f2fs_open : check encryption info
7. f2fs_file_mmap : ''
8. f2fs_setattr : ''
9. f2fs_file_write_iter : '' (Called by sys_io_submit)
10. f2fs_fallocate : do not support fcollapse
11. f2fs_evict_inode : free_encryption_info

Signed-off-by: Michael Halcrow
Signed-off-by: Theodore Ts'o
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:51 +0800
7f63eb77a f2fs: report unwritten area in f2fs_fiemap ... Browse Code »

This patch slightly changes f2fs_fiemap function to report unwritten area.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:45 +0800
43f3eae1d f2fs: split find_data_page according to specific purposes ... Browse Code »

This patch splits find_data_page as follows.

1. f2fs_gc
- use get_read_data_page() with read only

2. find_in_level
- use find_data_page without locked page

3. truncate_partial_page
- In the case cache_only mode, just drop cached page.
- Ohterwise, use get_lock_data_page() and guarantee to truncate

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:37 +0800
01f28610a f2fs: fix race on allocating and deallocating a dentry block ... Browse Code »

There are two threads:
f2fs_delete_entry() get_new_data_page()
f2fs_reserve_block()
dn.blkaddr = XXX
lock_page(dentry_block)
truncate_hole()
dn.blkaddr = NULL
unlock_page(dentry_block)
lock_page(dentry_block)
fill the block from XXX address
add new dentries
unlock_page(dentry_block)

Later, f2fs_write_data_page() will truncate the dentry_block, since
its block address is NULL.

The reason for this was due to the wrong lock order.
In this case, we should do f2fs_reserve_block() after locking its dentry block.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:35 +0800
05ca3632e f2fs: add sbi and page pointer in f2fs_io_info ... Browse Code »

This patch adds f2fs_sb_info and page pointers in f2fs_io_info structure.
With this change, we can reduce a lot of parameters for IO functions.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:32 +0800
f1e886601 f2fs: expose f2fs_mpage_readpages ... Browse Code »

This patch implements f2fs_mpage_readpages for further optimization on
encryption support.

The basic code was taken from fs/mpage.c, and changed to be simple by adjusting
that block_size is equal to page_size in f2fs.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:30 +0800
003a3e1d6 f2fs: add f2fs_map_blocks ... Browse Code »

This patch introduces f2fs_map_blocks structure likewise ext4_map_blocks.
Now, f2fs uses f2fs_map_blocks when handling get_block.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2015-05-29 06:41:29 +0800

05 May, 2015

1 commit

5463e7c18 Revert "f2fs: enhance multi-threads performance" ... Browse Code »

This reports performance regression by Yuanhan Liu.
The basic idea was to reduce one-point mutex, but it turns out this causes
another contention like context swithes.

https://lkml.org/lkml/2015/4/21/11

Until finishing the analysis on this issue, I'd like to revert this for a while.

This reverts commit 78373b7319abdf15050af5b1632c4c8b8b398f33.

Jaegeuk Kim
2015-05-05 05:15:15 +0800

18 Apr, 2015

1 commit

06a60deca Merge tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"New features:
- in-memory extent_cache
- fs_shutdown to test power-off-recovery
- use inline_data to store symlink path
- show f2fs as a non-misc filesystem

Major fixes:
- avoid CPU stalls on sync_dirty_dir_inodes
- fix some power-off-recovery procedure
- fix handling of broken symlink correctly
- fix missing dot and dotdot made by sudden power cuts
- handle wrong data index during roll-forward recovery
- preallocate data blocks for direct_io

... and a bunch of minor bug fixes and cleanups"

* tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (71 commits)
f2fs: pass checkpoint reason on roll-forward recovery
f2fs: avoid abnormal behavior on broken symlink
f2fs: flush symlink path to avoid broken symlink after POR
f2fs: change 0 to false for bool type
f2fs: do not recover wrong data index
f2fs: do not increase link count during recovery
f2fs: assign parent's i_mode for empty dir
f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries
f2fs: fix mismatching lock and unlock pages for roll-forward recovery
f2fs: fix sparse warnings
f2fs: limit b_size of mapped bh in f2fs_map_bh
f2fs: persist system.advise into on-disk inode
f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get
f2fs: preallocate fallocated blocks for direct IO
f2fs: enable inline data by default
f2fs: preserve extent info for extent cache
f2fs: initialize extent tree with on-disk extent info of inode
f2fs: introduce __{find,grab}_extent_tree
f2fs: split set_data_blkaddr from f2fs_update_extent_cache
f2fs: enable fast symlink by utilizing inline data
...

Linus Torvalds
2015-04-18 23:17:20 +0800

12 Apr, 2015

2 commits

22c6186ec direct_IO: remove rw from a_ops->direct_IO() ... Browse Code »

Now that no one is using rw, remove it completely.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800
6f6737631 direct_IO: use iov_iter_rw() instead of rw everywhere ... Browse Code »

The rw parameter to direct_IO is redundant with iov_iter->type, and
treated slightly differently just about everywhere it's used: some users
do rw & WRITE, and others do rw == WRITE where they should be doing a
bitwise check. Simplify this with the new iov_iter_rw() helper, which
always returns either READ or WRITE.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800