05 Apr, 2016
2 commits
-
Pull f2fs fixes from Jaegeuk Kim.
* tag 'f2fs-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: retrieve IO write stat from the right place
f2fs crypto: fix corrupted symlink in encrypted case
f2fs: cover large section in sanity check of super -
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.Let's stop pretending that pages in page cache are special. They are
not.The changes are pretty straight-forward:
- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds
31 Mar, 2016
1 commit
-
In the following patch,
f2fs: split journal cache from curseg cache
journal cache is split from curseg cache. So IO write statistics should be
retrived from journal cache but not curseg->sum_blk. Otherwise, it will
get 0, and the stat is lost.Signed-off-by: Shuoran Liu
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
29 Mar, 2016
1 commit
-
This patch fixes the bug which does not cover a large section case when checking
the sanity of superblock.
If f2fs detects misalignment, it will fix the superblock during the mount time,
so it doesn't need to trigger fsck.f2fs further.Reported-by: Matthias Prager
Reported-by: David Gnedt
Cc: stable 4.5+
Signed-off-by: Jaegeuk Kim
18 Mar, 2016
2 commits
-
The crc function is done bit by bit.
Optimize this by use cryptoapi
crc32 function which is backed by h/w acceleration.Signed-off-by: Keith Mok
Signed-off-by: Jaegeuk Kim -
This patch adds the renamed functions moved from the f2fs crypto files.
1. definitions for per-file encryption used by ext4 and f2fs.
2. crypto.c for encrypt/decrypt functions
a. IO preparation:
- fscrypt_get_ctx / fscrypt_release_ctx
b. before IOs:
- fscrypt_encrypt_page
- fscrypt_decrypt_page
- fscrypt_zeroout_range
c. after IOs:
- fscrypt_decrypt_bio_pages
- fscrypt_pullback_bio_page
- fscrypt_restore_control_page3. policy.c supporting context management.
a. For ioctls:
- fscrypt_process_policy
- fscrypt_get_policy
b. For context permission
- fscrypt_has_permitted_context
- fscrypt_inherit_context4. keyinfo.c to handle permissions
- fscrypt_get_encryption_info
- fscrypt_free_encryption_info5. fname.c to support filename encryption
a. general wrapper functions
- fscrypt_fname_disk_to_usr
- fscrypt_fname_usr_to_disk
- fscrypt_setup_filename
- fscrypt_free_filenameb. specific filename handling functions
- fscrypt_fname_alloc_buffer
- fscrypt_fname_free_buffer6. Makefile and Kconfig
Cc: Al Viro
Signed-off-by: Michael Halcrow
Signed-off-by: Ildar Muslukhov
Signed-off-by: Uday Savagaonkar
Signed-off-by: Theodore Ts'o
Signed-off-by: Arnd Bergmann
Signed-off-by: Jaegeuk Kim
27 Feb, 2016
1 commit
-
Add a new helper f2fs_flush_merged_bios to clean up redundant codes.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
23 Feb, 2016
8 commits
-
This patch changes to show more info in message log about the recovery
of the corrupted superblock during ->mount, e.g. the index of corrupted
superblock and the result of recovery.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
>From the function name of get_valid_checkpoint, it seems to return
the valid cp or NULL for caller to check. If no valid one is found,
f2fs_fill_super will print the err log. But if get_valid_checkpoint
get one valid(the return value indicate that it's valid, however actually
it is invalid after sanity checking), then print another similar err
log. That seems strange. Let's keep sanity checking inside the procedure
of geting valid cp. Another improvement we gained from this move is
that even the large volume is supported, we check the cp in advanced
to skip the following procedure if failing the sanity checking.Signed-off-by: Shawn Lin
Signed-off-by: Jaegeuk Kim -
read_raw_super_block was introduced to help find the
first valid superblock. Commit da554e48caab ("f2fs:
recovering broken superblock during mount") changed the
behaviour to read both of them and check whether need
the recovery flag or not. So the comment before this
function isn't consistent with what it actually does.
Also, the origin code use two tags to round the err
cases, which isn't so readable. So this patch amend
the comment and slightly reorganize it.Signed-off-by: Shawn Lin
Signed-off-by: Jaegeuk Kim -
Introduce a new structure f2fs_journal to wrap journal info in struct
f2fs_summary_block for readability.struct f2fs_journal {
union {
__le16 n_nats;
__le16 n_sits;
};
union {
struct nat_journal nat_j;
struct sit_journal sit_j;
struct f2fs_extra_info info;
};
} __packed;struct f2fs_summary_block {
struct f2fs_summary entries[ENTRIES_IN_SUM];
struct f2fs_journal journal;
struct summary_footer footer;
} __packed;Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Split drop_inmem_pages from commit_inmem_pages for code readability,
and prepare for the following modification.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Sometimes, if cp_error is set, there remains under-writeback pages, resulting in
kernel hang in put_super.Signed-off-by: Jaegeuk Kim
-
This patch introduces lifetime IO write statistics exposed to the sysfs interface.
The write IO amount is obtained from block layer, accumulated in the file system and
stored in the hot node summary of checkpoint.Signed-off-by: Shuoran Liu
Signed-off-by: Pengyang Hou
[Jaegeuk Kim: add sysfs documentation]
Signed-off-by: Jaegeuk Kim -
This patch exports a new sysfs entry 'dirty_nat_ratio' to control threshold
of dirty nat entries, if current ratio exceeds configured threshold,
checkpoint will be triggered in f2fs_balance_fs_bg for flushing dirty nats.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
15 Jan, 2016
1 commit
-
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: Greg Thelen
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jan, 2016
2 commits
-
This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.Signed-off-by: Jaegeuk Kim
-
This patch adds time and interval arrays to store some timing variables.
Signed-off-by: Jaegeuk Kim
09 Jan, 2016
1 commit
-
Only when node page is newly dirtied, it needs to check whether we need to do
f2fs_gc.Signed-off-by: Jaegeuk Kim
04 Jan, 2016
1 commit
-
Introduce max_file_blocks in sbi to store max block index of file in f2fs,
it could be used to avoid unneeded calculation of max block index in
runtime.Signed-off-by: Chao Yu
[Jaegeuk Kim: fix overflow of sbi->max_file_blocks]
Signed-off-by: Jaegeuk Kim
31 Dec, 2015
3 commits
-
This patch adds a max block check for get_data_block_bmap.
Trinity test program will send a block number as parameter into
ioctl_fibmap, which will be used in get_node_path(), when the block
number large than f2fs max blocks, it will trigger kernel bug.Signed-off-by: Yunlei He
Signed-off-by: Xue Liu
[Jaegeuk Kim: fix missing condition, pointed by Chao Yu]
Signed-off-by: Jaegeuk Kim -
The __f2fs_commit_super is static.
Signed-off-by: Jaegeuk Kim
-
do_checkpoint and write_checkpoint can fail due to reasons like triggering
in a readonly fs or encountering IO error of storage device.So it's better to report such error info to user, let user be aware of
failure of doing checkpoint.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
17 Dec, 2015
5 commits
-
Add a new option 'data_flush' to enable data flush functionality.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Maintain regular/symlink inode which has dirty pages in global dirty list
and record their total dirty pages count like the way of handling directory
inode.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Introduce __f2fs_commit_super to include duplicated codes in
f2fs_commit_super for cleanup.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
We have already got one copy of valid super block in memory, do not grab
buffer header of super block all the time.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
f2fs use fields of f2fs_super_block struct directly in a grabbed buffer.
Once the buffer happen to be destroyed (e.g. through dd), it may bring
in unpredictable effect on f2fs.This patch fixes to allocate additional buffer to store datas of super
block rather than using grabbed block buffer directly.Signed-off-by: Yunlei He
Signed-off-by: Jaegeuk Kim
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
16 Dec, 2015
2 commits
-
Add a new dirt list node member in inode info for linking the inode to
global dirty list in superblock, instead of old implementation which
allocate slab cache memory as an entry to inode.It avoids memory pressure due to slab cache allocation, and also makes
codes more clean.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
remove_dirty_dir_inode will be renamed to remove_dirty_inode as a generic
function in following patch for removing directory/regular/symlink inode
in global dirty list.Here rename ino management related functions for readability, also in
order to avoid name conflict.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
15 Dec, 2015
1 commit
-
Do more sanity check for superblock during ->mount.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
10 Dec, 2015
1 commit
-
Previously, f2fs_commit_super hacks the bh->blocknr to write the broken
alternate superblock.
Instead of it, we should use the correct logic to retrieve its buffer head
with locking it appropriately.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
05 Dec, 2015
2 commits
-
f2fs_sb_info::s_kobj should be released in error path of fill_super,
otherwise it will lead to memory leak.This bug was found by kmemleak:
dmesg:
kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800838dc358 (size 8):
comm "mount", pid 4154, jiffies 4297482839 (age 1911.412s)
hex dump (first 8 bytes):
7a 72 61 6d 31 00 ff ff zram1...
backtrace:
[] kmemleak_alloc+0x28/0x50
[] __kmalloc_track_caller+0xef/0x1c0
[] kstrdup+0x45/0x80
[] kstrdup_const+0x28/0x30
[] kvasprintf_const+0x63/0xa0
[] kobject_set_name_vargs+0x3c/0xa0
[] kobject_add_varg+0x25/0x60
[] kobject_init_and_add+0x53/0x70
[] f2fs_fill_super+0x9d9/0xc40 [f2fs]
[] mount_bdev+0x192/0x1d0
[] f2fs_mount+0x15/0x20 [f2fs]
[] mount_fs+0x43/0x170
[] vfs_kern_mount+0x76/0x160
[] do_mount+0x258/0xdc0
[] SyS_mount+0x7b/0xc0
[] entry_SYSCALL_64_fastpath+0x12/0x6f
...Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
f2fs_create_root_stats can fail due to no memory, report it to user.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
13 Oct, 2015
1 commit
-
After finishing building free nid cache, we will try to readahead
asynchronously 4 more pages for the next reloading, the count of
readahead nid pages is fixed.In some case, like SMR drive, read less sectors with fixed count
each time we trigger RA may be low efficient, since we will face
high seeking overhead, so we'd better let user to configure this
parameter from sysfs in specific workload.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
10 Oct, 2015
3 commits
-
This patch introduces a periodic checkpoint feature.
Note that, this is not enforcing to conduct checkpoints very strictly in terms
of trigger timing, instead just hope to help user experiences.
The default value is 60 seconds.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch introduce background_gc=sync enabling synchronous cleaning in
background.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Swith extent_cache option dynamically when remount may casue consistency
issue between extent cache and dnode page. Fix in this patch to avoid
that condition.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
02 Sep, 2015
1 commit
-
upset segment_info like this:
276000|161 0|0 4|70 3|0 3|0 0|0 0|91 4|0 4|232 4|39
276104|0 4|0 4|1 4|0 4|0 4|280 4|0 4|42 4|262 4|38
276204|179 4|89 4|39 4|24 4|0 4|96 4|3 4|428 4|0 4|118
276304|112 4|97 4|0 4|0 4|0 4|68 4|0 4|0 4|86 4|138
276404|0 4|0 0|166 5|39 4|101 0|111Signed-off-by: Yunlei He
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
20 Aug, 2015
1 commit
-
We should not write node pages when deleting orphan inodes.
In order to do that, we can eaisly set POR_DOING flag earlier before entering
orphan inode routine.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim