04 Jun, 2020
1 commit
-
Fix the following coccicheck warning:
fs/ext4/extents_status.c:1057:5-28: WARNING: Comparison to bool
fs/ext4/inode.c:2314:18-24: WARNING: Comparison to boolSigned-off-by: Jason Yan
Reviewed-by: Ritesh Harjani
Link: https://lore.kernel.org/r/20200420042918.19459-1-yanaijie@huawei.com
Signed-off-by: Theodore Ts'o
28 Aug, 2019
1 commit
-
@es_stats_cache_hits and @es_stats_cache_misses are accessed frequently in
ext4_es_lookup_extent function, it would influence the ext4 read/write
performance in NUMA system. Let's optimize it using percpu_counter,
it is profitable for the performance.The test command is as below:
fio -name=randwrite -numjobs=8 -filename=/mnt/test1 -rw=randwrite
-ioengine=libaio -direct=1 -iodepth=64 -sync=0 -norandommap
-group_reporting -runtime=120 -time_based -bs=4k -size=5GAnd the result is better 10% than the initial implement:
without the patch,IOPS=197k, BW=770MiB/s (808MB/s)(90.3GiB/120002msec)
with the patch, IOPS=218k, BW=852MiB/s (894MB/s)(99.9GiB/120002msec)Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Cc: Eric Biggers
Signed-off-by: Yang Guo
Signed-off-by: Shaokun Zhang
23 Aug, 2019
1 commit
-
The goal of this patch is to remove two references to the buffer delay
bit in ext4_da_page_release_reservation() as part of a larger effort
to remove all such references from ext4. These two references are
principally used to reduce the reserved block/cluster count when pages
are invalidated as a result of truncating, punching holes, or
collapsing a block range in a file. The entire function is removed
and replaced with code in ext4_es_remove_extent() that reduces the
reserved count as a side effect of removing a block range from delayed
and not unwritten extents in the extent status tree as is done when
truncating, punching holes, or collapsing ranges.The code is written to minimize the number of searches descending from
rb tree roots for scalability.Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o
12 Aug, 2019
2 commits
-
For debugging reasons, it's useful to know the contents of the extent
cache. Since the extent cache contains much of what is in the fiemap
ioctl, use an fiemap-style interface to return this information.Signed-off-by: Theodore Ts'o
-
The new ioctl EXT4_IOC_CLEAR_ES_CACHE will force an inode's extent
status cache to be cleared out. This is intended for use for
debugging.Signed-off-by: Theodore Ts'o
20 Jun, 2019
1 commit
-
Pointer 'node' is assigned a value that is never read, node is
later overwritten when it re-assigned a different value inside
the while-loop. The assignment is redundant and can be removed.Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara
08 Apr, 2019
1 commit
-
BUG_ON(1) leads to bogus warnings from clang when
CONFIG_PROFILE_ANNOTATED_BRANCHES is set:fs/ext4/inode.c:544:4: error: variable 'retval' is used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
BUG_ON(1);
^~~~~~~~~
include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON'
^~~~~~~~~~~~~~~~~~~
include/linux/compiler.h:48:23: note: expanded from macro 'unlikely'
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ext4/inode.c:591:6: note: uninitialized use occurs here
if (retval > 0 && map->m_flags & EXT4_MAP_MAPPED) {
^~~~~~
fs/ext4/inode.c:544:4: note: remove the 'if' if its condition is always true
BUG_ON(1);
^
include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON'
^
fs/ext4/inode.c:502:12: note: initialize the variable 'retval' to silence this warningChange it to BUG() so clang can see that this code path can never
continue.Signed-off-by: Arnd Bergmann
Signed-off-by: Theodore Ts'o
Reviewed-by: Nick Desaulniers
Reviewed-by: Jan Kara
02 Oct, 2018
5 commits
-
Add new code to count canceled pending cluster reservations on bigalloc
file systems and to reduce the cluster reservation count on all file
systems using delayed allocation. This replaces old code in
ext4_da_page_release_reservations that was incorrect.Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o -
Ext4 does not always reduce the reserved cluster count by the number
of clusters allocated when mapping a delayed extent. It sometimes
adds back one or more clusters after allocation if delalloc blocks
adjacent to the range allocated by ext4_ext_map_blocks() share the
clusters newly allocated for that range. However, this overcounts
the number of clusters needed to satisfy future mapping requests
(holding one or more reservations for clusters that have already been
allocated) and premature ENOSPC and quota failures, etc., result.Ext4 also does not reduce the reserved cluster count when allocating
clusters for non-delayed allocated writes that have previously been
reserved for delayed writes. This also results in overcounts.To make it possible to handle reserved cluster accounting for
fallocated regions in the same manner as used for other non-delayed
writes, do the reserved cluster accounting for them at the time of
allocation. In the current code, this is only done later when a
delayed extent sharing the fallocated region is finally mapped.Address comment correcting handling of unsigned long long constant
from Jan Kara's review of RFC version of this patch.Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o -
The code in ext4_da_map_blocks sometimes reserves space for more
delayed allocated clusters than it should, resulting in premature
ENOSPC, exceeded quota, and inaccurate free space reporting.Fix this by checking for written and unwritten blocks shared in the
same cluster with the newly delayed allocated block. A cluster
reservation should not be made for a cluster for which physical space
has already been allocated.Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o -
Add new pending reservation mechanism to help manage reserved cluster
accounting. Its primary function is to avoid the need to read extents
from the disk when invalidating pages as a result of a truncate, punch
hole, or collapse range operation.Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o -
Ext4 contains a few functions that are used to search for delayed
extents or blocks in the extents status tree. Rather than duplicate
code to add new functions to search for extents with different status
values, such as written or a combination of delayed and unwritten,
generalize the existing code to search for caller-specified extents
status values. Also, move this code into extents_status.c where it
is better associated with the data structures it operates upon, and
where it can be more readily used to implement new extents status tree
functions that might want a broader scope for i_es_lock.Three missing static specifiers in RFC version of patch reported and
fixed by Fengguang Wu .Signed-off-by: Eric Whitney
Signed-off-by: Theodore Ts'o
21 May, 2018
1 commit
-
Signed-off-by: Sean Fu
Signed-off-by: Theodore Ts'o
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
28 Feb, 2017
1 commit
-
Fix typos and add the following to the scripts/spelling.txt:
comsume||consume
comsumer||consumer
comsuming||consumingI see some variable names with this pattern, but this commit is only
touching comment blocks to avoid unexpected impact.Link: http://lkml.kernel.org/r/1481573103-11329-19-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Apr, 2016
1 commit
-
Messages passed to ext4_warning() or ext4_error() don't need trailing
newlines, because these function add the newlines themselves.Signed-off-by: Jakub Wilk
10 Mar, 2016
1 commit
-
We were setting referenced bit on the extent structure we return from
ext4_es_lookup_extent() which is just a private structure on stack. Thus
setting had no effect. Set the bit in the structure in the status tree
instead.Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o
24 Sep, 2015
1 commit
-
This allows us to refactor the procfs code, which saves a bit of
compiled space. More importantly it isolates most of the procfs
support code into a single file, so it's easier to #ifdef it out if
the proc file system has been disabled.Signed-off-by: Theodore Ts'o
03 May, 2015
1 commit
-
Currently it is possible to lose whole file system block worth of data
when we hit the specific interaction with unwritten and delayed extents
in status extent tree.The problem is that when we insert delayed extent into extent status
tree the only way to get rid of it is when we write out delayed buffer.
However there is a limitation in the extent status tree implementation
so that when inserting unwritten extent should there be even a single
delayed block the whole unwritten extent would be marked as delayed.At this point, there is no way to get rid of the delayed extents,
because there are no delayed buffers to write out. So when a we write
into said unwritten extent we will convert it to written, but it still
remains delayed.When we try to write into that block later ext4_da_map_blocks() will set
the buffer new and delayed and map it to invalid block which causes
the rest of the block to be zeroed loosing already written data.For now we can fix this by simply not allowing to set delayed status on
written extent in the extent status tree. Also add WARN_ON() to make
sure that we notice if this happens in the future.This problem can be easily reproduced by running the following xfs_io.
xfs_io -f -c "pwrite -S 0xaa 4096 2048" \
-c "falloc 0 131072" \
-c "pwrite -S 0xbb 65536 2048" \
-c "fsync" /mnt/test/fffecho 3 > /proc/sys/vm/drop_caches
xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fffThis can be theoretically also reproduced by at random by running fsx,
but it's not very reliable, though on machines with bigger page size
(like ppc) this can be seen more often (especially xfstest generic/127)Signed-off-by: Lukas Czerner
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org
03 Apr, 2015
1 commit
-
Remove unused header files and header files which are included in
ext4.h.Signed-off-by: Sheng Yong
Signed-off-by: Theodore Ts'o
26 Nov, 2014
5 commits
-
Introduce a simple aging to extent status tree. Each extent has a
REFERENCED bit which gets set when the extent is used. Shrinker then
skips entries with referenced bit set and clears the bit. Thus
frequently used extents have higher chances of staying in memory.Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o -
Currently flags for extent status tree are defined twice, once shifted
and once without a being shifted. Consolidate these definitions into one
place and make some computations automatic to make adding flags less
error prone. Compiler should be clever enough to figure out these are
constants and generate the same code.Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o -
Currently we scan extent status trees of inodes until we reclaim nr_to_scan
extents. This can however require a lot of scanning when there are lots
of delayed extents (as those cannot be reclaimed).Change shrinker to work as shrinkers are supposed to and *scan* only
nr_to_scan extents regardless of how many extents did we actually
reclaim. We however need to be careful and avoid scanning each status
tree from the beginning - that could lead to a situation where we would
not be able to reclaim anything at all when first nr_to_scan extents in
the tree are always unreclaimable. We remember with each inode offset
where we stopped scanning and continue from there when we next come
across the inode.Note that we also need to update places calling __es_shrink() manually
to pass reasonable nr_to_scan to have a chance of reclaiming anything and
not just 1.Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o -
Currently callers adding extents to extent status tree were responsible
for adding the inode to the list of inodes with freeable extents. This
is error prone and puts list handling in unnecessarily many places.Just add inode to the list automatically when the first non-delay extent
is added to the tree and remove inode from the list when the last
non-delay extent is removed.Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o -
In this commit we discard the lru algorithm for inodes with extent
status tree because it takes significant effort to maintain a lru list
in extent status tree shrinker and the shrinker can take a long time to
scan this lru list in order to reclaim some objects.We replace the lru ordering with a simple round-robin. After that we
never need to keep a lru list. That means that the list needn't be
sorted if the shrinker can not reclaim any objects in the first round.Cc: Andreas Dilger
Signed-off-by: Zheng Liu
Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o
21 Oct, 2014
1 commit
-
Pull ext4 updates from Ted Ts'o:
"A large number of cleanups and bug fixes, with some (minor) journal
optimizations"[ This got sent to me before -rc1, but was stuck in my spam folder. - Linus ]
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (67 commits)
ext4: check s_chksum_driver when looking for bg csum presence
ext4: move error report out of atomic context in ext4_init_block_bitmap()
ext4: Replace open coded mdata csum feature to helper function
ext4: delete useless comments about ext4_move_extents
ext4: fix reservation overflow in ext4_da_write_begin
ext4: add ext4_iget_normal() which is to be used for dir tree lookups
ext4: don't orphan or truncate the boot loader inode
ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT
ext4: optimize block allocation on grow indepth
ext4: get rid of code duplication
ext4: fix over-defensive complaint after journal abort
ext4: fix return value of ext4_do_update_inode
ext4: fix mmap data corruption when blocksize < pagesize
vfs: fix data corruption when blocksize < pagesize for mmaped data
ext4: fold ext4_nojournal_sops into ext4_sops
ext4: support freezing ext2 (nojournal) file systems
ext4: fold ext4_sync_fs_nojournal() into ext4_sync_fs()
ext4: don't check quota format when there are no quota files
jbd2: simplify calling convention around __jbd2_journal_clean_checkpoint_list
jbd2: avoid pointless scanning of checkpoint lists
...
02 Sep, 2014
4 commits
-
This commit adds some statictics in extent status tree shrinker. The
purpose to add these is that we want to collect more details when we
encounter a stall caused by extent status tree shrinker. Here we count
the following statictics:
stats:
the number of all objects on all extent status trees
the number of reclaimable objects on lru list
cache hits/misses
the last sorted interval
the number of inodes on lru list
average:
scan time for shrinking some objects
the number of shrunk objects
maximum:
the inode that has max nr. of objects on lru list
the maximum scan time for shrinking some objectsThe output looks like below:
$ cat /proc/fs/ext4/sda1/es_shrinker_info
stats:
28228 objects
6341 reclaimable objects
5281/631 cache hits/misses
586 ms last sorted interval
250 inodes on lru list
average:
153 us scan time
128 shrunk objects
maximum:
255 inode (255 objects, 198 reclaimable)
125723 us max scan timeIf the lru list has never been sorted, the following line will not be
printed:
586ms last sorted interval
If there is an empty lru list, the following lines also will not be
printed:
250 inodes on lru list
...
maximum:
255 inode (255 objects, 198 reclaimable)
0 us max scan timeMeanwhile in this commit a new trace point is defined to print some
details in __ext4_es_shrink().Cc: Andreas Dilger
Cc: Jan Kara
Reviewed-by: Jan Kara
Signed-off-by: Zheng Liu
Signed-off-by: Theodore Ts'o -
This commit improves the trace point of extents status tree. We rename
trace_ext4_es_shrink_enter in ext4_es_count() because it is also used
in ext4_es_scan() and we can not identify them from the result.Further this commit fixes a variable name in trace point in order to
keep consistency with others.Cc: Andreas Dilger
Cc: Jan Kara
Reviewed-by: Jan Kara
Signed-off-by: Zheng Liu
Signed-off-by: Theodore Ts'o -
Make the function name less redundant.
Signed-off-by: Theodore Ts'o
-
Teach ext4_ext_drop_refs() to accept a NULL argument, much like
kfree(). This allows us to drop a lot of checks to make sure path is
non-NULL before calling ext4_ext_drop_refs().Signed-off-by: Theodore Ts'o
13 Jul, 2014
1 commit
-
This fixes the following lockdep complaint:
[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ #7 Tainted: G O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
(&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [] __ext4_es_shrink+0x4f/0x2e0but task is already holding lock:
(&ei->i_es_lock){++++-.}, at: [] ext4_es_insert_extent+0x71/0x180which lock already depends on the new lock.
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ei->i_es_lock);
lock(&(&sbi->s_es_lru_lock)->rlock);
lock(&ei->i_es_lock);
lock(&(&sbi->s_es_lru_lock)->rlock);*** DEADLOCK ***
6 locks held by kworker/u24:0/4356:
#0: ("writeback"){.+.+.+}, at: [] process_one_work+0x180/0x560
#1: ((&(&wb->dwork)->work)){+.+.+.}, at: [] process_one_work+0x180/0x560
#2: (&type->s_umount_key#22){++++++}, at: [] grab_super_passive+0x44/0x90
#3: (jbd2_handle){+.+...}, at: [] start_this_handle+0x189/0x5f0
#4: (&ei->i_data_sem){++++..}, at: [] ext4_map_blocks+0x132/0x550
#5: (&ei->i_es_lock){++++-.}, at: [] ext4_es_insert_extent+0x71/0x180stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G O 3.16.0-rc2-mm1+ #7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
[] dump_stack+0x4e/0x68
[] print_circular_bug+0x1fb/0x20c
[] __lock_acquire+0x163e/0x1d00
[] ? retint_restore_args+0xe/0xe
[] ? __slab_alloc+0x4a8/0x4ce
[] ? __ext4_es_shrink+0x4f/0x2e0
[] lock_acquire+0x87/0x120
[] ? __ext4_es_shrink+0x4f/0x2e0
[] ? ext4_es_free_extent+0x5d/0x70
[] _raw_spin_lock+0x39/0x50
[] ? __ext4_es_shrink+0x4f/0x2e0
[] ? kmem_cache_alloc+0x18b/0x1a0
[] __ext4_es_shrink+0x4f/0x2e0
[] ext4_es_insert_extent+0xc8/0x180
[] ext4_map_blocks+0x1c4/0x550
[] ext4_writepages+0x6d4/0xd00
...Reported-by: Minchan Kim
Signed-off-by: Theodore Ts'o
Reported-by: Minchan Kim
Cc: stable@vger.kernel.org
Cc: Zheng Liu
13 May, 2014
1 commit
-
In ext4_es_can_be_merged() when checking whether we can merge two
extents we should use EXT_MAX_BLOCKS instead of defining it manually.
Also if it is really the case we should notify userspace because clearly
there is a bug in extent status tree implementation since this should
never happen.Signed-off-by: Lukas Czerner
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Zheng Liu
21 Apr, 2014
1 commit
-
Currently in ext4 there is quite a mess when it comes to naming
unwritten extents. Sometimes we call it uninitialized and sometimes we
refer to it as unwritten.The right name for the extent which has been allocated but does not
contain any written data is _unwritten_. Other file systems are
using this name consistently, even the buffer head state refers to it as
unwritten. We need to fix this confusion in ext4.This commit changes every reference to an uninitialized extent (meaning
allocated but unwritten) to unwritten extent. This includes comments,
function names and variable names. It even covers abbreviation of the
word uninitialized (such as uninit) and some misspellings.This commit does not change any of the code paths at all. This has been
confirmed by comparing md5sums of the assembly code of each object file
after all the function names were stripped from it.Signed-off-by: Lukas Czerner
Signed-off-by: "Theodore Ts'o"
07 Apr, 2014
1 commit
-
'0x7FDEADBEEF' will be truncated to 32-bit number under unicore32. Need
append 'ULL' for it.The related warning (with allmodconfig under unicore32):
CC [M] fs/ext4/extents_status.o
fs/ext4/extents_status.c: In function "__es_remove_extent":
fs/ext4/extents_status.c:813: warning: integer constant is too large for "long" typeSigned-off-by: Chen Gang
Signed-off-by: "Theodore Ts'o"
21 Feb, 2014
1 commit
-
Adjust the conversion specifications in a few optionally compiled debug
messages to match the return type of ext4_es_status(). Also, make a
couple of minor grammatical message edits while we're at it.Signed-off-by: Eric Whitney
Signed-off-by: "Theodore Ts'o"
20 Feb, 2014
1 commit
-
Avoid false positives by static code analysis tools such as sparse and
coverity caused by the fact that we set the physical block, and then
the status in the extent_status structure. It is also more efficient
to set both of these values at once.Addresses-Coverity-Id: #989077
Addresses-Coverity-Id: #989078
Addresses-Coverity-Id: #1080722Signed-off-by: "Theodore Ts'o"
Reviewed-by: Zheng Liu
11 Sep, 2013
1 commit
-
Convert the filesystem shrinkers to use the new API, and standardise some
of the behaviours of the shrinkers at the same time. For example,
nr_to_scan means the number of objects to scan, not the number of objects
to free.I refactored the CIFS idmap shrinker a little - it really needs to be
broken up into a shrinker per tree and keep an item count with the tree
root so that we don't need to walk the tree every time the shrinker needs
to count the number of objects in the tree (i.e. all the time under
memory pressure).[glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
[assorted fixes folded in]
Signed-off-by: Dave Chinner
Signed-off-by: Glauber Costa
Acked-by: Mel Gorman
Acked-by: Artem Bityutskiy
Acked-by: Jan Kara
Acked-by: Steven Whitehouse
Cc: Adrian Hunter
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew MortonSigned-off-by: Al Viro
29 Aug, 2013
1 commit
-
After applied the commit (4a092d73), we have reduced the number of
source files that need to #include ext4_extents.h. But we can do
better.This commit defines ext4_zeroout_es() in extents.c and move
EXT_MAX_BLOCKS into ext4.h in order not to include ext4_extents.h in
indirect.c and ioctl.c. Meanwhile we just need to include this file in
extent_status.c when ES_AGGRESSIVE_TEST is defined. Otherwise, this
commit removes a duplicated declaration in trace/events/ext4.h.After applied this patch, we just need to include ext4_extents.h file
in {super,migrate,move_extents,extents}.c, and it is easy for us to
define a new extent disk layout.Signed-off-by: Zheng Liu
Signed-off-by: "Theodore Ts'o"
17 Aug, 2013
2 commits
-
Add a new fiemap flag which forces the all of the extents in an inode
to be cached in the extent_status tree. This is critically important
when using AIO to a preallocated file, since if we need to read in
blocks from the extent tree, the io_submit(2) system call becomes
synchronous, and the AIO is no longer "A", which is bad.In addition, for most files which have an external leaf tree block,
the cost of caching the information in the extent status tree will be
less than caching the entire 4k block in the buffer cache. So it is
generally a win to keep the extent information cached.Signed-off-by: "Theodore Ts'o"
-
When we read in an extent tree leaf block from disk, arrange to have
all of its entries cached. In nearly all cases the in-memory
representation will be more compact than the on-disk representation in
the buffer cache, and it allows us to get the information without
having to traverse the extent tree for successive extents.Signed-off-by: "Theodore Ts'o"
Reviewed-by: Zheng Liu