06 Jan, 2017
2 commits
-
commit 30a9d7afe70ed6bd9191d3000e2ef1a34fb58493 upstream.
The number of 'counters' elements needed in 'struct sg' is
super_block->s_blocksize_bits + 2. Presently we have 16 'counters'
elements in the array. This is insufficient for block sizes >= 32k. In
such cases the memcpy operation performed in ext4_mb_seq_groups_show()
would cause stack memory corruption.Fixes: c9de560ded61f
Signed-off-by: Chandan Rajendra
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman -
commit 69e43e8cc971a79dd1ee5d4343d8e63f82725123 upstream.
'border' variable is set to a value of 2 times the block size of the
underlying filesystem. With 64k block size, the resulting value won't
fit into a 16-bit variable. Hence this commit changes the data type of
'border' to 'unsigned int'.Fixes: c9de560ded61f
Signed-off-by: Chandan Rajendra
Signed-off-by: Theodore Ts'o
Reviewed-by: Andreas Dilger
Signed-off-by: Greg Kroah-Hartman
29 Jul, 2016
1 commit
-
Pull vfs updates from Al Viro:
"Assorted cleanups and fixes.Probably the most interesting part long-term is ->d_init() - that will
have a bunch of followups in (at least) ceph and lustre, but we'll
need to sort the barrier-related rules before it can get used for
really non-trivial stuff.Another fun thing is the merge of ->d_iput() callers (dentry_iput()
and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
except the one in __d_lookup_lru())"* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
fs/dcache.c: avoid soft-lockup in dput()
vfs: new d_init method
vfs: Update lookup_dcache() comment
bdev: get rid of ->bd_inodes
Remove last traces of ->sync_page
new helper: d_same_name()
dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
vfs: clean up documentation
vfs: document ->d_real()
vfs: merge .d_select_inode() into .d_real()
unify dentry_iput() and dentry_unlink_inode()
binfmt_misc: ->s_root is not going anywhere
drop redundant ->owner initializations
ufs: get rid of redundant checks
orangefs: constify inode_operations
missed comment updates from ->direct_IO() prototype change
file_inode(f)->i_mapping is f->f_mapping
trim fsnotify hooks a bit
9p: new helper - v9fs_parent_fid()
debugfs: ->d_parent is never NULL or negative
...
15 Jul, 2016
1 commit
-
If we hit this error when mounted with errors=continue or
errors=remount-ro:EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.Instead, just error out. We could retry the allocation, but there is a
possibility of getting stuck in an infinite loop instead, so this seems
safer.[ Fixed up so we don't return EAGAIN to userspace. --tytso ]
Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Signed-off-by: Vegard Nossum
Signed-off-by: Theodore Ts'o
Cc: Aneesh Kumar K.V
Cc: stable@vger.kernel.org
27 Jun, 2016
1 commit
-
If there are no pending blocks to be released after a commit, forcing
a journal commit has no hope of helping. It's possible that a commit
had just completed, so if there are now free blocks available for
allocation, it's worth retrying the commit.Reported-by: Chao Yu
Signed-off-by: Theodore Ts'o
30 May, 2016
1 commit
-
it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.Signed-off-by: Al Viro
06 May, 2016
2 commits
-
Currently, in ext4_mb_init(), there's a loop like the following:
do {
...
offset += 1 << (sb->s_blocksize_bits - i);
i++;
} while (i s_blocksize_bits + 1);Note that the updated offset is used in the loop's next iteration only.
However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
and UBSAN reportsUBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
shift exponent 4294967295 is too large for 32-bit type 'int'
[...]
Call Trace:
[] dump_stack+0xbc/0x117
[] ? _atomic_dec_and_lock+0x169/0x169
[] ubsan_epilogue+0xd/0x4e
[] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
[] ? __ubsan_handle_load_invalid_value+0x158/0x158
[] ? kmem_cache_alloc+0x101/0x390
[] ? ext4_mb_init+0x13b/0xfd0
[] ? create_cache+0x57/0x1f0
[] ? create_cache+0x11a/0x1f0
[] ? mutex_lock+0x38/0x60
[] ? mutex_unlock+0x1b/0x50
[] ? put_online_mems+0x5b/0xc0
[] ? kmem_cache_create+0x117/0x2c0
[] ext4_mb_init+0xc49/0xfd0
[...]Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.
Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of offset is never used again.Silence UBSAN by introducing another variable, offset_incr, holding the
next increment to apply to offset and adjust that one by right shifting it
by one position per loop iteration.Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Stange
Signed-off-by: Theodore Ts'o -
Currently, in mb_find_order_for_block(), there's a loop like the following:
while (order bd_blkbits + 1) {
...
bb += 1 << (e4b->bd_blkbits - order);
}Note that the updated bb is used in the loop's next iteration only.
However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reportsUBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
shift exponent -1 is negative
[...]
Call Trace:
[] dump_stack+0xbc/0x117
[] ? _atomic_dec_and_lock+0x169/0x169
[] ubsan_epilogue+0xd/0x4e
[] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
[] ? __ubsan_handle_load_invalid_value+0x158/0x158
[] ? ext4_mb_generate_from_pa+0x590/0x590
[] ? ext4_read_block_bitmap_nowait+0x598/0xe80
[] mb_find_order_for_block+0x1ce/0x240
[...]Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of bb is never used again.Silence UBSAN by introducing another variable, bb_incr, holding the next
increment to apply to bb and adjust that one by right shifting it by one
position per loop iteration.Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161Cc: stable@vger.kernel.org
Signed-off-by: Nicolai Stange
Signed-off-by: Theodore Ts'o
27 Apr, 2016
1 commit
-
Messages passed to ext4_warning() or ext4_error() don't need trailing
newlines, because these function add the newlines themselves.Signed-off-by: Jakub Wilk
05 Apr, 2016
2 commits
-
Mostly direct substitution with occasional adjustment or removing
outdated comments.Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds -
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.Let's stop pretending that pages in page cache are special. They are
not.The changes are pretty straight-forward:
- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds
14 Mar, 2016
1 commit
-
This might be unexpected but pages allocated for sbi->s_buddy_cache are
charged to current memory cgroup. So, GFP_NOFS allocation could fail if
current task has been killed by OOM or if current memory cgroup has no
free memory left. Block allocator cannot handle such failures here yet.Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Theodore Ts'o
10 Mar, 2016
1 commit
-
Signed-off-by: Adam Buchbinder
Signed-off-by: Theodore Ts'o
22 Feb, 2016
1 commit
-
Now, ext4_free_blocks() doesn't revoke data blocks of per-file data
journalled inode and it can cause file data inconsistency problems.
Even though data blocks of per-file data journalled inode are already
forgotten by jbd2_journal_invalidatepage() in advance of invoking
ext4_free_blocks(), we still need to revoke the data blocks here.
Moreover some of the metadata blocks, which are not found by
sb_find_get_block(), are still needed to be revoked, but this is also
missing here.Signed-off-by: Daeho Jeong
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara
12 Feb, 2016
1 commit
-
This patch adds a line break for proc mb_groups display.
Signed-off-by: Huaitong Han
Signed-off-by: Theodore Ts'o
Reviewed-by: Andreas Dilger
10 Nov, 2015
1 commit
-
Switch everything to the new and more capable implementation of abs().
Mainly to give the new abs() a bit of a workout.Cc: Michal Nazarewicz
Cc: John Stultz
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Masami Hiramatsu
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Oct, 2015
1 commit
-
The ext4_fsblk_t type is a long long, which should not be used
with abs(), as is done in ext4_mb_check_group_pa().This patch modifies ext4_mb_check_group_pa() to use abs64()
instead.Signed-off-by: John Stultz
Signed-off-by: Theodore Ts'o
18 Oct, 2015
2 commits
-
When you repeatly execute xfstest generic/269 with bigalloc_1k option
enabled using the below command:"./kvm-xfstests -c bigalloc_1k -m nodelalloc -C 1000 generic/269"
you can easily see the below bug message.
"JBD2 unexpected failure: jbd2_journal_revoke: !buffer_revoked(bh);"
This means that an already revoked buffer is erroneously revoked again
and it is caused by doing revoke for the buffer at the wrong position
in ext4_free_blocks(). We need to re-position the buffer revoke
procedure for an unspecified buffer after checking the cluster boundary
for bigalloc option. If not, some part of the cluster can be doubly
revoked.Signed-off-by: Daeho Jeong
-
Make the bitmap reaading routines return real error codes (EIO,
EFSCORRUPTED, EFSBADCRC) which can then be reflected back to
userspace for more precise diagnosis work.In particular, this means that mballoc no longer claims that we're out
of memory if the block bitmaps become corrupt.Signed-off-by: Darrick J. Wong
Signed-off-by: Theodore Ts'o
24 Sep, 2015
1 commit
-
This allows us to refactor the procfs code, which saves a bit of
compiled space. More importantly it isolates most of the procfs
support code into a single file, so it's easier to #ifdef it out if
the proc file system has been disabled.Signed-off-by: Theodore Ts'o
06 Jul, 2015
2 commits
-
Pull ext4 bugfixes from Ted Ts'o:
"Bug fixes (all for stable kernels) for ext4:- address corner cases for indirect blocks->extent migration
- fix reserved block accounting invalidate_page when
page_size != block_size (i.e., ppc or 1k block size file systems)- fix deadlocks when a memcg is under heavy memory pressure
- fix fencepost error in lazytime optimization"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: replace open coded nofail allocation in ext4_free_blocks()
ext4: correctly migrate a file with a hole at the beginning
ext4: be more strict when migrating to non-extent based file
ext4: fix reservation release on invalidatepage for delalloc fs
ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp
bufferhead: Add _gfp version for sb_getblk()
ext4: fix fencepost error in lazytime optimization -
ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.Signed-off-by: Michal Hocko
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org
26 Jun, 2015
1 commit
-
Pull cgroup writeback support from Jens Axboe:
"This is the big pull request for adding cgroup writeback support.This code has been in development for a long time, and it has been
simmering in for-next for a good chunk of this cycle too. This is one
of those problems that has been talked about for at least half a
decade, finally there's a solution and code to go with it.Also see last weeks writeup on LWN:
http://lwn.net/Articles/648292/"
* 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits)
writeback, blkio: add documentation for cgroup writeback support
vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB
writeback: do foreign inode detection iff cgroup writeback is enabled
v9fs: fix error handling in v9fs_session_init()
bdi: fix wrong error return value in cgwb_create()
buffer: remove unusued 'ret' variable
writeback: disassociate inodes from dying bdi_writebacks
writeback: implement foreign cgroup inode bdi_writeback switching
writeback: add lockdep annotation to inode_to_wb()
writeback: use unlocked_inode_to_wb transaction in inode_congested()
writeback: implement unlocked_inode_to_wb transaction and use it for stat updates
writeback: implement [locked_]inode_to_wb_and_lock_list()
writeback: implement foreign cgroup inode detection
writeback: make writeback_control track the inode being written back
writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb()
mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use
writeback: implement memcg writeback domain based throttling
writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
writeback: implement memcg wb_domain
writeback: update wb_over_bg_thresh() to use wb_domain aware operations
...
15 Jun, 2015
1 commit
-
Making a function call with 20 arguments is rather expensive in both
stack and .text. In this case, doing the formatting manually doesn't
make it any less readable, so we might as well save 155 bytes of .text
and 112 bytes of stack.Signed-off-by: Rasmus Villemoes
08 Jun, 2015
2 commits
-
Currently ext4_mb_good_group() only returns 0 or 1 depending on whether
the allocation group is suitable for use or not. However we might get
various errors and fail while initializing new group including -EIO
which would never get propagated up the call chain. This might lead to
an endless loop at writeback when we're trying to find a good group to
allocate from and we fail to initialize new group (read error for
example).Fix this by returning proper error code from ext4_mb_good_group() and
using it in ext4_mb_regular_allocator(). In ext4_mb_regular_allocator()
we will always return only the first occurred error from
ext4_mb_good_group() and we only propagate it back to the caller if we
do not get any other errors and we fail to allocate any blocks.Note that with other modes than errors=continue, we will fail
immediately in ext4_mb_good_group() in case of error, however with
errors=continue we should try to continue using the file system, that's
why we're not going to fail immediately when we see an error from
ext4_mb_good_group(), but rather when we fail to find a suitable block
group to allocate from due to an problem in group initialization.Signed-off-by: Lukas Czerner
Signed-off-by: Theodore Ts'o
Reviewed-by: Darrick J. Wong -
Currently on the machines with page size > block size when initializing
block group buddy cache we initialize it for all the block group bitmaps
in the page. However in the case of read error, checksum error, or if
a single bitmap is in any way corrupted we would fail to initialize all
of the bitmaps. This is problematic because we will not have access to
the other allocation groups even though those might be perfectly fine
and usable.Fix this by reading all the bitmaps instead of error out on the first
problem and simply skip the bitmaps which were either not read properly,
or are not valid.Signed-off-by: Lukas Czerner
Signed-off-by: Theodore Ts'o
02 Jun, 2015
1 commit
-
With the planned cgroup writeback support, backing-dev related
declarations will be more widely used across block and cgroup;
unfortunately, including backing-dev.h from include/linux/blkdev.h
makes cyclic include dependency quite likely.This patch separates out backing-dev-defs.h which only has the
essential definitions and updates blkdev.h to include it. c files
which need access to more backing-dev details now include
backing-dev.h directly. This takes backing-dev.h off the common
include dependency chain making it a lot easier to use it across block
and cgroup.v2: fs/fat build failure fixed.
Signed-off-by: Tejun Heo
Reviewed-by: Jan Kara
Cc: Jens Axboe
Signed-off-by: Jens Axboe
26 Nov, 2014
2 commits
-
The iput() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring
Signed-off-by: Theodore Ts'o -
We must use GFP_NOFS instead GFP_KERNEL inside ext4_mb_add_groupinfo
and ext4_calculate_overhead() because they are called from inside a
journal transaction. Call trace:ioctl
->ext4_group_add
->journal_start
->ext4_setup_new_descs
->ext4_mb_add_groupinfo -> GFP_KERNEL
->ext4_flex_group_add
->ext4_update_super
->ext4_calculate_overhead -> GFP_KERNEL
->journal_stopSigned-off-by: Dmitry Monakhov
Signed-off-by: Theodore Ts'o
21 Nov, 2014
1 commit
-
Signed-off-by: Al Viro
Signed-off-by: Theodore Ts'o
21 Oct, 2014
1 commit
-
Pull ext4 updates from Ted Ts'o:
"A large number of cleanups and bug fixes, with some (minor) journal
optimizations"[ This got sent to me before -rc1, but was stuck in my spam folder. - Linus ]
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (67 commits)
ext4: check s_chksum_driver when looking for bg csum presence
ext4: move error report out of atomic context in ext4_init_block_bitmap()
ext4: Replace open coded mdata csum feature to helper function
ext4: delete useless comments about ext4_move_extents
ext4: fix reservation overflow in ext4_da_write_begin
ext4: add ext4_iget_normal() which is to be used for dir tree lookups
ext4: don't orphan or truncate the boot loader inode
ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT
ext4: optimize block allocation on grow indepth
ext4: get rid of code duplication
ext4: fix over-defensive complaint after journal abort
ext4: fix return value of ext4_do_update_inode
ext4: fix mmap data corruption when blocksize < pagesize
vfs: fix data corruption when blocksize < pagesize for mmaped data
ext4: fold ext4_nojournal_sops into ext4_sops
ext4: support freezing ext2 (nojournal) file systems
ext4: fold ext4_sync_fs_nojournal() into ext4_sync_fs()
ext4: don't check quota format when there are no quota files
jbd2: simplify calling convention around __jbd2_journal_clean_checkpoint_list
jbd2: avoid pointless scanning of checkpoint lists
...
15 Oct, 2014
1 commit
-
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
02 Oct, 2014
1 commit
-
Reviewed-by: Jan Kara
Signed-off-by: Dmitry Monakhov
Signed-off-by: Theodore Ts'o
05 Sep, 2014
2 commits
-
Having done a full regression test, we can now drop the
DELALLOC_RESERVED state flag.Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara -
The EXT4_STATE_DELALLOC_RESERVED flag was originally implemented
because it was too hard to make sure the mballoc and get_block flags
could be reliably passed down through all of the codepaths that end up
calling ext4_mb_new_blocks().Since then, we have mb_flags passed down through most of the code
paths, so getting rid of EXT4_STATE_DELALLOC_RESERVED isn't as tricky
as it used to.This commit plumbs in the last of what is required, and then adds a
WARN_ON check to make sure we haven't missed anything. If this passes
a full regression test run, we can then drop
EXT4_STATE_DELALLOC_RESERVED.Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara
27 Aug, 2014
1 commit
-
__this_cpu_ptr is being phased out use raw_cpu_ptr instead which was
introduced in 3.15-rc1.Cc: Jens Axboe
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo
24 Aug, 2014
1 commit
-
If we suffer a block allocation failure (for example due to a memory
allocation failure), it's possible that we will call
ext4_discard_allocated_blocks() before we've actually allocated any
blocks. In that case, fe_len and fe_start in ac->ac_f_ex will still
be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0)
triggering the BUG_ON on mb_free_blocks():BUG_ON(last >= (sb->s_blocksize << 3));
Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len
is zero.Also fix a missing ext4_mb_unload_buddy() call in
ext4_discard_allocated_blocks().Google-Bug-Id: 16844242
Fixes: 86f0afd463215fc3e58020493482faa4ac3a4d69
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org
31 Jul, 2014
1 commit
-
If there is a failure while allocating the preallocation structure, a
number of blocks can end up getting marked in the in-memory buddy
bitmap, and then not getting released. This can result in the
following corruption getting reported by the kernel:EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
12793 clusters in bitmap, 12729 in gdIn that case, we need to release the blocks using mb_free_blocks().
Tested: fs smoke test; also demonstrated that with injected errors,
the file system is no longer getting corruptedGoogle-Bug-Id: 16657874
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org
28 Jul, 2014
1 commit
-
As the member fe_len defined in struct ext4_free_extent is expressed as
number of clusters, the variable "size" computation is wrong, we need to
first translate fe_len to block number, then to bytes.Signed-off-by: Xiaoguang Wang
Signed-off-by: Theodore Ts'o
Reviewed-by: Lukas Czerner
15 Jul, 2014
1 commit
-
Commit 27dd43854227b ("ext4: introduce reserved space") reserves 2% of
the file system space to make sure metadata allocations will always
succeed. Given that, tracking the reservation of metadata blocks is
no longer necessary.Signed-off-by: Theodore Ts'o