15 May, 2009
12 commits
-
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix race in ext4_inode_info.i_cached_extent
ext4: Clear the unwritten buffer_head flag after the extent is initialized
ext4: Use a fake block number for delayed new buffer_head
ext4: Fix sub-block zeroing for writes into preallocated extents -
devpts_get_sb() calls memset(0) to clear mount options and calls
parse_mount_options() if user specified any mount options.The memset(0) is bogus since the 'mode' and 'ptmxmode' options are
non-zero by default. parse_mount_options() restores options to default
anyway and can properly deal with NULL mount options.So in devpts_get_sb() remove memset(0) and call parse_mount_options() even
for NULL mount options.Bug reported by Eric Paris: http://lkml.org/lkml/2009/5/7/448.
Signed-off-by: Sukadev Bhattiprolu
Tested-by: Marc Dionne
Reported-by: Eric Paris
Cc: Christoph Hellwig
Cc: Alan Cox
Acked-by: Serge Hallyn
Cc: Al Viro
Cc: "Rafael J. Wysocki"
Reviewed-by: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If two CPU's simultaneously call ext4_ext_get_blocks() at the same
time, there is nothing protecting the i_cached_extent structure from
being used and updated at the same time. This could potentially cause
the wrong location on disk to be read or written to, including
potentially causing the corruption of the block group descriptors
and/or inode table.This bug has been in the ext4 code since almost the very beginning of
ext4's development. Fortunately once the data is stored in the page
cache cache, ext4_get_blocks() doesn't need to be called, so trying to
replicate this problem to the point where we could identify its root
cause was *extremely* difficult. Many thanks to Kevin Shanahan for
working over several months to be able to reproduce this easily so we
could finally nail down the cause of the corruption.Signed-off-by: "Theodore Ts'o"
Reviewed-by: "Aneesh Kumar K.V" -
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix error handling in parse_DFS_referrals -
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Spelling fix in btrfs_lookup_first_block_group comments
Btrfs: make show_options result match actual option names
Btrfs: remove outdated comment in btrfs_ioctl_resize()
Btrfs: remove some WARN_ONs in the IO failure path
Btrfs: Don't loop forever on metadata IO failures
Btrfs: init inode ordered_data_close flag properly -
The BH_Unwritten flag indicates that the buffer is allocated on disk
but has not been written; that is, the disk was part of a persistent
preallocation area. That flag should only be set when a get_blocks()
function is looking up a inode's logical to physical block mapping.When ext4_get_blocks_wrap() is called with create=1, the uninitialized
extent is converted into an initialized one, so the BH_Unwritten flag
is no longer appropriate. Hence, we need to make sure the
BH_Unwritten is not left set, since the combination of BH_Mapped and
BH_Unwritten is not allowed; among other things, it will result ext4's
get_block() to be called over and over again during the write_begin
phase of write(2).Signed-off-by: Aneesh Kumar K.V
Signed-off-by: "Theodore Ts'o" -
Signed-off-by: Sankar P
Signed-off-by: Chris Mason -
The notreelog and flushoncommit mount options were being printed slightly
differently.Signed-off-by: Sage Weil
Signed-off-by: Chris Mason -
In Li Zefan's commit dae7b665cf6d6e6e733f1c9c16cf55547dd37e33,
a combination call of kmalloc() and copy_from_user() is replaced by
memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more.Signed-off-by: Li Hong
Signed-off-by: Chris Mason -
These debugging WARN_ONs make too much console noise during regular
IO failures. An IO failure will still generate a number of messages
as we verify checksums etc, but these two are not needed.Signed-off-by: Chris Mason
-
When a btrfs metadata read fails, the first thing we try to do is find
a good copy on another mirror of the block. If this fails, read_tree_block()
ends up returning a buffer that isn't up to date.The btrfs btree reading code was reworked to drop locks and repeat
the search when IO was done, but the changes didn't add a check for failed
reads. The end result was looping forever on buffers that were never
going to become up to date.Signed-off-by: Chris Mason
-
This flag is used to decide when we need to send a given file through
the ordered code to make sure it is fully written before a transaction
commits. It was not being properly set to zero when the inode was
being setup.Signed-off-by: Chris Mason
14 May, 2009
5 commits
-
cifs_strndup_from_ucs returns NULL on error, not an ERR_PTR
Signed-off-by: Jeff Layton
Signed-off-by: Steve French -
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: cody tidying, remove commented out line in Makefile
Squashfs: check page size is not larger than the filesystem block size
Squashfs: fix breakage when page size > metadata block size -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: destroy bdi on error -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: check size of array structured data exchanged via ioctls
nilfs2: fix lock order reversal in nilfs_clean_segments ioctl
nilfs2: fix possible circular locking for get information ioctls
nilfs2: ensure to clear dirty state when deleting metadata file block
nilfs2: fix circular locking dependency of writer mutex
nilfs2: fix possible recovery failure due to block creation without writer -
We need to mark the buffer_head mapping preallocated space as new
during write_begin. Otherwise we don't zero out the page cache content
properly for a partial write. This will cause file corruption with
preallocation.Now that we mark the buffer_head new we also need to have a valid
buffer_head blocknr so that unmap_underlying_metadata() unmaps the
correct block.Signed-off-by: Aneesh Kumar K.V
Signed-off-by: "Theodore Ts'o"
13 May, 2009
7 commits
-
The core VM assumes the page size used by the address_space in
inode->i_mapping is PAGE_SIZE but hugetlbfs breaks this assumption by
inserting pages into the page cache at offsets the core VM considers
unexpected.This would not be a problem except that hugetlbfs also provide a
->readpage implementation. As it exists, the core VM can assume the
base page size is being used, allocate pages on behalf of the
filesystem, insert them into the page cache and call ->readpage to
populate them. These pages are the wrong size and at the wrong offset
for hugetlbfs causing confusion.This patch deletes the ->readpage implementation for hugetlbfs on the
grounds the core VM should not be allocating and populating pages on
behalf of hugetlbfs. There should be no existing users of the
->readpage implementation so it should not cause a regression.Signed-off-by: Mel Gorman
Signed-off-by: Linus Torvalds -
Signed-off-by: Phillip Lougher
-
Normally the block size (by default 128K) will be larger than the
page size, unless a non-standard block size has been specified in
Mksquashfs, and the page size is larger than 4K.Signed-off-by: Phillip Lougher
-
Squashfs is broken on any system where the page size is larger than
the metadata size (8192). This is easily fixed by ensuring cache->pages
is always > 0.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Doug Chapman
Signed-off-by: Andrew Morton
Signed-off-by: Phillip Lougher -
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux:
nfsd: silence lockdep warning
lockd: fix list corruption on lockd restart
nfsd4: check for negative dentry before use in nfsv4 readdir
nfsd41: slots are freed with session
svcrdma: clean up error paths.
svcrdma: Fix dma map direction for rdma read targets -
Fix a size check WRT the manual pages. This was inadvertently broken by
commit 9fe5ad9c8cef9ad5873d8ee55d1cf00d9b607df0 ("flag parameters
add-on: remove epoll_create size param").Signed-off-by: Davide Libenzi
Cc:
Cc: rohit verma
Cc: Ulrich Drepper
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use a very large unsigned number (~0xffff) as as the fake block number
for the delayed new buffer. The VFS should never try to write out this
number, but if it does, this will make it obvious.Signed-off-by: Aneesh Kumar K.V
Signed-off-by: "Theodore Ts'o"
12 May, 2009
3 commits
-
Signed-off-by: J. Bruce Fields
-
The return value of dup2 when oldfd == newfd and the fd isn't valid is
not getting properly sign extended. We end up with 4294967287 instead
of -EBADF.I've reproduced this on SLE11 (2.6.27.21), openSUSE Factory
(2.6.29-rc5), and Ubuntu 9.04 (2.6.28).This patch uses a signed int for the error value so it is properly
extended.Commit 6c5d0512a091480c9f981162227fdb1c9d70e555 introduced this
regression.Reported-by: Jiri Dluhos
Signed-off-by: Jeff Mahoney
Signed-off-by: Linus Torvalds -
Although some ioctls of nilfs2 exchange data in the form of indirectly
referenced array, some of them lack size check on the array elements.This inserts the missing checks and rejects requests if data of ioctl
does not have a valid format.We usually don't have to check size of structures that we associated
with ioctl commands because the size is tested implicitly for
identifying ioctl command; the checks this patch adds are for the
cases where the implicit check is not applied.Signed-off-by: Ryusuke Konishi
11 May, 2009
3 commits
-
This is a companion patch to ("nilfs2: fix possible circular locking
for get information ioctls").This corrects lock order reversal between mm->mmap_sem and
nilfs->ns_segctor_sem in nilfs_clean_segments() which was detected by
lockdep check:=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.30-rc3-nilfs-00003-g360bdc1 #7
-------------------------------------------------------
mmap/5294 is trying to acquire lock:
(&nilfs->ns_segctor_sem){++++.+}, at: [] nilfs_transaction_begin+0xb6/0x10c [nilfs2]but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [] do_page_fault+0x1d8/0x30awhich lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_sem){++++++}:
[] __lock_acquire+0x1066/0x13b0
[] lock_acquire+0xba/0xdd
[] might_fault+0x68/0x88
[] copy_from_user+0x2a/0x111
[] nilfs_ioctl_prepare_clean_segments+0x1d/0xf1 [nilfs2]
[] nilfs_clean_segments+0x6d/0x1b9 [nilfs2]
[] nilfs_ioctl+0x2ad/0x318 [nilfs2]
[] vfs_ioctl+0x22/0x69
[] do_vfs_ioctl+0x460/0x499
[] sys_ioctl+0x40/0x5a
[] sysenter_do_call+0x12/0x38
[] 0xffffffff-> #0 (&nilfs->ns_segctor_sem){++++.+}:
[] __lock_acquire+0xdcc/0x13b0
[] lock_acquire+0xba/0xdd
[] down_read+0x2a/0x3e
[] nilfs_transaction_begin+0xb6/0x10c [nilfs2]
[] nilfs_page_mkwrite+0xe7/0x154 [nilfs2]
[] __do_fault+0x165/0x376
[] handle_mm_fault+0x287/0x5d1
[] do_page_fault+0x2fb/0x30a
[] error_code+0x72/0x78
[] 0xffffffffwhere nilfs_clean_segments() holds:
nilfs->ns_segctor_sem -> copy_from_user()
--> page fault -> mm->mmap_semAnd, page fault path may hold:
page fault -> mm->mmap_sem
--> nilfs_page_mkwrite() -> nilfs->ns_segctor_semEven though nilfs_clean_segments() does not perform write access on
given user pages, it may cause deadlock because nilfs->ns_segctor_sem
is shared per device and mm->mmap_sem can be shared with other tasks.To avoid this problem, this patch moves all calls of copy_from_user()
outside the nilfs->ns_segctor_sem lock in the ioctl.Signed-off-by: Ryusuke Konishi
-
This is one of two patches which are to correct possible circular
locking between mm->mmap_sem and nilfs->ns_segctor_sem.The problem was detected by lockdep check as follows:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.30-rc3-nilfs-00002-g3552613 #6
-------------------------------------------------------
mmap/5418 is trying to acquire lock:
(&nilfs->ns_segctor_sem){++++.+}, at: [] nilfs_transaction_begin+0xb6/0x10c [nilfs2]but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [] do_page_fault+0x1d8/0x30awhich lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_sem){++++++}:
[] __lock_acquire+0x1066/0x13b0
[] lock_acquire+0xba/0xdd
[] might_fault+0x68/0x88
[] copy_to_user+0x2c/0xfc
[] nilfs_ioctl_wrap_copy+0x103/0x160 [nilfs2]
[] nilfs_ioctl+0x30a/0x3b0 [nilfs2]
[] vfs_ioctl+0x22/0x69
[] do_vfs_ioctl+0x460/0x499
[] sys_ioctl+0x40/0x5a
[] sysenter_do_call+0x12/0x38
[] 0xffffffff-> #0 (&nilfs->ns_segctor_sem){++++.+}:
[] __lock_acquire+0xdcc/0x13b0
[] lock_acquire+0xba/0xdd
[] down_read+0x2a/0x3e
[] nilfs_transaction_begin+0xb6/0x10c [nilfs2]
[] nilfs_page_mkwrite+0xe7/0x154 [nilfs2]
[] __do_fault+0x165/0x376
[] handle_mm_fault+0x287/0x5d1
[] do_page_fault+0x2fb/0x30a
[] error_code+0x72/0x78
[] 0xffffffffother info that might help us debug this:
1 lock held by mmap/5418:
#0: (&mm->mmap_sem){++++++}, at: [] do_page_fault+0x1d8/0x30astack backtrace:
Pid: 5418, comm: mmap Not tainted 2.6.30-rc3-nilfs-00002-g3552613 #6
Call Trace:
[] ? printk+0xf/0x12
[] print_circular_bug_tail+0xaa/0xb5
[] __lock_acquire+0xdcc/0x13b0
[] ? nilfs_sufile_get_stat+0x1e/0x105 [nilfs2]
[] ? up_read+0x16/0x2c
[] ? nilfs_sufile_get_stat+0xfa/0x105 [nilfs2]
[] lock_acquire+0xba/0xdd
[] ? nilfs_transaction_begin+0xb6/0x10c [nilfs2]
[] down_read+0x2a/0x3e
[] ? nilfs_transaction_begin+0xb6/0x10c [nilfs2]
[] nilfs_transaction_begin+0xb6/0x10c [nilfs2]
[] nilfs_page_mkwrite+0xe7/0x154 [nilfs2]
[] __do_fault+0x165/0x376
[] handle_mm_fault+0x287/0x5d1
[] ? do_page_fault+0x1d8/0x30a
[] ? down_read_trylock+0x39/0x43
[] do_page_fault+0x2fb/0x30a
[] ? do_page_fault+0x0/0x30a
[] error_code+0x72/0x78
[] ? do_page_fault+0x0/0x30aThis makes the lock granularity of nilfs->ns_segctor_sem finer than
that of the mmap semaphore for ioctl commands except
nilfs_clean_segments().The successive patch ("nilfs2: fix lock order reversal in
nilfs_clean_segments ioctl") is required to fully resolve the problem.Signed-off-by: Ryusuke Konishi
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits)
Fix the race between capifs remount and node creation
Fix races around the access to ->s_options
switch ufs directories to ufs_sync_file()
Switch open_exec() and sys_uselib() to do_open_filp()
Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts
Reduce path_lookup() abuses
Make checkpatch.pl shut up on fs/inode.c
NULL noise in fs/super.c:kill_bdev_super()
romfs: cleanup romfs_fs.h
ROMFS: romfs_dev_read() error ignored
fs: dcache fix LRU ordering
ocfs2: Use nd_set_link().
Fix deadlock in ipathfs ->get_sb()
Fix a leak in failure exit in 9p ->get_sb()
Convert obvious places to deactivate_locked_super()
New helper: deactivate_locked_super()
reiserfs: remove privroot hiding in lookup
reiserfs: dont associate security.* with xattr files
reiserfs: fixup xattr_root caching
Always lookup priv_root on reiserfs mount and keep it
...
10 May, 2009
1 commit
-
This would fix the following failure during GC:
nilfs_cpfile_delete_checkpoints: cannot delete block
NILFS: GC failed during preparation: cannot delete checkpoints: err=-2The problem was caused by a break in state consistency between page
cache and btree; the above block was removed from the btree but the
page buffering the block was remaining in the page cache in dirty
state.This resolves the inconsistency by ensuring to clear dirty state of
the page buffering the deleted block.Reported-by: David Arendt
Signed-off-by: Ryusuke Konishi
09 May, 2009
9 commits
-
Put generic_show_options read access to s_options under rcu_read_lock,
split save_mount_options() into "we are setting it the first time"
(uses in foo_fill_super()) and "we are relacing and freeing the old one",
synchronize_rcu() before kfree() in the latter.Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
-
... and make path_lookup_open() static
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
-
... use kern_path() where possible
[folded a fix from rdd]
Signed-off-by: Al Viro
-
Code Quality According To Mingo(tm) has been vastly improved,
no code has been damaged^Wchanged^Wdamaged.[commit message rewritten -- AV]
Signed-off-by: Manish Katiyar
Signed-off-by: Al Viro -
Signed-off-by: H Hartley Sweeten
Cc: Subrata Modak
Signed-off-by: Al Viro -
romfs_dev_read() may return -EIO, but ret is unsigned, so the errorpath
isn't taken.Signed-off-by: Roel Kluin
Signed-off-by: Al Viro -
Fix ordering of LRU when moving referenced dentries to the head of the list
(they should go to the head of the list in the same order as they were found
from the tail, rather than reverse order).Signed-off-by: Nick Piggin
Signed-off-by: Al Viro