28 Jun, 2011
4 commits
-
Under heavy memory and filesystem load, users observe the assertion
mapping->nrpages == 0 in end_writeback() trigger. This can be caused by
page reclaim reclaiming the last page from a mapping in the following
race:CPU0 CPU1
...
shrink_page_list()
__remove_mapping()
__delete_from_page_cache()
radix_tree_delete()
evict_inode()
truncate_inode_pages()
truncate_inode_pages_range()
pagevec_lookup() - finds nothing
end_writeback()
mapping->nrpages != 0 -> BUG
page->mapping = NULL
mapping->nrpages--Fix the problem by doing a reliable check of mapping->nrpages under
mapping->tree_lock in end_writeback().Analyzed by Jay , lost in LKML, and dug out
by Miklos Szeredi .Cc: Jay
Cc: Miklos Szeredi
Signed-off-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
romfs_get_unmapped_area() checks argument `len' without considering
PAGE_ALIGN which will cause do_mmap_pgoff() return -EINVAL error after
commit f67d9b1576c ("nommu: add page_align to mmap").Fix the check by changing it in same way ramfs_nommu_get_unmapped_area()
was changed in ramfs/file-nommu.c.Signed-off-by: Bob Liu
Cc: David Howells
Cc: Paul Mundt
Acked-by: Greg Ungerer
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
btrfs: fix inconsonant inode information
Btrfs: make sure to update total_bitmaps when freeing cache V3
Btrfs: fix type mismatch in find_free_extent()
Btrfs: make sure to record the transid in new inodes -
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: prevent bogus assert when trying to remove non-existent attribute
xfs: clear XFS_IDIRTY_RELEASE on truncate down
xfs: reset inode per-lifetime state when recycling it
27 Jun, 2011
3 commits
-
When iputting the inode, We may leave the delayed nodes if they have some
delayed items that have not been dealt with. So when the inode is read again,
we must look up the relative delayed node, and use the information in it to
initialize the inode. Or we will get inconsonant inode information, it may
cause that the same directory index number is allocated again, and hit the
following oops:[ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the
insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
[ 5447.569766] ------------[ cut here ]------------
[ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
[SNIP]
[ 5447.790721] Call Trace:
[ 5447.793191] [] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
[ 5447.800156] [] btrfs_add_link+0x12b/0x191 [btrfs]
[ 5447.806517] [] btrfs_add_nondir+0x31/0x58 [btrfs]
[ 5447.812876] [] btrfs_create+0xf9/0x197 [btrfs]
[ 5447.818961] [] vfs_create+0x72/0x92
[ 5447.824090] [] do_last+0x22c/0x40b
[ 5447.829133] [] path_openat+0xc0/0x2ef
[ 5447.834438] [] ? __perf_event_task_sched_out+0x24/0x44
[ 5447.841216] [] ? perf_event_task_sched_out+0x59/0x67
[ 5447.847846] [] do_filp_open+0x3d/0x87
[ 5447.853156] [] ? strncpy_from_user+0x43/0x4d
[ 5447.859072] [] ? getname_flags+0x2e/0x80
[ 5447.864636] [] ? do_getname+0x14b/0x173
[ 5447.870112] [] ? audit_getname+0x16/0x26
[ 5447.875682] [] ? spin_lock+0xe/0x10
[ 5447.880882] [] do_sys_open+0x69/0xae
[ 5447.886153] [] sys_open+0x20/0x22
[ 5447.891114] [] system_call_fastpath+0x16/0x1bFix it by reusing the old delayed node.
Reported-by: Jim Schutt
Signed-off-by: Miao Xie
Tested-by: Jim Schutt
Signed-off-by: Chris Mason -
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: mark CONFIG_CIFS_NFSD_EXPORT as BROKEN
cifs: free blkcipher in smbhash -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
cifs: propagate errors from cifs_get_root() to mount(2)
cifs: tidy cifs_do_mount() up a bit
cifs: more breakage on mount failures
cifs: close sget() races
cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount()
cifs: move cifs_umount() call into ->kill_sb()
cifs: pull cifs_mount() call up
sanitize cifs_umount() prototype
cifs: initialize ->tlink_tree in cifs_setup_cifs_sb()
cifs: allocate mountdata earlier
cifs: leak on mount if we share superblock
cifs: don't pass superblock to cifs_mount()
cifs: don't leak nls on mount failure
cifs: double free on mount failure
take bdi setup/destruction into cifs_mount/cifs_umountAcked-by: Steve French
25 Jun, 2011
20 commits
-
A user reported this bug again where we have more bitmaps than we are supposed
to. This is because we failed to load the free space cache, but don't update
the ctl->total_bitmaps counter when we remove entries from the tree. This patch
fixes this problem and we should be good to go again. Thanks,Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason -
data parameter should be u64 because a full-sized chunk flags field is
passed instead of 0/1 for distinguishing data from metadata. All
underlying functions expect u64.Signed-off-by: Ilya Dryomov
Signed-off-by: Chris Mason -
... instead of just failing with -EINVAL
Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
if cifs_get_root() fails, we end up with ->mount() returning NULL,
which is not what callers expect. Moreover, in case of superblock
reuse we end up leaking a superblock reference...Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
have ->s_fs_info set by the set() callback passed to sget()
Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
all callers of cifs_umount() proceed to do the same thing; pull it into
cifs_umount() itself.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
instead of calling it manually in case if cifs_read_super() fails
to set ->s_root, just call it from ->kill_sb(). cifs_put_super()
is gone now *and* we have cifs_sb shutdown and destruction done
after the superblock is gone from ->s_instances.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
... to the point prior to sget(). Now we have cifs_sb set up early
enough.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
a) superblock argument is unused
b) it always returns 0Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
no need to wait until cifs_read_super() and we need it done
by the time cifs_mount() will be called.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
pull mountdata allocation up, so that it won't stand in the way when
we lift cifs_mount() to location before sget().Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
cifs_sb and nls end up leaked...
Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
To close sget() races we'll need to be able to set cifs_sb up before
we get the superblock, so we'll want to be able to do cifs_mount()
earlier. Fortunately, it's easy to do - setting ->s_maxbytes can
be done in cifs_read_super(), ditto for ->s_time_gran and as for
putting MS_POSIXACL into ->s_flags, we can mirror it in ->mnt_cifs_flags
until cifs_read_super() is called. Kill unused 'devname' argument,
while we are at it...Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
if cifs_sb allocation fails, we still need to drop nls we'd stashed
into volume_info - the one we would've copied to cifs_sb if we could
allocate the latter.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
if we get to out_super with ->s_root already set (e.g. with
cifs_get_root() failure), we'll end up with cifs_put_super()
called and ->mountdata freed twice. We'll also get cifs_sb
freed twice and cifs_sb->local_nls dropped twice. The problem
is, we can get to out_super both with and without ->s_root,
which makes ->put_super() a bad place for such work.Switch to ->kill_sb(), have all that work done there after
kill_anon_super(). Unlike ->put_super(), ->kill_sb() is
called by deactivate_locked_super() whether we have ->s_root
or not.Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
Acked-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Al Viro -
This does not work properly with CIFS as current servers do not
enable support for the FILE_OPEN_BY_FILE_ID on SMB NTCreateX
and not all NFS clients handle ESTALE.For now, it just plain doesn't work. Mark it BROKEN to discourage
distros from enabling it.Signed-off-by: Jeff Layton
Signed-off-by: Steve French -
When we create a new inode, we aren't filling in the
field that records the transaction that last changed this
inode.If we then go to fsync that inode, it will be skipped because the field
isn't filled in.Signed-off-by: Chris Mason
-
This is currently leaked in the rc == 0 case.
Reported-by: J. Bruce Fields
Signed-off-by: Jeff Layton
Reviewed-by: Shirish Pargaonkar
Signed-off-by: Steve French
24 Jun, 2011
7 commits
-
* 'for-linus' of git://git.kernel.dk/linux-block:
block: add REQ_SECURE to REQ_COMMON_MASK
block: use the passed in @bdev when claiming if partno is zero
block: Add __attribute__((format(printf...) and fix fallout
block: make disk_block_events() properly wait for work cancellation
block: remove non-syncing __disk_block_events() and fold it into disk_block_events()
block: don't use non-syncing event blocking in disk_check_events()
cfq-iosched: fix locking around ioc->ioc_data assignment -
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix wsize negotiation to respect max buffer size and active signing (try #4)
CIFS: Fix problem with 3.0-rc1 null user mount failure -
It was pointed out by 'make versioncheck' that some includes of
linux/version.h were not needed in fs/ (fs/btrfs/ctree.h and
fs/omfs/file.c).This patch removes them.
Signed-off-by: Jesper Juhl
Acked-by: Bob Copeland
Signed-off-by: Linus Torvalds -
If the attribute fork on an inode is in btree format and has
multiple levels (i.e node format rather than leaf format), then a
lookup failure will trigger an assert failure in xfs_da_path_shift
if the flag XFS_DA_OP_OKNOENT is not set. This flag is used to
indicate to the directory btree code that not finding an entry is
not a fatal error. In the case of doing a lookup for a directory
name removal, this is valid as a user cannot insert an arbitrary
name to remove from the directory btree.However, in the case of the attribute tree, a user has direct
control over the attribute name and can ask for any random name to
be removed without any validation. In this case, fsstress is asking
for a non-existent user.selinux attribute to be removed, and that is
causing xfs_da_path_shift() to fall off the bottom of the tree where
it asserts that a lookup failure is allowed. Because the flag is not
set, we die a horrible death on a debug enable kernel.Prevent this assert from firing on attribute removes by adding the
op_flag XFS_DA_OP_OKNOENT to atribute removal operations.Discovered when testing on a SELinux enabled system by fsstress in
test 070 by trying to remove a non-existent user.selinux attribute.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder -
When an inode is truncated down, speculative preallocation is
removed from the inode. This should also reset the state bits for
controlling whether preallocation is subsequently removed when the
file is next closed. The flag is not being cleared, so repeated
operations on a file that first involve a truncate (e.g. multiple
repeated dd invocations on a file) give different file layouts for
the second and subsequent invocations.Fix this by clearing the XFS_IDIRTY_RELEASE state bit when the
XFS_ITRUNCATED bit is detected in xfs_release() and hence ensure
that speculative delalloc is removed on files that have been
truncated down.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder -
XFS inodes has several per-lifetime state fields that determine the
behaviour of the inode. These state fields are not all reset when an
inode is reused from the reclaimable state.This can lead to unexpected behaviour of the new inode such as
speculative preallocation not being truncated away in the expected
manner for local files until the inode is subsequently truncated,
freed or cycles out of the cache. It can also lead to an inode being
considered to be a filestream inode or having been truncated when
that is not the case.Rework the reinitialisation of the inode when it is recycled to
ensure that it is pristine before it is reused. While there, also
fix the resetting of state flags in the recycling error paths so the
inode does not become unreclaimable.Signed-off-by: Dave Chinner
Signed-off-by: Alex Elder -
Hopefully last version. Base signing check on CAP_UNIX instead of
tcon->unix_ext, also clean up the comments a bit more.According to Hongwei Sun's blog posting here:
http://blogs.msdn.com/b/openspecification/archive/2009/04/10/smb-maximum-transmit-buffer-size-and-performance-tuning.aspx
CAP_LARGE_WRITEX is ignored when signing is active. Also, the maximum
size for a write without CAP_LARGE_WRITEX should be the maxBuf that
the server sent in the NEGOTIATE request.Fix the wsize negotiation to take this into account. While we're at it,
alter the other wsize definitions to use sizeof(WRITE_REQ) to allow for
slightly larger amounts of data to potentially be written per request.Signed-off-by: Jeff Layton
Signed-off-by: Steve French
23 Jun, 2011
2 commits
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
jfs: agstart field must be 64 bits
JFS: Don't save agno in the inode
jfs: Update agstart when resizing volume
jfs: old_agsize should be 64 bits in jfs_extendfs -
Figured it out: it was broken by b946845a9dc523c759cae2b6a0f6827486c3221a commit - "cifs: cifs_parse_mount_options: do not tokenize mount options in-place". So, as a quick fix I suggest to apply this patch.
[PATCH] CIFS: Fix kfree() with constant string in a null user case
Signed-off-by: Pavel Shilovsky
Reviewed-by: Jeff Layton
Signed-off-by: Steve French
22 Jun, 2011
2 commits
-
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix decode_secinfo_maxsz
NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test
NFSv4.1: Fix some issues with pnfs_generic_pg_test
NFSv4.1: file layout must consider pg_bsize for coalescing
pnfs-obj: No longer needed to take an extra ref at add_device
SUNRPC: Ensure the RPC client only quits on fatal signals
NFSv4: Fix a readdir regression
nfs4.1: mark layout as bad on error path in _pnfs_return_layout
nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout
NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path
NFS: (d)printks should use %zd for ssize_t arguments
NFSv4.1: fix break condition in pnfs_find_lseg
nfs4.1: fix several problems with _pnfs_return_layout
NFSv4.1: allow zero fh array in filelayout decode layout
NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
NFSv4.1: Fix a refcounting issue in the pNFS device id cache
NFSv4.1: deprecate headerpadsz in CREATE_SESSION
NFS41: do not update isize if inode needs layoutcommit
NLM: Don't hang forever on NLM unlock requests
NFS: fix umount of pnfs filesystems -
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: Fix oops in jbd2_journal_remove_journal_head()
jbd2: Remove obsolete parameters in the comments for some jbd2 functions
ext4: fixed tracepoints cleanup
ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
ext4: Fix max file size and logical block counting of extent format file
ext4: correct comments for ext4_free_blocks()
21 Jun, 2011
2 commits
-
I initially did the calculation in bytes, and not words
Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust -
And document what is going on there...
Signed-off-by: Trond Myklebust