14 Oct, 2009
1 commit
-
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cciss: Add cciss_allow_hpsa module parameter
cciss: Fix multiple calls to pci_release_regions
blk-settings: fix function parameter kernel-doc notation
writeback: kill space in debugfs item name
writeback: account IO throttling wait as iowait
elv_iosched_store(): fix strstrip() misuse
cfq-iosched: avoid probable slice overrun when idling
cfq-iosched: apply bool value where we return 0/1
cfq-iosched: fix think time allowed for seekers
cfq-iosched: fix the slice residual sign
cfq-iosched: abstract out the 'may this cfqq dispatch' logic
block: use proper BLK_RW_ASYNC in blk_queue_start_tag()
block: Seperate read and write statistics of in_flight requests v2
block: get rid of kblock_schedule_delayed_work()
cfq-iosched: fix possible problem with jiffies wraparound
cfq-iosched: fix issue with rq-rq merging and fifo list ordering
13 Oct, 2009
2 commits
-
This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.Signed-off-by: "Theodore Ts'o"
Signed-off-by: Jan Kara -
struct sockaddr_storage * can safely be used as struct sockaddr *.
Suppress an "incompatible pointer type" warning.Signed-off-by: Stefan Richter
Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds
12 Oct, 2009
3 commits
-
An interestingly corrupted romfs file system exposed a problem with the
romfs_dev_strnlen function: it's passing the wrong value to its helpers.
Rather than limit the string to the length passed in by the callers, it
uses the size of the device as the limit.Signed-off-by: Bernd Schmidt
Signed-off-by: Mike Frysinger
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds -
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix file clone ioctl for bookend extents
Btrfs: fix uninit compiler warning in cow_file_range_nocow
Btrfs: constify dentry_operations
Btrfs: optimize back reference update during btrfs_drop_snapshot
Btrfs: remove negative dentry when deleting subvolumne
Btrfs: optimize fsync for the single writer case
Btrfs: async delalloc flushing under space pressure
Btrfs: release delalloc reservations on extent item insertion
Btrfs: delay clearing EXTENT_DELALLOC for compressed extents
Btrfs: cleanup extent_clear_unlock_delalloc flags
Btrfs: fix possible softlockup in the allocator
Btrfs: fix deadlock on async thread startup -
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.Signed-off-by: Alexey Dobriyan
10 Oct, 2009
2 commits
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
ima: ecryptfs fix imbalance message
eCryptfs: Remove Kconfig NET dependency and select MD5
ecryptfs: depends on CRYPTO -
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: stop calling filemap_fdatawait inside ->fsync
fix readahead calculations in xfs_dir2_leaf_getdents()
xfs: make sure xfs_sync_fsdata covers the log
xfs: mark inodes dirty before issuing I/O
xfs: cleanup ->sync_fs
xfs: fix xfs_quiesce_data
xfs: implement ->dirty_inode to fix timestamp handling
09 Oct, 2009
22 commits
-
The file clone ioctl was incorrectly taking the offset into the
extent on disk into account when calculating the length of the
cloned extent.The length never changes based on the offset into the physical extent.
Test case:
fallocate -l 1g image
mke2fs image
bcp image image2
e2fsck -f image2(errors on image2)
The math bug ends up wrapping the length of the extent, and things
go wrong from there.Signed-off-by: Chris Mason
-
The extent_type variable was exposed uninit via a goto. It should be
impossible to trigger because it is protected by a check on another
variable, but this makes sure.Signed-off-by: Chris Mason
-
Signed-off-by: Chris Mason
-
This patch reading level 0 tree blocks that already use full backrefs.
Signed-off-by: Yan Zheng
Signed-off-by: Chris Mason -
The use of btrfs_dentry_delete is removing dentries from the
dcache when deleting subvolumne. btrfs_dentry_delete ignores
negative dentries. This is incorrect since if we don't remove
the negative dentry, its parent dentry can't be removed.Signed-off-by: Yan Zheng
Signed-off-by: Chris Mason -
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFSv4: Kill nfs4_renewd_prepare_shutdown()
NFSv4: Fix the referral mount code
nfs: Avoid overrun when copying client IP address string
NFS: Fix port initialisation in nfs_remount()
NFS: Fix port and mountport display in /proc/self/mountinfo
NFS: Fix a default mount regression... -
This patch optimizes the tree logging stuff so it doesn't always wait 1 jiffie
for new people to join the logging transaction if there is only ever 1 writer.
This helps a little bit with latency where we have something like RPM where it
will fdatasync every file it writes, and so waiting the 1 jiffie for every
fdatasync really starts to add up.Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason -
This patch moves the delalloc flushing that occurs when we are under space
pressure off to a async thread pool. This helps since we only free up
metadata space when we actually insert the extent item, which means it takes
quite a while for space to be free'ed up if we wait on all ordered extents.
However, if space is freed up due to inline extents being inserted, we can
wake people who are waiting up early, and they can finish their work.Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason -
This patch fixes an issue with the delalloc metadata space reservation
code. The problem is we used to free the reservation as soon as we
allocated the delalloc region. The problem with this is if we are not
inserting an inline extent, we don't actually insert the extent item until
after the ordered extent is written out. This patch does 3 things,1) It moves the reservation clearing stuff into the ordered code, so when
we remove the ordered extent we remove the reservation.
2) It adds a EXTENT_DO_ACCOUNTING flag that gets passed when we clear
delalloc bits in the cases where we want to clear the metadata reservation
when we clear the delalloc extent, in the case that we do an inline extent
or we invalidate the page.
3) It adds another waitqueue to the space info so that when we start a fs
wide delalloc flush, anybody else who also hits that area will simply wait
for the flush to finish and then try to make their allocation.This has been tested thoroughly to make sure we did not regress on
performance.Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason -
When compression is on, the cow_file_range code is farmed off to
worker threads. This allows us to do significant CPU work in parallel
on SMP machines.But it is a delicate balance around when we clear flags and how. In
the past we cleared the delalloc flag immediately, which was safe
because the pages stayed locked.But this is causing problems with the newest ENOSPC code, and with the
recent extent state cleanups we can now clear the delalloc bit at the
same time the uncompressed code does.Signed-off-by: Chris Mason
-
extent_clear_unlock_delalloc has a growing set of ugly parameters
that is very difficult to read and maintain.This switches to a flag field and well named flag defines.
Signed-off-by: Chris Mason
-
Now that the VFS actually waits for the data I/O to complete before
calling into ->fsync we can stop doing it ourselves.Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
This is for bug #850,
http://oss.sgi.com/bugzilla/show_bug.cgi?id=850
XFS file system segfaults , repeatedly and 100% reproducable in 2.6.30 , 2.6.31The above only showed up on a CONFIG_XFS_DEBUG=y kernel, because
xfs_bmapi() ASSERTs that it has been asked for at least one map,and it was getting 0.
The root cause is that our guesstimated "bufsize" from xfs_file_readdir
was fairly small, and thebufsize -= length;
in the loop was going negative - except bufsize is a size_t, so it
was wrapping to a very large number.Then when we did
ra_want = howmany(bufsize + mp->m_dirblksize,
mp->m_sb.sb_blocksize) - 1;with that very large number, the (int) ra_want was coming out
negative, and a subsequent compare:if (1 + ra_want > map_blocks ...
was coming out -true- (negative int compare w/ uint) and we went
back to xfs_bmapi() for more, even though we did not need more,
and asked for 0 maps, and hit the ASSERT.We have kind of a type mess here, but just keeping bufsize from
going negative is probably sufficient to avoid the problem.Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
We want to always cover the log after writing out the superblock, and
in case of a synchronous writeout make sure we actually wait for the
log to be covered. That way a filesystem that has been sync()ed can
be considered clean by log recovery.Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Eric Sandeen
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
To make sure they get properly waited on in sync when I/O is in flight and
we latter need to update the inode size. Requires a new helper to check if an
ioend structure is beyond the current EOF.Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
Sort out ->sync_fs to not perform a superblock writeback for the wait = 0 case
as that is just an optional first pass and the superblock will be written back
properly in the next call with wait = 1. Instead perform an opportunistic
quota writeback to have less work later. Also remove the freeze special case
as we do a proper wait = 1 call in the freeze code anyway.Also rename the function to xfs_fs_sync_fs to match the normal naming
convention, update comments and avoid calling into the laptop_mode logic on
an error.Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
We need to do a synchronous xfs_sync_fsdata to make sure the superblock
actually is on disk when we return.Also remove SYNC_BDFLUSH flag to xfs_sync_inodes because that particular
flag is never checked.Move xfs_filestream_flush call later to only release inodes after they
have been written out.Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
This is picking up on Felix's repost of Dave's patch to implement a
.dirty_inode method. We really need this notification because
the VFS keeps writing directly into the inode structure instead
of going through methods to update this state. In addition to
the long-known atime issue we now also have a caller in VM code
that updates c/mtime that way for shared writeable mmaps. And
I found another one that no one has noticed in practice in the FIFO
code.So implement ->dirty_inode to set i_update_core whenever the
inode gets externally dirtied, and switch the c/mtime handling to
the same scheme we already use for atime (always picking up
the value from the Linux inode).Note that this patch also removes the xfs_synchronize_atime call
in xfs_reclaim it was superflous as we already synchronize the time
when writing the inode via the log (xfs_inode_item_format) or the
normal buffers (xfs_iflush_int).In addition also remove the I_CLEAR check before copying the Linux
timestamps - now that we always have the Linux inode available
we can always use the timestamps in it.Also switch to just using file_update_time for regular reads/writes -
that will get us all optimization done to it for free and make
sure we notice early when it breaks.Signed-off-by: Christoph Hellwig
Reviewed-by: Felix Blyakher
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder -
The unencrypted files are being measured. Update the counters to get
rid of the ecryptfs imbalance message. (http://bugzilla.redhat.com/519737)Reported-by: Sachin Garg
Cc: Eric Paris
Cc: Dustin Kirkland
Cc: James Morris
Cc: David Safford
Cc: stable@kernel.org
Signed-off-by: Mimi Zohar
Signed-off-by: Tyler Hicks -
eCryptfs no longer uses a netlink interface to communicate with
ecryptfsd, so NET is not a valid dependency anymore.MD5 is required and must be built for eCryptfs to be of any use.
Signed-off-by: Tyler Hicks
-
ecryptfs uses crypto APIs so it should depend on CRYPTO.
Otherwise many build errors occur. [63 lines not pasted]Signed-off-by: Randy Dunlap
Cc: Andrew Morton
Cc: ecryptfs-devel@lists.launchpad.net
Signed-off-by: Tyler Hicks
08 Oct, 2009
3 commits
-
The NFSv4 renew daemon is shared between all active super blocks that refer
to a particular NFS server, so it is wrong to be shutting it down in
nfs4_kill_super every time a super block is destroyed.This patch therefore kills nfs4_renewd_prepare_shutdown altogether, and
leaves it up to nfs4_shutdown_client() to also shut down the renew daemon
by means of the existing call to nfs4_kill_renewd().Signed-off-by: Trond Myklebust
-
This flag indicates a hardware detected memory corruption on the page.
Any future access of the page data may bring down the machine.Signed-off-by: Wu Fengguang
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
fix the following 'make includecheck' warning:
fs/proc/kcore.c: linux/mm.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Oct, 2009
6 commits
-
Fix a typo which causes try_location() to use the wrong length argument
when calling nfs_parse_server_name(). This again, causes the initialisation
of the mount's sockaddr structure to fail.Also ensure that if nfs4_pathname_string() returns an error, then we pass
that error back up the stack instead of ENOENT.Signed-off-by: Trond Myklebust
-
As seen in , nfs4_init_client() can
overrun the source string when copying the client IP address from
nfs_parsed_mount_data::client_address to nfs_client::cl_ipaddr. Since
these are both treated as null-terminated strings elsewhere, the copy
should be done with strlcpy() not memcpy().Signed-off-by: Ben Hutchings
Signed-off-by: Trond Myklebust -
The recent changeset 53a0b9c4c99ab0085a06421f71592722e5b3fd5f (NFS: Replace
nfs_parse_ip_address() with rpc_pton()) broke nfs_remount, since the call
to rpc_pton() will zero out the port number in data->nfs_server.address.This is actually due to a bug in nfs_remount: it should be looking at the
port number in nfs_server.port instead...This fixes bug
http://bugzilla.kernel.org/show_bug.cgi?id=14276Signed-off-by: Trond Myklebust
-
Currently, the port and mount port will both display as 65535 if you do not
specify a port number. That would be wrong...Signed-off-by: Trond Myklebust
-
With the recent spate of changes, the nfs protocol version will now default
to 2 instead of 3, while the mount protocol version defaults to 3.The following patch should ensure the defaults are consistent with the
previous defaults of vers=3,proto=tcp,mountvers=3,mountproto=tcp.This fixes the bug
http://bugzilla.kernel.org/show_bug.cgi?id=14259Signed-off-by: Trond Myklebust
-
Commit a9327cac440be4d8333bba975cbbf76045096275 added seperate read
and write statistics of in_flight requests. And exported the number
of read and write requests in progress seperately through sysfs.But Corrado Zoccolo reported getting strange
output from "iostat -kx 2". Global values for service time and
utilization were garbage. For interval values, utilization was always
100%, and service time is higher than normal.So this was reverted by commit 0f78ab9899e9d6acb09d5465def618704255963b
The problem was in part_round_stats_single(), I missed the following:
if (now == part->stamp)
return;- if (part->in_flight) {
+ if (part_in_flight(part)) {
__part_stat_add(cpu, part, time_in_queue,
part_in_flight(part) * (now - part->stamp));
__part_stat_add(cpu, part, io_ticks, (now - part->stamp));With this chunk included, the reported regression gets fixed.
Signed-off-by: Nikanth Karthikesan
--
Signed-off-by: Jens Axboe
06 Oct, 2009
1 commit
-
Like the cluster allocating stuff, we can lockup the box with the normal
allocation path. This happens when we1) Start to cache a block group that is severely fragmented, but has a decent
amount of free space.
2) Start to commit a transaction
3) Have the commit try and empty out some of the delalloc inodes with extents
that are relatively large.The inodes will not be able to make the allocations because they will ask for
allocations larger than a contiguous area in the free space cache. So we will
wait for more progress to be made on the block group, but since we're in a
commit the caching kthread won't make any more progress and it already has
enough free space that wait_block_group_cache_progress will just return. So,
if we wait and fail to make the allocation the next time around, just loop and
go to the next block group. This keeps us from getting stuck in a softlockup.
Thanks,Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason