Doug / smarc-fsl-linux-kernel | Embedian Git Server

08 Feb, 2013

1 commit

8d19514fa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"We've got corner cases for updating i_size that ceph was hitting,
error handling for quotas when we run out of space, a very subtle
snapshot deletion race, a crash while removing devices, and one
deadlock between subvolume creation and the sb_internal code (thanks
lockdep)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: move d_instantiate outside the transaction during mksubvol
Btrfs: fix EDQUOT handling in btrfs_delalloc_reserve_metadata
Btrfs: fix possible stale data exposure
Btrfs: fix missing i_size update
Btrfs: fix race between snapshot deletion and getting inode
Btrfs: fix missing release of the space/qgroup reservation in start_transaction()
Btrfs: fix wrong sync_writers decrement in btrfs_file_aio_write()
Btrfs: do not merge logged extents if we've removed them from the tree
btrfs: don't try to notify udev about missing devices

Linus Torvalds
2013-02-08 09:06:46 +0800

07 Feb, 2013

1 commit

1a65e24b0 Btrfs: move d_instantiate outside the transaction during mksubvol ... Browse Code »

Dave Sterba triggered a lockdep complaint about lock ordering
between the sb_internal lock and the cleaner semaphore.

btrfs_lookup_dentry() checks for orphans if we're looking up
the inode for a subvolume, and subvolume creation is triggering
the lookup with a transaction running.

This commit moves the d_instantiate after the transaction closes.

Signed-off-by: Chris Mason

Chris Mason
2013-02-07 01:11:10 +0800

06 Feb, 2013

8 commits

eb6b88d92 Btrfs: fix EDQUOT handling in btrfs_delalloc_reserve_metadata ... Browse Code »

When btrfs_qgroup_reserve returned a failure, we were missing a counter
operation for BTRFS_I(inode)->outstanding_extents++, leading to warning
messages about outstanding extents and space_info->bytes_may_use != 0.
Additionally, the error handling code didn't take into account that we
dropped the inode lock which might require more cleanup.

Luckily, all the cleanup code we need is already there and can be shared
with reserve_metadata_bytes, which is exactly what this patch does.

Reported-by: Lev Vainblat
Signed-off-by: Jan Schmidt
Signed-off-by: Chris Mason

Jan Schmidt
2013-02-06 22:24:40 +0800
24f8ebe91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git for-chris into for-linus Browse Code »

Chris Mason
2013-02-06 08:24:44 +0800
59fe4f419 Btrfs: fix possible stale data exposure ... Browse Code »

We specifically do not update the disk i_size if there are ordered extents
outstanding for any area between the current disk_i_size and our ordered
extent so that we do not expose stale data. The problem is the check we
have only checks if the ordered extent starts at or after the current
disk_i_size, which doesn't take into account an ordered extent that starts
before the current disk_i_size and ends past the disk_i_size. Fix this by
checking if the extent ends past the disk_i_size. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-02-06 05:09:16 +0800
5d1f40202 Btrfs: fix missing i_size update ... Browse Code »

If we have an ordered extent before the ordered extent we are currently
completing that is after the current disk_i_size we will put our i_size
update into that ordered extent so that we do not expose stale data. The
problem is that if our disk i_size is updated past the previous ordered
extent we won't update the i_size with the pending i_size update. So check
the pending i_size update and if its above the current disk i_size we need
to go ahead and try to update. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-02-06 05:09:14 +0800
6f1c36055 Btrfs: fix race between snapshot deletion and getting inode ... Browse Code »

While running snapshot testscript created by Mitch and David,
the race between autodefrag and snapshot deletion can lead to
corruption of dead_root list so that we can get crash on
btrfs_clean_old_snapshots().

And besides autodefrag, scrub also does the same thing, ie. read
root first and get inode.

Here is the story(take autodefrag as an example):
(1) when we delete a snapshot or subvolume, it will set its root's
refs to zero and do a iput() on its own inode, and if this inode happens
to be the only active in-meory one in root's inode rbtree, it will add
itself to the global dead_roots list for later cleanup.

(2) after (1), the autodefrag thread may read another inode for defrag
and the inode is just in the deleted snapshot/subvolume, but all of these
are without checking if the root is still valid(refs > 0). So the end up
result is adding the deleted snapshot/subvolume's root to the global
dead_roots list AGAIN.

Fortunately, we already have a srcu lock to avoid the race, ie. subvol_srcu.

So all we need to do is to take the lock to protect 'read root and get inode',
since we synchronize to wait for the rcu grace period before adding something
to the global dead_roots list.

Reported-by: Mitch Harder
Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2013-02-06 05:09:13 +0800
843fcf357 Btrfs: fix missing release of the space/qgroup reservation in start_transaction() ... Browse Code »

When we fail to start a transaction, we need to release the reserved free space
and qgroup space, fix it.

Signed-off-by: Miao Xie
Reviewed-by: Jan Schmidt
Signed-off-by: Josef Bacik

Miao Xie
2013-02-06 05:09:11 +0800
0a3404dcf Btrfs: fix wrong sync_writers decrement in btrfs_file_aio_write() ... Browse Code »

If the checks at the beginning of btrfs_file_aio_write() fail, we needn't
decrease ->sync_writers, because we have not increased it. Fix it.

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-02-06 05:09:10 +0800
222c81dc3 Btrfs: do not merge logged extents if we've removed them from the tree ... Browse Code »

You can run into this problem where if somebody is fsyncing and writing out
the existing extents you will have removed the extent map from the em tree,
but it's still valid for the current fsync so we go ahead and write it. The
problem is we unconditionally try to merge it back into the em tree, but if
we've removed it from the em tree that will cause use after free problems.
Fix this to only merge if we are still a part of the tree. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-02-06 05:09:03 +0800

05 Feb, 2013

3 commits

fe547d771 Merge branch 'fix-max-write' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm ... Browse Code »

Pull dlm fix from David Teigland:
"Thanks to Jana who reported the problem and was able to test this fix
so quickly."

This fixes an incorrect size check that triggered for CONFIG_COMPAT
whether the code was actually doing compat or not. The incorrect write
size check broke userland (clvmd) when maximum resource name lengths are
used.

* 'fix-max-write' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: check the write size from user

Linus Torvalds
2013-02-05 17:50:11 +0800
a9bae1895 nilfs2: fix fix very long mount time issue ... Browse Code »

There exists a situation when GC can work in background alone without
any other filesystem activity during significant time.

The nilfs_clean_segments() method calls nilfs_segctor_construct() that
updates superblocks in the case of NILFS_SC_SUPER_ROOT and
THE_NILFS_DISCONTINUED flags are set. But when GC is working alone the
nilfs_clean_segments() is called with unset THE_NILFS_DISCONTINUED flag.
As a result, the update of superblocks doesn't occurred all this time
and in the case of SPOR superblocks keep very old values of last super
root placement.

SYMPTOMS:

Trying to mount a NILFS2 volume after SPOR in such environment ends with
very long mounting time (it can achieve about several hours in some
cases).

REPRODUCING PATH:

1. It needs to use external USB HDD, disable automount and doesn't
make any additional filesystem activity on the NILFS2 volume.

2. Generate temporary file with size about 100 - 500 GB (for example,
dd if=/dev/zero of= bs=1073741824 count=200). The size of
file defines duration of GC working.

3. Then it needs to delete file.

4. Start GC manually by means of command "nilfs-clean -p 0". When you
start GC by means of such way then, at the end, superblocks is updated
by once. So, for simulation of SPOR, it needs to wait sometime (15 -
40 minutes) and simply switch off USB HDD manually.

5. Switch on USB HDD again and try to mount NILFS2 volume. As a
result, NILFS2 volume will mount during very long time.

REPRODUCIBILITY: 100%

FIX:

This patch adds checking that superblocks need to update and set
THE_NILFS_DISCONTINUED flag before nilfs_clean_segments() call.

Reported-by: Sergey Alexandrov
Signed-off-by: Vyacheslav Dubeyko
Tested-by: Vyacheslav Dubeyko
Acked-by: Ryusuke Konishi
Tested-by: Ryusuke Konishi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vyacheslav Dubeyko
2013-02-05 17:38:46 +0800
d4b0bcf32 dlm: check the write size from user ... Browse Code »

Return EINVAL from write if the size is larger than
allowed. Do this before allocating kernel memory for
the bogus size, which could lead to OOM.

Reported-by: Sasha Levin
Tested-by: Jana Saout
Signed-off-by: David Teigland

David Teigland
2013-02-05 05:31:22 +0800

02 Feb, 2013

1 commit

3c9116080 btrfs: don't try to notify udev about missing devices ... Browse Code »

If we remove a missing device, bdev is null, and if we
send that off to btrfs_kobject_uevent we'll panic.

Signed-off-by: Eric Sandeen
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Eric Sandeen
2013-02-02 00:47:37 +0800

01 Feb, 2013

1 commit

bf6c8a814 Merge tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client bugfixes from Trond Myklebust:

- Error reporting in nfs_xdev_mount incorrectly maps all errors to
ENOMEM

- Fix an NFSv4 refcounting issue

- Fix a mount failure when the server reboots during NFSv4 trunking
discovery

- NFSv4.1 mounts may need to run the lease recovery thread.

- Don't silently fail setattr() requests on mountpoints

- Fix a SUNRPC socket/transport livelock and priority queue issue

- We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.

* tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session
SUNRPC: When changing the queue priority, ensure that we change the owner
NFS: Don't silently fail setattr() requests on mountpoints
NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery
NFSv4: Fix NFSv4 trunking discovery
NFSv4: Fix NFSv4 reference counting for trunked sessions
NFS: Fix error reporting in nfs_xdev_mount

Linus Torvalds
2013-02-01 05:43:52 +0800

31 Jan, 2013

2 commits

c489ee290 NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session ... Browse Code »

NFS4ERR_DELAY is a legal reply when we call DESTROY_SESSION. It
usually means that the server is busy handling an unfinished RPC
request. Just sleep for a second and then retry.
We also need to be able to handle the NFS4ERR_BACK_CHAN_BUSY return
value. If the NFS server has outstanding callbacks, we just want to
similarly sleep & retry.

Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org

Trond Myklebust
2013-01-31 06:45:15 +0800
ab2254178 NFS: Don't silently fail setattr() requests on mountpoints ... Browse Code »

Ensure that any setattr and getattr requests for junctions and/or
mountpoints are sent to the server. Ever since commit
0ec26fd0698 (vfs: automount should ignore LOOKUP_FOLLOW), we have
silently dropped any setattr requests to a server-side mountpoint.
For referrals, we have silently dropped both getattr and setattr
requests.

This patch restores the original behaviour for setattr on mountpoints,
and tries to do the same for referrals, provided that we have a
filehandle...

Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org

Trond Myklebust
2013-01-31 06:41:04 +0800

30 Jan, 2013

1 commit

f96736e1b Merge tag 'for-linus-v3.8-rc6' of git://oss.sgi.com/xfs/xfs ... Browse Code »

Pull xfs bugfixes from Ben Myers:
"Here are fixes for returning EFSCORRUPTED on probe of a non-xfs
filesystem, the stack switch in xfs_bmapi_allocate, a crash in
_xfs_buf_find, speculative preallocation as the filesystem nears
ENOSPC, an unmount hang, a race with AIO, and a regression with
xfs_fsr:

- fix return value when filesystem probe finds no XFS magic, a
regression introduced in 9802182.

- fix stack switch in __xfs_bmapi_allocate by moving the check for
stack switch up into xfs_bmapi_write.

- fix oops in _xfs_buf_find by validating that the requested block is
within the filesystem bounds.

- limit speculative preallocation near ENOSPC.

- fix an unmount hang in xfs_wait_buftarg by freeing the
xfs_buf_log_item in xfs_buf_item_unlock.

- fix a possible use after free with AIO.

- fix xfs_swap_extents after removal of xfs_flushinval_pages, a
regression introduced in commit fb59581404a."

* tag 'for-linus-v3.8-rc6' of git://oss.sgi.com/xfs/xfs:
xfs: Fix xfs_swap_extents() after removal of xfs_flushinval_pages()
xfs: Fix possible use-after-free with AIO
xfs: fix shutdown hang on invalid inode during create
xfs: limit speculative prealloc near ENOSPC thresholds
xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end
xfs: pull up stack_switch check into xfs_bmapi_write
xfs: Do not return EFSCORRUPTED when filesystem probe finds no XFS magic

Linus Torvalds
2013-01-30 08:59:37 +0800

29 Jan, 2013

7 commits

65e3aa77f xfs: Fix xfs_swap_extents() after removal of xfs_flushinval_pages() ... Browse Code »

Commit fb59581404ab7ec5075299065c22cb211a9262a9 removed
xfs_flushinval_pages() and changed its callers to use
filemap_write_and_wait() and truncate_pagecache_range() directly.

But in xfs_swap_extents() this change accidental switched the argument
for 'tip' to 'ip'. This patch switches it back to 'tip'

Signed-off-by: Torsten Kaiser
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers

Torsten Kaiser
2013-01-29 06:05:10 +0800
4b05d09c1 xfs: Fix possible use-after-free with AIO ... Browse Code »

Running AIO is pinning inode in memory using file reference. Once AIO
is completed using aio_complete(), file reference is put and inode can
be freed from memory. So we have to be sure that calling aio_complete()
is the last thing we do with the inode.

CC: xfs@oss.sgi.com
CC: Ben Myers
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers

Jan Kara
2013-01-29 02:51:22 +0800
9f87832a8 xfs: fix shutdown hang on invalid inode during create ... Browse Code »

When the new inode verify in xfs_iread() fails, the create
transaction is aborted and a shutdown occurs. The subsequent unmount
then hangs in xfs_wait_buftarg() on a buffer that has an elevated
hold count. Debug showed that it was an AGI buffer getting stuck:

[ 22.576147] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[ 22.976213] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[ 23.376206] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[ 23.776325] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck

The trace of this buffer leading up to the shutdown (trimmed for
brevity) looks like:

xfs_buf_init: bno 0x2 nblks 0x1 hold 1 caller xfs_buf_get_map
xfs_buf_get: bno 0x2 len 0x200 hold 1 caller xfs_buf_read_map
xfs_buf_read: bno 0x2 len 0x200 hold 1 caller xfs_trans_read_buf_map
xfs_buf_iorequest: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_hold: bno 0x2 nblks 0x1 hold 1 caller xfs_buf_iorequest
xfs_buf_rele: bno 0x2 nblks 0x1 hold 2 caller xfs_buf_iorequest
xfs_buf_iowait: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_ioerror: bno 0x2 len 0x200 hold 1 caller xfs_buf_bio_end_io
xfs_buf_iodone: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_ioend
xfs_buf_iowait_done: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_hold: bno 0x2 nblks 0x1 hold 1 caller xfs_buf_item_init
xfs_trans_read_buf: bno 0x2 len 0x200 hold 2 recur 0 refcount 1
xfs_trans_brelse: bno 0x2 len 0x200 hold 2 recur 0 refcount 1
xfs_buf_item_relse: bno 0x2 nblks 0x1 hold 2 caller xfs_trans_brelse
xfs_buf_rele: bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_relse
xfs_buf_unlock: bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
xfs_buf_rele: bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
xfs_buf_trylock: bno 0x2 nblks 0x1 hold 2 caller _xfs_buf_find
xfs_buf_find: bno 0x2 len 0x200 hold 2 caller xfs_buf_get_map
xfs_buf_get: bno 0x2 len 0x200 hold 2 caller xfs_buf_read_map
xfs_buf_read: bno 0x2 len 0x200 hold 2 caller xfs_trans_read_buf_map
xfs_buf_hold: bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_init
xfs_trans_read_buf: bno 0x2 len 0x200 hold 3 recur 0 refcount 1
xfs_trans_log_buf: bno 0x2 len 0x200 hold 3 recur 0 refcount 1
xfs_buf_item_unlock: bno 0x2 len 0x200 hold 3 flags DIRTY liflags ABORTED
xfs_buf_unlock: bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock
xfs_buf_rele: bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock

And that is the AGI buffer from cold cache read into memory to
transaction abort. You can see at transaction abort the bli is dirty
and only has a single reference. The item is not pinned, and it's
not in the AIL. Hence the only reference to it is this transaction.

The problem is that the xfs_buf_item_unlock() call is dropping the
last reference to the xfs_buf_log_item attached to the buffer (which
holds a reference to the buffer), but it is not freeing the
xfs_buf_log_item. Hence nothing will ever release the buffer, and
the unmount hangs waiting for this reference to go away.

The fix is simple - xfs_buf_item_unlock needs to detect the last
reference going away in this case and free the xfs_buf_log_item to
release the reference it holds on the buffer.

Signed-off-by: Dave Chinner
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers

Dave Chinner
2013-01-29 02:51:12 +0800
f2a459565 xfs: limit speculative prealloc near ENOSPC thresholds ... Browse Code »

There is a window on small filesytsems where specualtive
preallocation can be larger than that ENOSPC throttling thresholds,
resulting in specualtive preallocation trying to reserve more space
than there is space available. This causes immediate ENOSPC to be
triggered, prealloc to be turned off and flushing to occur. One the
next write (i.e. next 4k page), we do exactly the same thing, and so
effective drive into synchronous 4k writes by triggering ENOSPC
flushing on every page while in the window between the prealloc size
and the ENOSPC prealloc throttle threshold.

Fix this by checking to see if the prealloc size would consume all
free space, and throttle it appropriately to avoid premature
ENOSPC...

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Signed-off-by: Ben Myers

Dave Chinner
2013-01-29 02:50:50 +0800
eb178619f xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end ... Browse Code »

When _xfs_buf_find is passed an out of range address, it will fail
to find a relevant struct xfs_perag and oops with a null
dereference. This can happen when trying to walk a filesystem with a
metadata inode that has a partially corrupted extent map (i.e. the
block number returned is corrupt, but is otherwise intact) and we
try to read from the corrupted block address.

In this case, just fail the lookup. If it is readahead being issued,
it will simply not be done, but if it is real read that fails we
will get an error being reported. Ideally this case should result
in an EFSCORRUPTED error being reported, but we cannot return an
error through xfs_buf_read() or xfs_buf_get() so this lookup failure
may result in ENOMEM or EIO errors being reported instead.

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers

Dave Chinner
2013-01-29 02:49:21 +0800
d26978dd8 xfs: pull up stack_switch check into xfs_bmapi_write ... Browse Code »

The stack_switch check currently occurs in __xfs_bmapi_allocate,
which means the stack switch only occurs when xfs_bmapi_allocate()
is called in a loop. Pull the check up before the loop in
xfs_bmapi_write() such that the first iteration of the loop has
consistent behavior.

Signed-off-by: Brian Foster
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Brian Foster
2013-01-29 02:48:55 +0800
1bee12b8c xfs: Do not return EFSCORRUPTED when filesystem probe finds no XFS magic ... Browse Code »

9802182 changed the return value from EWRONGFS (aka EINVAL)
to EFSCORRUPTED which doesn't seem to be handled properly by
the root filesystem probe.

Signed-off-by: Eric Sandeen
Tested-by: Sergei Trofimovich
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers

Eric Sandeen
2013-01-29 02:48:21 +0800

28 Jan, 2013

5 commits

d4e0bfec9 GFS2: fix skip unlock condition ... Browse Code »

The recent commit fb6791d100d1bba20b5cdbc4912e1f7086ec60f8
included the wrong logic. The lvbptr check was incorrectly
added after the patch was tested.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2013-01-28 17:49:15 +0800
65436ec0c NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery ... Browse Code »

We do need to start the lease recovery thread prior to waiting for the
client initialisation to complete in NFSv4.1.

Signed-off-by: Trond Myklebust
Cc: Chuck Lever
Cc: Ben Greear
Cc: stable@vger.kernel.org [>=3.7]

Trond Myklebust
2013-01-28 04:51:41 +0800
202c312db NFSv4: Fix NFSv4 trunking discovery ... Browse Code »

If walking the list in nfs4[01]_walk_client_list fails, then the most
likely explanation is that the server dropped the clientid before we
actually managed to confirm it. As long as our nfs_client is the very
last one in the list to be tested, the caller can be assured that this
is the case when the final return value is NFS4ERR_STALE_CLIENTID.

Reported-by: Ben Greear
Signed-off-by: Trond Myklebust
Cc: Chuck Lever
Cc: stable@vger.kernel.org [>=3.7]
Tested-by: Ben Greear

Trond Myklebust
2013-01-28 04:51:28 +0800
4ae19c2dd NFSv4: Fix NFSv4 reference counting for trunked sessions ... Browse Code »

The reference counting in nfs4_init_client assumes wongly that it
is safe for nfs4_discover_server_trunking() to return a pointer to a
nfs_client prior to bumping the reference count.

Signed-off-by: Trond Myklebust
Cc: Chuck Lever
Cc: Ben Greear
Cc: stable@vger.kernel.org [>=3.7]

Trond Myklebust
2013-01-28 04:51:15 +0800
dee972b96 NFS: Fix error reporting in nfs_xdev_mount ... Browse Code »

Currently, nfs_xdev_mount converts all errors from clone_server() to
ENOMEM, which can then leak to userspace (for instance to 'mount'). Fix that.
Also ensure that if nfs_fs_mount_common() returns an error, we
don't dprintk(0)...

The regression originated in commit 3d176e3fe4f6dc379b252bf43e2e146a8f7caf01
(NFS: Use nfs_fs_mount_common() for xdev mounts)

Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org [>= 3.5]

Trond Myklebust
2013-01-28 04:51:15 +0800

26 Jan, 2013

1 commit

d7df025eb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"It turns out that we had two crc bugs when running fsx-linux in a
loop. Many thanks to Josef, Miao Xie, and Dave Sterba for nailing it
all down. Miao also has a new OOM fix in this v2 pull as well.

Ilya fixed a regression Liu Bo found in the balance ioctls for pausing
and resuming a running balance across drives.

Josef's orphan truncate patch fixes an obscure corruption we'd see
during xfstests.

Arne's patches address problems with subvolume quotas. If the user
destroys quota groups incorrectly the FS will refuse to mount.

The rest are smaller fixes and plugs for memory leaks."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (30 commits)
Btrfs: fix repeated delalloc work allocation
Btrfs: fix wrong max device number for single profile
Btrfs: fix missed transaction->aborted check
Btrfs: Add ACCESS_ONCE() to transaction->abort accesses
Btrfs: put csums on the right ordered extent
Btrfs: use right range to find checksum for compressed extents
Btrfs: fix panic when recovering tree log
Btrfs: do not allow logged extents to be merged or removed
Btrfs: fix a regression in balance usage filter
Btrfs: prevent qgroup destroy when there are still relations
Btrfs: ignore orphan qgroup relations
Btrfs: reorder locks and sanity checks in btrfs_ioctl_defrag
Btrfs: fix unlock order in btrfs_ioctl_rm_dev
Btrfs: fix unlock order in btrfs_ioctl_resize
Btrfs: fix "mutually exclusive op is running" error code
Btrfs: bring back balance pause/resume logic
btrfs: update timestamps on truncate()
btrfs: fix btrfs_cont_expand() freeing IS_ERR em
Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents
Btrfs: fix off-by-one in lseek
...

Linus Torvalds
2013-01-26 02:55:21 +0800

25 Jan, 2013

9 commits

66e2d3e8c Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 ... Browse Code »

Pull cifs fixes from Steve French:
"Two small cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
fs/cifs/cifs_dfs_ref.c: fix potential memory leakage
cifs: fix srcip_matches() for ipv6

Linus Torvalds
2013-01-25 11:15:43 +0800
1eafa6c73 Btrfs: fix repeated delalloc work allocation ... Browse Code »

btrfs_start_delalloc_inodes() locks the delalloc_inodes list, fetches the
first inode, unlocks the list, triggers btrfs_alloc_delalloc_work/
btrfs_queue_worker for this inode, and then it locks the list, checks the
head of the list again. But because we don't delete the first inode that it
deals with before, it will fetch the same inode. As a result, this function
allocates a huge amount of btrfs_delalloc_work structures, and OOM happens.

Fix this problem by splice this delalloc list.

Reported-by: Alex Lyakas
Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-01-25 01:51:27 +0800
c9f01bfe0 Btrfs: fix wrong max device number for single profile ... Browse Code »

The max device number of single profile is 1, not 0 (0 means 'as many as
possible'). Fix it.

Cc: Liu Bo
Signed-off-by: Miao Xie
Reviewed-by: Liu Bo
Signed-off-by: Josef Bacik

Miao Xie
2013-01-25 01:51:26 +0800
2cba30f17 Btrfs: fix missed transaction->aborted check ... Browse Code »

First, though the current transaction->aborted check can stop the commit early
and avoid unnecessary operations, it is too early, and some transaction handles
don't end, those handles may set transaction->aborted after the check.

Second, when we commit the transaction, we will wake up some worker threads to
flush the space cache and inode cache. Those threads also allocate some transaction
handles and may set transaction->aborted if some serious error happens.

So we need more check for ->aborted when committing the transaction. Fix it.

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-01-25 01:51:25 +0800
8d25a086e Btrfs: Add ACCESS_ONCE() to transaction->abort accesses ... Browse Code »

We may access and update transaction->aborted on the different CPUs without
lock, so we need ACCESS_ONCE() wrapper to prevent the compiler from creating
unsolicited accesses and make sure we can get the right value.

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-01-25 01:51:23 +0800
e58dd74bc Btrfs: put csums on the right ordered extent ... Browse Code »

I noticed a WARN_ON going off when adding csums because we were going over
the amount of csum bytes that should have been allowed for an ordered
extent. This is a leftover from when we used to hold the csums privately
for direct io, but now we use the normal ordered sum stuff so we need to
make sure and check if we've moved on to another extent so that the csums
are added to the right extent. Without this we could end up with csums for
bytenrs that don't have extents to cover them yet. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-01-25 01:51:22 +0800
192000dda Btrfs: use right range to find checksum for compressed extents ... Browse Code »

For compressed extents, the range of checksum is covered by disk length,
and the disk length is different with ram length, so we need to use disk
length instead to get us the right checksum.

Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2013-01-25 01:51:17 +0800
b0175117b Btrfs: fix panic when recovering tree log ... Browse Code »

A user reported a BUG_ON(ret) that occured during tree log replay. Ret was
-EAGAIN, so what I think happened is that we removed an extent that covered
a bitmap entry and an extent entry. We remove the part from the bitmap and
return -EAGAIN and then search for the next piece we want to remove, which
happens to be an entire extent entry, so we just free the sucker and return.
The problem is ret is still set to -EAGAIN so we trip the BUG_ON(). The
user used btrfs-zero-log so I'm not 100% sure this is what happened so I've
added a WARN_ON() to catch the other possibility. Thanks,

Reported-by: Jan Steffens
Signed-off-by: Josef Bacik

Josef Bacik
2013-01-25 01:49:49 +0800
201a90389 Btrfs: do not allow logged extents to be merged or removed ... Browse Code »

We drop the extent map tree lock while we're logging extents, so somebody
could come in and merge another extent into this one and screw up our
logging, or they could even remove us from the list which would keep us from
logging the extent or freeing our ref on it, so we need to make sure to not
clear LOGGING until after the extent is logged, and then we can merge it to
adjacent extents. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-01-25 01:49:48 +0800