04 Sep, 2016
3 commits
-
Pull btrfs fixes from Chris Mason:
"I'm still prepping a set of fixes for btrfs fsync, just nailing down a
hard to trigger memory corruption. For now, these are tested and ready."* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix one bug that process may endlessly wait for ticket in wait_reserve_ticket()
Btrfs: fix endless loop in balancing block groups
Btrfs: kill invalid ASSERT() in process_all_refs() -
Pull driver core fixes from Greg KH:
"Here are three small fixes for 4.8-rc5.One for sysfs, one for kernfs, and one documentation fix, all for
reported issues. All of these have been in linux-next for a while"* tag 'driver-core-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: correctly handle read offset on PREALLOC attrs
documentation: drivers/core/of: fix name of of_node symlink
kernfs: don't depend on d_find_any_alias() when generating notifications -
In commit 8ead9dd54716 ("devpts: more pty driver interface cleanups") I
made devpts_get_priv() just return the dentry->fs_data directly. And
because I thought it wouldn't happen, I added a warning if you ever saw
a pts node that wasn't on devpts.And no, that warning never triggered under any actual real use, but you
can trigger it by creating nonsensical pts nodes by hand.So just revert the warning, and make devpts_get_priv() return NULL for
that case like it used to.Reported-by: Dmitry Vyukov
Cc: stable@vger.kernel.org # 4.6+
Cc: Eric W Biederman"
Signed-off-by: Linus Torvalds
03 Sep, 2016
1 commit
-
Pull overlayfs fixes from Miklos Szeredi:
"Most of this is regression fixes for posix acl behavior introduced in
4.8-rc1 (these were caught by the pjd-fstest suite). The are also
miscellaneous fixes marked as stable material and cleanups.Other than overlayfs code, it touches to add a constant
with which to disable posix acl caching. No changes needed to the
actual caching code, it automatically does the right thing, although
later we may want to optimize this case.I'm now testing overlayfs with the following test suites to catch
regressions:- unionmount-testsuite
- xfstests
- pjd-fstest"* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: update doc
ovl: listxattr: use strnlen()
ovl: Switch to generic_getxattr
ovl: copyattr after setting POSIX ACL
ovl: Switch to generic_removexattr
ovl: Get rid of ovl_xattr_noacl_handlers array
ovl: Fix OVL_XATTR_PREFIX
ovl: fix spelling mistake: "directries" -> "directories"
ovl: don't cache acl on overlay layer
ovl: use cached acl on underlying layer
ovl: proper cleanup of workdir
ovl: remove posix_acl_default from workdir
ovl: handle umask and posix_acl_default correctly on creation
ovl: don't copy up opaqueness
02 Sep, 2016
2 commits
-
Pull audit fixes from Paul Moore:
"Two small patches to fix some bugs with the audit-by-executable
functionality we introduced back in v4.3 (both patches are marked
for the stable folks)"* 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit:
audit: fix exe_file access in audit_exe_compare
mm: introduce get_task_exe_file -
…rnel/git/dgc/linux-xfs
Pull xfs and iomap fixes from Dave Chinner:
"Most of these changes are small regression fixes that address problems
introduced in the 4.8-rc1 window. The two fixes that aren't (IO
completion fix and superblock inprogress check) are fixes for problems
introduced some time ago and need to be pushed back to stable kernels.Changes in this update:
- iomap FIEMAP_EXTENT_MERGED usage fix
- additional mount-time feature restrictions
- rmap btree query fixes
- freeze/unmount io completion workqueue fix
- memory corruption fix for deferred operations handling"* tag 'xfs-iomap-for-linus-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: track log done items directly in the deferred pending work item
iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystems
xfs: prevent dropping ioend completions during buftarg wait
xfs: fix superblock inprogress check
xfs: simple btree query range should look right if LE lookup fails
xfs: fix some key handling problems in _btree_simple_query_range
xfs: don't log the entire end of the AGF
xfs: disallow mounting of realtime + rmap filesystems
xfs: don't perform lookups on zero-height btrees
01 Sep, 2016
17 commits
-
If can_overcommit() in btrfs_calc_reclaim_metadata_size() returns true,
btrfs_async_reclaim_metadata_space() will not reclaim metadata space, just
return directly and also forget to wake up process which are waiting for
their tickets, so these processes will wait endlessly.Fstests case generic/172 with mount option "-o compress=lzo" have revealed
this bug in my test machine. Here if we have tickets to handle, we must
handle them first.Signed-off-by: Wang Xiaoguang
Reviewed-by: Josef Bacik
Signed-off-by: David Sterba -
Qgroup function may overwrite the saved error 'err' with 0
in case quota is not enabled, and this ends up with a
endless loop in balance because we keep going back to balance
the same block group.It really should use 'ret' instead.
Signed-off-by: Liu Bo
Reviewed-by: Qu Wenruo
Signed-off-by: David Sterba -
Suppose you have the following tree in snap1 on a file system mounted with -o
inode_cache so that inode numbers are recycled└── [ 258] a
└── [ 257] band then you remove b, rename a to c, and then re-create b in c so you have the
following tree└── [ 258] c
└── [ 257] band then you try to do an incremental send you will hit
ASSERT(pending_move == 0);
in process_all_refs(). This is because we assume that any recycling of inodes
will not have a pending change in our path, which isn't the case. This is the
case for the DELETE side, since we want to remove the old file using the old
path, but on the create side we could have a pending move and need to do the
normal pending rename dance. So remove this ASSERT() and put a comment about
why we ignore pending_move. Thanks,Signed-off-by: Josef Bacik
Signed-off-by: David Sterba -
Be defensive about what underlying fs provides us in the returned xattr
list buffer. If it's not properly null terminated, bail out with a warning
insead of BUG.Signed-off-by: Miklos Szeredi
Cc: -
Now that overlayfs has xattr handlers for iop->{set,remove}xattr, use
those same handlers for iop->getxattr as well.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Miklos Szeredi -
Setting POSIX acl may also modify the file mode, so need to copy that up to
the overlay inode.Reported-by: Eryu Guan
Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting")
Signed-off-by: Miklos Szeredi -
Commit d837a49bd57f ("ovl: fix POSIX ACL setting") switches from
iop->setxattr from ovl_setxattr to generic_setxattr, so switch from
ovl_removexattr to generic_removexattr as well. As far as permission
checking goes, the same rules should apply in either case.While doing that, rename ovl_setxattr to ovl_xattr_set to indicate that
this is not an iop->setxattr implementation and remove the unused inode
argument.Move ovl_other_xattr_set above ovl_own_xattr_set so that they match the
order of handlers in ovl_xattr_handlers.Signed-off-by: Andreas Gruenbacher
Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting")
Signed-off-by: Miklos Szeredi -
Use an ordinary #ifdef to conditionally include the POSIX ACL handlers
in ovl_xattr_handlers, like the other filesystems do. Flag the code
that is now only used conditionally with __maybe_unused.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Miklos Szeredi -
Make sure ovl_own_xattr_handler only matches attribute names starting
with "overlay.", not "overlayXXX".Signed-off-by: Andreas Gruenbacher
Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting")
Signed-off-by: Miklos Szeredi -
Trivial fix to spelling mistake in pr_err message.
Signed-off-by: Colin Ian King
Signed-off-by: Miklos Szeredi -
Some operations (setxattr/chmod) can make the cached acl stale. We either
need to clear overlay's acl cache for the affected inode or prevent acl
caching on the overlay altogether. Preventing caching has the following
advantages:- no double caching, less memory used
- overlay cache doesn't go stale when fs clears it's own cache
Possible disadvantage is performance loss. If that becomes a problem
get_acl() can be optimized for overlayfs.This patch disables caching by pre setting i_*acl to a value that
- has bit 0 set, so is_uncached_acl() will return true
- is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it
The constant -3 was chosen for this purpose.
Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes")
Signed-off-by: Miklos Szeredi -
Instead of calling ->get_acl() directly, use get_acl() to get the cached
value.We will have the acl cached on the underlying inode anyway, because we do
permission checking on the both the overlay and the underlying fs.So, since we already have double caching, this improves performance without
any cost.Signed-off-by: Miklos Szeredi
-
When mounting overlayfs it needs a clean "work" directory under the
supplied workdir.Previously the mount code removed this directory if it already existed and
created a new one. If the removal failed (e.g. directory was not empty)
then it fell back to a read-only mount not using the workdir.While this has never been reported, it is possible to get a non-empty
"work" dir from a previous mount of overlayfs in case of crash in the
middle of an operation using the work directory.In this case the left over state should be discarded and the overlay
filesystem will be consistent, guaranteed by the atomicity of operations on
moving to/from the workdir to the upper layer.This patch implements cleaning out any files left in workdir. It is
implemented using real recursion for simplicity, but the depth is limited
to 2, because the worst case is that of a directory containing whiteouts
under "work".Signed-off-by: Miklos Szeredi
Cc: -
Clear out posix acl xattrs on workdir and also reset the mode after
creation so that an inherited sgid bit is cleared.Signed-off-by: Miklos Szeredi
Cc: -
Setting MS_POSIXACL in sb->s_flags has the side effect of passing mode to
create functions without masking against umask.Another problem when creating over a whiteout is that the default posix acl
is not inherited from the parent dir (because the real parent dir at the
time of creation is the work directory).Fix these problems by:
a) If upper fs does not have MS_POSIXACL, then mask mode with umask.
b) If creating over a whiteout, call posix_acl_create() to get the
inherited acls. After creation (but before moving to the final
destination) set these acls on the created file. posix_acl_create() also
updates the file creation mode as appropriate.Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes")
Signed-off-by: Miklos Szeredi -
For more convenient access if one has a pointer to the task.
As a minor nit take advantage of the fact that only task lock + rcu are
needed to safely grab ->exe_file. This saves mm refcount dance.Use the helper in proc_exe_link.
Signed-off-by: Mateusz Guzik
Acked-by: Konstantin Khlebnikov
Acked-by: Richard Guy Briggs
Cc: # 4.3.x
Signed-off-by: Paul Moore -
We used to delay switching to the new credentials until after we had
mapped the executable (and possible elf interpreter). That was kind of
odd to begin with, since the new executable will actually then _run_
with the new creds, but whatever.The bigger problem was that we also want to make sure that we turn off
prof events and tracing before we start mapping the new executable
state. So while this is a cleanup, it's also a fix for a possible
information leak.Reported-by: Robert Święcki
Tested-by: Peter Zijlstra
Acked-by: David Howells
Acked-by: Oleg Nesterov
Acked-by: Andy Lutomirski
Acked-by: Eric W. Biederman
Cc: Willy Tarreau
Cc: Kees Cook
Cc: Al Viro
Signed-off-by: Linus Torvalds
31 Aug, 2016
3 commits
-
Attributes declared with __ATTR_PREALLOC use sysfs_kf_read() which returns
zero bytes for non-zero offset. This breaks script checkarray in mdadm tool
in debian where /bin/sh is 'dash' because its builtin 'read' reads only one
byte at a time. Script gets 'i' instead of 'idle' when reads current action
from /sys/block/$dev/md/sync_action and as a result does nothing.This patch adds trivial implementation of partial read: generate whole
string and move required part into buffer head.Signed-off-by: Konstantin Khlebnikov
Fixes: 4ef67a8c95f3 ("sysfs/kernfs: make read requests on pre-alloc files use the buffer.")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=787950
Cc: Stable # v3.19+
Acked-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman -
kernfs_notify_workfn() sends out file modified events for the
scheduled kernfs_nodes. Because the modifications aren't from
userland, it doesn't have the matching file struct at hand and can't
use fsnotify_modify(). Instead, it looked up the inode and then used
d_find_any_alias() to find the dentry and used fsnotify_parent() and
fsnotify() directly to generate notifications.The assumption was that the relevant dentries would have been pinned
if there are listeners, which isn't true as inotify doesn't pin
dentries at all and watching the parent doesn't pin the child dentries
even for dnotify. This led to, for example, inotify watchers not
getting notifications if the system is under memory pressure and the
matching dentries got reclaimed. It can also be triggered through
/proc/sys/vm/drop_caches or a remount attempt which involves shrinking
dcache.fsnotify_parent() only uses the dentry to access the parent inode,
which kernfs can do easily. Update kernfs_notify_workfn() so that it
uses fsnotify() directly for both the parent and target inodes without
going through d_find_any_alias(). While at it, supply the target file
name to fsnotify() from kernfs_node->name.Signed-off-by: Tejun Heo
Reported-by: Evgeny Vereshchagin
Fixes: d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events too")
Cc: John McCutchan
Cc: Robert Love
Cc: Eric Paris
Cc: stable@vger.kernel.org # v3.16+
Signed-off-by: Greg Kroah-Hartman -
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:Stable patches:
- Fix a refcount leak in nfs_callback_up_net
- Fix an Oopsable condition when the flexfile pNFS driver connection
to the DS fails
- Fix an Oopsable condition in NFSv4.1 server callback races
- Ensure pNFS clients stop doing I/O to the DS if their lease has
expired, as required by the NFSv4.1 protocolBugfixes:
- Fix potential looping in the NFSv4.x migration code
- Patch series to close callback races for OPEN, LAYOUTGET and
LAYOUTRETURN
- Silence WARN_ON when NFSv4.1 over RDMA is in use
- Fix a LAYOUTCOMMIT race in the pNFS/blocks client
- Fix pNFS timeout issues when the DS fails"* tag 'nfs-for-4.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4.x: Fix a refcount leak in nfs_callback_up_net
NFS4: Avoid migration loops
pNFS/flexfiles: Fix an Oopsable condition when connection to the DS fails
NFSv4.1: Remove obsolete and incorrrect assignment in nfs4_callback_sequence
NFSv4.1: Close callback races for OPEN, LAYOUTGET and LAYOUTRETURN
NFSv4.1: Defer bumping the slot sequence number until we free the slot
NFSv4.1: Delay callback processing when there are referring triples
NFSv4.1: Fix Oopsable condition in server callback races
SUNRPC: Silence WARN_ON when NFSv4.1 over RDMA is in use
pnfs/blocklayout: update last_write_offset atomically with extents
pNFS: The client must not do I/O to the DS if it's lease has expired
pNFS: Handle NFS4ERR_OLD_STATEID correctly in LAYOUTSTAT calls
pNFS/flexfiles: Set reasonable default retrans values for the data channel
NFS: Allow the mount option retrans=0
pNFS/flexfiles: Fix layoutstat periodic reporting
30 Aug, 2016
5 commits
-
On error, the callers expect us to return without bumping
nn->cb_users[].Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org # v3.7+ -
If a server returns itself as a location while migrating, the client may
end up getting stuck attempting to migrate twice to the same server. Catch
this by checking if the nfs_client found is the same as the existing
client. For the other two callers to nfs4_set_client, the nfs_client will
always be ERR_PTR(-EINVAL).Signed-off-by: Benjamin Coddington
Signed-off-by: Trond Myklebust -
Christoph reports slab corruption when a deferred refcount update
aborts during _defer_finish(). The cause of this was broken log item
state tracking in xfs_defer_pending -- upon an abort,
_defer_trans_abort() will call abort_intent on all intent items,
including the ones that have already had a done item attached.This is incorrect because each intent item has 2 refcount: the first
is released when the intent item is committed to the log; and the
second is released when the _done_ item is committed to the log, or
by the intent creator if there is no done item. In other words, once
we log the done item, responsibility for releasing the intent item's
second refcount is transferred to the done item and /must not/ be
performed by anything else.The dfp_committed flag should have been tracking whether or not we had
a done item so that _defer_trans_abort could decide if it needs to
abort the intent item, but due to a thinko this was not the case. Rip
it out and track the done item directly so that we do the right thing
w.r.t. intent item freeing.Signed-off-by: Darrick J. Wong
Reported-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner -
Pull ext4 fixes from Ted Ts'o:
"Fix bugs that could cause kernel deadlocks or file system corruption
while moving xattrs to expand the extended inode.Also add some sanity checks to the block group descriptors to make
sure we don't end up overwriting the superblock"* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: avoid deadlock when expanding inode size
ext4: properly align shifted xattrs when expanding inodes
ext4: fix xattr shifting when expanding inodes part 2
ext4: fix xattr shifting when expanding inodes
ext4: validate that metadata blocks do not overlap superblock
ext4: reserve xattr index for the Hurd -
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or
ff_layout_pg_init_write, then we currently end up clearing the layout
segment carried by the struct nfs_pageio_descriptor, causing an Oops
when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist.The fix is to ensure we return the layout and then retry.
Fixes: 446ca2195303 ("pNFS/flexfiles: When initing reads or writes, we...")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Trond Myklebust
29 Aug, 2016
6 commits
-
Filesystems like XFS that use extents should not set the
FIEMAP_EXTENT_MERGED flag in the fiemap extent structures. To allow
for both behaviors for the upcoming gfs2 usage split the iomap
type field into type and flags, and only set FIEMAP_EXTENT_MERGED if
the IOMAP_F_MERGED flag is set. The flags field will also come in
handy for future features such as shared extents on reflink-enabled
file systems.Reported-by: Andreas Gruenbacher
Signed-off-by: Christoph Hellwig
Acked-by: Darrick J. Wong
Signed-off-by: Dave Chinner -
Signed-off-by: Trond Myklebust
-
Defer freeing the slot until after we have processed the results from
OPEN and LAYOUTGET. This means that the server can rely on the
mechanism in RFC5661 Section 2.10.6.3 to ensure that replies to an
OPEN or LAYOUTGET/RETURN RPC call don't race with the callbacks that
apply to them.Signed-off-by: Trond Myklebust
-
For operations like OPEN or LAYOUTGET, which return recallable state
(i.e. delegations and layouts) we want to enable the mechanism for
resolving recall races in RFC5661 Section 2.10.6.3.
To do so, we will want to defer bumping the slot's sequence number until
we have finished processing the RPC results.Signed-off-by: Trond Myklebust
-
If CB_SEQUENCE tells us that the processing of this request depends on
the completion of one or more referring triples (see RFC 5661 Section
2.10.6.3), delay the callback processing until after the RPC requests
being referred to have completed.
If we end up delaying for more than 1/2 second, then fall back to
returning NFS4ERR_DELAY in reply to the callback.Signed-off-by: Trond Myklebust
-
The slot table hasn't been an array since v3.7. Ensure that we
use nfs4_lookup_slot() to access the slot correctly.Fixes: 87dda67e7386 ("NFSv4.1: Allow SEQUENCE to resize the slot table...")
Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org # v3.8+
27 Aug, 2016
3 commits
-
Merge fixes from Andrew Morton:
"11 fixes"* emailed patches from Andrew Morton :
mm: silently skip readahead for DAX inodes
dax: fix device-dax region base
fs/seq_file: fix out-of-bounds read
mm: memcontrol: avoid unused function warning
mm: clarify COMPACTION Kconfig text
treewide: replace config_enabled() with IS_ENABLED() (2nd round)
printk: fix parsing of "brl=" option
soft_dirty: fix soft_dirty during THP split
sysctl: handle error writing UINT_MAX to u32 fields
get_maintainer: quiet noisy implicit -f vcs_file_exists checking
byteswap: don't use __builtin_bswap*() with sparse -
Pull btrfs fixes from Chris Mason:
"We've queued up a few different fixes in here. These range from
enospc corners to fsync and quota fixes, and a few targeted at error
handling for corrupt metadata/fuzzing"* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix lockdep warning on deadlock against an inode's log mutex
Btrfs: detect corruption when non-root leaf has zero item
Btrfs: check btree node's nritems
btrfs: don't create or leak aliased root while cleaning up orphans
Btrfs: fix em leak in find_first_block_group
btrfs: do not background blkdev_put()
Btrfs: clarify do_chunk_alloc()'s return value
btrfs: fix fsfreeze hang caused by delayed iputs deal
btrfs: update btrfs_space_info's bytes_may_use timely
btrfs: divide btrfs_update_reserved_bytes() into two functions
btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster()
btrfs: qgroup: Fix qgroup incorrectness caused by log replay
btrfs: relocation: Fix leaking qgroups numbers on data extents
btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent()
btrfs: waiting on qgroup rescan should not always be interruptible
btrfs: properly track when rescan worker is running
btrfs: flush_space: treat return value of do_chunk_alloc properly
Btrfs: add ASSERT for block group's memory leak
btrfs: backref: Fix soft lockup in __merge_refs function
Btrfs: fix memory leak of reloc_root -
Pull dlm fix from David Teigland:
"This fixes a bug introduced by recent debugfs cleanup"* tag 'dlm-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix malfunction of dlm_tool caused by debugfs changes