06 Jan, 2017
1 commit
-
Pull audit fixes from Paul Moore:
"Two small fixes relating to audit's use of fsnotify.The first patch plugs a leak and the second fixes some lock
shenanigans. The patches are small and I banged on this for an
afternoon with our testsuite and didn't see anything odd"* 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit:
audit: Fix sleep in atomic
fsnotify: Remove fsnotify_duplicate_mark()
05 Jan, 2017
2 commits
-
Pull xfs fixes from Darrick Wong:
- fixes for crashes and double-cleanup errors
- XFS maintainership handover
- fix to prevent absurdly large block reservations
- fix broken sysfs getter/setters
* tag 'xfs-for-linus-4.10-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix max_retries _show and _store functions
xfs: update MAINTAINERS
xfs: fix crash and data corruption due to removal of busy COW extents
xfs: use the actual AG length when reserving blocks
xfs: fix double-cleanup when CUI recovery fails -
Pull block layer fixes from Jens Axboe:
"A set of fixes for the current series, one fixing a regression with
block size < page cache size in the alias series from Jan. Outside of
that, two small cleanups for wbt from Bart, a nvme pull request from
Christoph, and a few small fixes of documentation updates"* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix up io_poll documentation
block: Avoid that sparse complains about context imbalance in __wbt_wait()
block: Make wbt_wait() definition consistent with declaration
clean_bdev_aliases: Prevent cleaning blocks that are not in block range
genhd: remove dead and duplicated scsi code
block: add back plugging in __blkdev_direct_IO
nvmet/fcloop: remove some logically dead code performing redundant ret checks
nvmet: fix KATO offset in Set Features
nvme/fc: simplify error handling of nvme_fc_create_hw_io_queues
nvme/fc: correct some printk information
nvme/scsi: Remove START STOP emulation
nvme/pci: Delete misleading queue-wrap comment
nvme/pci: Fix whitespace problem
nvme: simplify stripe quirk
nvme: update maintainers information
04 Jan, 2017
4 commits
-
max_retries _show and _store functions should test against cfg->max_retries,
not cfg->retry_timeoutSigned-off-by: Carlos Maiolino
Reviewed-by: Eric Sandeen
Signed-off-by: Darrick J. Wong -
There is a race window between write_cache_pages calling
clear_page_dirty_for_io and XFS calling set_page_writeback, in which
the mapping for an inode is tagged neither as dirty, nor as writeback.If the COW shrinker hits in exactly that window we'll remove the delayed
COW extents and writepages trying to write it back, which in release
kernels will manifest as corruption of the bmap btree, and in debug
kernels will trip the ASSERT about now calling xfs_bmapi_write with the
COWFORK flag for holes. A complex customer load manages to hit this
window fairly reliably, probably by always having COW writeback in flight
while the cow shrinker runs.This patch adds another check for having the I_DIRTY_PAGES flag set,
which is still set during this race window. While this fixes the problem
I'm still not overly happy about the way the COW shrinker works as it
still seems a bit fragile.Signed-off-by: Christoph Hellwig
Signed-off-by: Darrick J. Wong -
We need to use the actual AG length when making per-AG reservations,
since we could otherwise end up reserving more blocks out of the last
AG than there are actual blocks.Complained-about-by: Brian Foster
Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig -
Dan Carpenter reported a double-free of rcur if _defer_finish fails
while we're recovering CUI items. Fix the error recovery to prevent
this.Reported-by: Dan Carpenter
Signed-off-by: Darrick J. Wong
03 Jan, 2017
2 commits
-
Pull fscrypt fixes from Ted Ts'o:
"Two fscrypt bug fixes, one of which was unmasked by an update to the
crypto tree during the merge window"* tag 'fscrypt-for-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: fix renaming and linking special files
fscrypt: fix the test_dummy_encryption mount option -
The first block to be cleaned may start at a non-zero page offset. In
such a scenario clean_bdev_aliases() will end up cleaning blocks that
do not fall in the range of blocks to be cleaned. This commit fixes the
issue by skipping blocks that do not fall in valid block range.Signed-off-by: Chandan Rajendra
Reviewed-by: Jan Kara
Reviewed-by: Eryu Guan
Signed-off-by: Jens Axboe
31 Dec, 2016
1 commit
-
Attempting to link a device node, named pipe, or socket file into an
encrypted directory through rename(2) or link(2) always failed with
EPERM. This happened because fscrypt_has_permitted_context() saw that
the file was unencrypted and forbid creating the link. This behavior
was unexpected because such files are never encrypted; only regular
files, directories, and symlinks can be encrypted.To fix this, make fscrypt_has_permitted_context() always return true on
special files.This will be covered by a test in my encryption xfstests patchset.
Fixes: 9bd8212f981e ("ext4 crypto: add encryption policy and password salt support")
Signed-off-by: Eric Biggers
Reviewed-by: Richard Weinberger
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o
28 Dec, 2016
1 commit
-
Commit f1c131b45410a: "crypto: xts - Convert to skcipher" now fails
the setkey operation if the AES key is the same as the tweak key.
Previously this check was only done if FIPS mode is enabled. Now this
check is also done if weak key checking was requested. This is
reasonable, but since we were using the dummy key which was a constant
series of 0x42 bytes, it now caused dummy encrpyption test mode to
fail.Fix this by using 0x42... and 0x24... for the two keys, so they are
different.Fixes: f1c131b45410a202eb45cc55980a7a9e4e4b4f40
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o
27 Dec, 2016
6 commits
-
Now that dax_iomap_fault() calls ->iomap_begin() without entry lock, we
can use transaction starting in ext4_iomap_begin() and thus simplify
ext4_dax_fault(). It also provides us proper retries in case of ENOSPC.Signed-off-by: Jan Kara
Signed-off-by: Dan Williams -
Currently ->iomap_begin() handler is called with entry lock held. If the
filesystem held any locks between ->iomap_begin() and ->iomap_end()
(such as ext4 which will want to hold transaction open), this would cause
lock inversion with the iomap_apply() from standard IO path which first
calls ->iomap_begin() and only then calls ->actor() callback which grabs
entry locks for DAX (if it faults when copying from/to user provided
buffers).Fix the problem by nesting grabbing of entry lock inside ->iomap_begin()
- ->iomap_end() pair.Reviewed-by: Ross Zwisler
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams -
The only case when we do not finish the page fault completely is when we
are loading hole pages into a radix tree. Avoid this special case and
finish the fault in that case as well inside the DAX fault handler. It
will allow us for easier iomap handling.Reviewed-by: Ross Zwisler
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams -
Currently dax_iomap_rw() takes care of invalidating page tables and
evicting hole pages from the radix tree when write(2) to the file
happens. This invalidation is only necessary when there is some block
allocation resulting from write(2). Furthermore in current place the
invalidation is racy wrt page fault instantiating a hole page just after
we have invalidated it.So perform the page invalidation inside dax_iomap_actor() where we can
do it only when really necessary and after blocks have been allocated so
nobody will be instantiating new hole pages anymore.Reviewed-by: Christoph Hellwig
Reviewed-by: Ross Zwisler
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams -
Currently invalidate_inode_pages2_range() and invalidate_mapping_pages()
just delete all exceptional radix tree entries they find. For DAX this
is not desirable as we track cache dirtiness in these entries and when
they are evicted, we may not flush caches although it is necessary. This
can for example manifest when we write to the same block both via mmap
and via write(2) (to different offsets) and fsync(2) then does not
properly flush CPU caches when modification via write(2) was the last
one.Create appropriate DAX functions to handle invalidation of DAX entries
for invalidate_inode_pages2_range() and invalidate_mapping_pages() and
wire them up into the corresponding mm functions.Acked-by: Johannes Weiner
Reviewed-by: Ross Zwisler
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams -
So far we did not return BH_New buffers from ext2_get_blocks() when we
allocated and zeroed-out a block for DAX inode to avoid racy zeroing in
DAX code. This zeroing is gone these days so we can remove the
workaround.Reviewed-by: Ross Zwisler
Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams
26 Dec, 2016
3 commits
-
No point in going through loops and hoops instead of just comparing the
values.Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra -
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra -
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
25 Dec, 2016
2 commits
-
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)to do the replacement at the end of the merge window.
Requested-by: Al Viro
Signed-off-by: Linus Torvalds -
Pull cifs fixes from Steve French:
"This ncludes various cifs/smb3 bug fixes, mostly for stable as well.In the next week I expect that Germano will have some reconnection
fixes, and also I expect to have the remaining pieces of the snapshot
enablement and SMB3 ACLs, but wanted to get this set of bug fixes in"* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_get_root shouldn't use path with tree name
Fix default behaviour for empty domains and add domainauto option
cifs: use %16phN for formatting md5 sum
cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack
CIFS: Fix a possible double locking of mutex during reconnect
CIFS: Fix a possible memory corruption during reconnect
CIFS: Fix a possible memory corruption in push locks
CIFS: Fix missing nls unload in smb2_reconnect()
CIFS: Decrease verbosity of ioctl call
SMB3: parsing for new snapshot timestamp mount parm
24 Dec, 2016
3 commits
-
There are only two calls sites of fsnotify_duplicate_mark(). Those are
in kernel/audit_tree.c and both are bogus. Vfsmount pointer is unused
for audit tree, inode pointer and group gets set in
fsnotify_add_mark_locked() later anyway, mask and free_mark are already
set in alloc_chunk(). In fact, calling fsnotify_duplicate_mark() is
actively harmful because following fsnotify_add_mark_locked() will leak
group reference by overwriting the group pointer. So just remove the two
calls to fsnotify_duplicate_mark() and the function.Signed-off-by: Jan Kara
[PM: line wrapping to fit in 80 chars]
Signed-off-by: Paul Moore -
Pull final vfs updates from Al Viro:
"Assorted cleanups and fixes all over the place"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sg_write()/bsg_write() is not fit to be called under KERNEL_DS
ufs: fix function declaration for ufs_truncate_blocks
fs: exec: apply CLOEXEC before changing dumpable task flags
seq_file: reset iterator to first record for zero offset
vfs: fix isize/pos/len checks for reflink & dedupe
[iov_iter] fix iterate_all_kinds() on empty iterators
move aio compat to fs/aio.c
reorganize do_make_slave()
clone_private_mount() doesn't need to touch namespace_sem
remove a bogus claim about namespace_sem being held by callers of mnt_alloc_id() -
Pull befs updates from Luis de Bethencourt:
"A series of small fixes and adding NFS export support"* tag 'befs-v4.10-rc1' of git://github.com/luisbg/linux-befs:
befs: add NFS export support
befs: remove trailing whitespaces
befs: remove signatures from comments
befs: fix style issues in header files
befs: fix style issues in linuxvfs.c
befs: fix typos in linuxvfs.c
befs: fix style issues in io.c
befs: fix style issues in inode.c
befs: fix style issues in debug.c
23 Dec, 2016
7 commits
-
sparse says:
fs/ufs/inode.c:1195:6: warning: symbol 'ufs_truncate_blocks' was not declared. Should it be static?
Note that the forward declaration in the file is already marked static.
Signed-off-by: Jeff Layton
Signed-off-by: Al Viro -
If you have a process that has set itself to be non-dumpable, and it
then undergoes exec(2), any CLOEXEC file descriptors it has open are
"exposed" during a race window between the dumpable flags of the process
being reset for exec(2) and CLOEXEC being applied to the file
descriptors. This can be exploited by a process by attempting to access
/proc//fd/... during this window, without requiring CAP_SYS_PTRACE.The race in question is after set_dumpable has been (for get_link,
though the trace is basically the same for readlink):[vfs]
-> proc_pid_link_inode_operations.get_link
-> proc_pid_get_link
-> proc_fd_access_allowed
-> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);Which will return 0, during the race window and CLOEXEC file descriptors
will still be open during this window because do_close_on_exec has not
been called yet. As a result, the ordering of these calls should be
reversed to avoid this race window.This is of particular concern to container runtimes, where joining a
PID namespace with file descriptors referring to the host filesystem
can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect
against access of CLOEXEC file descriptors -- file descriptors which may
reference filesystem objects the container shouldn't have access to).Cc: dev@opencontainers.org
Cc: # v3.2+
Reported-by: Michael Crosby
Signed-off-by: Aleksa Sarai
Signed-off-by: Al Viro -
If kernfs file is empty on a first read, successive read operations
using the same file descriptor will return no data, even when data is
available. Default kernfs 'seq_next' implementation advances iterator
position even when next object is not there. Kernfs 'seq_start' for
following requests will not return iterator as position is already on
the second object.This defect doesn't allow to monitor badblocks sysfs files from MD raid.
They are initially empty but if data appears at some stage, userspace is
not able to read it.Signed-off-by: Tomasz Majchrzak
Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro -
Strengthen the checking of pos/len vs. i_size, clarify the return values
for the clone prep function, and remove pointless code.Reviewed-by: Christoph Hellwig
Signed-off-by: Darrick J. Wong
Signed-off-by: Al Viro -
... and fix the minor buglet in compat io_submit() - native one
kills ioctx as cleanup when put_user() fails. Get rid of
bogus compat_... in !CONFIG_AIO case, while we are at it - they
should simply fail with ENOSYS, same as for native counterparts.Signed-off-by: Al Viro
-
This allows sending larger than 1 MB requests to devices that support
large I/O sizes.Signed-off-by: Christoph Hellwig
Reported-by: Laurence Oberman
Signed-off-by: Jens Axboe
22 Dec, 2016
8 commits
-
Implement mandatory export_operations, so it is possible to export befs via
nfs.Signed-off-by: Luis de Bethencourt
-
Removing all trailing whitespaces in befs.
I was skeptic about tainting the history with this, but whitespace changes
can be ignored by using 'git blame -w' and 'git log -w'.Signed-off-by: Luis de Bethencourt
-
No idea why some comments have signatures. These predate git. Removing them
since they add noise and no information.Signed-off-by: Luis de Bethencourt
-
Fixing checkpatch.pl issues in befs header files:
WARNING: Missing a blank line after declarations
+ befs_inode_addr iaddr;
+ iaddr.allocation_group = blockno >> BEFS_SB(sb)->ag_shift;WARNING: space prohibited between function name and open parenthesis '('
+ return BEFS_SB(sb)->block_size / sizeof (befs_disk_inode_addr);ERROR: "foo * bar" should be "foo *bar"
+ const char *key, befs_off_t * value);ERROR: Macros with complex values should be enclosed in parentheses
+#define PACKED __attribute__ ((__packed__))Signed-off-by: Luis de Bethencourt
-
Fix the following type of checkpatch.pl issues:
WARNING: line over 80 characters
+static struct dentry *befs_lookup(struct inode *, struct dentry *, unsigned int);ERROR: code indent should use tabs where possible
+ if (!bi)$WARNING: please, no spaces at the start of a line
+ if (!bi)$WARNING: labels should not be indented
+ unacquire_bh:WARNING: space prohibited between function name and open parenthesis '('
+ sizeof (struct befs_inode_info),WARNING: braces {} are not necessary for single statement blocks
+ if (!*out) {
+ return -ENOMEM;
+ }WARNING: Block comments use a trailing */ on a separate line
+ * in special cases */WARNING: Missing a blank line after declarations
+ int token;
+ if (!*p)ERROR: do not use assignment in if condition
+ if (!(bh = sb_bread(sb, sb_block))) {ERROR: space prohibited after that open parenthesis '('
+ if( befs_sb->num_blocks > ~((sector_t)0) ) {ERROR: space prohibited before that close parenthesis ')'
+ if( befs_sb->num_blocks > ~((sector_t)0) ) {ERROR: space required before the open parenthesis '('
+ if( befs_sb->num_blocks > ~((sector_t)0) ) {Signed-off-by: Luis de Bethencourt
-
Signed-off-by: Luis de Bethencourt
-
Fixing the two following checkpatch.pl issues:
ERROR: trailing whitespace
+ * Based on portions of file.c and inode.c $WARNING: labels should not be indented
+ error:Signed-off-by: Luis de Bethencourt
-
Fixing the following checkpatch.pl errors and warning:
ERROR: trailing whitespace
+ * $WARNING: Block comments use * on subsequent lines
+/*
+ Validates the correctness of the befs inodeERROR: "foo * bar" should be "foo *bar"
+befs_check_inode(struct super_block *sb, befs_inode * raw_inode,Signed-off-by: Luis de Bethencourt