22 Nov, 2018
1 commit
-
It returns EINVAL when the operation is not supported by the
filesystem. Fix it to return EOPNOTSUPP to be consistent with
the man page and clone_file_range().Clean up the inconsistent error return handling while I'm there.
(I know, lipstick on a pig, but every little bit helps...)Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
03 Nov, 2018
1 commit
-
Pull vfs dedup fixes from Dave Chinner:
"This reworks the vfs data cloning infrastructure.We discovered many issues with these interfaces late in the 4.19 cycle
- the worst of them (data corruption, setuid stripping) were fixed for
XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
the problems was needed. That rework is the contents of this pull
request.Rework the vfs_clone_file_range and vfs_dedupe_file_range
infrastructure to use a common .remap_file_range method and supply
generic bounds and sanity checking functions that are shared with the
data write path. The current VFS infrastructure has problems with
rlimit, LFS file sizes, file time stamps, maximum filesystem file
sizes, stripping setuid bits, etc and so they are addressed in these
commits.We also introduce the ability for the ->remap_file_range methods to
return short clones so that clones for vfs_copy_file_range() don't get
rejected if the entire range can't be cloned. It also allows
filesystems to sliently skip deduplication of partial EOF blocks if
they are not capable of doing so without requiring errors to be thrown
to userspace.Existing filesystems are converted to user the new remap_file_range
method, and both XFS and ocfs2 are modified to make use of the new
generic checking infrastructure"* tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
xfs: remove [cm]time update from reflink calls
xfs: remove xfs_reflink_remap_range
xfs: remove redundant remap partial EOF block checks
xfs: support returning partial reflink results
xfs: clean up xfs_reflink_remap_blocks call site
xfs: fix pagecache truncation prior to reflink
ocfs2: remove ocfs2_reflink_remap_range
ocfs2: support partial clone range and dedupe range
ocfs2: fix pagecache truncation prior to reflink
ocfs2: truncate page cache for clone destination file before remapping
vfs: clean up generic_remap_file_range_prep return value
vfs: hide file range comparison function
vfs: enable remap callers that can handle short operations
vfs: plumb remap flags through the vfs dedupe functions
vfs: plumb remap flags through the vfs clone functions
vfs: make remap_file_range functions take and return bytes completed
vfs: remap helper should update destination inode metadata
vfs: pass remap flags to generic_remap_checks
vfs: pass remap flags to generic_remap_file_range_prep
vfs: combine the clone and dedupe into a single remap_file_range
...
02 Nov, 2018
1 commit
-
Pull misc vfs updates from Al Viro:
"No common topic, really - a handful of assorted stuff; the least
trivial bits are Mark's dedupe patches"* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/exofs: only use true/false for asignment of bool type variable
fs/exofs: fix potential memory leak in mount option parsing
Delete invalid assignment statements in do_sendfile
iomap: remove duplicated include from iomap.c
vfs: dedupe should return EPERM if permission is not granted
vfs: allow dedupe of user owned read-only files
ntfs: don't open-code ERR_CAST
ext4: don't open-code ERR_CAST
30 Oct, 2018
17 commits
-
Since the remap prep function can update the length of the remap
request, we can change this function to return the usual return status
instead of the odd behavior it has now.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
There are no callers of vfs_dedupe_file_range_compare, so we might as
well make it a static helper and remove the export.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
Plumb in a remap flag that enables the filesystem remap handler to
shorten remapping requests for callers that can handle it. Now
copy_file_range can report partial success (in case we run up against
alignment problems, resource limits, etc.).We also enable CAN_SHORTEN for fideduperange to maintain existing
userspace-visible behavior where xfs/btrfs shorten the dedupe range to
avoid stale post-eof data exposure.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Plumb a remap_flags argument through the vfs_dedupe_file_range_one
functions so that dedupe can take advantage of it.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Plumb a remap_flags argument through the {do,vfs}_clone_file_range
functions so that clone can take advantage of it.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Change the remap_file_range functions to take a number of bytes to
operate upon and return the number of bytes they operated on. This is a
requirement for allowing fs implementations to return short clone/dedupe
results to the user, which will enable us to obey resource limits in a
graceful manner.A subsequent patch will enable copy_file_range to signal to the
->clone_file_range implementation that it can handle a short length,
which will be returned in the function's return value. For now the
short return is not implemented anywhere so the behavior won't change --
either copy_file_range manages to clone the entire range or it tries an
alternative.Neither clone ioctl can take advantage of this, alas.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Extend generic_remap_file_range_prep to handle inode metadata updates
when remapping into a file. If the operation can possibly alter the
file contents, we must update the ctime and mtime and remove security
privileges, just like we do for regular file writes.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Pass the same remap flags to generic_remap_checks for consistency.
Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
Plumb the remap flags through the filesystem from the vfs function
dispatcher all the way to the prep function to prepare for behavior
changes in subsequent patches.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
Combine the clone_file_range and dedupe_file_range operations into a
single remap_file_range file operation dispatch since they're
fundamentally the same operation. The differences between the two can
be made in the prep functions.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
Since we use clone_verify_area for both clone and dedupe range checks,
rename the function to make it clear that it's for both.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
The vfs_clone_file_prep is a generic function to be called by filesystem
implementations only. Rename the prefix to generic_ and make it more
clear that it applies to remap operations, not just clones.Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
Don't bother calling the filesystem for a zero-length dedupe request;
we can return zero and exit.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
A deduplication data corruption is exposed in XFS and btrfs. It is
caused by extending the block match range to include the partial EOF
block, but then allowing unknown data beyond EOF to be considered a
"match" to data in the destination file because the comparison is only
made to the end of the source file. This corrupts the destination file
when the source extent is shared with it.The VFS remapping prep functions only support whole block dedupe, but
we still need to appear to support whole file dedupe correctly. Hence
if the dedupe request includes the last block of the souce file, don't
include it in the actual dedupe operation. If the rest of the range
dedupes successfully, then reject the entire request. A subsequent
patch will enable us to shorten dedupe requests correctly.When reflinking sub-file ranges, a data corruption can occur when the
source file range includes a partial EOF block. This shares the unknown
data beyond EOF into the second file at a position inside EOF, exposing
stale data in the second file.If the reflink request includes the last block of the souce file, only
proceed with the reflink operation if it lands at or past the
destination file's current EOF. If it lands within the destination file
EOF, reject the entire request with -EINVAL and make the caller go the
hard way. A subsequent patch will enable us to shorten reflink requests
correctly.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
If a remap caller asks us to remap to the source file's EOF and the
source file length leaves us with a zero byte request, exit early.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner -
Move the file range checks from vfs_clone_file_prep into a separate
generic_remap_checks function so that all the checks are collected in a
central location. This forms the basis for adding more checks from
generic_write_checks that will make cloning's input checking more
consistent with write input checking.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner -
vfs_clone_file_prep_inodes cannot return 0 if it is asked to remap from
a zero byte file because that's what btrfs does.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner
26 Oct, 2018
1 commit
-
Pull timekeeping updates from Thomas Gleixner:
"The timers and timekeeping departement provides:- Another large y2038 update with further preparations for providing
the y2038 safe timespecs closer to the syscalls.- An overhaul of the SHCMT clocksource driver
- SPDX license identifier updates
- Small cleanups and fixes all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
tick/sched : Remove redundant cpu_online() check
clocksource/drivers/dw_apb: Add reset control
clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE
clocksource/drivers: Unify the names to timer-* format
clocksource/drivers/sh_cmt: Add R-Car gen3 support
dt-bindings: timer: renesas: cmt: document R-Car gen3 support
clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer
clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines
clocksource/drivers/sh_cmt: Fixup for 64-bit machines
clocksource/drivers/sh_tmu: Convert to SPDX identifiers
clocksource/drivers/sh_mtu2: Convert to SPDX identifiers
clocksource/drivers/sh_cmt: Convert to SPDX identifiers
clocksource/drivers/renesas-ostm: Convert to SPDX identifiers
clocksource: Convert to using %pOFn instead of device_node.name
tick/broadcast: Remove redundant check
RISC-V: Request newstat syscalls
y2038: signal: Change rt_sigtimedwait to use __kernel_timespec
y2038: socket: Change recvmmsg to use __kernel_timespec
y2038: sched: Change sched_rr_get_interval to use __kernel_timespec
y2038: utimes: Rework #ifdef guards for compat syscalls
...
18 Oct, 2018
3 commits
-
Assigning value -EINVAL to "retval" here, but that stored value is
overwritten before it can be used.retval = -EINVAL;
....
retval = rw_verify_area(WRITE, out.file, &out_pos, count);value_overwrite: Overwriting previous write to "retval" with value
from rw_verify_areadelete invalid assignment statements
Signed-off-by: n00202754
Signed-off-by: Al Viro -
Right now we return EINVAL if a process does not have permission to dedupe a
file. This was an oversight on my part. EPERM gives a true description of
the nature of our error, and EINVAL is already used for the case that the
filesystem does not support dedupe.Signed-off-by: Mark Fasheh
Reviewed-by: Darrick J. Wong
Acked-by: David Sterba
Signed-off-by: Al Viro -
The permission check in vfs_dedupe_file_range_one() is too coarse - We only
allow dedupe of the destination file if the user is root, or they have the
file open for write.This effectively limits a non-root user from deduping their own read-only
files. In addition, the write file descriptor that the user is forced to
hold open can prevent execution of files. As file data during a dedupe
does not change, the behavior is unexpected and this has caused a number of
issue reports. For an example, see:https://github.com/markfasheh/duperemove/issues/129
So change the check so we allow dedupe on the target if:
- the root or admin is asking for it
- the process has write access
- the owner of the file is asking for the dedupe
- the process could get write accessThat way users can open read-only and still get dedupe.
Signed-off-by: Mark Fasheh
Signed-off-by: Al Viro
24 Sep, 2018
1 commit
-
Commit 031a072a0b8a ("vfs: call vfs_clone_file_range() under freeze
protection") created a wrapper do_clone_file_range() around
vfs_clone_file_range() moving the freeze protection to former, so
overlayfs could call the latter.The more common vfs practice is to call do_xxx helpers from vfs_xxx
helpers, where freeze protecction is taken in the vfs_xxx helper, so
this anomality could be a source of confusion.It seems that commit 8ede205541ff ("ovl: add reflink/copyfile/dedup
support") may have fallen a victim to this confusion -
ovl_clone_file_range() calls the vfs_clone_file_range() helper in the
hope of getting freeze protection on upper fs, but in fact results in
overlayfs allowing to bypass upper fs freeze protection.Swap the names of the two helpers to conform to common vfs practice
and call the correct helpers from overlayfs and nfsd.Signed-off-by: Amir Goldstein
Signed-off-by: Miklos Szeredi
29 Aug, 2018
1 commit
-
The sys_llseek sytem call is needed on all 32-bit architectures and
none of the 64-bit ones, so we can remove the __ARCH_WANT_SYS_LLSEEK guard
and simplify the include/asm-generic/unistd.h header further.Since 32-bit tasks can run either natively or in compat mode on 64-bit
architectures, we have to check for both !CONFIG_64BIT and CONFIG_COMPAT.There are a few 64-bit architectures that also reference sys_llseek
in their 64-bit ABI (e.g. sparc), but I verified that those all
select CONFIG_COMPAT, so the #if check is still correct here. It's
a bit odd to include it in the syscall table though, as it's the
same as sys_lseek() on 64-bit, but with strange calling conventions.Acked-by: Geert Uytterhoeven
Reviewed-by: Christoph Hellwig
Signed-off-by: Arnd Bergmann
18 Jul, 2018
1 commit
-
This is needed by the stacked dedupe implementation in overlayfs.
Signed-off-by: Miklos Szeredi
07 Jul, 2018
4 commits
-
Extract vfs_dedupe_file_range_one() helper to deal with a single dedup
request.Signed-off-by: Miklos Szeredi
Reviewed-by: Christoph Hellwig -
Clean up f_op->dedupe_file_range() interface.
1) Use loff_t for offsets and length instead of u64
2) Order the arguments the same way as {copy|clone}_file_range().Signed-off-by: Miklos Szeredi
Reviewed-by: Darrick J. Wong -
Signed-off-by: Miklos Szeredi
-
Suggested-by: Darrick J. Wong
Signed-off-by: Miklos Szeredi
13 Jun, 2018
1 commit
-
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:kmalloc(a * b, gfp)
with:
kmalloc_array(a * b, gfp)as well as handling cases of:
kmalloc(a * b * c, gfp)
with:
kmalloc(array3_size(a, b, c), gfp)
as it's slightly less ugly than:
kmalloc_array(array_size(a, b), c, gfp)
This does, however, attempt to ignore constant size factors like:
kmalloc(4 * 1024, gfp)
though any constants defined via macros get caught up in the conversion.
Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().The Coccinelle script used for this was:
// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)Signed-off-by: Kees Cook
16 Apr, 2018
1 commit
-
It's a fairly inconsequential bug, since fdput() won't actually try to
fput() the file due to fd.flags (and thus FDPUT_FPUT) being zero in
the failure case, but most other vfs code takes steps to avoid this.Signed-off-by: Zev Weiss
Signed-off-by: Al Viro
03 Apr, 2018
4 commits
-
Using the ksys_p{read,write}64() wrappers allows us to get rid of
in-kernel calls to the sys_pread64() and sys_pwrite64() syscalls.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_p{read,write}64().This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.netCc: Al Viro
Cc: Andrew Morton
Signed-off-by: Dominik Brodowski -
Using this helper allows us to avoid the in-kernel calls to the
sys_read() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_read().This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.netCc: Alexander Viro
Signed-off-by: Dominik Brodowski -
Using this helper allows us to avoid the in-kernel calls to the
sys_lseek() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_lseek().This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.netCc: Alexander Viro
Signed-off-by: Dominik Brodowski -
Using this helper allows us to avoid the in-kernel calls to the sys_write()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_write().In the near future, the do_mounts / initramfs callers of ksys_write()
should be converted to use filp_open() and vfs_write() instead.This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.netCc: Alexander Viro
Cc: linux-s390@vger.kernel.org
Signed-off-by: Dominik Brodowski
18 Nov, 2017
1 commit
-
Pull iov_iter updates from Al Viro:
- bio_{map,copy}_user_iov() series; those are cleanups - fixes from the
same pile went into mainline (and stable) in late September.- fs/iomap.c iov_iter-related fixes
- new primitive - iov_iter_for_each_range(), which applies a function
to kernel-mapped segments of an iov_iter.Usable for kvec and bvec ones, the latter does kmap()/kunmap() around
the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the
latter is not hard to fix if the need ever appears, the former is by
design.Another related primitive will have to wait for the next cycle - it
passes page + offset + size instead of pointer + size, and that one
will be usable for everything _except_ kvec. Unfortunately, that one
didn't get exposure in -next yet, so...- a bit more lustre iov_iter work, including a use case for
iov_iter_for_each_range() (checksum calculation)- vhost/scsi leak fix in failure exit
- misc cleanups and detritectomy...
* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
iomap_dio_actor(): fix iov_iter bugs
switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range()
lustre: switch struct ksock_conn to iov_iter
vhost/scsi: switch to iov_iter_get_pages()
fix a page leak in vhost_scsi_iov_to_sgl() error recovery
new primitive: iov_iter_for_each_range()
lnet_return_rx_credits_locked: don't abuse list_entry
xen: don't open-code iov_iter_kvec()
orangefs: remove detritus from struct orangefs_kiocb_s
kill iov_shorten()
bio_alloc_map_data(): do bmd->iter setup right there
bio_copy_user_iov(): saner bio size calculation
bio_map_user_iov(): get rid of copying iov_iter
bio_copy_from_iter(): get rid of copying iov_iter
move more stuff down into bio_copy_user_iov()
blk_rq_map_user_iov(): move iov_iter_advance() down
bio_map_user_iov(): get rid of the iov_for_each()
bio_map_user_iov(): move alignment check into the main loop
don't rely upon subsequent bio_add_pc_page() calls failing
... and with iov_iter_get_pages_alloc() it becomes even simpler
...
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
12 Oct, 2017
1 commit
-
Signed-off-by: Al Viro