Eric Lee / smarc-fsl-linux-kernel

22 Nov, 2018

1 commit

494633fac vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP ... Browse Code »

It returns EINVAL when the operation is not supported by the
filesystem. Fix it to return EOPNOTSUPP to be consistent with
the man page and clone_file_range().

Clean up the inconsistent error return handling while I'm there.
(I know, lipstick on a pig, but every little bit helps...)

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2018-11-22 02:10:54 +0800

03 Nov, 2018

1 commit

c2aa1a444 Merge tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux ... Browse Code »

Pull vfs dedup fixes from Dave Chinner:
"This reworks the vfs data cloning infrastructure.

We discovered many issues with these interfaces late in the 4.19 cycle
- the worst of them (data corruption, setuid stripping) were fixed for
XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
the problems was needed. That rework is the contents of this pull
request.

Rework the vfs_clone_file_range and vfs_dedupe_file_range
infrastructure to use a common .remap_file_range method and supply
generic bounds and sanity checking functions that are shared with the
data write path. The current VFS infrastructure has problems with
rlimit, LFS file sizes, file time stamps, maximum filesystem file
sizes, stripping setuid bits, etc and so they are addressed in these
commits.

We also introduce the ability for the ->remap_file_range methods to
return short clones so that clones for vfs_copy_file_range() don't get
rejected if the entire range can't be cloned. It also allows
filesystems to sliently skip deduplication of partial EOF blocks if
they are not capable of doing so without requiring errors to be thrown
to userspace.

Existing filesystems are converted to user the new remap_file_range
method, and both XFS and ocfs2 are modified to make use of the new
generic checking infrastructure"

* tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
xfs: remove [cm]time update from reflink calls
xfs: remove xfs_reflink_remap_range
xfs: remove redundant remap partial EOF block checks
xfs: support returning partial reflink results
xfs: clean up xfs_reflink_remap_blocks call site
xfs: fix pagecache truncation prior to reflink
ocfs2: remove ocfs2_reflink_remap_range
ocfs2: support partial clone range and dedupe range
ocfs2: fix pagecache truncation prior to reflink
ocfs2: truncate page cache for clone destination file before remapping
vfs: clean up generic_remap_file_range_prep return value
vfs: hide file range comparison function
vfs: enable remap callers that can handle short operations
vfs: plumb remap flags through the vfs dedupe functions
vfs: plumb remap flags through the vfs clone functions
vfs: make remap_file_range functions take and return bytes completed
vfs: remap helper should update destination inode metadata
vfs: pass remap flags to generic_remap_checks
vfs: pass remap flags to generic_remap_file_range_prep
vfs: combine the clone and dedupe into a single remap_file_range
...

Linus Torvalds
2018-11-03 00:33:08 +0800

02 Nov, 2018

1 commit

8adcc5997 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc vfs updates from Al Viro:
"No common topic, really - a handful of assorted stuff; the least
trivial bits are Mark's dedupe patches"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/exofs: only use true/false for asignment of bool type variable
fs/exofs: fix potential memory leak in mount option parsing
Delete invalid assignment statements in do_sendfile
iomap: remove duplicated include from iomap.c
vfs: dedupe should return EPERM if permission is not granted
vfs: allow dedupe of user owned read-only files
ntfs: don't open-code ERR_CAST
ext4: don't open-code ERR_CAST

Linus Torvalds
2018-11-02 11:19:49 +0800

30 Oct, 2018

17 commits

8c5c836bd vfs: clean up generic_remap_file_range_prep return value ... Browse Code »

Since the remap prep function can update the length of the remap
request, we can change this function to return the usual return status
instead of the odd behavior it has now.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:42:24 +0800
c32e5f399 vfs: hide file range comparison function ... Browse Code »

There are no callers of vfs_dedupe_file_range_compare, so we might as
well make it a static helper and remove the export.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:42:17 +0800
eca3654e3 vfs: enable remap callers that can handle short operations ... Browse Code »

Plumb in a remap flag that enables the filesystem remap handler to
shorten remapping requests for callers that can handle it. Now
copy_file_range can report partial success (in case we run up against
alignment problems, resource limits, etc.).

We also enable CAN_SHORTEN for fideduperange to maintain existing
userspace-visible behavior where xfs/btrfs shorten the dedupe range to
avoid stale post-eof data exposure.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:42:10 +0800
df3658361 vfs: plumb remap flags through the vfs dedupe functions ... Browse Code »

Plumb a remap_flags argument through the vfs_dedupe_file_range_one
functions so that dedupe can take advantage of it.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:42:03 +0800
452ce6595 vfs: plumb remap flags through the vfs clone functions ... Browse Code »

Plumb a remap_flags argument through the {do,vfs}_clone_file_range
functions so that clone can take advantage of it.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:56 +0800
42ec3d4c0 vfs: make remap_file_range functions take and return bytes completed ... Browse Code »

Change the remap_file_range functions to take a number of bytes to
operate upon and return the number of bytes they operated on. This is a
requirement for allowing fs implementations to return short clone/dedupe
results to the user, which will enable us to obey resource limits in a
graceful manner.

A subsequent patch will enable copy_file_range to signal to the
->clone_file_range implementation that it can handle a short length,
which will be returned in the function's return value. For now the
short return is not implemented anywhere so the behavior won't change --
either copy_file_range manages to clone the entire range or it tries an
alternative.

Neither clone ioctl can take advantage of this, alas.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:49 +0800
8dde90bca vfs: remap helper should update destination inode metadata ... Browse Code »

Extend generic_remap_file_range_prep to handle inode metadata updates
when remapping into a file. If the operation can possibly alter the
file contents, we must update the ctime and mtime and remove security
privileges, just like we do for regular file writes.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:41 +0800
3d28193e1 vfs: pass remap flags to generic_remap_checks ... Browse Code »

Pass the same remap flags to generic_remap_checks for consistency.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:34 +0800
a91ae49bb vfs: pass remap flags to generic_remap_file_range_prep ... Browse Code »

Plumb the remap flags through the filesystem from the vfs function
dispatcher all the way to the prep function to prepare for behavior
changes in subsequent patches.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:28 +0800
2e5dfc99f vfs: combine the clone and dedupe into a single remap_file_range ... Browse Code »

Combine the clone_file_range and dedupe_file_range operations into a
single remap_file_range file operation dispatch since they're
fundamentally the same operation. The differences between the two can
be made in the prep functions.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:21 +0800
6095028b4 vfs: rename clone_verify_area to remap_verify_area ... Browse Code »

Since we use clone_verify_area for both clone and dedupe range checks,
rename the function to make it clear that it's for both.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:14 +0800
a83ab01a6 vfs: rename vfs_clone_file_prep to be more descriptive ... Browse Code »

The vfs_clone_file_prep is a generic function to be called by filesystem
implementations only. Rename the prefix to generic_ and make it more
clear that it applies to remap operations, not just clones.

Signed-off-by: Darrick J. Wong
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:08 +0800
9aae20500 vfs: skip zero-length dedupe requests ... Browse Code »

Don't bother calling the filesystem for a zero-length dedupe request;
we can return zero and exit.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:41:01 +0800
07d19dc9f vfs: avoid problematic remapping requests into partial EOF block ... Browse Code »

A deduplication data corruption is exposed in XFS and btrfs. It is
caused by extending the block match range to include the partial EOF
block, but then allowing unknown data beyond EOF to be considered a
"match" to data in the destination file because the comparison is only
made to the end of the source file. This corrupts the destination file
when the source extent is shared with it.

The VFS remapping prep functions only support whole block dedupe, but
we still need to appear to support whole file dedupe correctly. Hence
if the dedupe request includes the last block of the souce file, don't
include it in the actual dedupe operation. If the rest of the range
dedupes successfully, then reject the entire request. A subsequent
patch will enable us to shorten dedupe requests correctly.

When reflinking sub-file ranges, a data corruption can occur when the
source file range includes a partial EOF block. This shares the unknown
data beyond EOF into the second file at a position inside EOF, exposing
stale data in the second file.

If the reflink request includes the last block of the souce file, only
proceed with the reflink operation if it lands at or past the
destination file's current EOF. If it lands within the destination file
EOF, reject the entire request with -EINVAL and make the caller go the
hard way. A subsequent patch will enable us to shorten reflink requests
correctly.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:40:55 +0800
2c5773f10 vfs: exit early from zero length remap operations ... Browse Code »

If a remap caller asks us to remap to the source file's EOF and the
source file length leaves us with a zero byte request, exit early.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:40:39 +0800
1383a7ed6 vfs: check file ranges before cloning files ... Browse Code »

Move the file range checks from vfs_clone_file_prep into a separate
generic_remap_checks function so that all the checks are collected in a
central location. This forms the basis for adding more checks from
generic_write_checks that will make cloning's input checking more
consistent with write input checking.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Amir Goldstein
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:40:31 +0800
5b49f64db vfs: vfs_clone_file_prep_inodes should return EINVAL for a clone from beyond EOF ... Browse Code »

vfs_clone_file_prep_inodes cannot return 0 if it is asked to remap from
a zero byte file because that's what btrfs does.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Dave Chinner

Darrick J. Wong
2018-10-30 07:40:22 +0800

26 Oct, 2018

1 commit

4dcb9239d Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timekeeping updates from Thomas Gleixner:
"The timers and timekeeping departement provides:

- Another large y2038 update with further preparations for providing
the y2038 safe timespecs closer to the syscalls.

- An overhaul of the SHCMT clocksource driver

- SPDX license identifier updates

- Small cleanups and fixes all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
tick/sched : Remove redundant cpu_online() check
clocksource/drivers/dw_apb: Add reset control
clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE
clocksource/drivers: Unify the names to timer-* format
clocksource/drivers/sh_cmt: Add R-Car gen3 support
dt-bindings: timer: renesas: cmt: document R-Car gen3 support
clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer
clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines
clocksource/drivers/sh_cmt: Fixup for 64-bit machines
clocksource/drivers/sh_tmu: Convert to SPDX identifiers
clocksource/drivers/sh_mtu2: Convert to SPDX identifiers
clocksource/drivers/sh_cmt: Convert to SPDX identifiers
clocksource/drivers/renesas-ostm: Convert to SPDX identifiers
clocksource: Convert to using %pOFn instead of device_node.name
tick/broadcast: Remove redundant check
RISC-V: Request newstat syscalls
y2038: signal: Change rt_sigtimedwait to use __kernel_timespec
y2038: socket: Change recvmmsg to use __kernel_timespec
y2038: sched: Change sched_rr_get_interval to use __kernel_timespec
y2038: utimes: Rework #ifdef guards for compat syscalls
...

Linus Torvalds
2018-10-26 02:14:36 +0800

18 Oct, 2018

3 commits

55338ac2a Delete invalid assignment statements in do_sendfile ... Browse Code »

Assigning value -EINVAL to "retval" here, but that stored value is
overwritten before it can be used.

retval = -EINVAL;
....
retval = rw_verify_area(WRITE, out.file, &out_pos, count);

value_overwrite: Overwriting previous write to "retval" with value
from rw_verify_area

delete invalid assignment statements

Signed-off-by: n00202754
Signed-off-by: Al Viro

nixiaoming
2018-10-18 13:42:49 +0800
85c95f208 vfs: dedupe should return EPERM if permission is not granted ... Browse Code »

Right now we return EINVAL if a process does not have permission to dedupe a
file. This was an oversight on my part. EPERM gives a true description of
the nature of our error, and EINVAL is already used for the case that the
filesystem does not support dedupe.

Signed-off-by: Mark Fasheh
Reviewed-by: Darrick J. Wong
Acked-by: David Sterba
Signed-off-by: Al Viro

Mark Fasheh
2018-10-18 09:15:39 +0800
5de4480ae vfs: allow dedupe of user owned read-only files ... Browse Code »

The permission check in vfs_dedupe_file_range_one() is too coarse - We only
allow dedupe of the destination file if the user is root, or they have the
file open for write.

This effectively limits a non-root user from deduping their own read-only
files. In addition, the write file descriptor that the user is forced to
hold open can prevent execution of files. As file data during a dedupe
does not change, the behavior is unexpected and this has caused a number of
issue reports. For an example, see:

https://github.com/markfasheh/duperemove/issues/129

So change the check so we allow dedupe on the target if:

- the root or admin is asking for it
- the process has write access
- the owner of the file is asking for the dedupe
- the process could get write access

That way users can open read-only and still get dedupe.

Signed-off-by: Mark Fasheh
Signed-off-by: Al Viro

Mark Fasheh
2018-10-18 09:15:39 +0800

24 Sep, 2018

1 commit

a725356b6 vfs: swap names of {do,vfs}_clone_file_range() ... Browse Code »

Commit 031a072a0b8a ("vfs: call vfs_clone_file_range() under freeze
protection") created a wrapper do_clone_file_range() around
vfs_clone_file_range() moving the freeze protection to former, so
overlayfs could call the latter.

The more common vfs practice is to call do_xxx helpers from vfs_xxx
helpers, where freeze protecction is taken in the vfs_xxx helper, so
this anomality could be a source of confusion.

It seems that commit 8ede205541ff ("ovl: add reflink/copyfile/dedup
support") may have fallen a victim to this confusion -
ovl_clone_file_range() calls the vfs_clone_file_range() helper in the
hope of getting freeze protection on upper fs, but in fact results in
overlayfs allowing to bypass upper fs freeze protection.

Swap the names of the two helpers to conform to common vfs practice
and call the correct helpers from overlayfs and nfsd.

Signed-off-by: Amir Goldstein
Signed-off-by: Miklos Szeredi

Amir Goldstein
2018-09-24 16:54:01 +0800

29 Aug, 2018

1 commit

caf6f9c8a asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro ... Browse Code »

The sys_llseek sytem call is needed on all 32-bit architectures and
none of the 64-bit ones, so we can remove the __ARCH_WANT_SYS_LLSEEK guard
and simplify the include/asm-generic/unistd.h header further.

Since 32-bit tasks can run either natively or in compat mode on 64-bit
architectures, we have to check for both !CONFIG_64BIT and CONFIG_COMPAT.

There are a few 64-bit architectures that also reference sys_llseek
in their 64-bit ABI (e.g. sparc), but I verified that those all
select CONFIG_COMPAT, so the #if check is still correct here. It's
a bit odd to include it in the syscall table though, as it's the
same as sys_lseek() on 64-bit, but with strange calling conventions.

Acked-by: Geert Uytterhoeven
Reviewed-by: Christoph Hellwig
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2018-08-29 21:42:21 +0800

18 Jul, 2018

1 commit

f18253668 vfs: export vfs_dedupe_file_range_one() to modules ... Browse Code »

This is needed by the stacked dedupe implementation in overlayfs.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:40 +0800

07 Jul, 2018

4 commits

1b4f42a1e vfs: dedupe: extract helper for a single dedup ... Browse Code »

Extract vfs_dedupe_file_range_one() helper to deal with a single dedup
request.

Signed-off-by: Miklos Szeredi
Reviewed-by: Christoph Hellwig

Miklos Szeredi
2018-07-07 05:57:03 +0800
87eb5eb24 vfs: dedupe: rationalize args ... Browse Code »

Clean up f_op->dedupe_file_range() interface.

1) Use loff_t for offsets and length instead of u64
2) Order the arguments the same way as {copy|clone}_file_range().

Signed-off-by: Miklos Szeredi
Reviewed-by: Darrick J. Wong

Miklos Szeredi
2018-07-07 05:57:03 +0800
5740c99e9 vfs: dedupe: return int ... Browse Code »

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-07 05:57:03 +0800
92b66d2cd vfs: limit size of dedupe ... Browse Code »

Suggested-by: Darrick J. Wong
Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-07 05:57:02 +0800

13 Jun, 2018

1 commit

6da2ec560 treewide: kmalloc() -> kmalloc_array() ... Browse Code »

The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

kmalloc(a * b, gfp)

with:
kmalloc_array(a * b, gfp)

as well as handling cases of:

kmalloc(a * b * c, gfp)

with:

kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)

Signed-off-by: Kees Cook

Kees Cook
2018-06-13 07:19:22 +0800

16 Apr, 2018

1 commit

227627114 fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range() ... Browse Code »

It's a fairly inconsequential bug, since fdput() won't actually try to
fput() the file due to fd.flags (and thus FDPUT_FPUT) being zero in
the failure case, but most other vfs code takes steps to avoid this.

Signed-off-by: Zev Weiss
Signed-off-by: Al Viro

Zev Weiss
2018-04-16 11:36:26 +0800

03 Apr, 2018

4 commits

36028d5dd fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls ... Browse Code »

Using the ksys_p{read,write}64() wrappers allows us to get rid of
in-kernel calls to the sys_pread64() and sys_pwrite64() syscalls.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_p{read,write}64().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro
Cc: Andrew Morton
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:16:09 +0800
3ce4a7bf6 fs: add ksys_read() helper; remove in-kernel calls to sys_read() ... Browse Code »

Using this helper allows us to avoid the in-kernel calls to the
sys_read() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_read().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:16:04 +0800
76847e434 fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek() ... Browse Code »

Using this helper allows us to avoid the in-kernel calls to the
sys_lseek() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_lseek().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:16:03 +0800
e7a3e8b2e fs: add ksys_write() helper; remove in-kernel calls to sys_write() ... Browse Code »

Using this helper allows us to avoid the in-kernel calls to the sys_write()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_write().

In the near future, the do_mounts / initramfs callers of ksys_write()
should be converted to use filp_open() and vfs_write() instead.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro
Cc: linux-s390@vger.kernel.org
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:15:51 +0800

18 Nov, 2017

1 commit

16382e17c Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull iov_iter updates from Al Viro:

- bio_{map,copy}_user_iov() series; those are cleanups - fixes from the
same pile went into mainline (and stable) in late September.

- fs/iomap.c iov_iter-related fixes

- new primitive - iov_iter_for_each_range(), which applies a function
to kernel-mapped segments of an iov_iter.

Usable for kvec and bvec ones, the latter does kmap()/kunmap() around
the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the
latter is not hard to fix if the need ever appears, the former is by
design.

Another related primitive will have to wait for the next cycle - it
passes page + offset + size instead of pointer + size, and that one
will be usable for everything _except_ kvec. Unfortunately, that one
didn't get exposure in -next yet, so...

- a bit more lustre iov_iter work, including a use case for
iov_iter_for_each_range() (checksum calculation)

- vhost/scsi leak fix in failure exit

- misc cleanups and detritectomy...

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
iomap_dio_actor(): fix iov_iter bugs
switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range()
lustre: switch struct ksock_conn to iov_iter
vhost/scsi: switch to iov_iter_get_pages()
fix a page leak in vhost_scsi_iov_to_sgl() error recovery
new primitive: iov_iter_for_each_range()
lnet_return_rx_credits_locked: don't abuse list_entry
xen: don't open-code iov_iter_kvec()
orangefs: remove detritus from struct orangefs_kiocb_s
kill iov_shorten()
bio_alloc_map_data(): do bmd->iter setup right there
bio_copy_user_iov(): saner bio size calculation
bio_map_user_iov(): get rid of copying iov_iter
bio_copy_from_iter(): get rid of copying iov_iter
move more stuff down into bio_copy_user_iov()
blk_rq_map_user_iov(): move iov_iter_advance() down
bio_map_user_iov(): get rid of the iov_for_each()
bio_map_user_iov(): move alignment check into the main loop
don't rely upon subsequent bio_add_pc_page() calls failing
... and with iov_iter_get_pages_alloc() it becomes even simpler
...

Linus Torvalds
2017-11-18 04:08:18 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

12 Oct, 2017

1 commit

faea13297 kill iov_shorten() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-10-12 05:23:43 +0800