25 Sep, 2019
1 commit
-
Patch series "Make working with compound pages easier", v2.
These three patches add three helpers and convert the appropriate
places to use them.This patch (of 3):
It's unnecessarily hard to find out the size of a potentially huge page.
Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Acked-by: Michal Hocko
Reviewed-by: Andrew Morton
Reviewed-by: Ira Weiny
Acked-by: Kirill A. Shutemov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Jun, 2019
1 commit
-
Currently these functions return < 0 on error, and 0 for success.
Change that so that we return < 0 on error, but number of bytes
for success.Some callers already treat the return value that way, others need a
slight tweak.Signed-off-by: Jens Axboe
21 May, 2019
1 commit
-
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the fileThese files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:GPL-2.0-only
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
15 May, 2019
1 commit
-
To facilitate additional options to get_user_pages_fast() change the
singular write parameter to be gup_flags.This patch does not change any functionality. New functionality will
follow in subsequent patches.Some of the get_user_pages_fast() call sites were unchanged because they
already passed FOLL_WRITE or 0 for the write parameter.NOTE: It was suggested to change the ordering of the get_user_pages_fast()
arguments to ensure that callers were converted. This breaks the current
GUP call site convention of having the returned pages be the final
parameter. So the suggestion was rejected.Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com
Signed-off-by: Ira Weiny
Reviewed-by: Mike Marshall
Cc: Aneesh Kumar K.V
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Dan Williams
Cc: "David S. Miller"
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: James Hogan
Cc: Jason Gunthorpe
Cc: John Hubbard
Cc: "Kirill A. Shutemov"
Cc: Martin Schwidefsky
Cc: Michal Hocko
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Ralf Baechle
Cc: Rich Felker
Cc: Thomas Gleixner
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Apr, 2019
1 commit
-
If CONFIG_CRYPTO is not set or set to m,
gcc building warn this:lib/iov_iter.o: In function `hash_and_copy_to_iter':
iov_iter.c:(.text+0x9129): undefined reference to `crypto_stats_get'
iov_iter.c:(.text+0x9152): undefined reference to `crypto_stats_ahash_update'Reported-by: Hulk Robot
Fixes: d05f443554b3 ("iov_iter: introduce hash_and_copy_to_iter helper")
Suggested-by: Al Viro
Signed-off-by: YueHaibing
Signed-off-by: Al Viro
27 Feb, 2019
1 commit
-
Avoid cache line miss dereferencing struct page if we can.
page_copy_sane() mostly deals with order-0 pages.
Extra cache line miss is visible on TCP recvmsg() calls dealing
with GRO packets (typically 45 page frags are attached to one skb).Bringing the 45 struct pages into cpu cache while copying the data
is not free, since the freeing of the skb (and associated
page frags put_page()) can happen after cache lines have been evicted.Signed-off-by: Eric Dumazet
Cc: Al Viro
Signed-off-by: Al Viro
06 Jan, 2019
1 commit
-
Pull trivial vfs updates from Al Viro:
"A few cleanups + Neil's namespace_unlock() optimization"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
exec: make prepare_bprm_creds static
genheaders: %-s had been there since v6; %-*s - since v7
VFS: use synchronize_rcu_expedited() in namespace_unlock()
iov_iter: reduce code duplication
04 Jan, 2019
1 commit
-
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.Signed-off-by: Linus Torvalds
13 Dec, 2018
2 commits
-
Allow consumers that want to use iov iterator helpers and also update
a predefined hash calculation online when copying data. This is useful
when copying incoming network buffers to a local iterator and calculate
a digest on the incoming stream. nvme-tcp host driver that will be
introduced in following patches is the first consumer via
skb_copy_and_hash_datagram_iter.Acked-by: David S. Miller
Signed-off-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig -
The single caller to csum_and_copy_to_iter is skb_copy_and_csum_datagram
and we are trying to unite its logic with skb_copy_datagram_iter by passing
a callback to the copy function that we want to apply. Thus, we need
to make the checksum pointer private to the function.Acked-by: David S. Miller
Signed-off-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
28 Nov, 2018
1 commit
-
The same combination of csum_partial_copy_nocheck() with csum_add_block()
is used in a bunch of places. Add a helper doing just that and use it.Signed-off-by: Al Viro
26 Nov, 2018
1 commit
-
Signed-off-by: Al Viro
24 Oct, 2018
3 commits
-
Add a new iterator, ITER_DISCARD, that can only be used in READ mode and
just discards any data copied to it.This is useful in a network filesystem for discarding any unwanted data
sent by a server.Signed-off-by: David Howells
-
In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements. This makes it easier to add further
iterator types. Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself. Only the direction is required.Signed-off-by: David Howells
-
Use accessor functions to access an iterator's type and direction. This
allows for the possibility of using some other method of determining the
type of iterator than if-chains with bitwise-AND conditions.Signed-off-by: David Howells
16 Jul, 2018
3 commits
-
By mistake the ITER_PIPE early-exit / warning from copy_from_iter() was
cargo-culted in _copy_to_iter_mcsafe() rather than a machine-check-safe
version of copy_to_iter_pipe().Implement copy_pipe_to_iter_mcsafe() being careful to return the
indication of short copies due to a CPU exception.Without this regression-fix all splice reads to dax-mode files fail.
Reported-by: Ross Zwisler
Tested-by: Ross Zwisler
Signed-off-by: Dan Williams
Acked-by: Al Viro
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Fixes: 8780356ef630 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
Link: http://lkml.kernel.org/r/153108277278.37979.3327916996902264102.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar -
Add some theory of operation documentation to _copy_to_iter_flushcache().
Reported-by: Al Viro
Signed-off-by: Dan Williams
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Link: http://lkml.kernel.org/r/153108276767.37979.9462477994086841699.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar -
Add some theory of operation documentation to _copy_to_iter_mcsafe().
Reported-by: Al Viro
Signed-off-by: Dan Williams
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Link: http://lkml.kernel.org/r/153108276256.37979.1689794213845539316.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar
05 Jun, 2018
1 commit
-
Pull x86 dax updates from Ingo Molnar:
"This contains x86 memcpy_mcsafe() fault handling improvements the
nvdimm tree would like to make more use of"* 'x86-dax-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()
x86/asm/memcpy_mcsafe: Add write-protection-fault handling
x86/asm/memcpy_mcsafe: Return bytes remaining
x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling
x86/asm/memcpy_mcsafe: Remove loop unrolling
15 May, 2018
1 commit
-
Use the updated memcpy_mcsafe() implementation to define
copy_user_mcsafe() and copy_to_iter_mcsafe(). The most significant
difference from typical copy_to_iter() is that the ITER_KVEC and
ITER_BVEC iterator types can fail to complete a full transfer.Signed-off-by: Dan Williams
Cc: Al Viro
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: hch@lst.de
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/152539239150.31796.9189779163576449784.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar
03 May, 2018
2 commits
-
Make n signed to avoid leaking the pages array if __pipe_get_pages()
fails to allocate any pages.Signed-off-by: Ilya Dryomov
Signed-off-by: Al Viro -
It returns -EFAULT and happens to be a helper for pipe_get_pages()
whose return type is ssize_t.Signed-off-by: Ilya Dryomov
Signed-off-by: Al Viro
12 Oct, 2017
1 commit
-
For kvec and bvec: feeds segments to given callback as long as it
returns 0. For iovec and pipe: fails.Signed-off-by: Al Viro
21 Sep, 2017
1 commit
-
Issue is that if the data crosses a page boundary inside a compound
page, this check will incorrectly trigger a WARN_ON.To fix this, compute the order using the head of the compound page and
adjust the offset to be relative to that head.Fixes: 72e809ed81ed ("iov_iter: sanity checks for copy to/from page
primitives")Signed-off-by: Petar Penkov
CC: Al Viro
CC: Eric Dumazet
Signed-off-by: Al Viro
08 Jul, 2017
1 commit
-
Pull iov_iter hardening from Al Viro:
"This is the iov_iter/uaccess/hardening pile.For one thing, it trims the inline part of copy_to_user/copy_from_user
to the minimum that *does* need to be inlined - object size checks,
basically. For another, it sanitizes the checks for iov_iter
primitives. There are 4 groups of checks: access_ok(), might_fault(),
object size and KASAN.- access_ok() had been verified by whoever had set the iov_iter up.
However, that has happened in a function far away, so proving that
there's no path to actual copying bypassing those checks is hard
and proving that iov_iter has not been buggered in the meanwhile is
also not pleasant. So we want those redone in actual
copyin/copyout.- might_fault() is better off consolidated - we know whether it needs
to be checked as soon as we enter iov_iter primitive and observe
the iov_iter flavour. No need to wait until the copyin/copyout. The
call chains are short enough to make sure we won't miss anything -
in fact, it's more robust that way, since there are cases where we
do e.g. forced fault-in before getting to copyin/copyout. It's not
quite what we need to check (in particular, combination of
iovec-backed and set_fs(KERNEL_DS) is almost certainly a bug, not a
cause to skip checks), but that's for later series. For now let's
keep might_fault().- KASAN checks belong in copyin/copyout - at the same level where
other iov_iter flavours would've hit them in memcpy().- object size checks should apply to *all* iov_iter flavours, not
just iovec-backed ones.There are two groups of primitives - one gets the kernel object
described as pointer + size (copy_to_iter(), etc.) while another gets
it as page + offset + size (copy_page_to_iter(), etc.)For the first group the checks are best done where we actually have a
chance to find the object size. In other words, those belong in inline
wrappers in uio.h, before calling into iov_iter.c. Same kind as we
have for inlined part of copy_to_user().For the second group there is no object to look at - offset in page is
just a number, it bears no type information. So we do them in the
common helper called by iov_iter.c primitives of that kind. All it
currently does is checking that we are not trying to access outside of
the compound page; eventually we might want to add some sanity checks
on the page involved.So the things we need in copyin/copyout part of iov_iter.c do not
quite match anything in uaccess.h (we want no zeroing, we *do* want
access_ok() and KASAN and we want no might_fault() or object size
checks done on that level). OTOH, these needs are simple enough to
provide a couple of helpers (static in iov_iter.c) doing just what we
need..."* 'uaccess-work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: saner checks on copyin/copyout
iov_iter: sanity checks for copy to/from page primitives
iov_iter/hardening: move object size checks to inlined part
copy_{to,from}_user(): consolidate object size checks
copy_{from,to}_user(): move kasan checks and might_fault() out-of-line
07 Jul, 2017
1 commit
-
* might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic
codepath also needs might_fault() coverage)
* we have already done object size checks
* we have *NOT* done access_ok() recently enough; we rely upon the
iovec array having passed sanity checks back when it had been created
and not nothing having buggered it since. However, that's very much
non-local, so we'd better recheck that.So the thing we want does not match anything in uaccess - we need
access_ok + kasan checks + raw copy without any zeroing. Just define
such helpers and use them here.Signed-off-by: Al Viro
30 Jun, 2017
2 commits
-
for now - just that we don't attempt to cross out of compound page
Signed-off-by: Al Viro
-
There we actually have useful information about object sizes.
Note: this patch has them done for all iov_iter flavours.
Right now we do them twice in iovec case, but that'll change
very shortly.Signed-off-by: Al Viro
10 Jun, 2017
1 commit
-
The pmem driver has a need to transfer data with a persistent memory
destination and be able to rely on the fact that the destination writes are not
cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
(non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
to ensure data-writes have reached a power-fail-safe zone in the platform. The
fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
around and fence previous writes with an "sfence".Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
memcpy_flushcache, that guarantee that the destination buffer is not dirty in
the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
will be used to replace the "pmem api" (include/linux/pmem.h +
arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
otherwise.This is meant to satisfy the concern from Linus that if a driver wants to do
something beyond the normal nocache semantics it should be something private to
that driver [1], and Al's concern that anything uaccess related belongs with
the rest of the uaccess code [2].The first consumer of this interface is a new 'copy_from_iter' dax operation so
that pmem can inject cache maintenance operations without imposing this
overhead on other dax-capable drivers.[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.htmlCc:
Cc: Jan Kara
Cc: Jeff Moyer
Cc: Ingo Molnar
Cc: Christoph Hellwig
Cc: Toshi Kani
Cc: "H. Peter Anvin"
Cc: Al Viro
Cc: Thomas Gleixner
Cc: Matthew Wilcox
Reviewed-by: Ross Zwisler
Signed-off-by: Dan Williams
10 May, 2017
1 commit
-
Pull vfs fix from Al Viro:
"Braino fix for iov_iter_revert() misuse"* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix braino in generic_file_read_iter()
09 May, 2017
2 commits
-
There are many code paths opencoding kvmalloc. Let's use the helper
instead. The main difference to kvmalloc is that those users are
usually not considering all the aspects of the memory allocator. E.g.
allocation requests
Reviewed-by: Boris Ostrovsky # Xen bits
Acked-by: Kees Cook
Acked-by: Vlastimil Babka
Acked-by: Andreas Dilger # Lustre
Acked-by: Christian Borntraeger # KVM/s390
Acked-by: Dan Williams # nvdim
Acked-by: David Sterba # btrfs
Acked-by: Ilya Dryomov # Ceph
Acked-by: Tariq Toukan # mlx4
Acked-by: Leon Romanovsky # mlx5
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Herbert Xu
Cc: Anton Vorontsov
Cc: Colin Cross
Cc: Tony Luck
Cc: "Rafael J. Wysocki"
Cc: Ben Skeggs
Cc: Kent Overstreet
Cc: Santosh Raspatur
Cc: Hariprasad S
Cc: Yishai Hadas
Cc: Oleg Drokin
Cc: "Yan, Zheng"
Cc: Alexander Viro
Cc: Alexei Starovoitov
Cc: Eric Dumazet
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Wrong sign of iov_iter_revert() argument. Unfortunately, slipped through
the testing, since most of the time we don't do anything to the iterator
afterwards and potential oops on walking the iter->iov too far backwards
is too infrequent to be easily triggered.Add a sanity check in iov_iter_revert() to catch bugs like this one;
fortunately, the same braino hadn't happened in other callers, but we'd
better have a warning if such thing crops up.Signed-off-by: Al Viro
02 May, 2017
1 commit
-
Pull uaccess unification updates from Al Viro:
"This is the uaccess unification pile. It's _not_ the end of uaccess
work, but the next batch of that will go into the next cycle. This one
mostly takes copy_from_user() and friends out of arch/* and gets the
zero-padding behaviour in sync for all architectures.Dealing with the nocache/writethrough mess is for the next cycle;
fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
sold on access_ok() in there, BTW; just not in this pile), same for
reducing __copy_... callsites, strn*... stuff, etc. - there will be a
pile about as large as this one in the next merge window.This one sat in -next for weeks. -3KLoC"
* 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
HAVE_ARCH_HARDENED_USERCOPY is unconditional now
CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
m32r: switch to RAW_COPY_USER
hexagon: switch to RAW_COPY_USER
microblaze: switch to RAW_COPY_USER
get rid of padding, switch to RAW_COPY_USER
ia64: get rid of copy_in_user()
ia64: sanitize __access_ok()
ia64: get rid of 'segment' argument of __do_{get,put}_user()
ia64: get rid of 'segment' argument of __{get,put}_user_check()
ia64: add extable.h
powerpc: get rid of zeroing, switch to RAW_COPY_USER
esas2r: don't open-code memdup_user()
alpha: fix stack smashing in old_adjtimex(2)
don't open-code kernel_setsockopt()
mips: switch to RAW_COPY_USER
mips: get rid of tail-zeroing in primitives
mips: make copy_from_user() zero tail explicitly
mips: clean and reorder the forest of macros...
mips: consolidate __invoke_... wrappers
...
30 Apr, 2017
1 commit
-
Fixes: 27c0e3748e41
Tested-by: Dave Jones
Signed-off-by: Al Viro
03 Apr, 2017
1 commit
-
opposite to iov_iter_advance(); the caller is responsible for never
using it to move back past the initial position.Cc: stable@vger.kernel.org
Signed-off-by: Al Viro
29 Mar, 2017
2 commits
-
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
15 Jan, 2017
1 commit
-
The logics in pipe_advance() used to release all buffers past the new
position failed in cases when the number of buffers to release was equal
to pipe->buffers. If that happened, none of them had been released,
leaving pipe full. Worse, it was trivial to trigger and we end up with
pipe full of uninitialized pages. IOW, it's an infoleak.Cc: stable@vger.kernel.org # v4.9
Reported-by: "Alan J. Wylie"
Tested-by: "Alan J. Wylie"
Signed-off-by: Al Viro
23 Dec, 2016
1 commit
-
Problem similar to ones dealt with in "fold checks into iterate_and_advance()"
and followups, except that in this case we really want to do nothing when
asked for zero-length operation - unlike zero-length iterate_and_advance(),
zero-length iterate_all_kinds() has no side effects, and callers are simpler
that way.That got exposed when copy_from_iter_full() had been used by tipc, which
builds an msghdr with zero payload and (now) feeds it to a primitive
based on iterate_all_kinds() instead of iterate_and_advance().Reported-by: Jon Maloy
Tested-by: Jon Maloy
Signed-off-by: Al Viro
17 Dec, 2016
1 commit
-
Pull vfs updates from Al Viro:
- more ->d_init() stuff (work.dcache)
- pathname resolution cleanups (work.namei)
- a few missing iov_iter primitives - copy_from_iter_full() and
friends. Either copy the full requested amount, advance the iterator
and return true, or fail, return false and do _not_ advance the
iterator. Quite a few open-coded callers converted (and became more
readable and harder to fuck up that way) (work.iov_iter)- several assorted patches, the big one being logfs removal
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
logfs: remove from tree
vfs: fix put_compat_statfs64() does not handle errors
namei: fold should_follow_link() with the step into not-followed link
namei: pass both WALK_GET and WALK_MORE to should_follow_link()
namei: invert WALK_PUT logics
namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
namei: saner calling conventions for mountpoint_last()
namei.c: get rid of user_path_parent()
switch getfrag callbacks to ..._full() primitives
make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
[iov_iter] new primitives - copy_from_iter_full() and friends
don't open-code file_inode()
ceph: switch to use of ->d_init()
ceph: unify dentry_operations instances
lustre: switch to use of ->d_init()