02 Nov, 2020

3 commits


26 Oct, 2020

1 commit


23 Oct, 2020

3 commits


17 Oct, 2020

1 commit

  • To test fault-tolerance of user memory access functions, introduce fault
    injection to usercopy functions.

    If a failure is expected return either -EFAULT or the total amount of
    bytes that were not copied.

    Signed-off-by: Albert van der Linde
    Signed-off-by: Andrew Morton
    Reviewed-by: Akinobu Mita
    Reviewed-by: Alexander Potapenko
    Cc: Al Viro
    Cc: Andrey Konovalov
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Dmitry Vyukov
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Jonathan Corbet
    Cc: Marco Elver
    Cc: Peter Zijlstra (Intel)
    Cc: Thomas Gleixner
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20200831171733.955393-3-alinde@google.com
    Signed-off-by: Linus Torvalds

    Albert van der Linde
     

13 Oct, 2020

2 commits

  • Pull compat iovec cleanups from Al Viro:
    "Christoph's series around import_iovec() and compat variant thereof"

    * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    security/keys: remove compat_keyctl_instantiate_key_iov
    mm: remove compat_process_vm_{readv,writev}
    fs: remove compat_sys_vmsplice
    fs: remove the compat readv/writev syscalls
    fs: remove various compat readv/writev helpers
    iov_iter: transparently handle compat iovecs in import_iovec
    iov_iter: refactor rw_copy_check_uvector and import_iovec
    iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
    compat.h: fix a spelling error in

    Linus Torvalds
     
  • Pull copy_and_csum cleanups from Al Viro:
    "Saner calling conventions for csum_and_copy_..._user() and friends"

    [ Removing 800+ lines of code and cleaning stuff up is good - Linus ]

    * 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    ppc: propagate the calling conventions change down to csum_partial_copy_generic()
    amd64: switch csum_partial_copy_generic() to new calling conventions
    sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
    xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
    mips: propagate the calling convention change down into __csum_partial_copy_..._user()
    mips: __csum_partial_copy_kernel() has no users left
    mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
    sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
    i386: propagate the calling conventions change down to csum_partial_copy_generic()
    sh: propage the calling conventions change down to csum_partial_copy_generic()
    m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
    arm: propagate the calling convention changes down to csum_partial_copy_from_user()
    alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
    saner calling conventions for csum_and_copy_..._user()
    csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
    csum_partial_copy_nocheck(): drop the last argument
    unify generic instances of csum_partial_copy_nocheck()
    icmp_push_reply(): reorder adding the checksum up
    skb_copy_and_csum_bits(): don't bother with the last argument

    Linus Torvalds
     

06 Oct, 2020

1 commit

  • In reaction to a proposal to introduce a memcpy_mcsafe_fast()
    implementation Linus points out that memcpy_mcsafe() is poorly named
    relative to communicating the scope of the interface. Specifically what
    addresses are valid to pass as source, destination, and what faults /
    exceptions are handled.

    Of particular concern is that even though x86 might be able to handle
    the semantics of copy_mc_to_user() with its common copy_user_generic()
    implementation other archs likely need / want an explicit path for this
    case:

    On Fri, May 1, 2020 at 11:28 AM Linus Torvalds wrote:
    >
    > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams wrote:
    > >
    > > However now I see that copy_user_generic() works for the wrong reason.
    > > It works because the exception on the source address due to poison
    > > looks no different than a write fault on the user address to the
    > > caller, it's still just a short copy. So it makes copy_to_user() work
    > > for the wrong reason relative to the name.
    >
    > Right.
    >
    > And it won't work that way on other architectures. On x86, we have a
    > generic function that can take faults on either side, and we use it
    > for both cases (and for the "in_user" case too), but that's an
    > artifact of the architecture oddity.
    >
    > In fact, it's probably wrong even on x86 - because it can hide bugs -
    > but writing those things is painful enough that everybody prefers
    > having just one function.

    Replace a single top-level memcpy_mcsafe() with either
    copy_mc_to_user(), or copy_mc_to_kernel().

    Introduce an x86 copy_mc_fragile() name as the rename for the
    low-level x86 implementation formerly named memcpy_mcsafe(). It is used
    as the slow / careful backend that is supplanted by a fast
    copy_mc_generic() in a follow-on patch.

    One side-effect of this reorganization is that separating copy_mc_64.S
    to its own file means that perf no longer needs to track dependencies
    for its memcpy_64.S benchmarks.

    [ bp: Massage a bit. ]

    Signed-off-by: Dan Williams
    Signed-off-by: Borislav Petkov
    Reviewed-by: Tony Luck
    Acked-by: Michael Ellerman
    Cc:
    Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
    Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com

    Dan Williams
     

03 Oct, 2020

2 commits

  • Use in compat_syscall to import either native or the compat iovecs, and
    remove the now superflous compat_import_iovec.

    This removes the need for special compat logic in most callers, and
    the remaining ones can still be simplified by using __import_iovec
    with a bool compat parameter.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Split rw_copy_check_uvector into two new helpers with more sensible
    calling conventions:

    - iovec_from_user copies a iovec from userspace either into the provided
    stack buffer if it fits, or allocates a new buffer for it. Returns
    the actually used iovec. It also verifies that iov_len does fit a
    signed type, and handles compat iovecs if the compat flag is set.
    - __import_iovec consolidates the native and compat versions of
    import_iovec. It calls iovec_from_user, then validates each iovec
    actually points to user addresses, and ensures the total length
    doesn't overflow.

    This has two major implications:

    - the access_process_vm case loses the total lenght checking, which
    wasn't required anyway, given that each call receives two iovecs
    for the local and remote side of the operation, and it verifies
    the total length on the local side already.
    - instead of a single loop there now are two loops over the iovecs.
    Given that the iovecs are cache hot this doesn't make a major
    difference

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

25 Sep, 2020

1 commit


21 Aug, 2020

3 commits

  • All callers of these primitives will
    * discard anything we might've copied in case of error
    * ignore the csum value in case of error
    * always pass 0xffffffff as the initial sum, so the
    resulting csum value (in case of success, that is) will never be 0.

    That suggest the following calling conventions:
    * don't pass err_ptr - just return 0 on error.
    * don't bother with zeroing destination, etc. in case of error
    * don't pass the initial sum - just use 0xffffffff.

    This commit does the minimal conversion in the instances of csum_and_copy_...();
    the changes of actual asm code behind them are done later in the series.
    Note that this asm code is often shared with csum_partial_copy_nocheck();
    the difference is that csum_partial_copy_nocheck() passes 0 for initial
    sum while csum_and_copy_..._user() pass 0xffffffff. Fortunately, we are
    free to pass 0xffffffff in all cases and subsequent patches will use that
    freedom without any special comments.

    A part that could be split off: parisc and uml/i386 claimed to have
    csum_and_copy_to_user() instances of their own, but those were identical
    to the generic one, so we simply drop them. Not sure if it's worth
    a separate commit...

    Signed-off-by: Al Viro

    Al Viro
     
  • Preparation for the change of calling conventions; right now all
    callers pass 0 as initial sum. Passing 0xffffffff instead yields
    the values comparable mod 0xffff and guarantees that 0 will not
    be returned on success.

    Signed-off-by: Al Viro

    Al Viro
     
  • It's always 0. Note that we theoretically could use ~0U as well -
    result will be the same modulo 0xffff, _if_ the damn thing did the
    right thing for any value of initial sum; later we'll make use of
    that when convenient.

    However, unlike csum_and_copy_..._user(), there are instances that
    did not work for arbitrary initial sums; c6x is one such.

    Signed-off-by: Al Viro

    Al Viro
     

30 Jun, 2020

1 commit

  • The header file linux/uio.h includes crypto/hash.h which pulls in
    most of the Crypto API. Since linux/uio.h is used throughout the
    kernel this means that every tiny bit of change to the Crypto API
    causes the entire kernel to get rebuilt.

    This patch fixes this by moving it into lib/iov_iter.c instead
    where it is actually used.

    This patch also fixes the ifdef to use CRYPTO_HASH instead of just
    CRYPTO which does not guarantee the existence of ahash.

    Unfortunately a number of drivers were relying on linux/uio.h to
    provide access to linux/slab.h. This patch adds inclusions of
    linux/slab.h as detected by build failures.

    Also skbuff.h was relying on this to provide a declaration for
    ahash_request. This patch adds a forward declaration instead.

    Signed-off-by: Herbert Xu
    Signed-off-by: Al Viro

    Herbert Xu
     

21 Mar, 2020

1 commit

  • This replaces the kasan instrumentation with generic instrumentation,
    implicitly adding KCSAN instrumentation support.

    For KASAN no functional change is intended.

    Suggested-by: Arnd Bergmann
    Signed-off-by: Marco Elver
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Marco Elver
     

17 Dec, 2019

1 commit

  • We cannot look at 'i->pipe' unless we know the iter is a pipe. Move the
    ring_size load to a branch in iov_iter_alignment() where we've already
    checked the iter is a pipe to avoid bogus dereference.

    Reported-by: syzbot+bea68382bae9490e7dd6@syzkaller.appspotmail.com
    Fixes: 8cefc107ca54 ("pipe: Use head and tail pointers for the ring, not cursor and length")
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

02 Dec, 2019

1 commit

  • Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
    "As part of the cleanup of some remaining y2038 issues, I came to
    fs/compat_ioctl.c, which still has a couple of commands that need
    support for time64_t.

    In completely unrelated work, I spent time on cleaning up parts of
    this file in the past, moving things out into drivers instead.

    After Al Viro reviewed an earlier version of this series and did a lot
    more of that cleanup, I decided to try to completely eliminate the
    rest of it and move it all into drivers.

    This series incorporates some of Al's work and many patches of my own,
    but in the end stops short of actually removing the last part, which
    is the scsi ioctl handlers. I have patches for those as well, but they
    need more testing or possibly a rewrite"

    * tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
    scsi: sd: enable compat ioctls for sed-opal
    pktcdvd: add compat_ioctl handler
    compat_ioctl: move SG_GET_REQUEST_TABLE handling
    compat_ioctl: ppp: move simple commands into ppp_generic.c
    compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
    compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
    compat_ioctl: unify copy-in of ppp filters
    tty: handle compat PPP ioctls
    compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
    compat_ioctl: handle SIOCOUTQNSD
    af_unix: add compat_ioctl support
    compat_ioctl: reimplement SG_IO handling
    compat_ioctl: move WDIOC handling into wdt drivers
    fs: compat_ioctl: move FITRIM emulation into file systems
    gfs2: add compat_ioctl support
    compat_ioctl: remove unused convert_in_user macro
    compat_ioctl: remove last RAID handling code
    compat_ioctl: remove /dev/raw ioctl translation
    compat_ioctl: remove PCI ioctl translation
    compat_ioctl: remove joystick ioctl translation
    ...

    Linus Torvalds
     

16 Nov, 2019

1 commit

  • Split pipe->ring_size into two numbers:

    (1) pipe->ring_size - indicates the hard size of the pipe ring.

    (2) pipe->max_usage - indicates the maximum number of pipe ring slots that
    userspace orchestrated events can fill.

    This allows for a pipe that is both writable by the general kernel
    notification facility and by userspace, allowing plenty of ring space for
    notifications to be added whilst preventing userspace from being able to
    pin too much unswappable kernel space.

    Signed-off-by: David Howells

    David Howells
     

31 Oct, 2019

1 commit

  • Convert pipes to use head and tail pointers for the buffer ring rather than
    pointer and length as the latter requires two atomic ops to update (or a
    combined op) whereas the former only requires one.

    (1) The head pointer is the point at which production occurs and points to
    the slot in which the next buffer will be placed. This is equivalent
    to pipe->curbuf + pipe->nrbufs.

    The head pointer belongs to the write-side.

    (2) The tail pointer is the point at which consumption occurs. It points
    to the next slot to be consumed. This is equivalent to pipe->curbuf.

    The tail pointer belongs to the read-side.

    (3) head and tail are allowed to run to UINT_MAX and wrap naturally. They
    are only masked off when the array is being accessed, e.g.:

    pipe->bufs[head & mask]

    This means that it is not necessary to have a dead slot in the ring as
    head == tail isn't ambiguous.

    (4) The ring is empty if "head == tail".

    A helper, pipe_empty(), is provided for this.

    (5) The occupancy of the ring is "head - tail".

    A helper, pipe_occupancy(), is provided for this.

    (6) The number of free slots in the ring is "pipe->ring_size - occupancy".

    A helper, pipe_space_for_user() is provided to indicate how many slots
    userspace may use.

    (7) The ring is full if "head - tail >= pipe->ring_size".

    A helper, pipe_full(), is provided for this.

    Signed-off-by: David Howells

    David Howells
     

23 Oct, 2019

1 commit

  • There are two code locations that implement the SG_IO ioctl: the old
    sg.c driver, and the generic scsi_ioctl helper that is in turn used by
    multiple drivers.

    To eradicate the old compat_ioctl conversion handler for the SG_IO
    command, I implement a readable pair of put_sg_io_hdr() /get_sg_io_hdr()
    helper functions that can be used for both compat and native mode,
    and then I call this from both drivers.

    For the iovec handling, there is already a compat_import_iovec() function
    that can simply be called in place of import_iovec().

    To avoid having to pass the compat/native state through multiple
    indirections, I mark the SG_IO command itself as compatible in
    fs/compat_ioctl.c and use in_compat_syscall() to figure out where
    we are called from.

    As a side-effect of this, the sg.c driver now also accepts the 32-bit
    sg_io_hdr format in compat mode using the read/write interface, not
    just ioctl. This should improve compatiblity with old 32-bit binaries,
    but it would break if any application intentionally passes the 64-bit
    data structure in compat mode here.

    Steffen Maier helped debug an issue in an earlier version of this patch.

    Cc: Steffen Maier
    Cc: linux-scsi@vger.kernel.org
    Cc: Doug Gilbert
    Cc: "James E.J. Bottomley"
    Cc: "Martin K. Petersen"
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

25 Sep, 2019

1 commit

  • Patch series "Make working with compound pages easier", v2.

    These three patches add three helpers and convert the appropriate
    places to use them.

    This patch (of 3):

    It's unnecessarily hard to find out the size of a potentially huge page.
    Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

    Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle)
    Acked-by: Michal Hocko
    Reviewed-by: Andrew Morton
    Reviewed-by: Ira Weiny
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     

01 Jun, 2019

1 commit


21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

15 May, 2019

1 commit

  • To facilitate additional options to get_user_pages_fast() change the
    singular write parameter to be gup_flags.

    This patch does not change any functionality. New functionality will
    follow in subsequent patches.

    Some of the get_user_pages_fast() call sites were unchanged because they
    already passed FOLL_WRITE or 0 for the write parameter.

    NOTE: It was suggested to change the ordering of the get_user_pages_fast()
    arguments to ensure that callers were converted. This breaks the current
    GUP call site convention of having the returned pages be the final
    parameter. So the suggestion was rejected.

    Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
    Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com
    Signed-off-by: Ira Weiny
    Reviewed-by: Mike Marshall
    Cc: Aneesh Kumar K.V
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: "David S. Miller"
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: James Hogan
    Cc: Jason Gunthorpe
    Cc: John Hubbard
    Cc: "Kirill A. Shutemov"
    Cc: Martin Schwidefsky
    Cc: Michal Hocko
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Thomas Gleixner
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ira Weiny
     

04 Apr, 2019

1 commit

  • If CONFIG_CRYPTO is not set or set to m,
    gcc building warn this:

    lib/iov_iter.o: In function `hash_and_copy_to_iter':
    iov_iter.c:(.text+0x9129): undefined reference to `crypto_stats_get'
    iov_iter.c:(.text+0x9152): undefined reference to `crypto_stats_ahash_update'

    Reported-by: Hulk Robot
    Fixes: d05f443554b3 ("iov_iter: introduce hash_and_copy_to_iter helper")
    Suggested-by: Al Viro
    Signed-off-by: YueHaibing
    Signed-off-by: Al Viro

    YueHaibing
     

27 Feb, 2019

1 commit

  • Avoid cache line miss dereferencing struct page if we can.

    page_copy_sane() mostly deals with order-0 pages.

    Extra cache line miss is visible on TCP recvmsg() calls dealing
    with GRO packets (typically 45 page frags are attached to one skb).

    Bringing the 45 struct pages into cpu cache while copying the data
    is not free, since the freeing of the skb (and associated
    page frags put_page()) can happen after cache lines have been evicted.

    Signed-off-by: Eric Dumazet
    Cc: Al Viro
    Signed-off-by: Al Viro

    Eric Dumazet
     

06 Jan, 2019

1 commit


04 Jan, 2019

1 commit

  • Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
    of the user address range verification function since we got rid of the
    old racy i386-only code to walk page tables by hand.

    It existed because the original 80386 would not honor the write protect
    bit when in kernel mode, so you had to do COW by hand before doing any
    user access. But we haven't supported that in a long time, and these
    days the 'type' argument is a purely historical artifact.

    A discussion about extending 'user_access_begin()' to do the range
    checking resulted this patch, because there is no way we're going to
    move the old VERIFY_xyz interface to that model. And it's best done at
    the end of the merge window when I've done most of my merges, so let's
    just get this done once and for all.

    This patch was mostly done with a sed-script, with manual fix-ups for
    the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

    There were a couple of notable cases:

    - csky still had the old "verify_area()" name as an alias.

    - the iter_iov code had magical hardcoded knowledge of the actual
    values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
    really used it)

    - microblaze used the type argument for a debug printout

    but other than those oddities this should be a total no-op patch.

    I tried to fix up all architectures, did fairly extensive grepping for
    access_ok() uses, and the changes are trivial, but I may have missed
    something. Any missed conversion should be trivially fixable, though.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Dec, 2018

2 commits

  • Allow consumers that want to use iov iterator helpers and also update
    a predefined hash calculation online when copying data. This is useful
    when copying incoming network buffers to a local iterator and calculate
    a digest on the incoming stream. nvme-tcp host driver that will be
    introduced in following patches is the first consumer via
    skb_copy_and_hash_datagram_iter.

    Acked-by: David S. Miller
    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     
  • The single caller to csum_and_copy_to_iter is skb_copy_and_csum_datagram
    and we are trying to unite its logic with skb_copy_datagram_iter by passing
    a callback to the copy function that we want to apply. Thus, we need
    to make the checksum pointer private to the function.

    Acked-by: David S. Miller
    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     

28 Nov, 2018

1 commit


26 Nov, 2018

1 commit


24 Oct, 2018

3 commits

  • Add a new iterator, ITER_DISCARD, that can only be used in READ mode and
    just discards any data copied to it.

    This is useful in a network filesystem for discarding any unwanted data
    sent by a server.

    Signed-off-by: David Howells

    David Howells
     
  • In the iov_iter struct, separate the iterator type from the iterator
    direction and use accessor functions to access them in most places.

    Convert a bunch of places to use switch-statements to access them rather
    then chains of bitwise-AND statements. This makes it easier to add further
    iterator types. Also, this can be more efficient as to implement a switch
    of small contiguous integers, the compiler can use ~50% fewer compare
    instructions than it has to use bitwise-and instructions.

    Further, cease passing the iterator type into the iterator setup function.
    The iterator function can set that itself. Only the direction is required.

    Signed-off-by: David Howells

    David Howells
     
  • Use accessor functions to access an iterator's type and direction. This
    allows for the possibility of using some other method of determining the
    type of iterator than if-chains with bitwise-AND conditions.

    Signed-off-by: David Howells

    David Howells
     

16 Jul, 2018

1 commit

  • By mistake the ITER_PIPE early-exit / warning from copy_from_iter() was
    cargo-culted in _copy_to_iter_mcsafe() rather than a machine-check-safe
    version of copy_to_iter_pipe().

    Implement copy_pipe_to_iter_mcsafe() being careful to return the
    indication of short copies due to a CPU exception.

    Without this regression-fix all splice reads to dax-mode files fail.

    Reported-by: Ross Zwisler
    Tested-by: Ross Zwisler
    Signed-off-by: Dan Williams
    Acked-by: Al Viro
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Fixes: 8780356ef630 ("x86/asm/memcpy_mcsafe: Define copy_to_iter_mcsafe()")
    Link: http://lkml.kernel.org/r/153108277278.37979.3327916996902264102.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Ingo Molnar

    Dan Williams