21 Sep, 2017

1 commit

  • Issue is that if the data crosses a page boundary inside a compound
    page, this check will incorrectly trigger a WARN_ON.

    To fix this, compute the order using the head of the compound page and
    adjust the offset to be relative to that head.

    Fixes: 72e809ed81ed ("iov_iter: sanity checks for copy to/from page
    primitives")

    Signed-off-by: Petar Penkov
    CC: Al Viro
    CC: Eric Dumazet
    Signed-off-by: Al Viro

    Petar Penkov
     

08 Jul, 2017

1 commit

  • Pull iov_iter hardening from Al Viro:
    "This is the iov_iter/uaccess/hardening pile.

    For one thing, it trims the inline part of copy_to_user/copy_from_user
    to the minimum that *does* need to be inlined - object size checks,
    basically. For another, it sanitizes the checks for iov_iter
    primitives. There are 4 groups of checks: access_ok(), might_fault(),
    object size and KASAN.

    - access_ok() had been verified by whoever had set the iov_iter up.
    However, that has happened in a function far away, so proving that
    there's no path to actual copying bypassing those checks is hard
    and proving that iov_iter has not been buggered in the meanwhile is
    also not pleasant. So we want those redone in actual
    copyin/copyout.

    - might_fault() is better off consolidated - we know whether it needs
    to be checked as soon as we enter iov_iter primitive and observe
    the iov_iter flavour. No need to wait until the copyin/copyout. The
    call chains are short enough to make sure we won't miss anything -
    in fact, it's more robust that way, since there are cases where we
    do e.g. forced fault-in before getting to copyin/copyout. It's not
    quite what we need to check (in particular, combination of
    iovec-backed and set_fs(KERNEL_DS) is almost certainly a bug, not a
    cause to skip checks), but that's for later series. For now let's
    keep might_fault().

    - KASAN checks belong in copyin/copyout - at the same level where
    other iov_iter flavours would've hit them in memcpy().

    - object size checks should apply to *all* iov_iter flavours, not
    just iovec-backed ones.

    There are two groups of primitives - one gets the kernel object
    described as pointer + size (copy_to_iter(), etc.) while another gets
    it as page + offset + size (copy_page_to_iter(), etc.)

    For the first group the checks are best done where we actually have a
    chance to find the object size. In other words, those belong in inline
    wrappers in uio.h, before calling into iov_iter.c. Same kind as we
    have for inlined part of copy_to_user().

    For the second group there is no object to look at - offset in page is
    just a number, it bears no type information. So we do them in the
    common helper called by iov_iter.c primitives of that kind. All it
    currently does is checking that we are not trying to access outside of
    the compound page; eventually we might want to add some sanity checks
    on the page involved.

    So the things we need in copyin/copyout part of iov_iter.c do not
    quite match anything in uaccess.h (we want no zeroing, we *do* want
    access_ok() and KASAN and we want no might_fault() or object size
    checks done on that level). OTOH, these needs are simple enough to
    provide a couple of helpers (static in iov_iter.c) doing just what we
    need..."

    * 'uaccess-work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    iov_iter: saner checks on copyin/copyout
    iov_iter: sanity checks for copy to/from page primitives
    iov_iter/hardening: move object size checks to inlined part
    copy_{to,from}_user(): consolidate object size checks
    copy_{from,to}_user(): move kasan checks and might_fault() out-of-line

    Linus Torvalds
     

07 Jul, 2017

1 commit

  • * might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic
    codepath also needs might_fault() coverage)
    * we have already done object size checks
    * we have *NOT* done access_ok() recently enough; we rely upon the
    iovec array having passed sanity checks back when it had been created
    and not nothing having buggered it since. However, that's very much
    non-local, so we'd better recheck that.

    So the thing we want does not match anything in uaccess - we need
    access_ok + kasan checks + raw copy without any zeroing. Just define
    such helpers and use them here.

    Signed-off-by: Al Viro

    Al Viro
     

30 Jun, 2017

2 commits


10 Jun, 2017

1 commit

  • The pmem driver has a need to transfer data with a persistent memory
    destination and be able to rely on the fact that the destination writes are not
    cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
    (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
    to ensure data-writes have reached a power-fail-safe zone in the platform. The
    fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
    around and fence previous writes with an "sfence".

    Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
    memcpy_flushcache, that guarantee that the destination buffer is not dirty in
    the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
    will be used to replace the "pmem api" (include/linux/pmem.h +
    arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
    and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
    config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
    otherwise.

    This is meant to satisfy the concern from Linus that if a driver wants to do
    something beyond the normal nocache semantics it should be something private to
    that driver [1], and Al's concern that anything uaccess related belongs with
    the rest of the uaccess code [2].

    The first consumer of this interface is a new 'copy_from_iter' dax operation so
    that pmem can inject cache maintenance operations without imposing this
    overhead on other dax-capable drivers.

    [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
    [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html

    Cc:
    Cc: Jan Kara
    Cc: Jeff Moyer
    Cc: Ingo Molnar
    Cc: Christoph Hellwig
    Cc: Toshi Kani
    Cc: "H. Peter Anvin"
    Cc: Al Viro
    Cc: Thomas Gleixner
    Cc: Matthew Wilcox
    Reviewed-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Dan Williams
     

10 May, 2017

1 commit


09 May, 2017

2 commits

  • There are many code paths opencoding kvmalloc. Let's use the helper
    instead. The main difference to kvmalloc is that those users are
    usually not considering all the aspects of the memory allocator. E.g.
    allocation requests
    Reviewed-by: Boris Ostrovsky # Xen bits
    Acked-by: Kees Cook
    Acked-by: Vlastimil Babka
    Acked-by: Andreas Dilger # Lustre
    Acked-by: Christian Borntraeger # KVM/s390
    Acked-by: Dan Williams # nvdim
    Acked-by: David Sterba # btrfs
    Acked-by: Ilya Dryomov # Ceph
    Acked-by: Tariq Toukan # mlx4
    Acked-by: Leon Romanovsky # mlx5
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Herbert Xu
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Cc: Tony Luck
    Cc: "Rafael J. Wysocki"
    Cc: Ben Skeggs
    Cc: Kent Overstreet
    Cc: Santosh Raspatur
    Cc: Hariprasad S
    Cc: Yishai Hadas
    Cc: Oleg Drokin
    Cc: "Yan, Zheng"
    Cc: Alexander Viro
    Cc: Alexei Starovoitov
    Cc: Eric Dumazet
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • Wrong sign of iov_iter_revert() argument. Unfortunately, slipped through
    the testing, since most of the time we don't do anything to the iterator
    afterwards and potential oops on walking the iter->iov too far backwards
    is too infrequent to be easily triggered.

    Add a sanity check in iov_iter_revert() to catch bugs like this one;
    fortunately, the same braino hadn't happened in other callers, but we'd
    better have a warning if such thing crops up.

    Signed-off-by: Al Viro

    Al Viro
     

02 May, 2017

1 commit

  • Pull uaccess unification updates from Al Viro:
    "This is the uaccess unification pile. It's _not_ the end of uaccess
    work, but the next batch of that will go into the next cycle. This one
    mostly takes copy_from_user() and friends out of arch/* and gets the
    zero-padding behaviour in sync for all architectures.

    Dealing with the nocache/writethrough mess is for the next cycle;
    fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
    sold on access_ok() in there, BTW; just not in this pile), same for
    reducing __copy_... callsites, strn*... stuff, etc. - there will be a
    pile about as large as this one in the next merge window.

    This one sat in -next for weeks. -3KLoC"

    * 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
    HAVE_ARCH_HARDENED_USERCOPY is unconditional now
    CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
    m32r: switch to RAW_COPY_USER
    hexagon: switch to RAW_COPY_USER
    microblaze: switch to RAW_COPY_USER
    get rid of padding, switch to RAW_COPY_USER
    ia64: get rid of copy_in_user()
    ia64: sanitize __access_ok()
    ia64: get rid of 'segment' argument of __do_{get,put}_user()
    ia64: get rid of 'segment' argument of __{get,put}_user_check()
    ia64: add extable.h
    powerpc: get rid of zeroing, switch to RAW_COPY_USER
    esas2r: don't open-code memdup_user()
    alpha: fix stack smashing in old_adjtimex(2)
    don't open-code kernel_setsockopt()
    mips: switch to RAW_COPY_USER
    mips: get rid of tail-zeroing in primitives
    mips: make copy_from_user() zero tail explicitly
    mips: clean and reorder the forest of macros...
    mips: consolidate __invoke_... wrappers
    ...

    Linus Torvalds
     

30 Apr, 2017

1 commit


03 Apr, 2017

1 commit


29 Mar, 2017

2 commits


15 Jan, 2017

1 commit

  • The logics in pipe_advance() used to release all buffers past the new
    position failed in cases when the number of buffers to release was equal
    to pipe->buffers. If that happened, none of them had been released,
    leaving pipe full. Worse, it was trivial to trigger and we end up with
    pipe full of uninitialized pages. IOW, it's an infoleak.

    Cc: stable@vger.kernel.org # v4.9
    Reported-by: "Alan J. Wylie"
    Tested-by: "Alan J. Wylie"
    Signed-off-by: Al Viro

    Al Viro
     

23 Dec, 2016

1 commit

  • Problem similar to ones dealt with in "fold checks into iterate_and_advance()"
    and followups, except that in this case we really want to do nothing when
    asked for zero-length operation - unlike zero-length iterate_and_advance(),
    zero-length iterate_all_kinds() has no side effects, and callers are simpler
    that way.

    That got exposed when copy_from_iter_full() had been used by tipc, which
    builds an msghdr with zero payload and (now) feeds it to a primitive
    based on iterate_all_kinds() instead of iterate_and_advance().

    Reported-by: Jon Maloy
    Tested-by: Jon Maloy
    Signed-off-by: Al Viro

    Al Viro
     

17 Dec, 2016

1 commit

  • Pull vfs updates from Al Viro:

    - more ->d_init() stuff (work.dcache)

    - pathname resolution cleanups (work.namei)

    - a few missing iov_iter primitives - copy_from_iter_full() and
    friends. Either copy the full requested amount, advance the iterator
    and return true, or fail, return false and do _not_ advance the
    iterator. Quite a few open-coded callers converted (and became more
    readable and harder to fuck up that way) (work.iov_iter)

    - several assorted patches, the big one being logfs removal

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    logfs: remove from tree
    vfs: fix put_compat_statfs64() does not handle errors
    namei: fold should_follow_link() with the step into not-followed link
    namei: pass both WALK_GET and WALK_MORE to should_follow_link()
    namei: invert WALK_PUT logics
    namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link()
    namei: saner calling conventions for mountpoint_last()
    namei.c: get rid of user_path_parent()
    switch getfrag callbacks to ..._full() primitives
    make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success
    [iov_iter] new primitives - copy_from_iter_full() and friends
    don't open-code file_inode()
    ceph: switch to use of ->d_init()
    ceph: unify dentry_operations instances
    lustre: switch to use of ->d_init()

    Linus Torvalds
     

14 Dec, 2016

1 commit

  • Pull block layer updates from Jens Axboe:
    "This is the main block pull request this series. Contrary to previous
    release, I've kept the core and driver changes in the same branch. We
    always ended up having dependencies between the two for obvious
    reasons, so makes more sense to keep them together. That said, I'll
    probably try and keep more topical branches going forward, especially
    for cycles that end up being as busy as this one.

    The major parts of this pull request is:

    - Improved support for O_DIRECT on block devices, with a small
    private implementation instead of using the pig that is
    fs/direct-io.c. From Christoph.

    - Request completion tracking in a scalable fashion. This is utilized
    by two components in this pull, the new hybrid polling and the
    writeback queue throttling code.

    - Improved support for polling with O_DIRECT, adding a hybrid mode
    that combines pure polling with an initial sleep. From me.

    - Support for automatic throttling of writeback queues on the block
    side. This uses feedback from the device completion latencies to
    scale the queue on the block side up or down. From me.

    - Support from SMR drives in the block layer and for SD. From Hannes
    and Shaun.

    - Multi-connection support for nbd. From Josef.

    - Cleanup of request and bio flags, so we have a clear split between
    which are bio (or rq) private, and which ones are shared. From
    Christoph.

    - A set of patches from Bart, that improve how we handle queue
    stopping and starting in blk-mq.

    - Support for WRITE_ZEROES from Chaitanya.

    - Lightnvm updates from Javier/Matias.

    - Supoort for FC for the nvme-over-fabrics code. From James Smart.

    - A bunch of fixes from a whole slew of people, too many to name
    here"

    * 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
    blk-stat: fix a few cases of missing batch flushing
    blk-flush: run the queue when inserting blk-mq flush
    elevator: make the rqhash helpers exported
    blk-mq: abstract out blk_mq_dispatch_rq_list() helper
    blk-mq: add blk_mq_start_stopped_hw_queue()
    block: improve handling of the magic discard payload
    blk-wbt: don't throttle discard or write zeroes
    nbd: use dev_err_ratelimited in io path
    nbd: reset the setup task for NBD_CLEAR_SOCK
    nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
    nvme-fabrics: Add target support for FC transport
    nvme-fabrics: Add host support for FC transport
    nvme-fabrics: Add FC transport LLDD api definitions
    nvme-fabrics: Add FC transport FC-NVME definitions
    nvme-fabrics: Add FC transport error codes to nvme.h
    Add type 0x28 NVME type code to scsi fc headers
    nvme-fabrics: patch target code in prep for FC transport support
    nvme-fabrics: set sqe.command_id in core not transports
    parser: add u64 number parser
    nvme-rdma: align to generic ib_event logging helper
    ...

    Linus Torvalds
     

06 Dec, 2016

1 commit

  • copy_from_iter_full(), copy_from_iter_full_nocache() and
    csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
    et.al., advancing iterator only in case of successful full copy
    and returning whether it had been successful or not.

    Convert some obvious users. *NOTE* - do not blindly assume that
    something is a good candidate for those unless you are sure that
    not advancing iov_iter in failure case is the right thing in
    this case. Anything that does short read/short write kind of
    stuff (or is in a loop, etc.) is unlikely to be a good one.

    Signed-off-by: Al Viro

    Al Viro
     

17 Nov, 2016

1 commit

  • iov_iter_advance() needs to decrement iter->count by the number of
    bytes we'd moved beyond. Normal flavours do that, but ITER_PIPE
    doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files
    ends up with a bogus fallback to page cache read, resulting in incorrect
    values for file offset and bytes read.

    Signed-off-by: Abhi Das
    Signed-off-by: Al Viro

    Abhi Das
     

01 Nov, 2016

1 commit


15 Oct, 2016

1 commit

  • Both import_iovec() and rw_copy_check_uvector() take an array
    (typically small and on-stack) which is used to hold an iovec array copy
    from userspace. This is to avoid an expensive memory allocation in the
    fast path (i.e. few iovec elements).

    The caller may have to check whether these functions actually used
    the provided buffer or allocated a new one -- but this differs between
    the too. Let's just add a kernel doc to clarify what the semantics are
    for each function.

    Signed-off-by: Vegard Nossum
    Signed-off-by: Al Viro

    Vegard Nossum
     

12 Oct, 2016

1 commit


11 Oct, 2016

1 commit

  • Pull misc vfs updates from Al Viro:
    "Assorted misc bits and pieces.

    There are several single-topic branches left after this (rename2
    series from Miklos, current_time series from Deepa Dinamani, xattr
    series from Andreas, uaccess stuff from from me) and I'd prefer to
    send those separately"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
    proc: switch auxv to use of __mem_open()
    hpfs: support FIEMAP
    cifs: get rid of unused arguments of CIFSSMBWrite()
    posix_acl: uapi header split
    posix_acl: xattr representation cleanups
    fs/aio.c: eliminate redundant loads in put_aio_ring_file
    fs/internal.h: add const to ns_dentry_operations declaration
    compat: remove compat_printk()
    fs/buffer.c: make __getblk_slow() static
    proc: unsigned file descriptors
    fs/file: more unsigned file descriptors
    fs: compat: remove redundant check of nr_segs
    cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
    cifs: don't use memcpy() to copy struct iov_iter
    get rid of separate multipage fault-in primitives
    fs: Avoid premature clearing of capabilities
    fs: Give dentry to inode_change_ok() instead of inode
    fuse: Propagate dentry down to inode_change_ok()
    ceph: Propagate dentry down to inode_change_ok()
    xfs: Propagate dentry down to inode_change_ok()
    ...

    Linus Torvalds
     

06 Oct, 2016

2 commits

  • Signed-off-by: Miklos Szeredi
    Signed-off-by: Al Viro

    Miklos Szeredi
     
  • iov_iter variant for passing data into pipe. copy_to_iter()
    copies data into page(s) it has allocated and stuffs them into
    the pipe; copy_page_to_iter() stuffs there a reference to the
    page given to it. Both will try to coalesce if possible.
    iov_iter_zero() is similar to copy_to_iter(); iov_iter_get_pages()
    and friends will do as copy_to_iter() would have and return the
    pages where the data would've been copied. iov_iter_advance()
    will truncate everything past the spot it has advanced to.

    New primitive: iov_iter_pipe(), used for initializing those.
    pipe should be locked all along.

    Running out of space acts as fault would for iovec-backed ones;
    in other words, giving it to ->read_iter() may result in short
    read if the pipe overflows, or -EFAULT if it happens with nothing
    copied there.

    In other words, ->read_iter() on those acts pretty much like
    ->splice_read(). Moreover, all generic_file_splice_read() users,
    as well as many other ->splice_read() instances can be switched
    to that scheme - that'll happen in the next commit.

    Signed-off-by: Al Viro

    Al Viro
     

28 Sep, 2016

1 commit

  • * the only remaining callers of "short" fault-ins are just as happy with generic
    variants (both in lib/iov_iter.c); switch them to multipage variants, kill the
    "short" ones
    * rename the multipage variants to now available plain ones.
    * get rid of compat macro defining iov_iter_fault_in_multipage_readable by
    expanding it in its only user.

    Signed-off-by: Al Viro

    Al Viro
     

18 Sep, 2016

1 commit


29 Jul, 2016

1 commit

  • copy_page_to_iter_iovec() and copy_page_from_iter_iovec() copy some data
    to userspace or from userspace. These functions have a fast path where
    they map a page using kmap_atomic and a slow path where they use kmap.

    kmap is slower than kmap_atomic, so the fast path is preferred.

    However, on kernels without highmem support, kmap just calls
    page_address, so there is no need to avoid kmap. On kernels without
    highmem support, the fast path just increases code size (and cache
    footprint) and it doesn't improve copy performance in any way.

    This patch enables the fast path only if CONFIG_HIGHMEM is defined.

    Code size reduced by this patch:
    x86 (without highmem) 928
    x86-64 960
    sparc64 848
    alpha 1136
    pa-risc 1200

    [akpm@linux-foundation.org: use IS_ENABLED(), per Andi]
    Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1607221711410.4818@file01.intranet.prod.int.rdu2.redhat.com
    Signed-off-by: Mikulas Patocka
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Cc: Alexander Viro
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mikulas Patocka
     

10 Jun, 2016

1 commit

  • bvec has one native/mature iterator for long time, so not
    necessary to use the reinvented wheel for iterating bvecs
    in lib/iov_iter.c.

    Two ITER_BVEC test cases are run:
    - xfstest(-g auto) on loop dio/aio, no regression found
    - swap file works well under extreme stress(stress-ng --all 64 -t
    800 -v), and lots of OOMs are triggerd, and the whole
    system still survives

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Ming Lei
    Tested-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Ming Lei
     

26 May, 2016

2 commits

  • Pull vfs iov_iter regression fix from Al Viro:
    "Fix for braino in 'fold checks into iterate_and_advance()'"

    * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    do "fold checks into iterate_and_advance()" right

    Linus Torvalds
     
  • the only case when we should skip the iterate_and_advance() guts
    is when nothing's left in the iterator, _not_ just when requested
    amount is 0. Said guts will do nothing in the latter case anyway;
    the problem we tried to deal with in the aforementioned commit is
    that when there's nothing left *and* the amount requested is 0,
    we might end up deferencing one iovec too many; the value we fetch
    from there is discarded in that case, but theoretically it might
    oops if the iovec array ends exactly at the end of page with the
    next page not mapped.

    Bailing out on zero size requested had an unexpected side effect -
    zero-length segment in the beginning of iovec array ended up
    throwing do_loop_readv_writev() into infinite spin; we do not
    advance past the empty segment at all. Reproducer is trivial:
    echo '#include ' >a.c
    echo 'main() {char c; struct iovec v[] = {{&c,0},{&c,1}}; readv(0,v,2);}' >>a.c
    cc a.c && ./a.out

    Al Viro
     

19 May, 2016

1 commit


10 May, 2016

1 commit

  • they are open-coded in all users except iov_iter_advance(), and there
    they wouldn't be a bad idea either - as it is, iov_iter_advance(i, 0)
    ends up dereferencing potentially past the end of iovec array. It
    doesn't do anything with the value it reads, and very unlikely to
    trigger an oops on dereference, but it is not impossible.

    Reported-by: Jiri Slaby
    Reported-by: Takashi Iwai
    Signed-off-by: Al Viro

    Al Viro
     

09 Apr, 2016

1 commit


07 Dec, 2015

2 commits


12 Apr, 2015

2 commits


30 Mar, 2015

1 commit

  • iovec-backed iov_iter instances are assumed to satisfy several properties:
    * no more than UIO_MAXIOV elements in iovec array
    * total size of all ranges is no more than MAX_RW_COUNT
    * all ranges pass access_ok().

    The problem is, invariants of data structures should be established in the
    primitives creating those data structures, not in the code using those
    primitives. And iov_iter_init() violates that principle. For a while we
    managed to get away with that, but once the use of iov_iter started to
    spread, it didn't take long for shit to hit the fan - missed check in
    sys_sendto() had introduced a roothole.

    We _do_ have primitives for importing and validating iovecs (both native and
    compat ones) and those primitives are almost always followed by shoving the
    resulting iovec into iov_iter. Life would be considerably simpler (and safer)
    if we combined those primitives with initializing iov_iter.

    That gives us two new primitives - import_iovec() and compat_import_iovec().
    Calling conventions:
    iovec = iov_array;
    err = import_iovec(direction, uvec, nr_segs,
    ARRAY_SIZE(iov_array), &iovec,
    &iter);
    imports user vector into kernel space (into iov_array if it fits, allocated
    if it doesn't fit or if iovec was NULL), validates it and sets iter up to
    refer to it. On success 0 is returned and allocated kernel copy (or NULL
    if the array had fit into caller-supplied one) is returned via iovec.
    On failure all allocations are undone and -E... is returned. If the total
    size of ranges exceeds MAX_RW_COUNT, the excess is silently truncated.

    compat_import_iovec() expects uvec to be a pointer to user array of compat_iovec;
    otherwise it's identical to import_iovec().

    Finally, import_single_range() sets iov_iter backed by single-element iovec
    covering a user-supplied range -

    err = import_single_range(direction, address, size, iovec, &iter);

    does validation and sets iter up. Again, size in excess of MAX_RW_COUNT gets
    silently truncated.

    Next commits will be switching the things up to use of those and reducing
    the amount of iov_iter_init() instances.

    Signed-off-by: Al Viro

    Al Viro