06 Feb, 2019

1 commit


29 Dec, 2018

5 commits

  • Merge misc updates from Andrew Morton:

    - large KASAN update to use arm's "software tag-based mode"

    - a few misc things

    - sh updates

    - ocfs2 updates

    - just about all of MM

    * emailed patches from Andrew Morton : (167 commits)
    kernel/fork.c: mark 'stack_vm_area' with __maybe_unused
    memcg, oom: notify on oom killer invocation from the charge path
    mm, swap: fix swapoff with KSM pages
    include/linux/gfp.h: fix typo
    mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm
    hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
    hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
    memory_hotplug: add missing newlines to debugging output
    mm: remove __hugepage_set_anon_rmap()
    include/linux/vmstat.h: remove unused page state adjustment macro
    mm/page_alloc.c: allow error injection
    mm: migrate: drop unused argument of migrate_page_move_mapping()
    blkdev: avoid migration stalls for blkdev pages
    mm: migrate: provide buffer_migrate_page_norefs()
    mm: migrate: move migrate_page_lock_buffers()
    mm: migrate: lock buffers before migrate_page_move_mapping()
    mm: migration: factor out code to compute expected number of page references
    mm, page_alloc: enable pcpu_drain with zone capability
    kmemleak: add config to select auto scan
    mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
    ...

    Linus Torvalds
     
  • Pull aio updates from Jens Axboe:
    "Flushing out pre-patches for the buffered/polled aio series. Some
    fixes in here, but also optimizations"

    * tag 'for-4.21/aio-20181221' of git://git.kernel.dk/linux-block:
    aio: abstract out io_event filler helper
    aio: split out iocb copy from io_submit_one()
    aio: use iocb_put() instead of open coding it
    aio: only use blk plugs for > 2 depth submissions
    aio: don't zero entire aio_kiocb aio_get_req()
    aio: separate out ring reservation from req allocation
    aio: use assigned completion handler

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:
    "This is the main pull request for block/storage for 4.21.

    Larger than usual, it was a busy round with lots of goodies queued up.
    Most notable is the removal of the old IO stack, which has been a long
    time coming. No new features for a while, everything coming in this
    week has all been fixes for things that were previously merged.

    This contains:

    - Use atomic counters instead of semaphores for mtip32xx (Arnd)

    - Cleanup of the mtip32xx request setup (Christoph)

    - Fix for circular locking dependency in loop (Jan, Tetsuo)

    - bcache (Coly, Guoju, Shenghui)
    * Optimizations for writeback caching
    * Various fixes and improvements

    - nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith)
    * host and target support for NVMe over TCP
    * Error log page support
    * Support for separate read/write/poll queues
    * Much improved polling
    * discard OOM fallback
    * Tracepoint improvements

    - lightnvm (Hans, Hua, Igor, Matias, Javier)
    * Igor added packed metadata to pblk. Now drives without metadata
    per LBA can be used as well.
    * Fix from Geert on uninitialized value on chunk metadata reads.
    * Fixes from Hans and Javier to pblk recovery and write path.
    * Fix from Hua Su to fix a race condition in the pblk recovery
    code.
    * Scan optimization added to pblk recovery from Zhoujie.
    * Small geometry cleanup from me.

    - Conversion of the last few drivers that used the legacy path to
    blk-mq (me)

    - Removal of legacy IO path in SCSI (me, Christoph)

    - Removal of legacy IO stack and schedulers (me)

    - Support for much better polling, now without interrupts at all.
    blk-mq adds support for multiple queue maps, which enables us to
    have a map per type. This in turn enables nvme to have separate
    completion queues for polling, which can then be interrupt-less.
    Also means we're ready for async polled IO, which is hopefully
    coming in the next release.

    - Killing of (now) unused block exports (Christoph)

    - Unification of the blk-rq-qos and blk-wbt wait handling (Josef)

    - Support for zoned testing with null_blk (Masato)

    - sx8 conversion to per-host tag sets (Christoph)

    - IO priority improvements (Damien)

    - mq-deadline zoned fix (Damien)

    - Ref count blkcg series (Dennis)

    - Lots of blk-mq improvements and speedups (me)

    - sbitmap scalability improvements (me)

    - Make core inflight IO accounting per-cpu (Mikulas)

    - Export timeout setting in sysfs (Weiping)

    - Cleanup the direct issue path (Jianchao)

    - Export blk-wbt internals in block debugfs for easier debugging
    (Ming)

    - Lots of other fixes and improvements"

    * tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits)
    kyber: use sbitmap add_wait_queue/list_del wait helpers
    sbitmap: add helpers for add/del wait queue handling
    block: save irq state in blkg_lookup_create()
    dm: don't reuse bio for flushes
    nvme-pci: trace SQ status on completions
    nvme-rdma: implement polling queue map
    nvme-fabrics: allow user to pass in nr_poll_queues
    nvme-fabrics: allow nvmf_connect_io_queue to poll
    nvme-core: optionally poll sync commands
    block: make request_to_qc_t public
    nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"
    nvme-tcp: fix endianess annotations
    nvmet-tcp: fix endianess annotations
    nvme-pci: refactor nvme_poll_irqdisable to make sparse happy
    nvme-pci: only set nr_maps to 2 if poll queues are supported
    nvmet: use a macro for default error location
    nvmet: fix comparison of a u16 with -1
    blk-mq: enable IO poll if .nr_queues of type poll > 0
    blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
    blk-mq: skip zero-queue maps in blk_mq_map_swqueue
    ...

    Linus Torvalds
     
  • Pull y2038 updates from Arnd Bergmann:
    "More syscalls and cleanups

    This concludes the main part of the system call rework for 64-bit
    time_t, which has spread over most of year 2018, the last six system
    calls being

    - ppoll
    - pselect6
    - io_pgetevents
    - recvmmsg
    - futex
    - rt_sigtimedwait

    As before, nothing changes for 64-bit architectures, while 32-bit
    architectures gain another entry point that differs only in the layout
    of the timespec structure. Hopefully in the next release we can wire
    up all 22 of those system calls on all 32-bit architectures, which
    gives us a baseline version for glibc to start using them.

    This does not include the clock_adjtime, getrusage/waitid, and
    getitimer/setitimer system calls. I still plan to have new versions of
    those as well, but they are not required for correct operation of the
    C library since they can be emulated using the old 32-bit time_t based
    system calls.

    Aside from the system calls, there are also a few cleanups here,
    removing old kernel internal interfaces that have become unused after
    all references got removed. The arch/sh cleanups are part of this,
    there were posted several times over the past year without a reaction
    from the maintainers, while the corresponding changes made it into all
    other architectures"

    * tag 'y2038-for-4.21' of ssh://gitolite.kernel.org:/pub/scm/linux/kernel/git/arnd/playground:
    timekeeping: remove obsolete time accessors
    vfs: replace current_kernel_time64 with ktime equivalent
    timekeeping: remove timespec_add/timespec_del
    timekeeping: remove unused {read,update}_persistent_clock
    sh: remove board_time_init() callback
    sh: remove unused rtc_sh_get/set_time infrastructure
    sh: sh03: rtc: push down rtc class ops into driver
    sh: dreamcast: rtc: push down rtc class ops into driver
    y2038: signal: Add compat_sys_rt_sigtimedwait_time64
    y2038: signal: Add sys_rt_sigtimedwait_time32
    y2038: socket: Add compat_sys_recvmmsg_time64
    y2038: futex: Add support for __kernel_timespec
    y2038: futex: Move compat implementation into futex.c
    io_pgetevents: use __kernel_timespec
    pselect6: use __kernel_timespec
    ppoll: use __kernel_timespec
    signal: Add restore_user_sigmask()
    signal: Add set_user_sigmask()

    Linus Torvalds
     
  • All callers of migrate_page_move_mapping() now pass NULL for 'head'
    argument. Drop it.

    Link: http://lkml.kernel.org/r/20181211172143.7358-7-jack@suse.cz
    Signed-off-by: Jan Kara
    Acked-by: Mel Gorman
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

18 Dec, 2018

8 commits

  • Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • In preparation of handing in iocbs in a different fashion as well. Also
    make it clear that the iocb being passed in isn't modified, by marking
    it const throughout.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Replace the percpu_ref_put() + kmem_cache_free() with a call to
    iocb_put() instead.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Plugging is meant to optimize submission of a string of IOs, if we don't
    have more than 2 being submitted, don't bother setting up a plug.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • It's 192 bytes, fairly substantial. Most items don't need to be cleared,
    especially not upfront. Clear the ones we do need to clear, and leave
    the other ones for setup when the iocb is prepared and submitted.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This is in preparation for certain types of IO not needing a ring
    reserveration.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • We know this is a read/write request, but in preparation for
    having different kinds of those, ensure that we call the assigned
    handler instead of assuming it's aio_complete_rq().

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • * for-4.21/block: (351 commits)
    blk-mq: enable IO poll if .nr_queues of type poll > 0
    blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
    blk-mq: skip zero-queue maps in blk_mq_map_swqueue
    block: fix blk-iolatency accounting underflow
    blk-mq: fix dispatch from sw queue
    block: mq-deadline: Fix write completion handling
    nvme-pci: don't share queue maps
    blk-mq: only dispatch to non-defauly queue maps if they have queues
    blk-mq: export hctx->type in debugfs instead of sysfs
    blk-mq: fix allocation for queue mapping table
    blk-wbt: export internal state via debugfs
    blk-mq-debugfs: support rq_qos
    block: update sysfs documentation
    block: loop: check error using IS_ERR instead of IS_ERR_OR_NULL in loop_add()
    aoe: add __exit annotation
    block: clear REQ_HIPRI if polling is not supported
    blk-mq: replace and kill blk_mq_request_issue_directly
    blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
    blk-mq: refactor the code of issue request directly
    block: remove the bio_integrity_advance export
    ...

    Jens Axboe
     

15 Dec, 2018

1 commit

  • Pull block fixes from Jens Axboe:
    "Three small fixes for this week. contains:

    - spectre indexing fix for aio (Jeff)

    - fix for the previous zeroing bio fix, we don't need it for user
    mapped pages, and in fact it breaks some applications if we do
    (Keith)

    - allocation failure fix for null_blk with zoned (Shin'ichiro)"

    * tag 'for-linus-20181214' of git://git.kernel.dk/linux-block:
    block: Fix null_blk_zoned creation failure with small number of zones
    aio: fix spectre gadget in lookup_ioctx
    block/bio: Do not zero user pages

    Linus Torvalds
     

12 Dec, 2018

1 commit

  • Matthew pointed out that the ioctx_table is susceptible to spectre v1,
    because the index can be controlled by an attacker. The below patch
    should mitigate the attack for all of the aio system calls.

    Cc: stable@vger.kernel.org
    Reported-by: Matthew Wilcox
    Reported-by: Dan Carpenter
    Signed-off-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Jeff Moyer
     

07 Dec, 2018

3 commits

  • struct timespec is not y2038 safe.
    struct __kernel_timespec is the new y2038 safe structure for all
    syscalls that are using struct timespec.
    Update io_pgetevents interfaces to use struct __kernel_timespec.

    sigset_t also has different representations on 32 bit and 64 bit
    architectures. Hence, we need to support the following different
    syscalls:

    New y2038 safe syscalls:
    (Controlled by CONFIG_64BIT_TIME for 32 bit ABIs)

    Native 64 bit(unchanged) and native 32 bit : sys_io_pgetevents
    Compat : compat_sys_io_pgetevents_time64

    Older y2038 unsafe syscalls:
    (Controlled by CONFIG_32BIT_COMPAT_TIME for 32 bit ABIs)

    Native 32 bit : sys_io_pgetevents_time32
    Compat : compat_sys_io_pgetevents

    Note that io_getevents syscalls do not have a y2038 safe solution.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     
  • Refactor the logic to restore the sigmask before the syscall
    returns into an api.
    This is useful for versions of syscalls that pass in the
    sigmask and expect the current->sigmask to be changed during
    the execution and restored after the execution of the syscall.

    With the advent of new y2038 syscalls in the subsequent patches,
    we add two more new versions of the syscalls (for pselect, ppoll
    and io_pgetevents) in addition to the existing native and compat
    versions. Adding such an api reduces the logic that would need to
    be replicated otherwise.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     
  • Refactor reading sigset from userspace and updating sigmask
    into an api.

    This is useful for versions of syscalls that pass in the
    sigmask and expect the current->sigmask to be changed during,
    and restored after, the execution of the syscall.

    With the advent of new y2038 syscalls in the subsequent patches,
    we add two more new versions of the syscalls (for pselect, ppoll,
    and io_pgetevents) in addition to the existing native and compat
    versions. Adding such an api reduces the logic that would need to
    be replicated otherwise.

    Note that the calls to sigprocmask() ignored the return value
    from the api as the function only returns an error on an invalid
    first argument that is hardcoded at these call sites.
    The updated logic uses set_current_blocked() instead.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     

05 Dec, 2018

2 commits

  • No one is going to poll for aio (yet), so we must clear the HIPRI
    flag, as we would otherwise send it down the poll queues, where no
    one will be polling for completions.

    Signed-off-by: Christoph Hellwig

    IOCB_HIPRI, not RWF_HIPRI.

    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Pull in v4.20-rc5, solving a conflict we'll otherwise get in aio.c and
    also getting the merge fix that went into mainline that users are
    hitting testing for-4.21/block and/or for-next.

    * tag 'v4.20-rc5': (664 commits)
    Linux 4.20-rc5
    PCI: Fix incorrect value returned from pcie_get_speed_cap()
    MAINTAINERS: Update linux-mips mailing list address
    ocfs2: fix potential use after free
    mm/khugepaged: fix the xas_create_range() error path
    mm/khugepaged: collapse_shmem() do not crash on Compound
    mm/khugepaged: collapse_shmem() without freezing new_page
    mm/khugepaged: minor reorderings in collapse_shmem()
    mm/khugepaged: collapse_shmem() remember to clear holes
    mm/khugepaged: fix crashes due to misaccounted holes
    mm/khugepaged: collapse_shmem() stop if punched or truncated
    mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
    mm/huge_memory: splitting set mapping+index before unfreeze
    mm/huge_memory: rename freeze_page() to unmap_page()
    initramfs: clean old path before creating a hardlink
    kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace
    psi: make disabling/enabling easier for vendor kernels
    proc: fixup map_files test on arm
    debugobjects: avoid recursive calls with kmemleak
    userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set
    ...

    Jens Axboe
     

20 Nov, 2018

1 commit

  • For cases when the application does not specify aio_reqprio for an aio,
    fallback to use get_current_ioprio() to obtain the task I/O priority
    last set using ioprio_set() rather than the hardcoded IOPRIO_CLASS_NONE
    value.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Adam Manzanares
    Signed-off-by: Damien Le Moal
    Signed-off-by: Jens Axboe

    Damien Le Moal
     

17 Nov, 2018

1 commit


27 Aug, 2018

1 commit

  • Christoph Hellwig suggested a slightly different path for handling
    backwards compatibility with the 32-bit time_t based system calls:

    Rather than simply reusing the compat_sys_* entry points on 32-bit
    architectures unchanged, we get rid of those entry points and the
    compat_time types by renaming them to something that makes more sense
    on 32-bit architectures (which don't have a compat mode otherwise),
    and then share the entry points under the new name with the 64-bit
    architectures that use them for implementing the compatibility.

    The following types and interfaces are renamed here, and moved
    from linux/compat_time.h to linux/time32.h:

    old new
    --- ---
    compat_time_t old_time32_t
    struct compat_timeval struct old_timeval32
    struct compat_timespec struct old_timespec32
    struct compat_itimerspec struct old_itimerspec32
    ns_to_compat_timeval() ns_to_old_timeval32()
    get_compat_itimerspec64() get_old_itimerspec32()
    put_compat_itimerspec64() put_old_itimerspec32()
    compat_get_timespec64() get_old_timespec32()
    compat_put_timespec64() put_old_timespec32()

    As we already have aliases in place, this patch addresses only the
    instances that are relevant to the system call interface in particular,
    not those that occur in device drivers and other modules. Those
    will get handled separately, while providing the 64-bit version
    of the respective interfaces.

    I'm not renaming the timex, rusage and itimerval structures, as we are
    still debating what the new interface will look like, and whether we
    will need a replacement at all.

    This also doesn't change the names of the syscall entry points, which can
    be done more easily when we actually switch over the 32-bit architectures
    to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to
    SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix.

    Suggested-by: Christoph Hellwig
    Link: https://lore.kernel.org/lkml/20180705222110.GA5698@infradead.org/
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

14 Aug, 2018

2 commits

  • Pull vfs aio updates from Al Viro:
    "Christoph's aio poll, saner this time around.

    This time it's pretty much local to fs/aio.c. Hopefully race-free..."

    * 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    aio: allow direct aio poll comletions for keyed wakeups
    aio: implement IOCB_CMD_POLL
    aio: add a iocb refcount
    timerfd: add support for keyed wakeups

    Linus Torvalds
     
  • Pull vfs open-related updates from Al Viro:

    - "do we need fput() or put_filp()" rules are gone - it's always fput()
    now. We keep track of that state where it belongs - in ->f_mode.

    - int *opened mess killed - in finish_open(), in ->atomic_open()
    instances and in fs/namei.c code around do_last()/lookup_open()/atomic_open().

    - alloc_file() wrappers with saner calling conventions are introduced
    (alloc_file_clone() and alloc_file_pseudo()); callers converted, with
    much simplification.

    - while we are at it, saner calling conventions for path_init() and
    link_path_walk(), simplifying things inside fs/namei.c (both on
    open-related paths and elsewhere).

    * 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
    few more cleanups of link_path_walk() callers
    allow link_path_walk() to take ERR_PTR()
    make path_init() unconditionally paired with terminate_walk()
    document alloc_file() changes
    make alloc_file() static
    do_shmat(): grab shp->shm_file earlier, switch to alloc_file_clone()
    new helper: alloc_file_clone()
    create_pipe_files(): switch the first allocation to alloc_file_pseudo()
    anon_inode_getfile(): switch to alloc_file_pseudo()
    hugetlb_file_setup(): switch to alloc_file_pseudo()
    ocxlflash_getfile(): switch to alloc_file_pseudo()
    cxl_getfile(): switch to alloc_file_pseudo()
    ... and switch shmem_file_setup() to alloc_file_pseudo()
    __shmem_file_setup(): reorder allocations
    new wrapper: alloc_file_pseudo()
    kill FILE_{CREATED,OPENED}
    switch atomic_open() and lookup_open() to returning 0 in all success cases
    document ->atomic_open() changes
    ->atomic_open(): return 0 in all success cases
    get rid of 'opened' in path_openat() and the helpers downstream
    ...

    Linus Torvalds
     

06 Aug, 2018

3 commits

  • If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the
    ctx_lock without spinning we can just complete the iocb straight from the
    wakeup callback to avoid a context switch.

    Signed-off-by: Christoph Hellwig
    Tested-by: Avi Kivity

    Christoph Hellwig
     
  • Simple one-shot poll through the io_submit() interface. To poll for
    a file descriptor the application should submit an iocb of type
    IOCB_CMD_POLL. It will poll the fd for the events specified in the
    the first 32 bits of the aio_buf field of the iocb.

    Unlike poll or epoll without EPOLLONESHOT this interface always works
    in one shot mode, that is once the iocb is completed, it will have to be
    resubmitted.

    Signed-off-by: Christoph Hellwig
    Tested-by: Avi Kivity

    Christoph Hellwig
     
  • This is needed to prevent races caused by the way the ->poll API works.
    To avoid introducing overhead for other users of the iocbs we initialize
    it to zero and only do refcount operations if it is non-zero in the
    completion path.

    Signed-off-by: Christoph Hellwig
    Tested-by: Avi Kivity

    Christoph Hellwig
     

23 Jul, 2018

1 commit

  • Pull vfs fixes from Al Viro:
    "Fix several places that screw up cleanups after failures halfway
    through opening a file (one open-coding filp_clone_open() and getting
    it wrong, two misusing alloc_file()). That part is -stable fodder from
    the 'work.open' branch.

    And Christoph's regression fix for uapi breakage in aio series;
    include/uapi/linux/aio_abi.h shouldn't be pulling in the kernel
    definition of sigset_t, the reason for doing so in the first place had
    been bogus - there's no need to expose struct __aio_sigset in
    aio_abi.h at all"

    * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    aio: don't expose __aio_sigset in uapi
    ocxlflash_getfile(): fix double-iput() on alloc_file() failures
    cxl_getfile(): fix double-iput() on alloc_file() failures
    drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

    Linus Torvalds
     

18 Jul, 2018

1 commit

  • glibc uses a different defintion of sigset_t than the kernel does,
    and the current version would pull in both. To fix this just do not
    expose the type at all - this somewhat mirrors pselect() where we
    do not even have a type for the magic sigmask argument, but just
    use pointer arithmetics.

    Fixes: 7a074e96 ("aio: implement io_pgetevents")
    Signed-off-by: Christoph Hellwig
    Reported-by: Adrian Reber
    Signed-off-by: Al Viro

    Christoph Hellwig
     

12 Jul, 2018

2 commits


29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 Jun, 2018

1 commit


05 Jun, 2018

1 commit


31 May, 2018

2 commits

  • This is the per-I/O equivalent of the ioprio_set system call.

    When IOCB_FLAG_IOPRIO is set on the iocb aio_flags field, then we set the
    newly added kiocb ki_ioprio field to the value in the iocb aio_reqprio field.

    This patch depends on block: add ioprio_check_cap function.

    Signed-off-by: Adam Manzanares
    Reviewed-by: Jeff Moyer
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Adam Manzanares
     
  • In order to avoid kiocb bloat for per command iopriority support, rw_hint
    is converted from enum to a u16. Added a guard around ki_hint assignment.

    Signed-off-by: Adam Manzanares
    Signed-off-by: Al Viro

    Adam Manzanares
     

30 May, 2018

2 commits

  • as it is, the logics in native io_submit(2) is "if asked for
    more than LONG_MAX/sizeof(pointer) iocbs to submit, don't
    bother with more than LONG_MAX/sizeof(pointer)" (i.e.
    512M requests on 32bit and 1E requests on 64bit) while
    compat io_submit(2) goes with "stop after the first
    PAGE_SIZE/sizeof(pointer) iocbs", i.e. 1K or so. Which is
    * inconsistent
    * *way* too much in native case
    * possibly too little in compat one
    and
    * wrong anyway, since the natural point where we
    ought to stop bothering is ctx->nr_events

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Al Viro
     
  • get rid of insane "copy array of 32bit pointers into an array of
    native ones" glue.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Al Viro