04 May, 2017

1 commit

  • Pull MD updates from Shaohua Li:

    - Add Partial Parity Log (ppl) feature found in Intel IMSM raid array
    by Artur Paszkiewicz. This feature is another way to close RAID5
    writehole. The Linux implementation is also available for normal
    RAID5 array if specific superblock bit is set.

    - A number of md-cluser fixes and enabling md-cluster array resize from
    Guoqing Jiang

    - A bunch of patches from Ming Lei and Neil Brown to rewrite MD bio
    handling related code. Now MD doesn't directly access bio bvec,
    bi_phys_segments and uses modern bio API for bio split.

    - Improve RAID5 IO pattern to improve performance for hard disk based
    RAID5/6 from me.

    - Several patches from Song Liu to speed up raid5-cache recovery and
    allow raid5 cache feature disabling in runtime.

    - Fix a performance regression in raid1 resync from Xiao Ni.

    - Other cleanup and fixes from various people.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (84 commits)
    md/raid10: skip spare disk as 'first' disk
    md/raid1: Use a new variable to count flighting sync requests
    md: clear WantReplacement once disk is removed
    md/raid1/10: remove unused queue
    md: handle read-only member devices better.
    md/raid10: wait up frozen array in handle_write_completed
    uapi: fix linux/raid/md_p.h userspace compilation error
    md-cluster: Fix a memleak in an error handling path
    md: support disabling of create-on-open semantics.
    md: allow creation of mdNNN arrays via md_mod/parameters/new_array
    raid5-ppl: use a single mempool for ppl_io_unit and header_page
    md/raid0: fix up bio splitting.
    md/linear: improve bio splitting.
    md/raid5: make chunk_aligned_read() split bios more cleanly.
    md/raid10: simplify handle_read_error()
    md/raid10: simplify the splitting of requests.
    md/raid1: factor out flush_bio_list()
    md/raid1: simplify handle_read_error().
    Revert "block: introduce bio_copy_data_partial"
    md/raid1: simplify alloc_behind_master_bio()
    ...

    Linus Torvalds
     

03 May, 2017

1 commit

  • Pull documentation update from Jonathan Corbet:
    "A reasonably busy cycle for documentation this time around. There is a
    new guide for user-space API documents, rather sparsely populated at
    the moment, but it's a start. Markus improved the infrastructure for
    converting diagrams. Mauro has converted much of the USB documentation
    over to RST. Plus the usual set of fixes, improvements, and tweaks.

    There's a bit more than the usual amount of reaching out of
    Documentation/ to fix comments elsewhere in the tree; I have acks for
    those where I could get them"

    * tag 'docs-4.12' of git://git.lwn.net/linux: (74 commits)
    docs: Fix a couple typos
    docs: Fix a spelling error in vfio-mediated-device.txt
    docs: Fix a spelling error in ioctl-number.txt
    MAINTAINERS: update file entry for HSI subsystem
    Documentation: allow installing man pages to a user defined directory
    Doc/PM: Sync with intel_powerclamp code behavior
    zr364xx.rst: usb/devices is now at /sys/kernel/debug/
    usb.rst: move documentation from proc_usb_info.txt to USB ReST book
    convert philips.txt to ReST and add to media docs
    docs-rst: usb: update old usbfs-related documentation
    arm: Documentation: update a path name
    docs: process/4.Coding.rst: Fix a couple of document refs
    docs-rst: fix usb cross-references
    usb: gadget.h: be consistent at kernel doc macros
    usb: composite.h: fix two warnings when building docs
    usb: get rid of some ReST doc build errors
    usb.rst: get rid of some Sphinx errors
    usb/URB.txt: convert to ReST and update it
    usb/persist.txt: convert to ReST and add to driver-api book
    usb/hotplug.txt: convert to ReST and add to driver-api book
    ...

    Linus Torvalds
     

02 May, 2017

3 commits

  • Pull uaccess unification updates from Al Viro:
    "This is the uaccess unification pile. It's _not_ the end of uaccess
    work, but the next batch of that will go into the next cycle. This one
    mostly takes copy_from_user() and friends out of arch/* and gets the
    zero-padding behaviour in sync for all architectures.

    Dealing with the nocache/writethrough mess is for the next cycle;
    fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
    sold on access_ok() in there, BTW; just not in this pile), same for
    reducing __copy_... callsites, strn*... stuff, etc. - there will be a
    pile about as large as this one in the next merge window.

    This one sat in -next for weeks. -3KLoC"

    * 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
    HAVE_ARCH_HARDENED_USERCOPY is unconditional now
    CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
    m32r: switch to RAW_COPY_USER
    hexagon: switch to RAW_COPY_USER
    microblaze: switch to RAW_COPY_USER
    get rid of padding, switch to RAW_COPY_USER
    ia64: get rid of copy_in_user()
    ia64: sanitize __access_ok()
    ia64: get rid of 'segment' argument of __do_{get,put}_user()
    ia64: get rid of 'segment' argument of __{get,put}_user_check()
    ia64: add extable.h
    powerpc: get rid of zeroing, switch to RAW_COPY_USER
    esas2r: don't open-code memdup_user()
    alpha: fix stack smashing in old_adjtimex(2)
    don't open-code kernel_setsockopt()
    mips: switch to RAW_COPY_USER
    mips: get rid of tail-zeroing in primitives
    mips: make copy_from_user() zero tail explicitly
    mips: clean and reorder the forest of macros...
    mips: consolidate __invoke_... wrappers
    ...

    Linus Torvalds
     
  • Shaohua Li
     
  • Pull block layer updates from Jens Axboe:

    - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
    was initially a fork of CFQ, but subsequently changed to implement
    fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
    to be used on desktop type single drives, providing good fairness.
    From Paolo.

    - Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
    using a scalable token based algorithm that throttles IO based on
    live completion IO stats, similary to blk-wbt. From Omar.

    - A series from Jan, moving users to separately allocated backing
    devices. This continues the work of separating backing device life
    times, solving various problems with hot removal.

    - A series of updates for lightnvm, mostly from Javier. Includes a
    'pblk' target that exposes an open channel SSD as a physical block
    device.

    - A series of fixes and improvements for nbd from Josef.

    - A series from Omar, removing queue sharing between devices on mostly
    legacy drivers. This helps us clean up other bits, if we know that a
    queue only has a single device backing. This has been overdue for
    more than a decade.

    - Fixes for the blk-stats, and improvements to unify the stats and user
    windows. This both improves blk-wbt, and enables other users to
    register a need to receive IO stats for a device. From Omar.

    - blk-throttle improvements from Shaohua. This provides a scalable
    framework for implementing scalable priotization - particularly for
    blk-mq, but applicable to any type of block device. The interface is
    marked experimental for now.

    - Bucketized IO stats for IO polling from Stephen Bates. This improves
    efficiency of polled workloads in the presence of mixed block size
    IO.

    - A few fixes for opal, from Scott.

    - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
    From a variety of folks, mostly Sagi and James Smart.

    - A series from Bart, improving our exposed info and capabilities from
    the blk-mq debugfs support.

    - A series from Christoph, cleaning up how handle WRITE_ZEROES.

    - A series from Christoph, cleaning up the block layer handling of how
    we track errors in a request. On top of being a nice cleanup, it also
    shrinks the size of struct request a bit.

    - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
    never used by platforms, and the latter has outlived it's usefulness.

    - Various little bug fixes and cleanups from a wide variety of folks.

    * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
    block: hide badblocks attribute by default
    blk-mq: unify hctx delay_work and run_work
    block: add kblock_mod_delayed_work_on()
    blk-mq: unify hctx delayed_run_work and run_work
    nbd: fix use after free on module unload
    MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
    blk-mq-sched: alloate reserved tags out of normal pool
    mtip32xx: use runtime tag to initialize command header
    scsi: Implement blk_mq_ops.show_rq()
    blk-mq: Add blk_mq_ops.show_rq()
    blk-mq: Show operation, cmd_flags and rq_flags names
    blk-mq: Make blk_flags_show() callers append a newline character
    blk-mq: Move the "state" debugfs attribute one level down
    blk-mq: Unregister debugfs attributes earlier
    blk-mq: Only unregister hctxs for which registration succeeded
    blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
    blk-mq: Let blk_mq_debugfs_register() look up the queue name
    blk-mq: Register /queue/mq after having registered /queue
    ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
    ide-pm: always pass 0 error to __blk_end_request_all
    ..

    Linus Torvalds
     

28 Apr, 2017

4 commits

  • Commit 99e6608c9e74 "block: Add badblock management for gendisks"
    allowed for drivers like pmem and software-raid to advertise a list of
    bad media areas. However, it inadvertently added a 'badblocks' to all
    block devices. Lets clean this up by having the 'badblocks' attribute
    not be visible when the driver has not populated a 'struct badblocks'
    instance in the gendisk.

    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Martin K. Petersen
    Reported-by: Vishal Verma
    Signed-off-by: Dan Williams
    Tested-by: Vishal Verma
    Signed-off-by: Jens Axboe

    Dan Williams
     
  • The only difference between ->run_work and ->delay_work, is that
    the latter is used to defer running a queue. This is done by
    marking the queue stopped, and scheduling ->delay_work to run
    sometime in the future. While the queue is stopped, direct runs
    or runs through ->run_work will not run the queue.

    If we combine the handlers, then we need to handle two things:

    1) If a delayed/stopped run is scheduled, then we should not run
    the queue before that has been completed.
    2) If a queue is delayed/stopped, the handler needs to restart
    the queue. Normally a run of a queue with the stopped bit set
    would be a no-op.

    Case 1 is handled by modifying a currently pending queue run
    to the deadline set by the caller of blk_mq_delay_queue().
    Subsequent attempts to queue a queue run will find the work
    item already pending, and direct runs will see a stopped queue
    as before.

    Case 2 is handled by adding a new bit, BLK_MQ_S_START_ON_RUN,
    that tells the work handler that it should clear a stopped
    queue and run the handler.

    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This modifies (or adds, if not currently pending) an existing
    delayed work item.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • They serve the exact same purpose. Get rid of the non-delayed
    work variant, and just run it without delay for the normal case.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Jens Axboe
     

27 Apr, 2017

10 commits

  • At least one driver, mtip32xx, has a hard coded dependency on
    the value of the reserved tag used for internal commands. While
    that should really be fixed up, for now let's ensure that we just
    bypass the scheduler tags an allocation marked as reserved. They
    are used for house keeping or error handling, so we can safely
    ignore them in the scheduler.

    Tested-by: Ming Lei
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This new callback function will be used in the next patch to show
    more information about SCSI requests.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Omar Sandoval
    Cc: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Show the operation name, .cmd_flags and .rq_flags as names instead
    of numbers.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • This patch does not change any functionality but makes it possible
    to produce a single line of output with multiple flag-to-name
    translations.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Move the "state" attribute from the top level to the "mq" directory
    as requested by Omar.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • We currently call blk_mq_free_queue() from blk_cleanup_queue()
    before we unregister the debugfs attributes for that queue in
    blk_release_queue(). This leaves a window open during which
    accessing most of the mq debugfs attributes would cause a
    use-after-free. Additionally, the "state" attribute allows
    running the queue, which we should not do after the queue has
    entered the "dead" state. Fix both cases by unregistering the
    debugfs attributes before freeing queue resources starts.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Hctx unregistration involves calling kobject_del(). kobject_del()
    must not be called if kobject_add() has not been called. Hence in
    the error path only unregister hctxs for which registration succeeded.

    Signed-off-by: Bart Van Assche
    Cc: Omar Sandoval
    Cc: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Since the blk_mq_debugfs_*register_hctxs() functions register and
    unregister all attributes under the "mq" directory, rename these
    into blk_mq_debugfs_*register_mq().

    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • A later patch will move the call of blk_mq_debugfs_register() to
    a function to which the queue name is not passed as an argument.
    To avoid having to add a 'name' argument to multiple callers, let
    blk_mq_debugfs_register() look up the queue name.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • A later patch in this series will modify blk_mq_debugfs_register()
    such that it uses q->kobj.parent to determine the name of a
    request queue. Hence make sure that that pointer is initialized
    before blk_mq_debugfs_register() is called. To avoid lock inversion,
    protect sysfs / debugfs registration with the queue sysfs_lock
    instead of the global mutex all_q_mutex.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

24 Apr, 2017

1 commit

  • When registering an integrity profile: if the template's interval_exp is
    not 0 use it, otherwise use the ilog2() of logical block size of the
    provided gendisk.

    This fixes a long-standing DM linear target bug where it cannot pass
    integrity data to the underlying device if its logical block size
    conflicts with the underlying device's logical block size.

    Cc: stable@vger.kernel.org
    Reported-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Acked-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

22 Apr, 2017

2 commits

  • Commit 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
    introduced blk_integrity_revalidate(), which seems to assume ownership
    of the stable pages flag and unilaterally clears it if no blk_integrity
    profile is registered:

    if (bi->profile)
    disk->queue->backing_dev_info->capabilities |=
    BDI_CAP_STABLE_WRITES;
    else
    disk->queue->backing_dev_info->capabilities &=
    ~BDI_CAP_STABLE_WRITES;

    It's called from revalidate_disk() and rescan_partitions(), making it
    impossible to enable stable pages for drivers that support partitions
    and don't use blk_integrity: while the call in revalidate_disk() can be
    trivially worked around (see zram, which doesn't support partitions and
    hence gets away with zram_revalidate_disk()), rescan_partitions() can
    be triggered from userspace at any time. This breaks rbd, where the
    ceph messenger is responsible for generating/verifying CRCs.

    Since blk_integrity_{un,}register() "must" be used for (un)registering
    the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES
    setting there. This way drivers that call blk_integrity_register() and
    use integrity infrastructure won't interfere with drivers that don't
    but still want stable pages.

    Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
    Cc: "Martin K. Petersen"
    Cc: Christoph Hellwig
    Cc: Mike Snitzer
    Cc: stable@vger.kernel.org # 4.4+, needs backporting
    Tested-by: Dan Williams
    Signed-off-by: Ilya Dryomov
    Signed-off-by: Jens Axboe

    Ilya Dryomov
     
  • Avoid that the following kernel bug gets triggered:

    BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:349
    in_atomic(): 1, irqs_disabled(): 0, pid: 8019, name: find
    CPU: 10 PID: 8019 Comm: find Tainted: G W I 4.11.0-rc4-dbg+ #2
    Call Trace:
    dump_stack+0x68/0x93
    ___might_sleep+0x16e/0x230
    __might_sleep+0x4a/0x80
    __ext4_get_inode_loc+0x1e0/0x4e0
    ext4_iget+0x70/0xbc0
    ext4_iget_normal+0x2f/0x40
    ext4_lookup+0xb6/0x1f0
    lookup_slow+0x104/0x1e0
    walk_component+0x19a/0x330
    path_lookupat+0x4b/0x100
    filename_lookup+0x9a/0x110
    user_path_at_empty+0x36/0x40
    vfs_statx+0x67/0xc0
    SYSC_newfstatat+0x20/0x40
    SyS_newfstatat+0xe/0x10
    entry_SYSCALL_64_fastpath+0x18/0xad

    This happens since the big if/else in blk_mq_make_request() doesn't
    have final else section that also drops the ctx. Add that.

    Fixes: b00c53e8f411 ("blk-mq: fix schedule-while-atomic with scheduler attached")
    Signed-off-by: Bart Van Assche
    Cc: Omar Sandoval

    Added a bit more to the commit log.

    Signed-off-by: Jens Axboe

    Bart Van Assche
     

21 Apr, 2017

13 commits

  • No point in providing and exporting this helper. There's just
    one (real) user of it, just use rq_data_dir().

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • commit c13660a08c8b ("blk-mq-sched: change ->dispatch_requests()
    to ->dispatch_request()") removed the last user of this function.
    Hence also remove the function itself.

    Signed-off-by: Bart Van Assche
    Cc: Omar Sandoval
    Cc: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • If the caller passes in wait=true, it has to be able to block
    for a driver tag. We just had a bug where flush insertion
    would block on tag allocation, while we had preempt disabled.
    Ensure that we catch cases like that earlier next time.

    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Fixes an issue where the size of the poll_stat array in request_queue
    does not match the size expected by the new size based bucketing for
    IO completion polling.

    Fixes: 720b8ccc4500 ("blk-mq: Add a polling specific stats function")
    Signed-off-by: Stephen Bates
    Signed-off-by: Jens Axboe

    Stephen Bates
     
  • We must have dropped the ctx before we call
    blk_mq_sched_insert_request() with can_block=true, otherwise we risk
    that a flush request can block on insertion if we are currently out of
    tags.

    [ 47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
    [ 47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
    [ 47.690572] Preemption disabled at:
    [ 47.690584] [] blk_mq_sched_get_request+0x6c/0x280
    [ 47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
    [ 47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
    [ 47.718081] Call Trace:
    [ 47.720903] dump_stack+0x4f/0x73
    [ 47.724694] ? blk_mq_sched_get_request+0x6c/0x280
    [ 47.730137] __schedule_bug+0x6c/0xc0
    [ 47.734314] __schedule+0x559/0x780
    [ 47.738302] schedule+0x3b/0x90
    [ 47.741899] io_schedule+0x11/0x40
    [ 47.745788] blk_mq_get_tag+0x167/0x2a0
    [ 47.750162] ? remove_wait_queue+0x70/0x70
    [ 47.754901] blk_mq_get_driver_tag+0x92/0xf0
    [ 47.759758] blk_mq_sched_insert_request+0x134/0x170
    [ 47.765398] ? blk_account_io_start+0xd0/0x270
    [ 47.770679] blk_mq_make_request+0x1b2/0x850
    [ 47.775766] generic_make_request+0xf7/0x2d0
    [ 47.780860] submit_bio+0x5f/0x120
    [ 47.784979] ? submit_bio+0x5f/0x120
    [ 47.789631] submit_bh_wbc.isra.46+0x10d/0x130
    [ 47.794902] submit_bh+0xb/0x10
    [ 47.798719] journal_submit_commit_record+0x190/0x210
    [ 47.804686] ? _raw_spin_unlock+0x13/0x30
    [ 47.809480] jbd2_journal_commit_transaction+0x180a/0x1d00
    [ 47.815925] kjournald2+0xb6/0x250
    [ 47.820022] ? kjournald2+0xb6/0x250
    [ 47.824328] ? remove_wait_queue+0x70/0x70
    [ 47.829223] kthread+0x10e/0x140
    [ 47.833147] ? commit_timeout+0x10/0x10
    [ 47.837742] ? kthread_create_on_node+0x40/0x40
    [ 47.843122] ret_from_fork+0x29/0x40

    Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Rather than bucketing IO statisics based on direction only we also
    bucket based on the IO size. This leads to improved polling
    performance. Update the bucket callback function and use it in the
    polling latency estimation.

    Signed-off-by: Stephen Bates
    Signed-off-by: Jens Axboe

    Stephen Bates
     
  • In order to allow for filtering of IO based on some other properties
    of the request than direction we allow the bucket function to return
    an int.

    If the bucket callback returns a negative do no count it in the stats
    accumulation.

    Signed-off-by: Stephen Bates

    Fixed up Kyber scheduler stat callback.

    Signed-off-by: Jens Axboe

    Stephen Bates
     
  • If we have a scheduler attached, blk_mq_tag_to_rq() on the
    scheduled tags will return NULL if a request is no longer
    in flight. This is different than using the normal tags,
    where it will always return the fixed request. Check for
    this condition for polling, in case we happen to enter
    polling for a completed request.

    The request address remains valid, so this check and return
    should be perfectly safe.

    Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
    Tested-by: Stephen Bates
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Acked-by: Roger Pau Monné
    Reviewed-by: Konrad Rzeszutek Wilk
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Merge blk_mq_ipi_complete_request and blk_mq_stat_add into their only
    caller.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Now that all drivers that call blk_mq_complete_requests have a
    ->complete callback we can remove the direct call to blk_mq_end_request,
    as well as the error argument to blk_mq_complete_request.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • This passes on the scsi_cmnd result field to users of passthrough
    requests. Currently we abuse req->errors for this purpose, but that
    field will go away in its current form.

    Note that the old IDE code abuses the errors field in very creative
    ways and stores all kinds of different values in it. I didn't dare
    to touch this magic, so the abuses are brought forward 1:1.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • The function only returns -EIO if rq->errors is non-zero, which is not
    very useful and lets a large number of callers ignore the return value.

    Just let the callers figure out their error themselves.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

20 Apr, 2017

5 commits

  • We trigger this warning:

    block/blk-throttle.c: In function ‘blk_throtl_bio’:
    block/blk-throttle.c:2042:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
    int ret;
    ^~~

    since we only assign 'ret' if BLK_DEV_THROTTLING_LOW is off, we never
    check it.

    Reported-by: Bart Van Assche
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • If we don't have CGROUPS enabled, the compile ends in the
    following misery:

    In file included from ../block/bfq-iosched.c:105:0:
    ../block/bfq-iosched.h:819:22: error: array type has incomplete element type
    extern struct cftype bfq_blkcg_legacy_files[];
    ^
    ../block/bfq-iosched.h:820:22: error: array type has incomplete element type
    extern struct cftype bfq_blkg_files[];
    ^

    Move the declarations under the right ifdef.

    Reported-by: Randy Dunlap
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • The call to bfq_check_ioprio_change will dereference bic, however,
    the null check for bic is after this call. Move the the null
    check on bic to before the call to avoid any potential null
    pointer dereference issues.

    Detected by CoverityScan, CID#1430138 ("Dereference before null check")

    Signed-off-by: Colin Ian King
    Signed-off-by: Jens Axboe

    Colin Ian King
     
  • Since ioprio_best() translates IOPRIO_CLASS_NONE into IOPRIO_CLASS_BE
    and since lower numerical priority values represent a higher priority
    a simple numerical comparison is sufficient.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Adam Manzanares
    Tested-by: Adam Manzanares
    Reviewed-by: Christoph Hellwig
    Cc: Matias Bjørling
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Since only a single caller remains, inline blk_rq_set_prio(). Initialize
    req->ioprio even if no I/O priority has been set in the bio nor in the
    I/O context.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Adam Manzanares
    Tested-by: Adam Manzanares
    Reviewed-by: Christoph Hellwig
    Cc: Matias Bjørling
    Signed-off-by: Jens Axboe

    Bart Van Assche