25 Apr, 2018

2 commits

  • This reverts commit 37c7c6c76d431dd7ef9c29d95f6052bd425f004c.

    Turns out some drivers(most are FC drivers) may not use managed
    IRQ affinity, and has their customized .map_queues meantime, so
    still keep this code for avoiding regression.

    Reported-by: Laurence Oberman
    Tested-by: Laurence Oberman
    Tested-by: Christian Borntraeger
    Tested-by: Stefan Haberland
    Cc: Ewan Milne
    Cc: Christoph Hellwig
    Cc: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • As it came up in discussion on the mailing list that the semantic
    meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely
    obvious to everyone, let's add some minimal kerneldoc for a
    starter.

    Signed-off-by: Linus Walleij
    Signed-off-by: Jens Axboe

    Linus Walleij
     

19 Apr, 2018

2 commits

  • The initializing of q->root_blkg is currently outside of queue lock
    and rcu, so the blkg may be destroied before the initializing, which
    may cause dangling/null references. On the other side, the destroys
    of blkg are protected by queue lock or rcu. Put the initializing
    inside the queue lock and rcu to make it safer.

    Signed-off-by: Jiang Biao
    Signed-off-by: Wen Yang
    CC: Tejun Heo
    CC: Jens Axboe
    Signed-off-by: Jens Axboe

    Jiang Biao
     
  • The comment before blkg_create() in blkcg_init_queue() was moved
    from blkcg_activate_policy() by commit ec13b1d6f0a0457312e615, but
    it does not suit for the new context.

    Signed-off-by: Jiang Biao
    Signed-off-by: Wen Yang
    CC: Tejun Heo
    CC: Jens Axboe
    Signed-off-by: Jens Axboe

    Jiang Biao
     

18 Apr, 2018

2 commits

  • As described in the comment of blkcg_activate_policy(),
    *Update of each blkg is protected by both queue and blkcg locks so
    that holding either lock and testing blkcg_policy_enabled() is
    always enough for dereferencing policy data.*
    with queue lock held, there is no need to hold blkcg lock in
    blkcg_deactivate_policy(). Similar case is in
    blkcg_activate_policy(), which has removed holding of blkcg lock in
    commit 4c55f4f9ad3001ac1fefdd8d8ca7641d18558e23.

    Signed-off-by: Jiang Biao
    Signed-off-by: Wen Yang
    CC: Tejun Heo
    Signed-off-by: Jens Axboe

    Jiang Biao
     
  • Even if we don't have an IO context attached to a request, we still
    need to clear the priv[0..1] pointers, as they could be pointing
    to previously used bic/bfqq structures. If we don't do so, we'll
    either corrupt memory on dispatching a request, or cause an
    imbalance in counters.

    Inspired by a fix from Kees.

    Reported-by: Oleksandr Natalenko
    Reported-by: Kees Cook
    Cc: stable@vger.kernel.org
    Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Apr, 2018

1 commit

  • rq->gstate and rq->aborted_gstate both are zero before rqs are
    allocated. If we have a small timeout, when the timer fires,
    there could be rqs that are never allocated, and also there could
    be rq that has been allocated but not initialized and started. At
    the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
    the blk_mq_terminate_expired will identify the rq is timed out and
    invoke .timeout early.

    For scsi, this will cause scsi_times_out to be invoked before the
    scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
    the moment, then we will get crash.

    Cc: Bart Van Assche
    Cc: Tejun Heo
    Cc: Ming Lei
    Cc: Martin Steigerwald
    Cc: stable@vger.kernel.org
    Signed-off-by: Jianchao Wang
    Signed-off-by: Jens Axboe

    Jianchao Wang
     

15 Apr, 2018

1 commit

  • When blk_queue_enter() waits for a queue to unfreeze, or unset the
    PREEMPT_ONLY flag, do not allow it to be interrupted by a signal.

    The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec
    ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI
    device is resumed asynchronously, i.e. after un-freezing userspace tasks.

    So that commit exposed the bug as a regression in v4.15. A mysterious
    SIGBUS (or -EIO) sometimes happened during the time the device was being
    resumed. Most frequently, there was no kernel log message, and we saw Xorg
    or Xwayland killed by SIGBUS.[1]

    [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979

    Without this fix, I get an IO error in this test:

    # dd if=/dev/sda of=/dev/null iflag=direct & \
    while killall -SIGUSR1 dd; do sleep 0.1; done & \
    echo mem > /sys/power/state ; \
    sleep 5; killall dd # stop after 5 seconds

    The interruptible wait was added to blk_queue_enter in
    commit 3ef28e83ab15 ("block: generic request_queue reference counting").
    Before then, the interruptible wait was only in blk-mq, but I don't think
    it could ever have been correct.

    Reviewed-by: Bart Van Assche
    Cc: stable@vger.kernel.org
    Signed-off-by: Alan Jenkins
    Signed-off-by: Jens Axboe

    Alan Jenkins
     

11 Apr, 2018

2 commits

  • This reverts commit 127276c6ce5a30fcc806b7fe53015f4f89b62956.

    When all CPUs of one hw queue become offline, there still may have IOs
    not completed from this hctx. But blk_mq_hw_queue_mapped() is called in
    blk_mq_queue_tag_busy_iter(), which is used for iterating request in timeout
    handler, timeout event will be missed on the inactive hctx, then request may
    never be completed.

    Also the replementation of blk_mq_hw_queue_mapped() doesn't match the helper's
    name any more, and it should have been named as blk_mq_hw_queue_active().

    Even other callers need further verification about this reimplemenation.

    So revert this patch now, and we can improve hw queue activate/inactivate event
    after adequent researching and test.

    Cc: Stefan Haberland
    Cc: Christian Borntraeger
    Cc: Christoph Hellwig
    Reported-by: Jens Axboe
    Fixes: 127276c6ce5a30fcc ("blk-mq: reimplement blk_mq_hw_queue_mapped")
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
    it is no longer safe to access cgroup information during or after the
    blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
    call with blk_queue_enter() / blk_queue_exit().

    Reported-by: Ming Lei
    Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller")
    Signed-off-by: Bart Van Assche
    Cc: Ming Lei
    Cc: Joseph Qi
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

10 Apr, 2018

9 commits

  • Firstly, from commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
    blk-mq doesn't remap queue any more after CPU topo is changed.

    Secondly, set->nr_hw_queues can't be bigger than nr_cpu_ids, and now we map
    all possible CPUs to hw queues, so at least one CPU is mapped to each hctx.

    So queue mapping has became static and fixed just like percpu variable, and
    we don't need to handle queue remapping any more.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • Now the actual meaning of queue mapped is that if there is any online
    CPU mapped to this hctx, so implement blk_mq_hw_queue_mapped() in this
    way.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • There are several reasons for removing the check:

    1) blk_mq_hw_queue_mapped() returns true always now since each hctx
    may be mapped by one CPU at least

    2) when there isn't any online CPU mapped to this hctx, there won't
    be any IO queued to this CPU, blk_mq_run_hw_queue() only runs queue
    if there is IO queued to this hctx

    3) If __blk_mq_delay_run_hw_queue() is called by blk_mq_delay_run_hw_queue(),
    which is run from blk_mq_dispatch_rq_list() or scsi_mq_get_budget(), and
    the hctx to be handled has to be mapped.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • No driver uses this interface any more, so remove it.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • This patch introduces helper of blk_mq_hw_queue_first_cpu() for
    figuring out the hctx's first cpu, and code duplication can be
    avoided.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • This patch figures out the final selected CPU, then writes
    it to hctx->next_cpu once, then we can avoid to intermediate
    next cpu observed from other dispatch paths.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • From commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
    blk-mq doesn't remap queue after CPU topo is changed, that said when
    some of these offline CPUs become online, they are still mapped to
    hctx 0, then hctx 0 may become the bottleneck of IO dispatch and
    completion.

    This patch sets up the mapping from the beginning, and aligns to
    queue mapping for PCI device (blk_mq_pci_map_queues()).

    Cc: Stefan Haberland
    Cc: Keith Busch
    Cc: stable@vger.kernel.org
    Fixes: 4b855ad37194 ("blk-mq: Create hctx for each present CPU)
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • From commit 20e4d81393196 (blk-mq: simplify queue mapping & schedule
    with each possisble CPU), one hctx can be mapped from all offline CPUs,
    then hctx->next_cpu can be set as wrong.

    This patch fixes this issue by making hctx->next_cpu pointing to the
    first CPU in hctx->cpumask if all CPUs in hctx->cpumask are offline.

    Cc: Stefan Haberland
    Tested-by: Christian Borntraeger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Fixes: 20e4d81393196 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
    Cc: stable@vger.kernel.org
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • This patch orders getting budget and driver tag by making sure to acquire
    driver tag after budget is got, this way can help to avoid the following
    race:

    1) before dispatch request from scheduler queue, get one budget first, then
    dequeue a request, call it request A.

    2) in another IO path for dispatching request B which is from hctx->dispatch,
    driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
    unfortunately the budget is held by request A.

    3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
    A, and try to get driver tag first, unfortunately no driver tag is
    available because the driver tag is held by request B

    4) both two IO pathes can't move on, and IO stall is caused.

    This issue can be observed when running dbench on USB storage.

    This patch fixes this issue by always getting budget before getting
    driver tag.

    Cc: stable@vger.kernel.org
    Fixes: de1482974080ec9e ("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")
    Cc: Christoph Hellwig
    Cc: Bart Van Assche
    Cc: Omar Sandoval
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

06 Apr, 2018

1 commit

  • Pull block layer updates from Jens Axboe:
    "It's a pretty quiet round this time, which is nice. This contains:

    - series from Bart, cleaning up the way we set/test/clear atomic
    queue flags.

    - series from Bart, fixing races between gendisk and queue
    registration and removal.

    - set of bcache fixes and improvements from various folks, by way of
    Michael Lyle.

    - set of lightnvm updates from Matias, most of it being the 1.2 to
    2.0 transition.

    - removal of unused DIO flags from Nikolay.

    - blk-mq/sbitmap memory ordering fixes from Omar.

    - divide-by-zero fix for BFQ from Paolo.

    - minor documentation patches from Randy.

    - timeout fix from Tejun.

    - Alpha "can't write a char atomically" fix from Mikulas.

    - set of NVMe fixes by way of Keith.

    - bsg and bsg-lib improvements from Christoph.

    - a few sed-opal fixes from Jonas.

    - cdrom check-disk-change deadlock fix from Maurizio.

    - various little fixes, comment fixes, etc from various folks"

    * tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
    blk-mq: Directly schedule q->timeout_work when aborting a request
    blktrace: fix comment in blktrace_api.h
    lightnvm: remove function name in strings
    lightnvm: pblk: remove some unnecessary NULL checks
    lightnvm: pblk: don't recover unwritten lines
    lightnvm: pblk: implement 2.0 support
    lightnvm: pblk: implement get log report chunk
    lightnvm: pblk: rename ppaf* to addrf*
    lightnvm: pblk: check for supported version
    lightnvm: implement get log report chunk helpers
    lightnvm: make address conversions depend on generic device
    lightnvm: add support for 2.0 address format
    lightnvm: normalize geometry nomenclature
    lightnvm: complete geo structure with maxoc*
    lightnvm: add shorten OCSSD version in geo
    lightnvm: add minor version to generic geometry
    lightnvm: simplify geometry structure
    lightnvm: pblk: refactor init/exit sequences
    lightnvm: Avoid validation of default op value
    lightnvm: centralize permission check for lightnvm ioctl
    ...

    Linus Torvalds
     

05 Apr, 2018

1 commit

  • Pull char/misc updates from Greg KH:
    "Here is the big set of char/misc driver patches for 4.17-rc1.

    There are a lot of little things in here, nothing huge, but all
    important to the different hardware types involved:

    - thunderbolt driver updates

    - parport updates (people still care...)

    - nvmem driver updates

    - mei updates (as always)

    - hwtracing driver updates

    - hyperv driver updates

    - extcon driver updates

    - ... and a handful of even smaller driver subsystem and individual
    driver updates

    All of these have been in linux-next with no reported issues"

    * tag 'char-misc-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (149 commits)
    hwtracing: Add HW tracing support menu
    intel_th: Add ACPI glue layer
    intel_th: Allow forcing host mode through drvdata
    intel_th: Pick up irq number from resources
    intel_th: Don't touch switch routing in host mode
    intel_th: Use correct method of finding hub
    intel_th: Add SPDX GPL-2.0 header to replace GPLv2 boilerplate
    stm class: Make dummy's master/channel ranges configurable
    stm class: Add SPDX GPL-2.0 header to replace GPLv2 boilerplate
    MAINTAINERS: Bestow upon myself the care for drivers/hwtracing
    hv: add SPDX license id to Kconfig
    hv: add SPDX license to trace
    Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
    Drivers: hv: vmbus: respect what we get from hv_get_synint_state()
    /dev/mem: Avoid overwriting "err" in read_mem()
    eeprom: at24: use SPDX identifier instead of GPL boiler-plate
    eeprom: at24: simplify the i2c functionality checking
    eeprom: at24: fix a line break
    eeprom: at24: tweak newlines
    eeprom: at24: refactor at24_probe()
    ...

    Linus Torvalds
     

03 Apr, 2018

1 commit

  • Request abortion is performed by overriding deadline to now and
    scheduling timeout handling immediately. For the latter part, the
    code was using mod_timer(timeout, 0) which can't guarantee that the
    timer runs afterwards. Let's schedule the underlying work item
    directly instead.

    This fixes the hangs during probing reported by Sitsofe but it isn't
    yet clear to me how the failure can happen reliably if it's just the
    above described race condition.

    Signed-off-by: Tejun Heo
    Reported-by: Sitsofe Wheeler
    Reported-by: Meelis Roos
    Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger timeout path")
    Cc: stable@vger.kernel.org # v4.16
    Link: http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8NyOF-EMA@mail.gmail.com
    Link: http://lkml.kernel.org/r/alpine.LRH.2.21.1802261049140.4893@math.ut.ee
    Signed-off-by: Jens Axboe

    Tejun Heo
     

28 Mar, 2018

1 commit

  • The PCI interrupt vectors intended to be associated with a queue may
    not start at 0; a driver may allocate pre_vectors for special use. This
    patch adds an offset parameter so blk-mq may find the intended affinity
    mask and updates all drivers using this API accordingly.

    Cc: Don Brace
    Cc:
    Cc:
    Signed-off-by: Keith Busch
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Keith Busch
     

27 Mar, 2018

1 commit

  • If a storage device handled by BFQ happens to be slower than 7.5 KB/s
    for a certain amount of time (in the order of a second), then the
    estimated peak rate of the device, maintained in BFQ, becomes equal to
    0. The reason is the limited precision with which the rate is
    represented (details on the range of representable values in the
    comments introduced by this commit). This leads to a division-by-zero
    error where the estimated peak rate is used as divisor. Such a type of
    failure has been reported in [1].

    This commit addresses this issue by:
    1. Lower-bounding the estimated peak rate to 1
    2. Adding and improving comments on the range of rates representable

    [1] https://www.spinics.net/lists/kernel/msg2739205.html

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Paolo Valente
    Signed-off-by: Jens Axboe

    Paolo Valente
     

26 Mar, 2018

1 commit


22 Mar, 2018

1 commit


20 Mar, 2018

1 commit

  • scsi_device_quiesce() uses synchronize_rcu() to guarantee that the
    effect of blk_set_preempt_only() will be visible for percpu_ref_tryget()
    calls that occur after the queue unfreeze by using the approach
    explained in https://lwn.net/Articles/573497/. The rcu read lock and
    unlock calls in blk_queue_enter() form a pair with the synchronize_rcu()
    call in scsi_device_quiesce(). Both scsi_device_quiesce() and
    blk_queue_enter() must either use regular RCU or RCU-sched.
    Since neither the RCU-protected code in blk_queue_enter() nor
    blk_queue_usage_counter_release() sleeps, regular RCU protection
    is sufficient. Note: scsi_device_quiesce() does not have to be
    modified since it already uses synchronize_rcu().

    Reported-by: Tejun Heo
    Fixes: 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably")
    Signed-off-by: Bart Van Assche
    Acked-by: Tejun Heo
    Cc: Tejun Heo
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Cc: Christoph Hellwig
    Cc: Johannes Thumshirn
    Cc: Oleksandr Natalenko
    Cc: Martin Steigerwald
    Cc: stable@vger.kernel.org # v4.15
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

18 Mar, 2018

2 commits

  • bio_check_eod() should check partition size not the whole disk if
    bio->bi_partno is non-zero. Do this by moving the call
    to bio_check_eod() into blk_partition_remap().

    Based on an earlier patch from Jiufei Xue.

    Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
    Reported-by: Jiufei Xue
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Since commit 634f9e4631a8 ("blk-mq: remove REQ_ATOM_COMPLETE usages
    from blk-mq") blk_rq_is_complete() only reports whether or not a
    request has completed for legacy queues. Hence modify the
    blk-mq-debugfs code such that it shows the blk-mq request state
    again.

    Fixes: 634f9e4631a8 ("blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq")
    Signed-off-by: Bart Van Assche
    Cc: Tejun Heo
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

17 Mar, 2018

2 commits

  • We've triggered a WARNING in blk_throtl_bio() when throttling writeback
    io, which complains blkg->refcnt is already 0 when calling blkg_get(),
    and then kernel crashes with invalid page request.
    After investigating this issue, we've found it is caused by a race
    between blkcg_bio_issue_check() and cgroup_rmdir(), which is described
    below:

    writeback kworker cgroup_rmdir
    cgroup_destroy_locked
    kill_css
    css_killed_ref_fn
    css_killed_work_fn
    offline_css
    blkcg_css_offline
    blkcg_bio_issue_check
    rcu_read_lock
    blkg_lookup
    spin_trylock(q->queue_lock)
    blkg_destroy
    spin_unlock(q->queue_lock)
    blk_throtl_bio
    spin_lock_irq(q->queue_lock)
    ...
    spin_unlock_irq(q->queue_lock)
    rcu_read_unlock

    Since rcu can only prevent blkg from releasing when it is being used,
    the blkg->refcnt can be decreased to 0 during blkg_destroy() and schedule
    blkg release.
    Then trying to blkg_get() in blk_throtl_bio() will complains the WARNING.
    And then the corresponding blkg_put() will schedule blkg release again,
    which result in double free.
    This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg
    creation in blkcg_bio_issue_check()"). Before this commit, it will
    lookup first and then try to lookup/create again with queue_lock. Since
    revive this logic is a bit drastic, so fix it by only offlining pd during
    blkcg_css_offline(), and move the rest destruction (especially
    blkg_put()) into blkcg_css_free(), which should be the right way as
    discussed.

    Fixes: ae1188963611 ("blkcg: consolidate blkg creation in blkcg_bio_issue_check()")
    Reported-by: Jiufei Xue
    Signed-off-by: Joseph Qi
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Joseph Qi
     
  • The length must be given as bytes and not as 4 bit tuples.

    Reviewed-by: Scott Bauer
    Signed-off-by: Jonas Rabenstein
    Signed-off-by: Jens Axboe

    Jonas Rabenstein
     

16 Mar, 2018

1 commit

  • register_blkdev() and __register_chrdev_region() treat the major
    number as an unsigned int. So print it the same way to avoid
    absurd error statements such as:
    "... major requested (-1) is greater than the maximum (511) ..."
    (and also fix off-by-one bugs in the error prints).

    While at it, also update the comment describing register_blkdev().

    Signed-off-by: Srivatsa S. Bhat
    Reviewed-by: Logan Gunthorpe
    Signed-off-by: Greg Kroah-Hartman

    Srivatsa S. Bhat
     

14 Mar, 2018

3 commits

  • The current BSG design tries to shoe-horn the transport-specific
    passthrough commands into the overall framework for SCSI passthrough
    requests. This has a couple problems:

    - each passthrough queue has to set the QUEUE_FLAG_SCSI_PASSTHROUGH flag
    despite not dealing with SCSI commands at all. Because of that these
    queues could also incorrectly accept SCSI commands from in-kernel
    users or through the legacy SCSI_IOCTL_SEND_COMMAND ioctl.
    - the real SCSI bsg queues also incorrectly accept bsg requests of the
    BSG_SUB_PROTOCOL_SCSI_TRANSPORT type
    - the bsg transport code is almost unredable because it tries to reuse
    different SCSI concepts for its own purpose.

    This patch instead adds a new bsg_ops structure to handle the two cases
    differently, and thus solves all of the above problems. Another side
    effect is that the bsg-lib queues also don't need to embedd a
    struct scsi_request anymore.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Users of the bsg-lib interface should only use the bsg_job data structure
    and not know about implementation details of it.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Benjamin Block
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • The zfcp driver wants to know the timeout for a bsg job, so add a field
    to struct bsg_job for it in preparation of not exposing the request
    to the bsg-lib users.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Benjamin Block
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

09 Mar, 2018

5 commits

  • Avoid that building with W=1 causes the kernel-doc tool to complain
    about undocumented function arguments for the blk-zoned.c source file.

    Signed-off-by: Bart Van Assche
    Cc: Christoph Hellwig
    Cc: Damien Le Moal
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • This patch helps to avoid that new code gets introduced in block drivers
    that manipulates queue flags without holding the queue lock when that
    lock should be held.

    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • This patch has been generated as follows:

    for verb in set_unlocked clear_unlocked set clear; do
    replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \
    $(git grep -lw queue_flag_${verb} drivers block/bsg*)
    done

    Except for protecting all queue flag changes with the queue lock
    this patch does not change any functionality.

    Cc: Mike Snitzer
    Cc: Shaohua Li
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Signed-off-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Johannes Thumshirn
    Acked-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Since the queue flags may be changed concurrently from multiple
    contexts after a queue becomes visible in sysfs, make these changes
    safe by protecting these with the queue lock.

    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Introduce functions that modify the queue flags and that protect
    these modifications with the request queue lock. Except for moving
    one wake_up_all() call from inside to outside a critical section,
    this patch does not change any functionality.

    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Ming Lei
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche