08 Oct, 2018

5 commits

  • The code comments of closure_return_with_destructor() in closure.h makrs
    function name as closure_return(). This patch fixes this type with the
    correct name - closure_return_with_destructor.

    Signed-off-by: Coly Li
    Signed-off-by: Jens Axboe

    Coly Li
     
  • When doing ioctl in flash device, it will call ioctl_dev() in super.c,
    then we should not to get cached device since flash only device has
    no backend device. This patch just move the jugement dc->io_disable
    to cached_dev_ioctl() to make ioctl in flash device correctly.

    Fixes: 0f0709e6bfc3c ("bcache: stop bcache device when backing device is offline")
    Signed-off-by: Tang Junhui
    Cc: stable@vger.kernel.org
    Signed-off-by: Coly Li
    Signed-off-by: Jens Axboe

    Tang Junhui
     
  • In cached_dev_cache_miss() and check_should_bypass(), REQ_META is used
    to check whether a bio is for metadata request. REQ_META is used for
    blktrace, the correct REQ_ flag should be REQ_PRIO. This flag means the
    bio should be prior to other bio, and frequently be used to indicate
    metadata io in file system code.

    This patch replaces REQ_META with correct flag REQ_PRIO.

    CC Adam Manzanares because he explains to me what REQ_PRIO is for.

    Signed-off-by: Coly Li
    Cc: Adam Manzanares
    Signed-off-by: Jens Axboe

    Coly Li
     
  • Missed reading IOs are identified by s->cache_missed, not the
    s->cache_miss, so in trace_bcache_read() using trace_bcache_read
    to identify whether the IO is missed or not.

    Signed-off-by: Tang Junhui
    Cc: stable@vger.kernel.org
    Signed-off-by: Coly Li
    Signed-off-by: Jens Axboe

    Tang Junhui
     
  • UUIDs are considered as metadata. __uuid_write should add the number
    of buckets (in sectors) written to disk to ca->meta_sectors_written.
    Currently only 1 bucket is used in uuid write.

    Steps to test:
    1) create a fresh backing device and a fresh cache device separately.
    The backing device didn't attach to any cache set.
    2) cd /sys/block//bcache
    cat metadata_written // record the output value
    cat bucket_size
    3) attach the backing device to cache set
    4) cat metadata_written
    The output value is almost the same as the value in step 2
    before the change.
    After the change, the value is bigger about 1 bucket size.

    Signed-off-by: Shenghui Wang
    Reviewed-by: Tang Junhui
    Signed-off-by: Coly Li
    Signed-off-by: Jens Axboe

    Shenghui Wang
     

05 Oct, 2018

2 commits

  • Pull NVMe updates from Christoph:

    "A relatively boring merge window:

    - better AEN tracing (Chaitanya)
    - NUMA aware PCIe multipathing (me)
    - RDMA workqueue fixes (Sagi)
    - better bio usage in the target (Sagi)
    - FC rework for target removal (James)
    - better multipath handling of ->queue_rq failures (James)
    - various cleanups (Milan)"

    * 'nvme-4.20' of git://git.infradead.org/nvme:
    nvmet-rdma: use a private workqueue for delete
    nvme: take node locality into account when selecting a path
    nvmet: don't split large I/Os unconditionally
    nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O
    nvme-core: add async event trace helper
    nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device
    nvmet_fc: support target port removal with nvmet layer
    nvme-fc: fix for a minor typos
    nvmet: remove redundant module prefix
    nvme: fix typo in nvme_identify_ns_descs

    Jens Axboe
     
  • Queue deletion is done asynchronous when the last reference on the queue
    is dropped. Thus, in order to make sure we don't over allocate under a
    connect/disconnect storm, we let queue deletion complete before making
    forward progress.

    However, given that we flush the system_wq from rdma_cm context which
    runs from a workqueue context, we can have a circular locking complaint
    [1]. Fix that by using a private workqueue for queue deletion.

    [1]:
    ======================================================
    WARNING: possible circular locking dependency detected
    4.19.0-rc4-dbg+ #3 Not tainted
    ------------------------------------------------------
    kworker/5:0/39 is trying to acquire lock:
    00000000a10b6db9 (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x6f/0x440 [rdma_cm]

    but task is already holding lock:
    00000000331b4e2c ((work_completion)(&queue->release_work)){+.+.}, at: process_one_work+0x3ed/0xa20

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #3 ((work_completion)(&queue->release_work)){+.+.}:
    process_one_work+0x474/0xa20
    worker_thread+0x63/0x5a0
    kthread+0x1cf/0x1f0
    ret_from_fork+0x24/0x30

    -> #2 ((wq_completion)"events"){+.+.}:
    flush_workqueue+0xf3/0x970
    nvmet_rdma_cm_handler+0x133d/0x1734 [nvmet_rdma]
    cma_ib_req_handler+0x72f/0xf90 [rdma_cm]
    cm_process_work+0x2e/0x110 [ib_cm]
    cm_req_handler+0x135b/0x1c30 [ib_cm]
    cm_work_handler+0x2b7/0x38cd [ib_cm]
    process_one_work+0x4ae/0xa20
    nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: disconnected (10): status 0 id 0000000040357082
    worker_thread+0x63/0x5a0
    kthread+0x1cf/0x1f0
    ret_from_fork+0x24/0x30
    nvme nvme0: Reconnecting in 10 seconds...

    -> #1 (&id_priv->handler_mutex/1){+.+.}:
    __mutex_lock+0xfe/0xbe0
    mutex_lock_nested+0x1b/0x20
    cma_ib_req_handler+0x6aa/0xf90 [rdma_cm]
    cm_process_work+0x2e/0x110 [ib_cm]
    cm_req_handler+0x135b/0x1c30 [ib_cm]
    cm_work_handler+0x2b7/0x38cd [ib_cm]
    process_one_work+0x4ae/0xa20
    worker_thread+0x63/0x5a0
    kthread+0x1cf/0x1f0
    ret_from_fork+0x24/0x30

    -> #0 (&id_priv->handler_mutex){+.+.}:
    lock_acquire+0xc5/0x200
    __mutex_lock+0xfe/0xbe0
    mutex_lock_nested+0x1b/0x20
    rdma_destroy_id+0x6f/0x440 [rdma_cm]
    nvmet_rdma_release_queue_work+0x8e/0x1b0 [nvmet_rdma]
    process_one_work+0x4ae/0xa20
    worker_thread+0x63/0x5a0
    kthread+0x1cf/0x1f0
    ret_from_fork+0x24/0x30

    Fixes: 777dc82395de ("nvmet-rdma: occasionally flush ongoing controller teardown")
    Reported-by: Bart Van Assche
    Signed-off-by: Sagi Grimberg
    Tested-by: Bart Van Assche

    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     

04 Oct, 2018

2 commits

  • Some time ago REQ_DISCARD was renamed into REQ_OP_DISCARD. Some comments
    and documentation files were not updated however. Update these comments
    and documentation files. See also commit 4e1b2d52a80d ("block, fs,
    drivers: remove REQ_OP compat defs and related code").

    Signed-off-by: Bart Van Assche
    Cc: Mike Christie
    Cc: Martin K. Petersen
    Cc: Philipp Reisner
    Cc: Lars Ellenberg
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • There is another cast from unsigned long to int which causes
    a bounds check to fail with specially crafted input. The value is
    then used as an index in the slot array in cdrom_slot_status().

    This issue is similar to CVE-2018-16658 and CVE-2018-10940.

    Signed-off-by: Young_X
    Signed-off-by: Jens Axboe

    Young_X
     

02 Oct, 2018

10 commits

  • Replace "fallthru" with a proper "fall through" annotation.

    This fix is part of the ongoing efforts to enabling
    -Wimplicit-fallthrough

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Jens Axboe

    Gustavo A. R. Silva
     
  • Make current_path an array with an entry for every possible node, and
    cache the best path on a per-node basis. Take the node distance into
    account when selecting it. This is primarily useful for dual-ported PCIe
    devices which are connected to PCIe root ports on different sockets.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Keith Busch
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • If we know that the I/O size exceeds our inline bio vec, no
    point using it and split the rest to begin with. We could
    in theory reuse the inline bio and only allocate the bio_vec,
    but its really not worth optimizing for.

    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     
  • When an io is rejected by nvmf_check_ready() due to validation of the
    controller state, the nvmf_fail_nonready_command() will normally return
    BLK_STS_RESOURCE to requeue and retry. However, if the controller is
    dying or the I/O is marked for NVMe multipath, the I/O is failed so that
    the controller can terminate or so that the io can be issued on a
    different path. Unfortunately, as this reject point is before the
    transport has accepted the command, blk-mq ends up completing the I/O
    and never calls nvme_complete_rq(), which is where multipath may preserve
    or re-route the I/O. The end result is, the device user ends up seeing an
    EIO error.

    Example: single path connectivity, controller is under load, and a reset
    is induced. An I/O is received:

    a) while the reset state has been set but the queues have yet to be
    stopped; or
    b) after queues are started (at end of reset) but before the reconnect
    has completed.

    The I/O finishes with an EIO status.

    This patch makes the following changes:

    - Adds the HOST_PATH_ERROR pathing status from TP4028
    - Modifies the reject point such that it appears to queue successfully,
    but actually completes the io with the new pathing status and calls
    nvme_complete_rq().
    - nvme_complete_rq() recognizes the new status, avoids resetting the
    controller (likely was already done in order to get this new status),
    and calls the multipather to clear the current path that errored.
    This allows the next command (retry or new command) to select a new
    path if there is one.

    Signed-off-by: James Smart
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • This patch adds a new event for nvme async event notification.
    We print the async event in the decoded format when we recognize
    the event otherwise we just dump the result.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • The fc transport device should allow for a rediscovery, as userspace
    might have lost the events. Example is udev events not handled during
    system startup.

    This patch add a sysfs entry 'nvme_discovery' on the fc class to
    have it replay all udev discovery events for all local port/remote
    port address pairs.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Currently, if a targetport has been connected to via the nvmet config
    (in other words, the add_port() transport routine called, and the nvmet
    port pointer stored for using in upcalls on new io), and if the
    targetport is then removed (say the lldd driver decides to unload or
    fully reset its hardware) and then re-added (the lldd driver reloads or
    reinits its hardware), the port pointer has been lost so there's no way
    to continue to post commands up to nvmet via the transport port.

    Correct by allocating a small "port context" structure that will be
    linked to by the targetport. The context will save the targetport WWN's
    and the nvmet port pointer to use for it. Initial allocation will occur
    when the targetport is bound to via add_port. The context will be
    deallocated when remove_port() is called. If a targetport is removed
    while nvmet has the active port context, the targetport will be unlinked
    from the port context before removal. If a new targetport is registered,
    the port contexts without a binding are looked through and if the WWN's
    match (so it's the same as nvmet's port context) the port context is
    linked to the new target port. Thus new io can be received on the new
    targetport and operation resumes with nvmet.

    Additionally, this also resolves nvmet configuration changing out from
    underneath of the nvme-fc target port (for example: a nvmetcli clear).

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Signed-off-by: Milan P. Gandhi
    Reviewed-by: James Smart
    Signed-off-by: Christoph Hellwig

    Milan P. Gandhi
     
  • This patch removes the redundant module prefix used in the pr_err() when
    nvmet_get_smart_log_nsid() failed to find the namespace provided as a part
    of smart-log command.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • Signed-off-by: Milan P. Gandhi
    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Milan P. Gandhi
     

01 Oct, 2018

1 commit

  • Merge -rc6 in, for two reasons:

    1) Resolve a trivial conflict in the blk-mq-tag.c documentation
    2) A few important regression fixes went into upstream directly, so
    they aren't in the 4.20 branch.

    Signed-off-by: Jens Axboe

    * tag 'v4.19-rc6': (780 commits)
    Linux 4.19-rc6
    MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c
    cpufreq: qcom-kryo: Fix section annotations
    perf/core: Add sanity check to deal with pinned event failure
    xen/blkfront: correct purging of persistent grants
    Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
    selftests/powerpc: Fix Makefiles for headers_install change
    blk-mq: I/O and timer unplugs are inverted in blktrace
    dax: Fix deadlock in dax_lock_mapping_entry()
    x86/boot: Fix kexec booting failure in the SEV bit detection code
    bcache: add separate workqueue for journal_write to avoid deadlock
    drm/amd/display: Fix Edid emulation for linux
    drm/amd/display: Fix Vega10 lightup on S3 resume
    drm/amdgpu: Fix vce work queue was not cancelled when suspend
    Revert "drm/panel: Add device_link from panel device to DRM device"
    xen/blkfront: When purging persistent grants, keep them in the buffer
    clocksource/drivers/timer-atmel-pit: Properly handle error cases
    block: fix deadline elevator drain for zoned block devices
    ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
    drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
    ...

    Signed-off-by: Jens Axboe

    Jens Axboe
     

30 Sep, 2018

2 commits

  • Jens writes:
    "Block fixes for 4.19-rc6

    A set of fixes that should go into this release. This pull request
    contains:

    - A fix (hopefully) for the persistent grants for xen-blkfront. A
    previous fix from this series wasn't complete, hence reverted, and
    this one should hopefully be it. (Boris Ostrovsky)

    - Fix for an elevator drain warning with SMR devices, which is
    triggered when you switch schedulers (Damien)

    - bcache deadlock fix (Guoju Fang)

    - Fix for the block unplug tracepoint, which has had the
    timer/explicit flag reverted since 4.11 (Ilya)

    - Fix a regression in this series where the blk-mq timeout hook is
    invoked with the RCU read lock held, hence preventing it from
    blocking (Keith)

    - NVMe pull from Christoph, with a single multipath fix (Susobhan Dey)"

    * tag 'for-linus-20180929' of git://git.kernel.dk/linux-block:
    xen/blkfront: correct purging of persistent grants
    Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
    blk-mq: I/O and timer unplugs are inverted in blktrace
    bcache: add separate workqueue for journal_write to avoid deadlock
    xen/blkfront: When purging persistent grants, keep them in the buffer
    block: fix deadline elevator drain for zoned block devices
    blk-mq: Allow blocking queue tag iter callbacks
    nvme: properly propagate errors in nvme_mpath_init

    Greg Kroah-Hartman
     
  • Thomas writes:
    "Three small fixes for clocksource drivers:
    - Proper error handling in the Atmel PIT driver
    - Add CLOCK_SOURCE_SUSPEND_NONSTOP for TI SoCs so suspend works again
    - Fix the next event function for Facebook Backpack-CMM BMC chips so
    usleep(100) doesnt sleep several milliseconds"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clocksource/drivers/timer-atmel-pit: Properly handle error cases
    clocksource/drivers/fttmr010: Fix set_next_event handler
    clocksource/drivers/ti-32k: Add CLOCK_SOURCE_SUSPEND_NONSTOP flag for non-am43 SoCs

    Greg Kroah-Hartman
     

29 Sep, 2018

8 commits

  • Rafael writes:
    "Power management fix for 4.19-rc6

    Fix incorrect __init and __exit annotations in the Qualcomm
    Kryo cpufreq driver (Nathan Chancellor)."

    * tag 'pm-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    cpufreq: qcom-kryo: Fix section annotations

    Greg Kroah-Hartman
     
  • There is currently a warning when building the Kryo cpufreq driver into
    the kernel image:

    WARNING: vmlinux.o(.text+0x8aa424): Section mismatch in reference from
    the function qcom_cpufreq_kryo_probe() to the function
    .init.text:qcom_cpufreq_kryo_get_msm_id()
    The function qcom_cpufreq_kryo_probe() references
    the function __init qcom_cpufreq_kryo_get_msm_id().
    This is often because qcom_cpufreq_kryo_probe lacks a __init
    annotation or the annotation of qcom_cpufreq_kryo_get_msm_id is wrong.

    Remove the '__init' annotation from qcom_cpufreq_kryo_get_msm_id
    so that there is no more mismatch warning.

    Additionally, Nick noticed that the remove function was marked as
    '__init' when it should really be marked as '__exit'.

    Fixes: 46e2856b8e18 (cpufreq: Add Kryo CPU scaling driver)
    Fixes: 5ad7346b4ae2 (cpufreq: kryo: Add module remove and exit)
    Reported-by: Nick Desaulniers
    Signed-off-by: Nathan Chancellor
    Acked-by: Viresh Kumar
    Cc: 4.18+ # 4.18+
    Signed-off-by: Rafael J. Wysocki

    Nathan Chancellor
     
  • Dmitry writes:
    "Input updates for v4.19-rc5

    Just a few driver fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: uinput - allow for max == min during input_absinfo validation
    Input: elantech - enable middle button of touchpad on ThinkPad P72
    Input: atakbd - fix Atari CapsLock behaviour
    Input: atakbd - fix Atari keymap
    Input: egalax_ts - add system wakeup support
    Input: gpio-keys - fix a documentation index issue

    Greg Kroah-Hartman
     
  • Mark writes:
    "spi: Fixes for v4.19

    Quite a few fixes for the Renesas drivers in here, plus a fix for the
    Tegra driver and some documentation fixes for the recently added
    spi-mem code. The Tegra fix is relatively large but fairly
    straightforward and mechanical, it runs on probe so it's been
    reasonably well covered in -next testing."

    * tag 'spi-fix-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
    spi: spi-mem: Move the DMA-able constraint doc to the kerneldoc header
    spi: spi-mem: Add missing description for data.nbytes field
    spi: rspi: Fix interrupted DMA transfers
    spi: rspi: Fix invalid SPI use during system suspend
    spi: sh-msiof: Fix handling of write value for SISTR register
    spi: sh-msiof: Fix invalid SPI use during system suspend
    spi: gpio: Fix copy-and-paste error
    spi: tegra20-slink: explicitly enable/disable clock

    Greg Kroah-Hartman
     
  • Mark writes:
    "regulator: Fixes for 4.19

    A collection of fairly minor bug fixes here, a couple of driver
    specific ones plus two core fixes. There's one fix for the new
    suspend state code which fixes some confusion with constant values
    that are supposed to indicate noop operation and another fixing a
    race condition with the creation of sysfs files on new regulators."

    * tag 'regulator-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
    regulator: fix crash caused by null driver data
    regulator: Fix 'do-nothing' value for regulators without suspend state
    regulator: da9063: fix DT probing with constraints
    regulator: bd71837: Disable voltage monitoring for LDO3/4

    Greg Kroah-Hartman
     
  • Linus writes:
    "Pin control fixes for v4.19:
    - Fixes to x86 hardware:
    - AMD interrupt debounce issues
    - Faulty Intel cannonlake register offset
    - Revert pin translation IRQ locking"

    * tag 'pinctrl-v4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    Revert "pinctrl: intel: Do pin translation when lock IRQ"
    pinctrl: cannonlake: Fix HOSTSW_OWN register offset of H variant
    pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type

    Greg Kroah-Hartman
     
  • Dave writes:
    "drm fixes for 4.19-rc6

    Looks like a pretty normal week for graphics,

    core: syncobj fix, panel link regression revert
    amd: suspend/resume fixes, EDID emulation fix
    mali-dp: NV12 writeback and vblank reset fixes
    etnaviv: DMA setup fix"

    * tag 'drm-fixes-2018-09-28' of git://anongit.freedesktop.org/drm/drm:
    drm/amd/display: Fix Edid emulation for linux
    drm/amd/display: Fix Vega10 lightup on S3 resume
    drm/amdgpu: Fix vce work queue was not cancelled when suspend
    Revert "drm/panel: Add device_link from panel device to DRM device"
    drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
    drm/malidp: Fix writeback in NV12
    drm: mali-dp: Call drm_crtc_vblank_reset on device init
    drm/etnaviv: add DMA configuration for etnaviv platform device

    Greg Kroah-Hartman
     
  • Bjorn writes:
    "PCI fixes:

    - Fix ACPI hotplug issue that causes black screen crash at boot (Mika
    Westerberg)

    - Fix DesignWare "scheduling while atomic" issues (Jisheng Zhang)

    - Add PPC contacts to MAINTAINERS for PCI core error handling (Bjorn
    Helgaas)

    - Sort Mobiveil MAINTAINERS entry (Lorenzo Pieralisi)"

    * tag 'pci-v4.19-fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
    PCI: dwc: Fix scheduling while atomic issues
    MAINTAINERS: Move mobiveil PCI driver entry where it belongs
    MAINTAINERS: Update PPC contacts for PCI core error handling

    Greg Kroah-Hartman
     

28 Sep, 2018

10 commits