18 Aug, 2017

2 commits


17 Aug, 2017

1 commit


16 Aug, 2017

2 commits


11 Aug, 2017

2 commits

  • The numd field of directive receive command takes number of dwords to
    transfer. This fix has the correct calculation for numd.

    Signed-off-by: Kwan (Hingkwan) Huen-SSI
    Reviewed-by: Jens Axboe
    Signed-off-by: Christoph Hellwig

    Kwan (Hingkwan) Huen-SSI
     
  • We need to return an error if a timeout occurs on any NVMe command during
    initialization. Without this, the nvme reset work will be stuck. A timeout
    will have a negative error code, meaning we need to stop initializing
    the controller. All postitive returns mean the controller is still usable.

    bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196325

    Signed-off-by: Keith Busch
    Cc: Martin Peres
    [jth consolidated cleanup path ]
    Signed-off-by: Johannes Thumshirn
    Signed-off-by: Christoph Hellwig

    Keith Busch
     

10 Aug, 2017

3 commits

  • Currently we create the sysfs entry even if we fail mapping
    it. In that case, the unmapping will not remove the sysfs created
    file. There is no good reason to create a sysfs entry for a non
    working CMB and show his characteristics.

    Fixes: f63572dff ("nvme: unmap CMB and remove sysfs file in reset path")
    Signed-off-by: Max Gurtovoy
    Reviewed-by: Stephen Bates
    Signed-off-by: Christoph Hellwig

    Max Gurtovoy
     
  • At queue creation, the transport allocates a local job struct
    (struct nvmet_fc_fcp_iod) for each possible element of the queue.
    When a new CMD is received from the wire, a jobs struct is allocated
    from the queue and then used for the duration of the command.
    The job struct contains buffer space for the wire command iu. Thus,
    upon allocation of the job struct, the cmd iu buffer is copied to
    the job struct and the LLDD may immediately free/reuse the CMD IU
    buffer passed in the call.

    However, in some circumstances, due to the packetized nature of FC
    and the api of the FC LLDD which may issue a hw command to send the
    wire response, but the LLDD may not get the hw completion for the
    command and upcall the nvmet_fc layer before a new command may be
    asynchronously received on the wire. In other words, its possible
    for the initiator to get the response from the wire, thus believe a
    command slot free, and send a new command iu. The new command iu
    may be received by the LLDD and passed to the transport before the
    LLDD had serviced the hw completion and made the teardown calls for
    the original job struct. As such, there is no available job struct
    available for the new io. E.g. it appears like the host sent more
    queue elements than the queue size. It didn't based on it's
    understanding.

    Rather than treat this as a hard connection failure queue the new
    request until the job struct does free up. As the buffer isn't
    copied as there's no job struct, a special return value must be
    returned to the LLDD to signify to hold off on recycling the cmd
    iu buffer. And later, when a job struct is allocated and the
    buffer copied, a new LLDD callback is introduced to notify the
    LLDD and allow it to recycle it's command iu buffer.

    Signed-off-by: James Smart
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Some broken controllers (such as earlier Linux targets) pad model or
    serial fields with 0-bytes rather than spaces. The NVMe spec disallows
    0 bytes in "ASCII" fields. Thus strip trailing 0-bytes, too. Also make
    sure that we get no underflow for pathological input.

    Signed-off-by: Martin Wilck
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Keith Busch
    Signed-off-by: Christoph Hellwig

    Martin Wilck
     

26 Jul, 2017

3 commits

  • With a misbehaving controller it's possible we'll never
    enter the live state and create an admin queue. When we
    fail out of reset work it's possible we failed out early
    enough without setting up the admin queue. We tear down
    queues after a failed reset, but needed to do some more
    sanitization.

    Fixes 443bd90f2cca: "nvme: host: unquiesce queue in nvme_kill_queues()"

    [ 189.650995] nvme nvme1: pci function 0000:0b:00.0
    [ 317.680055] nvme nvme0: Device not ready; aborting reset
    [ 317.680183] nvme nvme0: Removing after probe failure status: -19
    [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
    [ 317.681397] general protection fault: 0000 [#1] SMP KASAN
    [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
    [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
    [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
    [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
    [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
    [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
    [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
    [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
    [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
    [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
    [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
    [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
    [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
    [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 317.685018] Call Trace:
    [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core]
    [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
    [ 317.685289] process_one_work+0x771/0x1170
    [ 317.685372] worker_thread+0xde/0x11e0
    [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110
    [ 317.685550] kthread+0x2d3/0x3d0
    [ 317.685617] ? process_one_work+0x1170/0x1170
    [ 317.685704] ? kthread_create_on_node+0xc0/0xc0
    [ 317.685785] ret_from_fork+0x25/0x30
    [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
    [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
    [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]---

    Signed-off-by: Scott Bauer
    Signed-off-by: Christoph Hellwig

    Scott Bauer
     
  • It's possible the preferred HMB size may not be a multiple of the
    chunk_size. This patch moves len to function scope and uses that in
    the for loop increment so the last iteration doesn't cause the total
    size to exceed the allocated HMB size.

    Based on an earlier patch from Keith Busch.

    Signed-off-by: Christoph Hellwig
    Reported-by: Dan Carpenter
    Reviewed-by: Keith Busch
    Fixes: 87ad72a59a38 ("nvme-pci: implement host memory buffer support")

    Christoph Hellwig
     
  • The FC-NVME spec hasn't locked down on the format string for TRADDR.
    Currently the spec is lobbying for "nn-:pn-"
    where the wwn's are hex values but not prefixed by 0x.

    Most implementations so far expect a string format of
    "nn-0x:pn-0x" to be used. The transport
    uses the match_u64 parser which requires a leading 0x prefix to set
    the base properly. If it's not there, a match will either fail or return
    a base 10 value.

    The resolution in T11 is pushing out. Therefore, to fix things now and
    to cover any eventuality and any implementations already in the field,
    this patch adds support for both formats.

    The change consists of replacing the token matching routine with a
    routine that validates the fixed string format, and then builds
    a local copy of the hex name with a 0x prefix before calling
    the system parser.

    Note: the same parser routine exists in both the initiator and target
    transports. Given this is about the only "shared" item, we chose to
    replicate rather than create an interdendency on some shared code.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     

25 Jul, 2017

2 commits

  • There are cases where threads are in the process of submitting new
    io when the LLDD calls in to remove the remote port. In some cases,
    the next io actually goes to the LLDD, who knows the remoteport isn't
    present and rejects it. To properly recovery/restart these i/o's we
    don't want to hard fail them, we want to treat them as temporary
    resource errors in which a delayed retry will work.

    Add a couple more checks on remoteport connectivity and commonize the
    busy response handling when it's seen.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • The WWID sysfs attribute can provide multiple means of a World Wide ID
    for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation
    of VID, Serial Number, Model and the Namespace ID in this order of
    preference.

    If the target also sends us a UUID use the UUID for identification and
    give it the highest priority.

    This eases generation of /dev/disk/by-* symlinks.

    Signed-off-by: Johannes Thumshirn
    Signed-off-by: Christoph Hellwig

    Johannes Thumshirn
     

20 Jul, 2017

9 commits


12 Jul, 2017

1 commit

  • Pull more block updates from Jens Axboe:
    "This is a followup for block changes, that didn't make the initial
    pull request. It's a bit of a mixed bag, this contains:

    - A followup pull request from Sagi for NVMe. Outside of fixups for
    NVMe, it also includes a series for ensuring that we properly
    quiesce hardware queues when browsing live tags.

    - Set of integrity fixes from Dmitry (mostly), fixing various issues
    for folks using DIF/DIX.

    - Fix for a bug introduced in cciss, with the req init changes. From
    Christoph.

    - Fix for a bug in BFQ, from Paolo.

    - Two followup fixes for lightnvm/pblk from Javier.

    - Depth fix from Ming for blk-mq-sched.

    - Also from Ming, performance fix for mtip32xx that was introduced
    with the dynamic initialization of commands"

    * 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
    block: call bio_uninit in bio_endio
    nvmet: avoid unneeded assignment of submit_bio return value
    nvme-pci: add module parameter for io queue depth
    nvme-pci: compile warnings in nvme_alloc_host_mem()
    nvmet_fc: Accept variable pad lengths on Create Association LS
    nvme_fc/nvmet_fc: revise Create Association descriptor length
    lightnvm: pblk: remove unnecessary checks
    lightnvm: pblk: control I/O flow also on tear down
    cciss: initialize struct scsi_req
    null_blk: fix error flow for shared tags during module_init
    block: Fix __blkdev_issue_zeroout loop
    nvme-rdma: unconditionally recycle the request mr
    nvme: split nvme_uninit_ctrl into stop and uninit
    virtio_blk: quiesce/unquiesce live IO when entering PM states
    mtip32xx: quiesce request queues to make sure no submissions are inflight
    nbd: quiesce request queues to make sure no submissions are inflight
    nvme: kick requeue list when requeueing a request instead of when starting the queues
    nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
    nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
    nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
    ...

    Linus Torvalds
     

11 Jul, 2017

1 commit

  • Pull followup NVMe (mostly) changes from Sagi:

    I added the quiesce/unquiesce patches in here as it's
    easy for me easily apply changes on top. It has accumulated
    reviews and includes mostly nvme anyway, please tell me if
    you don't want to take them with this.

    This includes:
    - quiesce/unquiesce fixes in nvme and others from me
    - nvme-fc add create association padding spec updates from James
    - some more quirking from MKP
    - nvmet nit cleanup from Max
    - Fix nvme-rdma racy RDMA completion signalling from Marta
    - some centralization patches from me
    - add tagset nr_hw_queues updates on controller resets in
    nvme drivers from me
    - nvme-rdma fix resources recycling when doing error recovery from me
    - minor cleanups in nvme-fc from me

    Jens Axboe
     

10 Jul, 2017

4 commits


09 Jul, 2017

1 commit

  • Pull PCI updates from Bjorn Helgaas:

    - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee
    Khee)

    - make host bridge IRQ mapping much more generic (Matthew Minter,
    Lorenzo Pieralisi)

    - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo
    Pieralisi)

    - mutex sriov_configure() (Jakub Kicinski)

    - mutex pci_error_handlers callbacks (Christoph Hellwig)

    - split ->reset_notify() into ->reset_prepare()/reset_done()
    (Christoph Hellwig)

    - support multiple PCIe portdrv interrupts for MSI as well as MSI-X
    (Gabriele Paoloni)

    - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele
    Paoloni)

    - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez)

    - test INTx masking during enumeration, not at run-time (Piotr Gregor)

    - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki)

    - restore the status of PCI devices across hibernation (Chen Yu)

    - keep parent resources that start at 0x0 (Ard Biesheuvel)

    - enable ECRC only if device supports it (Bjorn Helgaas)

    - restore PRI and PASID state after Function-Level Reset (CQ Tang)

    - skip DPC event if device is not present (Keith Busch)

    - check domain when matching SMBIOS info (Sujith Pandel)

    - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson)

    - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng)

    - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas)

    - add Switchtec "running" status flag (Logan Gunthorpe)

    - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav)

    - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar
    Gogada)

    - move VMD SRCU cleanup after bus, child device removal (Jon Derrick)

    - add Faraday clock handling (Linus Walleij)

    - configure Rockchip MPS and reorganize (Shawn Lin)

    - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla)

    - support Tegra MSI 64-bit addressing (Thierry Reding)

    - use Rockchip normal (not privileged) register bank (Shawn Lin)

    - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song)

    - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc
    Gonzalez)

    - add MediaTek PCIe host controller support (Ryder Lee)

    - add Qualcomm IPQ4019 support (John Crispin)

    - add HyperV vPCI protocol v1.2 support (Jork Loeser)

    - add i.MX6 regulator support (Quentin Schulz)

    * tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
    PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support
    PCI: Add DT binding for Sigma Designs Tango PCIe controller
    PCI: rockchip: Use normal register bank for config accessors
    dt-bindings: PCI: Add documentation for MediaTek PCIe
    PCI: Remove __pci_dev_reset() and pci_dev_reset()
    PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
    PCI: xilinx: Make of_device_ids const
    PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts
    PCI: vmd: Move SRCU cleanup after bus, child device removal
    PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000
    PCI: versatile: Add local struct device pointers
    PCI: tegra: Do not allocate MSI target memory
    PCI: tegra: Support MSI 64-bit addressing
    PCI: rockchip: Use local struct device pointer consistently
    PCI: rockchip: Check for clk_prepare_enable() errors during resume
    MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer
    PCI: rockchip: Configure RC's MPS setting
    PCI: rockchip: Reconfigure configuration space header type
    PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses()
    PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu()
    ...

    Linus Torvalds
     

06 Jul, 2017

8 commits


04 Jul, 2017

1 commit