23 Nov, 2013

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "Things have been quiet this round with mostly bugfixes, percpu
    conversions, and other minor iscsi-target conformance testing changes.

    The highlights include:

    - Add demo_mode_discovery attribute for iscsi-target (Thomas)
    - Convert tcm_fc(FCoE) to use percpu-ida pre-allocation
    - Add send completion interrupt coalescing for ib_isert
    - Convert target-core to use percpu-refcounting for se_lun
    - Fix mutex_trylock usage bug in iscsit_increment_maxcmdsn
    - tcm_loop updates (Hannes)
    - target-core ALUA cleanups + prep for v3.14 SCSI Referrals support (Hannes)

    v3.14 is currently shaping to be a busy development cycle in target
    land, with initial support for T10 Referrals and T10 DIF currently on
    the roadmap"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits)
    iscsi-target: chap auth shouldn't match username with trailing garbage
    iscsi-target: fix extract_param to handle buffer length corner case
    iscsi-target: Expose default_erl as TPG attribute
    target_core_configfs: split up ALUA supported states
    target_core_alua: Make supported states configurable
    target_core_alua: Store supported ALUA states
    target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED
    target_core_alua: spellcheck
    target core: rename (ex,im)plict -> (ex,im)plicit
    percpu-refcount: Add percpu-refcount.o to obj-y
    iscsi-target: Do not reject non-immediate CmdSNs exceeding MaxCmdSN
    iscsi-target: Convert iscsi_session statistics to atomic_long_t
    target: Convert se_device statistics to atomic_long_t
    target: Fix delayed Task Aborted Status (TAS) handling bug
    iscsi-target: Reject unsupported multi PDU text command sequence
    ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call
    iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn
    target: Core does not need blkdev.h
    target: Pass through I/O topology for block backstores
    iser-target: Avoid using FRMR for single dma entry requests
    ...

    Linus Torvalds
     

19 Nov, 2013

1 commit

  • Pull infiniband/rdma updates from Roland Dreier:
    - Re-enable flow steering verbs with new improved userspace ABI
    - Fixes for slow connection due to GID lookup scalability
    - IPoIB fixes
    - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
    - Further improvements to SRP error handling
    - Add new transport type for Cisco usNIC

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
    IB/core: Re-enable create_flow/destroy_flow uverbs
    IB/core: extended command: an improved infrastructure for uverbs commands
    IB/core: Remove ib_uverbs_flow_spec structure from userspace
    IB/core: Use a common header for uverbs flow_specs
    IB/core: Make uverbs flow structure use names like verbs ones
    IB/core: Rename 'flow' structs to match other uverbs structs
    IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
    IB/ucma: Convert use of typedef ctl_table to struct ctl_table
    IB/cm: Convert to using idr_alloc_cyclic()
    IB/mlx5: Fix page shift in create CQ for userspace
    IB/mlx4: Fix device max capabilities check
    IB/mlx5: Fix list_del of empty list
    IB/mlx5: Remove dead code
    IB/core: Encorce MR access rights rules on kernel consumers
    IB/mlx4: Fix endless loop in resize CQ
    RDMA/cma: Remove unused argument and minor dead code
    RDMA/ucma: Discard events for IDs not yet claimed by user space
    IB/core: Add Cisco usNIC rdma node and transport types
    RDMA/nes: Remove self-assignment from nes_query_qp()
    IB/srp: Report receive errors correctly
    ...

    Linus Torvalds
     

18 Nov, 2013

7 commits

  • …s', 'ocrdma', 'qib' and 'srp' into for-next

    Roland Dreier
     
  • This commit reverts commit 7afbddfae993 ("IB/core: Temporarily disable
    create_flow/destroy_flow uverbs"). Since the uverbs extensions
    functionality was experimental for v3.12, this patch re-enables the
    support for them and flow-steering for v3.13.

    Signed-off-by: Matan Barak
    Signed-off-by: Roland Dreier

    Matan Barak
     
  • Commit 400dbc96583f ("IB/core: Infrastructure for extensible uverbs
    commands") added an infrastructure for extensible uverbs commands
    while later commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow
    through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
    using this new infrastructure.

    According to the commit 400dbc96583f, the purpose of this
    infrastructure is to support passing around provider (eg. hardware)
    specific buffers when userspace issue commands to the kernel, so that
    it would be possible to extend uverbs (eg. core) buffers independently
    from the provider buffers.

    But the new kernel command function prototypes were not modified to
    take advantage of this extension. This issue was exposed by Roland
    Dreier in a previous review[1].

    So the following patch is an attempt to a revised extensible command
    infrastructure.

    This improved extensible command infrastructure distinguish between
    core (eg. legacy)'s command/response buffers from provider
    (eg. hardware)'s command/response buffers: each extended command
    implementing function is given a struct ib_udata to hold core
    (eg. uverbs) input and output buffers, and another struct ib_udata to
    hold the hw (eg. provider) input and output buffers.

    Having those buffers identified separately make it easier to increase
    one buffer to support extension without having to add some code to
    guess the exact size of each command/response parts: This should make
    the extended functions more reliable.

    Additionally, instead of relying on command identifier being greater
    than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
    unused bits in command field: on the 32 bits provided by command
    field, only 6 bits are really needed to encode the identifier of
    commands currently supported by the kernel. (Even using only 6 bits
    leaves room for about 23 new commands).

    So this patch makes use of some high order bits in command field to
    store flags, leaving enough room for more command identifiers than one
    will ever need (eg. 256).

    The new flags are used to specify if the command should be processed
    as an extended one or a legacy one. While designing the new command
    format, care was taken to make usage of flags itself extensible.

    Using high order bits of the commands field ensure that newer
    libibverbs on older kernel will properly fail when trying to call
    extended commands. On the other hand, older libibverbs on newer kernel
    will never be able to issue calls to extended commands.

    The extended command header includes the optional response pointer so
    that output buffer length and output buffer pointer are located
    together in the command, allowing proper parameters checking. This
    should make implementing functions easier and safer.

    Additionally the extended header ensure 64bits alignment, while making
    all sizes multiple of 8 bytes, extending the maximum buffer size:

    legacy extended

    Maximum command buffer: 256KBytes 1024KBytes (512KBytes + 512KBytes)
    Maximum response buffer: 256KBytes 1024KBytes (512KBytes + 512KBytes)

    For the purpose of doing proper buffer size accounting, the headers
    size are no more taken in account in "in_words".

    One of the odds of the current extensible infrastructure, reading
    twice the "legacy" command header, is fixed by removing the "legacy"
    command header from the extended command header: they are processed as
    two different parts of the command: memory is read once and
    information are not duplicated: it's making clear that's an extended
    command scheme and not a different command scheme.

    The proposed scheme will format input (command) and output (response)
    buffers this way:

    - command:

    legacy header +
    extended header +
    command data (core + hw):

    +----------------------------------------+
    | flags | 00 00 | command |
    | in_words | out_words |
    +----------------------------------------+
    | response |
    | response |
    | provider_in_words | provider_out_words |
    | padding |
    +----------------------------------------+
    | |
    . .
    . (in_words * 8) .
    | |
    +----------------------------------------+
    | |
    . .
    . (provider_in_words * 8) .
    | |
    +----------------------------------------+

    - response, if present:

    +----------------------------------------+
    | |
    . .
    . (out_words * 8) .
    | |
    +----------------------------------------+
    | |
    . .
    . (provider_out_words * 8) .
    | |
    +----------------------------------------+

    The overall design is to ensure that the extensible infrastructure is
    itself extensible while begin more reliable with more input and bound
    checking.

    Note:

    The unused field in the extended header would be perfect candidate to
    hold the command "comp_mask" (eg. bit field used to handle
    compatibility). This was suggested by Roland Dreier in a previous
    review[2]. But "comp_mask" field is likely to be present in the uverb
    input and/or provider input, likewise for the response, as noted by
    Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
    header.

    [1]:
    http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

    [2]:
    http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

    [3]:
    http://marc.info/?i=525C1149.6000701@mellanox.com

    Signed-off-by: Yann Droneaud
    Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

    [ Convert "ret ? ret : 0" to the equivalent "ret". - Roland ]

    Signed-off-by: Roland Dreier

    Yann Droneaud
     
  • The structure holding any types of flow_spec is of no use to
    userspace. It would be wrong for userspace to do:

    struct ib_uverbs_flow_spec flow_spec;

    flow_spec.type = IB_FLOW_SPEC_TCP;
    flow_spec.size = sizeof(flow_spec);

    Instead, userspace should use the dedicated flow_spec structure for
    - Ethernet : struct ib_uverbs_flow_spec_eth,
    - IPv4 : struct ib_uverbs_flow_spec_ipv4,
    - TCP/UDP : struct ib_uverbs_flow_spec_tcp_udp.

    In other words, struct ib_uverbs_flow_spec is a "virtual" data
    structure that can only be use by the kernel as an alias to the other.

    Signed-off-by: Yann Droneaud
    Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
    Signed-off-by: Roland Dreier

    Yann Droneaud
     
  • This patch adds "flow" prefix to most of data structure added as part
    of commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through
    uverbs") to keep those names in sync with the data structures added in
    commit 319a441d1361 ("IB/core: Add receive flow steering support").

    It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.

    Signed-off-by: Yann Droneaud
    Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
    Signed-off-by: Roland Dreier

    Yann Droneaud
     
  • Commit 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through
    uverbs") added public data structures to support receive flow
    steering. The new structs are not following the 'uverbs' pattern:
    they're lacking the common prefix 'ib_uverbs'.

    This patch replaces ib_kern prefix by ib_uverbs.

    Signed-off-by: Yann Droneaud
    Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
    Signed-off-by: Roland Dreier

    Yann Droneaud
     
  • This patch fixes the following issues:

    1. Unneeded checks were removed

    2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.

    3. Remove a 32bit hole on 64bit systems with strict alignment in
    struct ib_kern_flow_att by adding a reserved field.

    Signed-off-by: Matan Barak
    Signed-off-by: Roland Dreier

    Matan Barak
     

17 Nov, 2013

2 commits


16 Nov, 2013

6 commits

  • Pull trivial tree updates from Jiri Kosina:
    "Usual earth-shaking, news-breaking, rocket science pile from
    trivial.git"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
    doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
    doc: add missing files to timers/00-INDEX
    timekeeping: Fix some trivial typos in comments
    mm: Fix some trivial typos in comments
    irq: Fix some trivial typos in comments
    NUMA: fix typos in Kconfig help text
    mm: update 00-INDEX
    doc: Documentation/DMA-attributes.txt fix typo
    DRM: comment: `halve' -> `half'
    Docs: Kconfig: `devlopers' -> `developers'
    doc: typo on word accounting in kprobes.c in mutliple architectures
    treewide: fix "usefull" typo
    treewide: fix "distingush" typo
    mm/Kconfig: Grammar s/an/a/
    kexec: Typo s/the/then/
    Documentation/kvm: Update cpuid documentation for steal time and pv eoi
    treewide: Fix common typo in "identify"
    __page_to_pfn: Fix typo in comment
    Correct some typos for word frequency
    clk: fixed-factor: Fix a trivial typo
    ...

    Linus Torvalds
     
  • When creating a CQ, we must use mlx5 adapter page shift.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Move the check on max supported CQEs after the final number of entries is
    evaluated.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • The value of the local variable index is never used in reg_mr_callback().

    Signed-off-by: Eli Cohen

    [ Remove now-unused variable delta too. - Roland ]

    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Enforce the rule that when requesting remote write or atomic permissions, local
    write must be indicated as well. See IB spec 11.2.8.2.

    Spotted by: Hagay Abramovsky
    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • When calling get_sw_cqe() we need pass the consumer_index and not the
    masked value. Failure to do so will cause incorrect result of
    get_sw_cqe() possibly leading to endless loop.

    This problem was reported and analyzed by Michael Rice from HP.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     

14 Nov, 2013

1 commit

  • Pull PCI changes from Bjorn Helgaas:
    "Resource management
    - Fix host bridge window coalescing (Alexey Neyman)
    - Pass type, width, and prefetchability for window alignment (Wei Yang)

    PCI device hotplug
    - Convert acpiphp, acpiphp_ibm to dynamic debug (Lan Tianyu)

    Power management
    - Remove pci_pm_complete() (Liu Chuansheng)

    MSI
    - Fail initialization if device is not in PCI_D0 (Yijing Wang)

    MPS (Max Payload Size)
    - Use pcie_get_mps() and pcie_set_mps() to simplify code (Yijing Wang)
    - Use pcie_set_readrq() to simplify code (Yijing Wang)
    - Use cached pci_dev->pcie_mpss to simplify code (Yijing Wang)

    SR-IOV
    - Enable upstream bridges even for VFs on virtual buses (Bjorn Helgaas)
    - Use pci_is_root_bus() to avoid catching virtual buses (Wei Yang)

    Virtualization
    - Add x86 MSI masking ops (Konrad Rzeszutek Wilk)

    Freescale i.MX6
    - Support i.MX6 PCIe controller (Sean Cross)
    - Increase link startup timeout (Marek Vasut)
    - Probe PCIe in fs_initcall() (Marek Vasut)
    - Fix imprecise abort handler (Tim Harvey)
    - Remove redundant of_match_ptr (Sachin Kamat)

    Renesas R-Car
    - Support Gen2 internal PCIe controller (Valentine Barshak)

    Samsung Exynos
    - Add MSI support (Jingoo Han)
    - Turn off power when link fails (Jingoo Han)
    - Add Jingoo Han as maintainer (Jingoo Han)
    - Add clk_disable_unprepare() on error path (Wei Yongjun)
    - Remove redundant of_match_ptr (Sachin Kamat)

    Synopsys DesignWare
    - Add irq_create_mapping() (Pratyush Anand)
    - Add header guards (Seungwon Jeon)

    Miscellaneous
    - Enable native PCIe services by default on non-ACPI (Andrew Murray)
    - Cleanup _OSC usage and messages (Bjorn Helgaas)
    - Remove pcibios_last_bus boot option on non-x86 (Bjorn Helgaas)
    - Convert bus code to use bus_, drv_, and dev_groups (Greg Kroah-Hartman)
    - Remove unused pci_mem_start (Myron Stowe)
    - Make sysfs functions static (Sachin Kamat)
    - Warn on invalid return from driver probe (Stephen M. Cameron)
    - Remove Intel Haswell D3 delays (Todd E Brandt)
    - Call pci_set_master() in core if driver doesn't do it (Yinghai Lu)
    - Use pci_is_pcie() to simplify code (Yijing Wang)
    - Use PCIe capability accessors to simplify code (Yijing Wang)
    - Use cached pci_dev->pcie_cap to simplify code (Yijing Wang)
    - Removed unused "is_pcie" from struct pci_dev (Yijing Wang)
    - Simplify sysfs CPU affinity implementation (Yijing Wang)"

    * tag 'pci-v3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (79 commits)
    PCI: Enable upstream bridges even for VFs on virtual buses
    PCI: Add pci_upstream_bridge()
    PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
    PCI: Warn on driver probe return value greater than zero
    PCI: Drop warning about drivers that don't use pci_set_master()
    PCI: Workaround missing pci_set_master in pci drivers
    powerpc/pci: Use pci_is_pcie() to simplify code [fix]
    PCI: Update pcie_ports 'auto' behavior for non-ACPI platforms
    PCI: imx6: Probe the PCIe in fs_initcall()
    PCI: Add R-Car Gen2 internal PCI support
    PCI: imx6: Remove redundant of_match_ptr
    PCI: Report pci_pme_active() kmalloc failure
    mn10300/PCI: Remove useless pcibios_last_bus
    frv/PCI: Remove pcibios_last_bus
    PCI: imx6: Increase link startup timeout
    PCI: exynos: Remove redundant of_match_ptr
    PCI: imx6: Fix imprecise abort handler
    PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
    PCI: imx6: Remove redundant dev_err() in imx6_pcie_probe()
    x86/PCI: Coalesce multiple overlapping host bridge windows
    ...

    Linus Torvalds
     

13 Nov, 2013

3 commits

  • * lookup_one_len() really wants i_mutex held on directory.
    * leaks galore - just mount ipathfs, then
    cd /sys/bus/pci/drivers/qib_ib; echo *:*:*.* >unbind
    on a box with that card present and try to umount ipathfs...

    Signed-off-by: Al Viro

    Al Viro
     
  • This patch avoids a duplicate iscsit_increment_maxcmdsn() call for
    ISER_IB_RDMA_WRITE within isert_map_rdma() + isert_reg_rdma_frwr(),
    which will already be occuring once during isert_put_datain() ->
    iscsit_build_rsp_pdu() operation.

    It also removes the local conn->stat_sn assignment + increment,
    and changes the third parameter to iscsit_build_rsp_pdu() to
    signal this should be done by iscsi_target_mode code.

    Tested-by: Moussa Ba
    Cc: # v3.10+
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch changes isert_reg_rdma_frwr() to not use FRMR for single
    dma entry requests from small I/Os, in order to avoid the associated
    memory registration overhead.

    Using DMA MR is sufficient here for the single dma entry requests,
    and addresses a >= v3.12 performance regression.

    Signed-off-by: Vu Pham
    Cc: # v3.12+
    Signed-off-by: Nicholas Bellinger

    Vu Pham
     

12 Nov, 2013

2 commits

  • The dev variable is never assigned after being initialised.

    Signed-off-by: Michal Nazarewicz
    Signed-off-by: Roland Dreier

    Michal Nazarewicz
     
  • Problem reported by Avneesh Pant :

    It looks like we are triggering a bug in RDMA CM/UCM interaction.
    The bug specifically hits when we have an incoming connection
    request and the connecting process dies BEFORE the passive end of
    the connection can process the request i.e. it does not call
    rdma_get_cm_event() to retrieve the initial connection event. We
    were able to triage this further and have some additional
    information now.

    In the example below when P1 dies after issuing a connect request
    as the CM id is being destroyed all outstanding connects (to P2)
    are sent a reject message. We see this reject message being
    received on the passive end and the appropriate CM ID created for
    the initial connection message being retrieved in cm_match_req().
    The problem is in the ucma_event_handler() code when this reject
    message is delivered to it and the initial connect message itself
    HAS NOT been delivered to the client. In fact the client has not
    even called rdma_cm_get_event() at this stage so we haven't
    allocated a new ctx in ucma_get_event() and updated the new
    connection CM_ID to point to the new UCMA context.

    This results in the reject message not being dropped in
    ucma_event_handler() for the new connection request as the
    (if (!ctx->uid)) block is skipped since the ctx it refers to is
    the listen CM id context which does have a valid UID associated
    with it (I believe the new CMID for the connection initially
    uses the listen CMID -> context when it is created in
    cma_new_conn_id). Thus the assumption that new events for a
    connection can get dropped in ucma_event_handler() is incorrect
    IF the initial connect request has not been retrieved in the
    first case. We end up getting a CM Reject event on the listen CM
    ID and our upper layer code asserts (in fact this event does not
    even have the listen_id set as that only gets set up librdmacm
    for connect requests).

    The solution is to verify that the cm_id being reported in the event
    is the same as the cm_id referenced by the ucma context. A mismatch
    indicates that the ucma context corresponds to the listen. This fix
    was validated by using a modified version of librdmacm that was able
    to verify the problem and see that the reject message was indeed
    dropped after this patch was applied.

    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Sean Hefty
     

09 Nov, 2013

17 commits

  • This patch adds new rdma node and new rdma transport, and supporting
    code used by Cisco's low latency driver called usNIC. usNIC uses its
    own transport, distinct from IB and iWARP.

    Signed-off-by: Upinder Malhi
    Signed-off-by: Jeff Squyres
    Signed-off-by: Roland Dreier

    Upinder Malhi \(umalhi\)
     
  • Assigning a value to itself is pointless.

    Spotted with coverity, no hardware to test.

    Signed-off-by: Dave Jones
    Signed-off-by: Roland Dreier

    Dave Jones
     
  • The IB spec does not guarantee that the opcode is available in error
    completions. Hence do not rely on it. See also commit 948d1e889e5b
    ("IB/srp: Introduce srp_handle_qp_err()").

    Signed-off-by: Bart Van Assche
    Cc: # v3.8
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • If SCSI commands are submitted with a SCSI request timeout that is
    lower than the the IB RC timeout, it can happen that the SCSI error
    handler has already started device recovery before transport layer
    error handling starts. So it can happen that the SCSI error handler
    tries to abort a SCSI command after it has been reset by
    srp_rport_reconnect().

    Tell the SCSI error handler that such commands have finished and that
    it is not necessary to continue its recovery strategy for commands
    that have been reset by srp_rport_reconnect().

    Signed-off-by: Bart Van Assche
    Cc:
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Remove an SRP target from the SRP target list before invoking the last
    scsi_host_put() call. This change is necessary because that last put
    frees the memory that holds the srp_target_port structure.

    This patch prevents the following kernel oops:

    RIP: 0010:[] __lock_acquire+0x500/0x1570
    Call Trace:
    [] lock_acquire+0xa4/0x120
    [] _spin_lock+0x36/0x70
    [] srp_remove_work+0xef/0x180 [ib_srp]
    [] worker_thread+0x21c/0x3d0
    [] kthread+0x96/0xa0
    [] child_rip+0xa/0x20

    Signed-off-by: Vu Pham

    [ bvanassche - Modified path description and CC'ed stable. ]

    Signed-off-by: Bart Van Assche
    Cc:
    Signed-off-by: Roland Dreier

    Vu Pham
     
  • Currently, it's not possible to change queue depth for a device behind
    SRP host. Sometimes, we need to adjust queue_depth for performance
    reason (eg storage busy, we need lower queue_depth to avoid running
    into SCSI error handler), so this patch add support for SRP driver.

    Signed-off-by: Jack Wang
    Tested-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Jack Wang
     
  • Certain storage configurations, e.g. a sufficiently large array of
    hard disks in a RAID configuration, need a queue depth above 64 to
    achieve optimal performance. Hence make the queue depth configurable.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Tested-by: Jack Wang
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • This patch does not change any functionality.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Cc: Roland Dreier
    Cc: Vu Pham
    Cc: Sebastian Riemer
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • On an initiator system with multiple IB ports it is not yet possible
    to figure out what the originating port of an SRP connection is. Hence
    make the source GID available in sysfs.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • After a transport layer occurred, periodically try to reconnect
    to the target until the dev_loss timer expires. Protect the
    callback functions that can be invoked from inside the SCSI EH
    against concurrent invocation with srp_reconnect_rport() via the
    rport mutex. Change the default dev_loss_tmo from 60s into 600s
    to give the reconnect mechanism a chance to kick in.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Add support for periodically reconnecting to an SRP target until
    the dev_loss timer expires. After the tenth reconnection attempt,
    gradually slow down subsequent reconnect attempts.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Start the reconnect timer, fast_io_fail timer and dev_loss timers if a
    transport layer error occurs.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB SRP
    initiator. Add kernel module parameters that allow to specify default
    values for these parameters.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Keep the rport data structure around after srp_remove_host() has
    finished until cleanup of the IB transport layer has finished
    completely. This is necessary because later patches use the rport
    pointer inside the queuecommand callback. Without this patch
    accessing the rport from inside a queuecommand callback is racy
    because srp_remove_host() must be invoked before scsi_remove_host()
    and because the queuecommand callback could get invoked after
    srp_remove_host() has finished. In other words, without this patch
    the queuecommand callback can get invoked after the rport data
    structure has been freed.

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Allow the InfiniBand RC retry count to be configured by the user as an
    option in the target login string. Reducing this retry count allows to
    reduce the path failover time.

    Signed-off-by: Vu Pham

    [ bvanassche: Rewrote patch description / changed default retry count ]

    Signed-off-by: Bart Van Assche
    Acked-by: David Dillow
    Signed-off-by: Roland Dreier

    Vu Pham
     
  • Commit 7fac33014f54("IB/qib: checkpatch fixes") was overzealous in
    removing a simple_strtoul for a parse routine, setup_txselect(). That
    routine is required to handle a multi-value string.

    Unwind that aspect of the fix.

    Cc:
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • Convert __attribute__ ((packed)) to __packed.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn