23 Jun, 2014

2 commits


13 Jun, 2014

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "The highlights this round include:

    - Add support for T10 PI pass-through between vhost-scsi +
    virtio-scsi (MST + Paolo + MKP + nab)
    - Add support for T10 PI in qla2xxx target mode (Quinn + MKP + hch +
    nab, merged through scsi.git)
    - Add support for percpu-ida pre-allocation in qla2xxx target code
    (Quinn + nab)
    - A number of iser-target fixes related to hardening the network
    portal shutdown path (Sagi + Slava)
    - Fix response length residual handling for a number of control CDBs
    (Roland + Christophe V.)
    - Various iscsi RFC conformance fixes in the CHAP authentication path
    (Tejas and Calsoft folks + nab)
    - Return TASK_SET_FULL status for tcm_fc(FCoE) DataIn + Response
    failures (Vasu + Jun + nab)
    - Fix long-standing ABORT_TASK + session reset hang (nab)
    - Convert iser-initiator + iser-target to include T10 bytes into EDTL
    (Sagi + Or + MKP + Mike Christie)
    - Fix NULL pointer dereference regression related to XCOPY introduced
    in v3.15 + CC'ed to v3.12.y (nab)"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (34 commits)
    target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmd
    vhost-scsi: Include prot_bytes into expected data transfer length
    TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
    libiscsi, iser: Adjust data_length to include protection information
    scsi_cmnd: Introduce scsi_transfer_length helper
    target: Report correct response length for some commands
    target/sbc: Check that the LBA and number of blocks are correct in VERIFY
    target/sbc: Remove sbc_check_valid_sectors()
    Target/iscsi: Fix sendtargets response pdu for iser transport
    Target/iser: Fix a wrong dereference in case discovery session is over iser
    iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
    target: Use complete_all for se_cmd->t_transport_stop_comp
    target: Set CMD_T_ACTIVE bit for Task Management Requests
    target: cleanup some boolean tests
    target/spc: Simplify INQUIRY EVPD=0x80
    tcm_fc: Generate TASK_SET_FULL status for response failures
    tcm_fc: Generate TASK_SET_FULL status for DataIN failures
    iscsi-target: Reject mutual authentication with reflected CHAP_C
    iscsi-target: Remove no-op from iscsit_tpg_del_portal_group
    iscsi-target: Fix CHAP_A parameter list handling
    ...

    Linus Torvalds
     

12 Jun, 2014

2 commits

  • Pull vhost infrastructure updates from Michael S. Tsirkin:
    "This reworks vhost core dropping unnecessary RCU uses in favor of VQ
    mutexes which are used on fast path anyway. This fixes worst-case
    latency for users which change the memory mappings a lot. Memory
    allocation for vhost-net now supports fallback on vmalloc (same as for
    vhost-scsi) this makes it possible to create the device on systems
    where memory is very fragmented, with slightly lower performance"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    vhost: move memory pointer to VQs
    vhost: move acked_features to VQs
    vhost: replace rcu with mutex
    vhost-net: extend device allocation to vmalloc

    Linus Torvalds
     
  • This patch updates vhost_scsi_get_tag() to accept the combined
    expected data transfer length + T10 PI bytes as the value passed
    into target_submit_cmd().

    This is required now that target-core logic in commit 14ef9200
    expects to subtract se_cmd->prot_length from se_cmd->data_length.

    Cc: Paolo Bonzini
    Cc: Michael S. Tsirkin
    Cc: Martin K. Petersen
    Cc: Sagi Grimberg
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

09 Jun, 2014

4 commits

  • commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
    vhost: replace rcu with mutex
    replaced rcu sync for memory accesses with VQ mutex locl/unlock.
    This is correct since all accesses are under VQ mutex, but incomplete:
    we still do useless rcu lock/unlock operations, someone might copy this
    code into some other context where this won't be right.
    This use of RCU is also non standard and hard to understand.
    Let's copy the pointer to each VQ structure, this way
    the access rules become straight-forward, and there's
    no need for RCU anymore.

    Reported-by: Eric Dumazet
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Refactor code to make sure features are only accessed
    under VQ mutex. This makes everything simpler, no need
    for RCU here anymore.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • All memory accesses are done under some VQ mutex.
    So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
    for memory access changes.
    Some guests cause a lot of these changes, so it's helpful
    to make them faster.

    Reported-by: "Gonglei (Arei)"
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Michael Mueller provided a patch to reduce the size of
    vhost-net structure as some allocations could fail under
    memory pressure/fragmentation. We are still left with
    high order allocations though.

    This patch is handling the problem at the core level, allowing
    vhost structures to use vmalloc() if kmalloc() failed.

    As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
    to kzalloc() flags to do this fallback only when really needed.

    People are still looking at cleaner ways to handle the problem
    at the API level, probably passing in multiple iovecs.
    This hack seems consistent with approaches
    taken since then by drivers/vhost/scsi.c and net/core/dev.c

    Based on patch by Romain Francoise.

    Cc: Michael Mueller
    Signed-off-by: Romain Francoise
    Acked-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

03 Jun, 2014

4 commits

  • This patch updates vhost_scsi_handle_vq() to check for the existance
    of virtio_scsi_cmd_req_pi comparing vq->iov[0].iov_len in order to
    calculate seperate data + protection SGLs from data_num.

    Also update tcm_vhost_submission_work() to pass the pre-allocated
    cmd->tvc_prot_sgl[] memory into target_submit_cmd_map_sgls(), and
    update vhost_scsi_get_tag() parameters to accept scsi_tag, lun, and
    task_attr.

    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Sagi Grimberg
    Cc: H. Peter Anvin
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch adds vhost_scsi_map_iov_to_prot() to perform the mapping of
    T10 data integrity memory between virtio iov + struct scatterlist using
    get_user_pages_fast() following existing code.

    As with vhost_scsi_map_iov_to_sgl(), this does sanity checks against the
    total prot_sgl_count vs. pre-allocated SGLs, and loops across protection
    iovs using vhost_scsi_map_to_sgl() to perform the actual memory mapping.

    Also update tcm_vhost_release_cmd() to release associated tvc_prot_sgl[]
    struct page.

    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Sagi Grimberg
    Cc: H. Peter Anvin
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch updates tcm_vhost_make_nexus() to pre-allocate per descriptor
    tcm_vhost_cmd->tvc_prot_sgl[] used to expose protection SGLs from within
    virtio-scsi guest memory to vhost-scsi.

    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • Move the overflow check for sgl_count > TCM_VHOST_PREALLOC_SGLS into
    vhost_scsi_map_iov_to_sgl() so that it's based on the total number
    of SGLs for all IOVs, instead of single IOVs.

    Also, rename TCM_VHOST_PREALLOC_PAGES -> TCM_VHOST_PREALLOC_UPAGES
    to better describe pointers to user-space pages.

    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

18 Apr, 2014

1 commit

  • Mostly scripted conversion of the smp_mb__* barriers.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
    Cc: Linus Torvalds
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Apr, 2014

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "Here are the target pending updates for v3.15-rc1. Apologies in
    advance for waiting until the second to last day of the merge window
    to send these out.

    The highlights this round include:

    - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
    - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
    - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
    - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
    - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
    - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
    - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

    Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
    metadata into KVM guest have been left-out for now, as there where a
    few comments from MST + Paolo that where not able to be addressed in
    time for v3.15. Please expect this feature for v3.16-rc1"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
    ib_srpt: Use correct ib_sg_dma primitives
    target/tcm_fc: Rename ft_tport_create to ft_tport_get
    target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
    target/tcm_fc: Rename structs and list members for clarity
    target/tcm_fc: Limit to 1 TPG per wwn
    target/tcm_fc: Don't export ft_lport_list
    target/tcm_fc: Fix use-after-free of ft_tpg
    target: Add check to prevent Abort Task from aborting itself
    target: Enable READ_STRIP emulation in target_complete_ok_work
    target/sbc: Add sbc_dif_read_strip software emulation
    target: Enable WRITE_INSERT emulation in target_execute_cmd
    target/sbc: Add sbc_dif_generate software emulation
    target/sbc: Only expose PI read_cap16 bits when supported by fabric
    target/spc: Only expose PI mode page bits when supported by fabric
    target/spc: Only expose PI inquiry bits when supported by fabric
    target: Pass in transport supported PI at session initialization
    target/iblock: Fix double bioset_integrity_free bug
    Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
    target/rd: T10-Dif: RAM disk is allocating more space than required.
    iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
    ...

    Linus Torvalds
     

07 Apr, 2014

2 commits

  • In order to support local WRITE_INSERT + READ_STRIP operations for
    non PI enabled fabrics, the fabric driver needs to be able signal
    what protection offload operations are supported.

    This is done at session initialization time so the modes can be
    signaled by individual se_wwn + se_portal_group endpoints, as well
    as optionally across different transports on the same endpoint.

    For iser-target, set TARGET_PROT_ALL if the underlying ib_device
    has already signaled PI offload support, and allow this to be
    exposed via a new iscsit_transport->iscsit_get_sup_prot_ops()
    callback.

    For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
    operation.

    For all other drivers, set TARGET_PROT_NORMAL to disable fabric
    level PI.

    Cc: Martin K. Petersen
    Cc: Sagi Grimberg
    Cc: Or Gerlitz
    Cc: Quinn Tran
    Cc: Giridhar Malavali
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • Now that TASK_ABORTED status is not generated for all cases by
    TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is
    necessary in order to give fabric drivers a chance to unmap
    hardware / software resources before the se_cmd descriptor is
    released via the normal TFO->release_cmd() codepath.

    This patch adds TFO->aborted_task() in core_tmr_abort_task()
    in place of the original transport_send_task_abort(), and
    also updates all fabric drivers to implement this caller.

    The fabric drivers that include changes to perform cleanup
    via ->aborted_task() are:

    - iscsi-target
    - iser-target
    - srpt
    - tcm_qla2xxx

    The fabric drivers that currently set ->aborted_task() to
    NOPs are:

    - loopback
    - tcm_fc
    - usb-gadget
    - sbp-target
    - vhost-scsi

    For the latter five, there appears to be no additional cleanup
    required before invoking TFO->release_cmd() to release the
    se_cmd descriptor.

    v2 changes:
    - Move ->aborted_task() call into transport_cmd_finish_abort (Alex)

    Cc: Alex Leung
    Cc: Mark Rustad
    Cc: Roland Dreier
    Cc: Vu Pham
    Cc: Chris Boot
    Cc: Sebastian Andrzej Siewior
    Cc: Michael S. Tsirkin
    Cc: Giridhar Malavali
    Cc: Saurav Kashyap
    Cc: Quinn Tran
    Cc: Sagi Grimberg
    Cc: Or Gerlitz
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

02 Apr, 2014

1 commit


29 Mar, 2014

2 commits

  • vhost fails to validate negative error code
    from vhost_get_vq_desc causing
    a crash: we are using -EFAULT which is 0xfffffff2
    as vector size, which exceeds the allocated size.

    The code in question was introduced in commit
    8dd014adfea6f173c1ef6378f7e5e7924866c923
    vhost-net: mergeable buffers support

    CVE-2014-0055

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • When mergeable buffers are disabled, and the
    incoming packet is too large for the rx buffer,
    get_rx_bufs returns success.

    This was intentional in order for make recvmsg
    truncate the packet and then handle_rx would
    detect err != sock_len and drop it.

    Unfortunately we pass the original sock_len to
    recvmsg - which means we use parts of iov not fully
    validated.

    Fix this up by detecting this overrun and doing packet drop
    immediately.

    CVE-2014-0077

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

02 Mar, 2014

1 commit

  • Pull SCSI target fixes from Nicholas Bellinger:
    "The bulk of the series are bugfixes for qla2xxx target NPIV support
    that went in for v3.14-rc1. Also included are a few DIF related
    fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
    protocol version related fix from Venkatesh.

    Also just a heads up that a series to address a number of issues with
    iser-target active I/O reset/shutdown is still being tested, and will
    be included in a separate -rc6 PULL request"

    * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
    vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
    qla2xxx: Fix kernel panic on selective retransmission request
    Target/sbc: Don't use sg as iterator in sbc_verify_read
    target: Add DIF sense codes in transport_generic_request_failure
    target/sbc: Fix sbc_dif_copy_prot addr offset bug
    tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
    tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
    tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
    qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
    qla2xxx: Fix qlt_lport_register base_vha callback race

    Linus Torvalds
     

25 Feb, 2014

1 commit


14 Feb, 2014

2 commits

  • vhost_zerocopy_callback accesses VQ right after it drops a ubuf
    reference. In theory, this could race with device removal which waits
    on the ubuf kref, and crash on use after free.

    Do all accesses within rcu read side critical section, and synchronize
    on release.

    Since callbacks are always invoked from bh, synchronize_rcu_bh seems
    enough and will help release complete a bit faster.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • vhost checked the counter within the refcnt before decrementing. It
    really wanted to know that it is the one that has the last reference, as
    a way to batch freeing resources a bit more efficiently.

    Note: we only let refcount go to 0 on device release.

    This works well but we now access the ref counter twice so there's a
    race: all users might see a high count and decide to defer freeing
    resources.
    In the end no one initiates freeing resources until the last reference
    is gone (which is on VM shotdown so might happen after a looooong time).

    Let's do what we probably should have done straight away:
    switch from kref to plain atomic, documenting the
    semantics, return the refcount value atomically after decrement,
    then use that to avoid the deadlock.

    Reported-by: Qin Chuanyu
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

01 Feb, 2014

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "The highlights this round include:

    - add support for SCSI Referrals (Hannes)
    - add support for T10 DIF into target core (nab + mkp)
    - add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab)
    - add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab)
    - prep changes to iser-target for >= v3.15 T10 DIF support (Sagi)
    - add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn)
    - allow percpu_ida_alloc() to receive task state bitmask (Kent)
    - fix >= v3.12 iscsi-target session reset hung task regression (nab)
    - fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab)
    - fix a long-standing network portal creation race (Andy)"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
    target: Fix percpu_ref_put race in transport_lun_remove_cmd
    target/iscsi: Fix network portal creation race
    target: Report bad sector in sense data for DIF errors
    iscsi-target: Convert gfp_t parameter to task state bitmask
    iscsi-target: Fix connection reset hang with percpu_ida_alloc
    percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
    iscsi-target: Pre-allocate more tags to avoid ack starvation
    qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport
    qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO.
    qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
    IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
    IB/isert: Move fastreg descriptor creation to a function
    IB/isert: Avoid frwr notation, user fastreg
    IB/isert: seperate connection protection domains and dma MRs
    tcm_loop: Enable DIF/DIX modes in SCSI host LLD
    target/rd: Add DIF protection into rd_execute_rw
    target/rd: Add support for protection SGL setup + release
    target/rd: Refactor rd_build_device_space + rd_release_device_space
    target/file: Add DIF protection support to fd_execute_rw
    target/file: Add DIF protection init/format support
    ...

    Linus Torvalds
     

24 Jan, 2014

1 commit

  • This patch changes percpu_ida_alloc() + callers to accept task state
    bitmask for prepare_to_wait() for code like target/iscsi that needs
    it for interruptible sleep, that is provided in a subsequent patch.

    It now expects TASK_UNINTERRUPTIBLE when the caller is able to sleep
    waiting for a new tag, or TASK_RUNNING when the caller cannot sleep,
    and is forced to return a negative value when no tags are available.

    v2 changes:
    - Include blk-mq + tcm_fc + vhost/scsi + target/iscsi changes
    - Drop signal_pending_state() call
    v3 changes:
    - Only call prepare_to_wait() + finish_wait() when != TASK_RUNNING
    (PeterZ)

    Reported-by: Linus Torvalds
    Cc: Linus Torvalds
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Jens Axboe
    Signed-off-by: Kent Overstreet
    Cc: #3.12+
    Signed-off-by: Nicholas Bellinger

    Kent Overstreet
     

18 Jan, 2014

1 commit

  • This patch adds support to target_submit_cmd_map_sgls() for
    accepting 'sgl_prot' + 'sgl_prot_count' parameters for
    DIF protection information.

    Note the passed parameters are stored at se_cmd->t_prot_sg
    and se_cmd->t_prot_nents respectively.

    Also, update tcm_loop and vhost-scsi fabrics usage of
    target_submit_cmd_map_sgls() to take into account the
    new parameters.

    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Sagi Grimberg
    Cc: Or Gerlitz
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

07 Dec, 2013

1 commit


23 Nov, 2013

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "Things have been quiet this round with mostly bugfixes, percpu
    conversions, and other minor iscsi-target conformance testing changes.

    The highlights include:

    - Add demo_mode_discovery attribute for iscsi-target (Thomas)
    - Convert tcm_fc(FCoE) to use percpu-ida pre-allocation
    - Add send completion interrupt coalescing for ib_isert
    - Convert target-core to use percpu-refcounting for se_lun
    - Fix mutex_trylock usage bug in iscsit_increment_maxcmdsn
    - tcm_loop updates (Hannes)
    - target-core ALUA cleanups + prep for v3.14 SCSI Referrals support (Hannes)

    v3.14 is currently shaping to be a busy development cycle in target
    land, with initial support for T10 Referrals and T10 DIF currently on
    the roadmap"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits)
    iscsi-target: chap auth shouldn't match username with trailing garbage
    iscsi-target: fix extract_param to handle buffer length corner case
    iscsi-target: Expose default_erl as TPG attribute
    target_core_configfs: split up ALUA supported states
    target_core_alua: Make supported states configurable
    target_core_alua: Store supported ALUA states
    target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED
    target_core_alua: spellcheck
    target core: rename (ex,im)plict -> (ex,im)plicit
    percpu-refcount: Add percpu-refcount.o to obj-y
    iscsi-target: Do not reject non-immediate CmdSNs exceeding MaxCmdSN
    iscsi-target: Convert iscsi_session statistics to atomic_long_t
    target: Convert se_device statistics to atomic_long_t
    target: Fix delayed Task Aborted Status (TAS) handling bug
    iscsi-target: Reject unsupported multi PDU text command sequence
    ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call
    iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn
    target: Core does not need blkdev.h
    target: Pass through I/O topology for block backstores
    iser-target: Avoid using FRMR for single dma entry requests
    ...

    Linus Torvalds
     

26 Oct, 2013

1 commit

  • This patch addresses a long-standing bug where the get_user_pages_fast()
    write parameter used for setting the underlying page table entry permission
    bits was incorrectly set to write=1 for data_direction=DMA_TO_DEVICE, and
    passed into get_user_pages_fast() via vhost_scsi_map_iov_to_sgl().

    However, this parameter is intended to signal WRITEs to pinned userspace
    PTEs for the virtio-scsi DMA_FROM_DEVICE -> READ payload case, and *not*
    for the virtio-scsi DMA_TO_DEVICE -> WRITE payload case.

    This bug would manifest itself as random process segmentation faults on
    KVM host after repeated vhost starts + stops and/or with lots of vhost
    endpoints + LUNs.

    Cc: Stefan Hajnoczi
    Cc: Michael S. Tsirkin
    Cc: Asias He
    Cc: # 3.6+
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

17 Oct, 2013

1 commit


02 Oct, 2013

1 commit


18 Sep, 2013

2 commits


17 Sep, 2013

1 commit

  • the wake_up_process func is included by spin_lock/unlock in
    vhost_work_queue,
    but it could be done outside the spin_lock.
    I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf,
    the num as below.
    original modified
    thread_num tp(Gbps) vhost(%) | tp(Gbps) vhost(%)
    1 9.59 28.82 | 9.59 27.49
    8 9.61 32.92 | 9.62 26.77
    64 9.58 46.48 | 9.55 38.99
    256 9.6 63.7 | 9.6 52.59

    Signed-off-by: Chuanyu Qin
    Signed-off-by: Michael S. Tsirkin

    Qin Chuanyu
     

13 Sep, 2013

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "Lots of activity again this round for I/O performance optimizations
    (per-cpu IDA pre-allocation for vhost + iscsi/target), and the
    addition of new fabric independent features to target-core
    (COMPARE_AND_WRITE + EXTENDED_COPY).

    The main highlights include:

    - Support for iscsi-target login multiplexing across individual
    network portals
    - Generic Per-cpu IDA logic (kent + akpm + clameter)
    - Conversion of vhost to use per-cpu IDA pre-allocation for
    descriptors, SGLs and userspace page pointer list
    - Conversion of iscsi-target + iser-target to use per-cpu IDA
    pre-allocation for descriptors
    - Add support for generic COMPARE_AND_WRITE (AtomicTestandSet)
    emulation for virtual backend drivers
    - Add support for generic EXTENDED_COPY (CopyOffload) emulation for
    virtual backend drivers.
    - Add support for fast memory registration mode to iser-target (Vu)

    The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of
    particular significance, which make us the first and only open source
    target to support the full set of VAAI primitives.

    Currently Linux clients are lacking upstream support to actually
    utilize these primitives. However, with server side support now in
    place for folks like MKP + ZAB working on the client, this logic once
    reserved for the highest end of storage arrays, can now be run in VMs
    on their laptops"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits)
    target/iscsi: Bump versions to v4.1.0
    target: Update copyright ownership/year information to 2013
    iscsi-target: Bump default TCP listen backlog to 256
    target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out
    iscsi-target; Bump default CmdSN Depth to 64
    iscsi-target: Remove unnecessary wait_for_completion in iscsi_get_thread_set
    iscsi-target: Add thread_set->ts_activate_sem + use common deallocate
    iscsi-target: Fix race with thread_pre_handler flush_signals + ISCSI_THREAD_SET_DIE
    target: remove unused including
    iser-target: introduce fast memory registration mode (FRWR)
    iser-target: generalize rdma memory registration and cleanup
    iser-target: move rdma wr processing to a shared function
    target: Enable global EXTENDED_COPY setup/release
    target: Add Third Party Copy (3PC) bit in INQUIRY response
    target: Enable EXTENDED_COPY setup in spc_parse_cdb
    target: Add support for EXTENDED_COPY copy offload emulation
    target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check
    target: Add global device list for EXTENDED_COPY
    target: Make helpers non static for EXTENDED_COPY command setup
    target: Make spc_parse_naa_6h_vendor_specific non static
    ...

    Linus Torvalds
     

11 Sep, 2013

1 commit


10 Sep, 2013

2 commits

  • This patch adds support for pre-allocation of per tv_cmd descriptor
    scatterlist + user-space page pointer memory using se_sess->sess_cmd_map
    within tcm_vhost_make_nexus() code.

    This includes sanity checks within vhost_scsi_map_to_sgl()
    to reject I/O that exceeds these initial hardcoded values, and
    the necessary cleanup in tcm_vhost_make_nexus() failure path +
    tcm_vhost_drop_nexus().

    v3 changes:
    - Rebase to v3.11-rc5 code

    Cc: Michael S. Tsirkin
    Cc: Asias He
    Cc: Kent Overstreet
    Reviewed-by: Asias He
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     
  • This patch changes vhost/scsi to use transport_init_session_tags()
    pre-allocation logic for per-cpu session tag pooling with internal
    ida_alloc() + ida_free() calls based upon the saved se_cmd->map_tag id.

    FIXME: Make transport_init_session_tags() number of tags setup
    configurable per vring client setting via configfs

    v5 changes:
    - Convert to percpu_ida.h include

    v3 changes:
    - Update to percpu-ida usage
    - Rebase to v3.11-rc5 code

    Cc: Michael S. Tsirkin
    Cc: Asias He
    Cc: Kent Overstreet
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Nicholas Bellinger

    Nicholas Bellinger
     

04 Sep, 2013

1 commit

  • As Michael point out, We used to limit the max pending DMAs to get better cache
    utilization. But it was not done correctly since it was one done when there's no
    new buffers submitted from guest. Guest can easily exceeds the limitation by
    keeping sending packets.

    So this patch moves the check into main loop. Tests shows about 5%-10%
    improvement on per cpu throughput for guest tx.

    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang