Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

21 Dec, 2014

1 commit

cdce6ac27 Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI update from James Bottomley:
"This is a much shorter set of patches that were on the go but didn't
make it in to the early pull request for the merge window. It's
really a set of bug fixes plus some final cleanup work on the new tag
queue API"

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
storvsc: ring buffer failures may result in I/O freeze
ipr: set scsi_level correctly for disk arrays
ipr: add support for async scanning to speed up boot
scsi_debug: fix missing "break;" in SDEBUG_UA_CAPACITY_CHANGED case
scsi_debug: take sdebug_host_list_lock when changing capacity
scsi_debug: improve driver description in Kconfig
scsi_debug: fix compare and write errors
qla2xxx: fix race in handling rport deletion during recovery causes panic
scsi: blacklist RSOC for Microsoft iSCSI target devices
scsi: fix random memory corruption with scsi-mq + T10 PI
Revert "[SCSI] mpt3sas: Remove phys on topology change"
Revert "[SCSI] mpt2sas: Remove phys on topology change."
esas2r: Correct typos of "validate" in a comment
fc: FCP_PTA_SIMPLE is 0
ibmvfc: remove unused tag variable
scsi: remove MSG_*_TAG defines
scsi: remove scsi_set_tag_type
scsi: remove scsi_get_tag_type
scsi: never drop to untagged mode during queue ramp down
scsi: remove ->change_queue_type method

Linus Torvalds
2014-12-21 05:42:57 +0800

20 Dec, 2014

1 commit

ed55635e2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending ... Browse Code »

Pull SCSI target fixes from Nicholas Bellinger:
"The highlights this merge window include:

- Allow target fabric drivers to function as built-in. (Roland)
- Fix tcm_loop multi-TPG endpoint nexus bug. (Hannes)
- Move per device config_item_type into se_subsystem_api, allowing
configfs attributes to be defined at module_init time. (Jerome +
nab)
- Convert existing IBLOCK/FILEIO/RAMDISK/PSCSI/TCMU drivers to use
external configfs attributes. (nab)
- A number of iser-target fixes related to active session + network
portal shutdown stability during extended stress testing. (Sagi +
Slava)
- Dynamic allocation of T10-PI contexts for iser-target, fixing a
potentially bogus iscsi_np->tpg_np pointer reference in >= v3.14
code. (Sagi)
- iser-target performance + scalability improvements. (Sagi)
- Fixes for SPC-4 Persistent Reservation AllRegistrants spec
compliance. (Ilias + James + nab)
- Avoid potential short kern_sendmsg() in iscsi-target for now until
Al's conversion to use msghdr iteration is merged post -rc1.
(Viro)

Also, Sagi has requested a number of iser-target patches (9) that
address stability issues he's encountered during extended stress
testing be considered for v3.10.y + v3.14.y code. Given the amount of
LOC involved, it will certainly require extra backporting effort.

Apologies in advance to Greg-KH & Co on this. Sagi and I will be
working post-merge to ensure they each get applied correctly"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (53 commits)
target: Allow AllRegistrants to re-RESERVE existing reservation
uapi/linux/target_core_user.h: fix headers_install.sh badness
iscsi-target: Fail connection on short sendmsg writes
iscsi-target: nullify session in failed login sequence
target: Avoid dropping AllRegistrants reservation during unregister
target: Fix R_HOLDER bit usage for AllRegistrants
iscsi-target: Drop left-over bogus iscsi_np->tpg_np
iser-target: Fix wc->wr_id cast warning
iser-target: Remove code duplication
iser-target: Adjust log levels and prettify some prints
iser-target: Use debug_level parameter to control logging level
iser-target: Fix logout sequence
iser-target: Don't wait for session commands from completion context
iser-target: Reduce CQ lock contention by batch polling
iser-target: Introduce isert_poll_budget
iser-target: Remove an atomic operation from the IO path
iser-target: Remove redundant call to isert_conn_terminate
iser-target: Use single CQ for TX and RX
iser-target: Centralize completion elements to a context
iser-target: Cast wr_id with uintptr_t instead of unsinged long
...

Linus Torvalds
2014-12-20 10:02:22 +0800

18 Dec, 2014

1 commit

e61745769 Merge remote-tracking branch 'scsi-queue/drivers-for-3.19' into for-linus Browse Code »

James Bottomley
2014-12-18 21:56:29 +0800

16 Dec, 2014

26 commits

a7cfef21e Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'mlx4', 'ocrdma', 'odp' and 'srp' into for-next Browse Code »

Roland Dreier
2014-12-16 10:19:20 +0800
7dcf9c193 IB/srp: Allow newline separator for connection string ... Browse Code »

In case the last argument of the connection string is processed as a
string (destination GID for example).

Signed-off-by: Sagi Grimberg
Acked-by: Bart Van Assche
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:15:23 +0800
056da88f2 IB/iser: Bump version to 1.5 ... Browse Code »

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Or Gerlitz
2014-12-16 10:11:47 +0800
5bb6e543d IB/iser: DIX update ... Browse Code »

Following few recent Block integrity updates, we align the iSER data
integrity offload settings with:

- Deprecate pi_guard module param
- Expose support for DIX type 0.
- Use scsi_transfer_length for the transfer length
- Get pi_interval, ref_tag, ref_remap, bg_type and
check_mask setting from scsi_cmnd

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:46 +0800
06c7fb677 IB/iser: Micro-optimize iser_handle_wc ... Browse Code »

Use likely() for wc.status == IB_WC_SUCCESS

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:46 +0800
60e20908c IB/iser: Micro-optimize iser logging ... Browse Code »

And fix a checkpatch warning.

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:46 +0800
da64bdb25 IB/iser: Use more completion queues ... Browse Code »

No reason to settle with four, can use the min between device max comp
vectors and number of cores.

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:45 +0800
7e1fd4d1e IB/iser: Remove redundant is_mr indicator ... Browse Code »

It is enough to check mem_h pointer assignment, mem_h == NULL will
indicate that buffer is not registered using mr.

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:45 +0800
a11b3e693 IB/iser: Centralize memory region invalidation to a function ... Browse Code »

Eliminates code duplication.

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:45 +0800
f0caef6d4 IB/iser: Terminate connection before cleaning inflight tasks ... Browse Code »

When closing the connection, we should first terminate the connection
(in case it was not previously terminated) to guarantee the QP is in
error state and we are done with servicing IO. Only then go ahead with
tasks cleanup via iscsi_conn_stop.

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:44 +0800
7414dde0a IB/iser: Fix race between iser connection teardown and scsi TMFs ... Browse Code »
26

In certain scenarios (target kill with live IO) scsi TMFs may race
with iser RDMA teardown, which might cause NULL dereference on iser IB
device handle (which might have been freed). In this case we take a
conditional lock for TMFs and check the connection state (avoid
introducing lock contention in the IO path). This is indeed best
effort approach, but sufficient to survive multi targets sudden death
while heavy IO is inflight.

While we are on it, add a nice kernel-doc style documentation.

Reported-by: Ariel Nahum
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:44 +0800
3f562a0b8 IB/iser: Fix possible NULL derefernce ib_conn->device in session_create ... Browse Code »

If rdma_cm error event comes after ep_poll but before conn_bind, we
should protect against dereferncing the device (which may have been
terminated) in session_create and conn_create (already protected)
callbacks.

Signed-off-by: Ariel Nahum
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Ariel Nahum
2014-12-16 10:11:44 +0800
49df2781b IB/iser: Fix sparse warnings ... Browse Code »

Use uintptr_t to handle wr_id casting, which was found by Kbuild test
robot and smatch. Also remove an internal definition of variable which
potentially shadows an external one (and make sparse happy).

Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:44 +0800
6ec9d4d23 IB/iser: Fix possible SQ overflow ... Browse Code »

Fix a regression was introduced in commit 6df5a128f0fd ("IB/iser:
Suppress scsi command send completions").

The sig_count was wrongly set to be static variable, thus it is
possible that we won't reach to (sig_count % ISER_SIGNAL_BATCH) == 0
condition (due to races) and the send queue will be overflowed.

Instead keep sig_count per connection. We don't need it to be atomic
as we are safe under the iscsi session frwd_lock taken by libiscsi on
the queuecommand path.

Fixes: 6df5a128f0fd ("IB/iser: Suppress scsi command send completions")
Signed-off-by: Max Gurtovoy
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Max Gurtovoy
2014-12-16 10:11:44 +0800
93acb7bbc IB/iser: Decrement CQ's active QPs accounting when QP creation fails ... Browse Code »

When creating a connection QP we choose the least used CQ and inc the
number of active QPs on that. If we fail to create the QP, we need to
decrement the active QPs counter.

Reported-by: Roi Dayan
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:44 +0800
5426b1711 IB/iser: Collapse cleanup and disconnect handlers ... Browse Code »
13

No real need to wait for TIMEWAIT_EXIT before we destroy the RDMA
resources (also TIMEAWAIT_EXIT is not guarenteed to always arrive). As
for the cma_id, only destroy it if the state is not DOWN where in this
case, conn_release is already running and we don't want to compete.

Signed-off-by: Ariel Nahum
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Ariel Nahum
2014-12-16 10:11:44 +0800
16df2a26f IB/iser: Fix catastrophic error flow hang ... Browse Code »

In case of the HCA going into catasrophic error flow, the
beacon post_send is likely to fail, so surely there will
be no completion for it.

In this case, use a best effort approach and don't wait for beacon
completion if we failed to post the send.

Reported-by: Alex Tabachnik
Signed-off-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Sagi Grimberg
2014-12-16 10:11:44 +0800
f4641ef70 IB/iser: Re-adjust CQ and QP send ring sizes to HW limits ... Browse Code »

Re-adjust max CQEs per CQ and max send_wr per QP according
to the resource limits supported by underlying hardware.

Signed-off-by: Minh Tran
Signed-off-by: Jayamohan Kallickal
Acked-by: Sagi Grimberg
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Minh Tran
2014-12-16 10:11:44 +0800
ce347ab90 IPoIB: No longer use flush as a parameter ... Browse Code »
13

Various places in the IPoIB code had a deadlock related to flushing
the ipoib workqueue. Now that we have per device workqueues and a
specific flush workqueue, there is no longer a deadlock issue with
flushing the device specific workqueues and we can do so unilaterally.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:15 +0800
bb42a6dd0 IPoIB: Make ipoib_mcast_stop_thread flush the workqueue ... Browse Code »
13

We used to pass a flush variable to mcast_stop_thread to indicate if
we should flush the workqueue or not. This was due to some code
trying to flush a workqueue that it was currently running on which is
a no-no. Now that we have per-device work queues, and now that
ipoib_mcast_restart_task has taken the fact that it is queued on a
single thread workqueue with all of the ipoib_mcast_join_task's and
therefore has no need to stop the join task while it runs, we can do
away with the flush parameter and unilaterally flush always.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:15 +0800
5141861cd IPoIB: Use dedicated workqueues per interface ... Browse Code »
13

During my recent work on the rtnl lock deadlock in the IPoIB driver, I
saw that even once I fixed the apparent races for a single device, as
soon as that device had any children, new races popped up. It turns
out that this is because no matter how well we protect against races
on a single device, the fact that all devices use the same workqueue,
and flush_workqueue() flushes *everything* from that workqueue, we can
have one device in the middle of a down and holding the rtnl lock and
another totally unrelated device needing to run mcast_restart_task,
which wants the rtnl lock and will loop trying to take it unless is
sees its own FLAG_ADMIN_UP flag go away. Because the unrelated
interface will never see its own ADMIN_UP flag drop, the interface
going down will deadlock trying to flush the queue. There are several
possible solutions to this problem:

Make carrier_on_task and mcast_restart_task try to take the rtnl for
some set period of time and if they fail, then bail. This runs the
real risk of dropping work on the floor, which can end up being its
own separate kind of deadlock.

Set some global flag in the driver that says some device is in the
middle of going down, letting all tasks know to bail. Again, this can
drop work on the floor. I suppose if our own ADMIN_UP flag doesn't go
away, then maybe after a few tries on the rtnl lock we can queue our
own task back up as a delayed work and return and avoid dropping work
on the floor that way. But I'm not 100% convinced that we won't cause
other problems.

Or the method this patch attempts to use, which is when we bring an
interface up, create a workqueue specifically for that interface, so
that when we take it back down, we are flushing only those tasks
associated with our interface. In addition, keep the global
workqueue, but now limit it to only flush tasks. In this way, the
flush tasks can always flush the device specific work queues without
having deadlock issues.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:15 +0800
3bcce487f IPoIB: change init sequence ordering ... Browse Code »
13

In preparation for using per device work queues, we need to move the
start of the neighbor thread task to after ipoib_ib_dev_init and move
the destruction of the neighbor task to before ipoib_ib_dev_cleanup.
Otherwise we will end up freeing our workqueue with work possibly
still on it.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:15 +0800
e5d1dcf1b IPoIB: fix mcast_dev_flush/mcast_restart_task race ... Browse Code »
13

Our mcast_dev_flush routine and our mcast_restart_task can race
against each other. In particular, they both hold the priv->lock
while manipulating the rbtree and while removing mcast entries from
the multicast_list and while adding entries to the remove_list, but
they also both drop their locks prior to doing the actual removes.
The mcast_dev_flush routine is run entirely under the rtnl lock and so
has at least some locking. The actual race condition is like this:

Thread 1 Thread 2
ifconfig ib0 up
start multicast join for broadcast
multicast join completes for broadcast
start to add more multicast joins
call mcast_restart_task to add new entries
ifconfig ib0 down
mcast_dev_flush
mcast_leave(mcast A)
mcast_leave(mcast A)

As mcast_leave calls ib_sa_multicast_leave, and as member in
core/multicast.c is ref counted, we run into an unbalanced refcount
issue. To avoid stomping on each others removes, take the rtnl lock
specifically when we are deleting the entries from the remove list.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:14 +0800
016d9fb25 IPoIB: fix MCAST_FLAG_BUSY usage ... Browse Code »
13

Commit a9c8ba5884 ("IPoIB: Fix usage of uninitialized multicast
objects") added a new flag MCAST_JOIN_STARTED, but was not very strict
in how it was used. We didn't always initialize the completion struct
before we set the flag, and we didn't always call complete on the
completion struct from all paths that complete it. This made it less
than totally effective, and certainly made its use confusing. And in
the flush function we would use the presence of this flag to signal
that we should wait on the completion struct, but we never cleared
this flag, ever. This is further muddied by the fact that we overload
the MCAST_FLAG_BUSY flag to mean two different things: we have a join
in flight, and we have succeeded in getting an ib_sa_join_multicast.

In order to make things clearer and aid in resolving the rtnl deadlock
bug I've been chasing, I cleaned this up a bit.

1) Remove the MCAST_JOIN_STARTED flag entirely
2) Un-overload MCAST_FLAG_BUSY so it now only means a join is in-flight
3) Test on mcast->mc directly to see if we have completed
ib_sa_join_multicast (using IS_ERR_OR_NULL)
4) Make sure that before setting MCAST_FLAG_BUSY we always initialize
the mcast->done completion struct
5) Make sure that before calling complete(&mcast->done), we always clear
the MCAST_FLAG_BUSY bit
6) Take the mcast_mutex before we call ib_sa_multicast_join and also
take the mutex in our join callback. This forces
ib_sa_multicast_join to return and set mcast->mc before we process
the callback. This way, our callback can safely clear mcast->mc
if there is an error on the join and we will do the right thing as
a result in mcast_dev_flush.
7) Because we need the mutex to synchronize mcast->mc, we can no
longer call mcast_sendonly_join directly from mcast_send and
instead must add sendonly join processing to the mcast_join_task

A number of different races are resolved with these changes. These
races existed with the old MCAST_FLAG_BUSY usage, the
MCAST_JOIN_STARTED flag was an attempt to address them, and while it
helped, a determined effort could still trip things up.

One race looks something like this:

Thread 1 Thread 2
ib_sa_join_multicast (as part of running restart mcast task)
alloc member
call callback
ifconfig ib0 down
wait_for_completion
callback call completes
wait_for_completion in
mcast_dev_flush completes
mcast->mc is PTR_ERR_OR_NULL
so we skip ib_sa_leave_multicast
return from callback
return from ib_sa_join_multicast
set mcast->mc = return from ib_sa_multicast

We now have a permanently unbalanced join/leave issue that trips up the
refcounting in core/multicast.c

Another like this:

Thread 1 Thread 2 Thread 3
ib_sa_multicast_join
ifconfig ib0 down
priv->broadcast = NULL
join_complete
wait_for_completion
mcast->mc is not yet set, so don't clear
return from ib_sa_join_multicast and set mcast->mc
complete
return -EAGAIN (making mcast->mc invalid)
call ib_sa_multicast_leave
on invalid mcast->mc, hang
forever

By holding the mutex around ib_sa_multicast_join and taking the mutex
early in the callback, we force mcast->mc to be valid at the time we
run the callback. This allows us to clear mcast->mc if there is an
error and the join is going to fail. We do this before we complete
the mcast. In this way, mcast_dev_flush always sees consistent state
in regards to mcast->mc membership at the time that the
wait_for_completion() returns.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:14 +0800
67d7209e1 IPoIB: Make the carrier_on_task race aware ... Browse Code »
13

We blindly assume that we can just take the rtnl lock and that will
prevent races with downing this interface. Unfortunately, that's not
the case. In ipoib_mcast_stop_thread() we will call flush_workqueue()
in an attempt to clear out all remaining instances of ipoib_join_task.
But, since this task is put on the same workqueue as the join task,
the flush_workqueue waits on this thread too. But this thread is
deadlocked on the rtnl lock. The better thing here is to use trylock
and loop on that until we either get the lock or we see that
FLAG_ADMIN_UP has been cleared, in which case we don't need to do
anything anyway and we just return.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:14 +0800
afe1de664 IPoIB: Consolidate rtnl_lock tasks in workqueue ... Browse Code »
13

Setting the MTU can safely be moved to the carrier_on_task, which keeps
us from needing to take the rtnl lock in the join_finish section.

Signed-off-by: Doug Ledford
Signed-off-by: Roland Dreier

Doug Ledford
2014-12-16 10:11:14 +0800

13 Dec, 2014

11 commits

ed4520ae9 iser-target: Fix wc->wr_id cast warning ... Browse Code »

CC [M] drivers/infiniband/ulp/isert/ib_isert.o
drivers/infiniband/ulp/isert/ib_isert.c: In function ‘isert_cq_comp_err’:
drivers/infiniband/ulp/isert/ib_isert.c:1979:42: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

Signed-off-by: Nicholas Bellinger

Nicholas Bellinger
2014-12-13 15:32:35 +0800
10633c37b iser-target: Remove code duplication ... Browse Code »

- Fall-through in switch case instead in do_control_comp.
- Move rkey invalidation to a function.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:35 +0800
4c22e07ff iser-target: Adjust log levels and prettify some prints ... Browse Code »

debug_level 1 (warn): Include warning messages.
debug_level 2 (info): Include relevant info for control plane.
debug_level 3 (debug): Include relevant info in the IO path.

Also, added/removed some logging messages.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:34 +0800
24f412dd3 iser-target: Use debug_level parameter to control logging level ... Browse Code »

Personal preference, easier control of the log level with
a single modparam which can be changed dynamically. Allows
better saparation of control and IO plains.

Replaced throughout ib_isert.c:
s/pr_debug/isert_dbg/g
s/pr_info/isert_info/g
s/pr_warn/isert_warn/g
s/pr_err/isert_err/g

Plus nit checkpatch warning change.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:34 +0800
991bb7640 iser-target: Fix logout sequence ... Browse Code »

We don't want to wait for conn_logout_comp from isert_comp_wq
context as this blocks further completions from being processed.
Instead we wait for it conditionally (if logout response was
actually posted) in wait_conn. This wait should normally happen
immediately as it occurs after we consumed all the completions
(including flush errors) and conn_logout_comp should have been
completed.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:33 +0800
c7e160ee0 iser-target: Don't wait for session commands from completion context ... Browse Code »

Might result in a deadlock where completion context waits for
session commands release where the later might need a final
completion for it.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:32 +0800
36ea63b52 iser-target: Reduce CQ lock contention by batch polling ... Browse Code »

In order to reduce the contention on CQ locking (present
in some LLDDs) we poll in batches of 16 work completion items.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:32 +0800
37d9fe80a iser-target: Introduce isert_poll_budget ... Browse Code »

In case the CQ is packed with completions, we can't just
hog the CPU forever. Poll until a sufficient budget (currently
hard-coded to 64k completions) and if budget is exhausted, bailout
and give a chance to other threads.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:31 +0800
bdf20e725 iser-target: Remove an atomic operation from the IO path ... Browse Code »

In order to know that we consumed all the connection completions
we maintain atomic post_send_buf_count for each IO post send. But
we can know that if we post a "beacon" (zero length RECV work request)
after we move the QP into error state and the target does not serve
any new IO. When we consume it, we know we finished all the connection
completion and we can go ahead and destroy stuff.

In error completion handler we now just need to check for ISERT_BEACON_WRID
to arrive and then wait for session commands to cleanup and complete
conn_wait_comp_err.

We reserve another CQ and QP entries to fit the zero length post recv.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:31 +0800
377d8f5c3 iser-target: Remove redundant call to isert_conn_terminate ... Browse Code »

We are calling session reinstatement, wait_conn will start
connection termination.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:30 +0800
6f0fae3d7 iser-target: Use single CQ for TX and RX ... Browse Code »

Using TX and RX CQs attached to the same vector might
create a throttling effect coming from the serial processing
of a work-queue. Use one CQ instead, it will do better in interrupt
processing and it provides a simpler code. Also, We get rid of
redundant isert_rx_wq.

Next we can remove the atomic post_send_buf_count from the IO path.

Signed-off-by: Sagi Grimberg
Signed-off-by: Nicholas Bellinger

Sagi Grimberg
2014-12-13 15:32:30 +0800