linux/drivers/infiniband/hw/cxgb4
Hariprasad S 093108cb36 RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr()
Currently c4iw_peer_abort_intr() does not wake up the waiter if the
endpoint state indicates we're using MPAv2 and we're currently trying to
connect. This was introduced with commit 7c0a33d611 ("RDMA/cxgb4:
Don't wakeup threads for MPAv2")

However, this original fix is flawed because it introduces a race that
can cause a deadlock of the iwarp stack.  Here is the race:

->local side sets up an active offload connection.

->local side sends MPA_START request.

->peer sends MPA_START response.

->local side ingress cpl thread begins processing the MPA_START response,
but before it changes the state from MPA_REQ_SENT to FPDU_MODE:

->peer sends a RST which results in a ABORT_REQ_RSS.  This triggers
peer_abort_intr() which sees the state in MPA_REQ_SENT and since mpa_rev
is 2, it will avoid waking up the endpoint with -ECONNRESET, assuming the
stack will re-attempt the connection using MPAv1.

->Meanwhile, the cpl thread moves the state to FPDU_MODE and calls
c4iw_modify_rc_qp() which calls rdma_init() which sends a RI_WR/INIT WR
to firmware.  But since HW sent an abort, FW correctly drops the RI_WR/INIT
WR.

->So the cpl thread is stuck waiting for a reply and cannot process the
ABORT_REQ_RSS cpl sitting in its input queue. Thus everything comes to a
halt because no more ingress cpls are processed by the stack...

The correct fix for the issue is to always do the wake up in
c4iw_abort_intr() but reinitialize the wait object in c4iw_reconnect().

Fixes: 7c0a33d611 ("RDMA/cxgb4: Don't wakeup threads for MPAv2")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:38:10 -04:00
..
cm.c RDMA/iw_cxgb4: Always wake up waiter in c4iw_peer_abort_intr() 2016-05-13 19:38:10 -04:00
cq.c RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips 2016-04-26 12:47:09 -04:00
device.c Merge branches 'nes', 'cxgb4' and 'iwpm' into k.o/for-4.6 2016-03-16 13:57:43 -04:00
ev.c InfiniBand/RDMA changes for 3.20 merge window: 2015-02-21 12:53:21 -08:00
id_table.c drivers/infiniband/hw: rename random32() to prandom_u32() 2013-05-07 18:38:27 -07:00
iw_cxgb4.h RDMA/iw_cxgb4: stop_ep_timer() after MPA negotiation 2016-05-13 19:38:06 -04:00
Kconfig RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter 2014-04-28 17:29:41 -07:00
Makefile RDMA/cxgb4: Remove kfifo usage 2012-05-18 13:22:36 -07:00
mem.c RDMA/iw_cxgb4: set the correct FID value in DSGL commands 2016-05-13 19:38:05 -04:00
provider.c iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping 2016-04-26 12:46:54 -04:00
qp.c RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips 2016-04-26 12:47:09 -04:00
resource.c RDMA/cxgb4: Add missing debug stats 2014-04-11 11:36:09 -07:00
t4.h iw_cxgb4: Pass qid range to user space driver 2015-12-24 00:17:30 -05:00
t4fw_ri_api.h cxgb4, iw_cxgb4: move delayed ack macro definitions 2016-03-22 00:25:05 -07:00
user.h iw_cxgb4: Pass qid range to user space driver 2015-12-24 00:17:30 -05:00