linux/drivers/infiniband/hw
Michael J. Ruhl b4a4957d3d IB/hfi1: Fix destroy_qp hang after a link down
rvt_destroy_qp() cannot complete until all in process packets have
been released from the underlying hardware.  If a link down event
occurs, an application can hang with a kernel stack similar to:

cat /proc/<app PID>/stack
 quiesce_qp+0x178/0x250 [hfi1]
 rvt_reset_qp+0x23d/0x400 [rdmavt]
 rvt_destroy_qp+0x69/0x210 [rdmavt]
 ib_destroy_qp+0xba/0x1c0 [ib_core]
 nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
 nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
 nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
 nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
 process_one_work+0x17a/0x440
 worker_thread+0x126/0x3c0
 kthread+0xcf/0xe0
 ret_from_fork+0x58/0x90
 0xffffffffffffffff

quiesce_qp() waits until all outstanding packets have been freed.
This wait should be momentary.  During a link down event, the cleanup
handling does not ensure that all packets caught by the link down are
flushed properly.

This is caused by the fact that the freeze path and the link down
event is handled the same.  This is not correct.  The freeze path
waits until the HFI is unfrozen and then restarts PIO.  A link down
is not a freeze event.  The link down path cannot restart the PIO
until link is restored.  If the PIO path is restarted before the link
comes up, the application (QP) using the PIO path will hang (until
link is restored).

Fix by separating the linkdown path from the freeze path and use the
link down path for link down events.

Close a race condition sc_disable() by acquiring both the progress
and release locks.

Close a race condition in sc_stop() by moving the setting of the flag
bits under the alloc lock.

Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 19:24:51 -06:00
..
bnxt_re bnxt_re: Fix couple of memory leaks that could lead to IOMMU call traces 2018-09-05 16:08:41 -06:00
cxgb3 RDMA/providers: Remove pointless functions 2018-07-30 20:31:54 -06:00
cxgb4 iw_cxgb4: only allow 1 flush on user qps 2018-09-04 15:07:56 -06:00
hfi1 IB/hfi1: Fix destroy_qp hang after a link down 2018-09-20 19:24:51 -06:00
hns Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
i40iw RDMA/providers: Remove pointless functions 2018-07-30 20:31:54 -06:00
mlx4 RDMA/mlx4: Ensure that maximal send/receive SGE less than supported by HW 2018-09-06 13:16:12 -06:00
mlx5 mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
mthca RDMA/providers: Fix return value from create_srq callbacks 2018-07-30 20:29:45 -06:00
nes RDMA/providers: Remove pointless functions 2018-07-30 20:31:54 -06:00
ocrdma RDMA/providers: Remove pointless functions 2018-07-30 20:31:54 -06:00
qedr Linux 4.18 2018-08-16 13:12:00 -06:00
qib RDMA: Constify the argument of the work request conversion functions 2018-07-30 20:00:20 -06:00
usnic RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const 2018-07-30 20:09:34 -06:00
vmw_pvrdma RDMA/providers: Remove pointless functions 2018-07-30 20:31:54 -06:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00