2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-25 21:54:06 +08:00
linux-next/drivers/infiniband/sw/rxe
Yonatan Cohen 002e062e13 IB/rxe: Fix handling of erroneous WR
To correctly handle a erroneous WR this fix does the following
1. Make sure the bad WQE causes a user completion event.
2. Call rxe_completer to handle the erred WQE.

Before the fix, when rxe_requester found a bad WQE, it changed its
status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs.

If this was the 1st WQE then there would be no ACK to invoke the
completer and this bad WQE would be stuck in the QP's send-q.

On top of that the requester exiting with 0 caused rxe_do_task to
endlessly invoke rxe_requester, resulting in a soft-lockup attached
below.

In case the WQE was not the 1st and rxe_completer did get a chance to
handle the bad WQE, it did not cause a complete event since the WQE's
IB_SEND_SIGNALED flag was not set.

Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec
version 1.2.1, section 10.7.3.1 Signaled Completions.

NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[<ffffffffa0590145>] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe]
[<ffffffffa05952ec>] lookup_mem+0x3c/0xc0 [rdma_rxe]
[<ffffffffa0595534>] copy_data+0x1c4/0x230 [rdma_rxe]
[<ffffffffa058c180>] rxe_requester+0x9d0/0x1100 [rdma_rxe]
[<ffffffff8158e98a>] ? kfree_skbmem+0x5a/0x60
[<ffffffffa05962c9>] rxe_do_task+0x89/0xf0 [rdma_rxe]
[<ffffffffa05963e2>] rxe_run_task+0x12/0x30 [rdma_rxe]
[<ffffffffa059110a>] rxe_post_send+0x41a/0x550 [rdma_rxe]
[<ffffffff811ef922>] ? __kmalloc+0x182/0x200
[<ffffffff816ba512>] ? down_read+0x12/0x40
[<ffffffffa054bd32>] ib_uverbs_post_send+0x532/0x540 [ib_uverbs]
[<ffffffff815f8722>] ? tcp_sendmsg+0x402/0xb80
[<ffffffffa05453dc>] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs]
[<ffffffff81623c2e>] ? inet_recvmsg+0x7e/0xb0
[<ffffffff8158764d>] ? sock_recvmsg+0x3d/0x50
[<ffffffff81215b87>] __vfs_write+0x37/0x140
[<ffffffff81216892>] vfs_write+0xb2/0x1b0
[<ffffffff81217ce5>] SyS_write+0x55/0xc0
[<ffffffff816bc672>] entry_SYSCALL_64_fastpath+0x1a/0xa

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
..
Kconfig
Makefile
rxe_av.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_comp.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_cq.c
rxe_dma.c IB/{rxe,core,rdmavt}: Fix kernel crash for reg MR 2016-10-06 13:50:04 -04:00
rxe_hdr.h
rxe_icrc.c
rxe_loc.h IB/rxe: Properly honor max IRD value for rd/atomic. 2016-10-06 13:50:04 -04:00
rxe_mcast.c
rxe_mmap.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_mr.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_net.c IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum 2016-11-16 20:03:44 -05:00
rxe_net.h IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_opcode.c
rxe_opcode.h
rxe_param.h
rxe_pool.c
rxe_pool.h
rxe_qp.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_queue.c
rxe_queue.h
rxe_recv.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_req.c IB/rxe: Fix handling of erroneous WR 2016-11-16 20:03:44 -05:00
rxe_resp.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_srq.c
rxe_sysfs.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_task.c
rxe_task.h
rxe_verbs.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe_verbs.h
rxe.c IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00
rxe.h IB/rxe: improved debug prints & code cleanup 2016-10-06 13:50:04 -04:00