migration/rdma: Fix race on source

Fix a race where the destination might try and send the source a
WRID_READY before the source has done a post-recv for it.

rdma_post_recv has to happen after the qp exists, and we're
OK since we've already called qemu_rdma_source_init that calls
qemu_alloc_qp.

This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1285044

The race can be triggered by adding a few ms wait before this
post_recv_control (which was originally due to me turning on loads of
debug).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20170717110936.23314-2-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This commit is contained in:
Dr. David Alan Gilbert 2017-07-17 12:09:31 +01:00 committed by Juan Quintela
parent e9277a19a1
commit 9cf2bab2ed

View File

@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
caps_to_network(&cap);
ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
if (ret) {
ERROR(errp, "posting second control recv");
goto err_rdma_source_connect;
}
ret = rdma_connect(rdma->cm_id, &conn_param);
if (ret) {
perror("rdma_connect");
@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
rdma_ack_cm_event(cm_event);
ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
if (ret) {
ERROR(errp, "posting second control recv!");
goto err_rdma_source_connect;
}
rdma->control_ready_expected = 1;
rdma->nb_sent = 0;
return 0;