linux/net/sunrpc
Chuck Lever 33849792cb xprtrdma: Detect unreachable NFS/RDMA servers more reliably
Current NFS clients rely on connection loss to determine when to
retransmit. In particular, for protocols like NFSv4, clients no
longer rely on RPC timeouts to drive retransmission: NFSv4 servers
are required to terminate a connection when they need a client to
retransmit pending RPCs.

When a server is no longer reachable, either because it has crashed
or because the network path has broken, the server cannot actively
terminate a connection. Thus NFS clients depend on transport-level
keepalive to determine when a connection must be replaced and
pending RPCs retransmitted.

However, RDMA RC connections do not have a native keepalive
mechanism. If an NFS/RDMA server crashes after a client has sent
RPCs successfully (an RC ACK has been received for all OTW RDMA
requests), there is no way for the client to know the connection is
moribund.

In addition, new RDMA requests are subject to the RPC-over-RDMA
credit limit. If the client has consumed all granted credits with
NFS traffic, it is not allowed to send another RDMA request until
the server replies. Thus it has no way to send a true keepalive when
the workload has already consumed all credits with pending RPCs.

To address this, forcibly disconnect a transport when an RPC times
out. This prevents moribund connections from stopping the
detection of failover or other configuration changes on the server.

Note that even if the connection is still good, retransmitting
any RPC will trigger a disconnect thanks to this logic in
xprt_rdma_send_request:

	/* Must suppress retransmit to maintain credits */
	if (req->rl_connect_cookie == xprt->connect_cookie)
		goto drop_connection;
	req->rl_connect_cookie = xprt->connect_cookie;

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-04-25 16:12:19 -04:00
..
auth_gss The nfsd update this round is mainly a lot of miscellaneous cleanups and 2017-02-28 15:39:09 -08:00
xprtrdma xprtrdma: Detect unreachable NFS/RDMA servers more reliably 2017-04-25 16:12:19 -04:00
addr.c replace strict_strto calls 2014-07-12 18:45:49 -04:00
auth_generic.c NFS client updates for Linux 4.9 2016-10-13 21:28:20 -07:00
auth_null.c sunrpc: remove dead codes of cr_magic in rpc_cred 2017-02-08 17:02:46 -05:00
auth_unix.c sunrpc: rename NFS_NGROUPS to UNX_NGROUPS for auth unix 2017-02-08 17:02:45 -05:00
auth.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
backchannel_rqst.c SUNRPC: Refactor rpc_xdr_buf_init() 2016-09-19 13:08:37 -04:00
cache.c NFS client updates for Linux 4.11 2017-03-01 16:10:30 -08:00
clnt.c NFSv4: Set the connection timeout to match the lease period 2017-02-09 14:15:16 -05:00
debugfs.c sunrpc: record rpc client pointer in seq->private directly 2017-02-08 17:02:47 -05:00
Kconfig rpcrdma: Merge svcrdma and xprtrdma modules into one 2015-06-04 16:56:02 -04:00
Makefile SUNRPC: Add a structure to track multiple transports 2016-02-05 18:48:54 -05:00
netns.h netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
rpc_pipe.c fs: Replace CURRENT_TIME with current_time() for inode timestamps 2016-09-27 21:06:21 -04:00
rpcb_clnt.c SUNRPC: Use the multipath iterator to assign a transport to each task 2016-02-05 18:48:55 -05:00
sched.c SUNRPC: Separate buffer pointers for RPC Call and Reply messages 2016-09-19 13:08:37 -04:00
socklib.c sunrpc: do not pull udp headers on receive 2016-04-11 15:31:33 -04:00
stats.c SUNRPC: Proper metric accounting when RPC is not transmitted 2016-11-29 16:45:44 -05:00
sunrpc_syms.c SUNRPC: cleanup ida information when removing sunrpc module 2017-01-24 15:29:24 -05:00
sunrpc.h SUNRPC: track whether a request is coming from a loop-back interface. 2014-05-22 15:59:18 -04:00
svc_xprt.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-02-20 13:23:30 -08:00
svc.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
svcauth_unix.c sunrpc: rename NFS_NGROUPS to UNX_NGROUPS for auth unix 2017-02-08 17:02:45 -05:00
svcauth.c locking/atomic, kref: Implement kref_put_lock() 2017-01-18 10:03:29 +01:00
svcsock.c SUNRPC/backchanel: set XPT_CONG_CTRL flag for bc xprt 2017-03-09 15:20:46 -05:00
sysctl.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
timer.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
xdr.c SUNRPC: Add a helper function xdr_stream_decode_string_dup() 2017-02-21 16:56:16 -05:00
xprt.c sunrpc: Export xprt_force_disconnect() 2017-04-25 16:12:16 -04:00
xprtmultipath.c SUNRPC search xprt switch for sockaddr 2016-09-19 13:08:36 -04:00
xprtsock.c NFS client updates for Linux 4.11 2017-03-01 16:10:30 -08:00