linux/net/sunrpc
Chuck Lever 0bad47cada svcrdma: Preserve CB send buffer across retransmits
During each NFSv4 callback Call, an RDMA Send completion frees the
page that contains the RPC Call message. If the upper layer
determines that a retransmit is necessary, this is too soon.

One possible symptom: after a GARBAGE_ARGS response an NFSv4.1
callback request, the following BUG fires on the NFS server:

kernel: BUG: Bad page state in process kworker/0:2H  pfn:7d3ce2
kernel: page:ffffea001f4f3880 count:-2 mapcount:0 mapping:          (null) index:0x0
kernel: flags: 0x2fffff80000000()
kernel: raw: 002fffff80000000 0000000000000000 0000000000000000 fffffffeffffffff
kernel: raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
kernel: page dumped because: nonzero _refcount
kernel: Modules linked in: cts rpcsec_gss_krb5 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm
ocfs2_nodemanager ocfs2_stackglue rpcrdm a ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm irqbypass crct10dif_pc lmul crc32_pclmul ghash_clmulni_intel pcbc iTCO_wdt
iTCO_vendor_support aesni_intel crypto_simd glue_helper cryptd pcspkr lpc_ich i2c_i801
mei_me mf d_core mei raid0 sg wmi ioatdma ipmi_si ipmi_devintf ipmi_msghandler shpchp
acpi_power_meter acpi_pad nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs
libcrc32c mlx4_en mlx4_ib mlx5_ib ib_core sd_mod sr_mod cdrom ast drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crc32c_intel libahci drm
mlx5_core igb libata mlx4_core dca i2c_algo_bit i2c_core nvme
kernel: ptp nvme_core pps_core dm_mirror dm_region_hash dm_log dm_mod dax
kernel: CPU: 0 PID: 11495 Comm: kworker/0:2H Not tainted 4.14.0-rc3-00001-g577ce48 #811
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
kernel: Call Trace:
kernel: dump_stack+0x62/0x80
kernel: bad_page+0xfe/0x11a
kernel: free_pages_check_bad+0x76/0x78
kernel: free_pcppages_bulk+0x364/0x441
kernel: ? ttwu_do_activate.isra.61+0x71/0x78
kernel: free_hot_cold_page+0x1c5/0x202
kernel: __put_page+0x2c/0x36
kernel: svc_rdma_put_context+0xd9/0xe4 [rpcrdma]
kernel: svc_rdma_wc_send+0x50/0x98 [rpcrdma]

This issue exists all the way back to v4.5, but refactoring and code
re-organization prevents this simple patch from applying to kernels
older than v4.12. The fix is the same, however, if someone needs to
backport it.

Reported-by: Ben Coddington <bcodding@redhat.com>
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=314
Fixes: 5d252f90a8 ('svcrdma: Add class for RDMA backwards ... ')
Cc: stable@vger.kernel.org # v4.12
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-07 16:44:00 -05:00
..
auth_gss NFS client updates for Linux 4.13 2017-07-13 14:35:37 -07:00
xprtrdma svcrdma: Preserve CB send buffer across retransmits 2017-11-07 16:44:00 -05:00
addr.c replace strict_strto calls 2014-07-12 18:45:49 -04:00
auth_generic.c NFS client updates for Linux 4.9 2016-10-13 21:28:20 -07:00
auth_null.c sunrpc: remove dead codes of cr_magic in rpc_cred 2017-02-08 17:02:46 -05:00
auth_unix.c sunrpc: rename NFS_NGROUPS to UNX_NGROUPS for auth unix 2017-02-08 17:02:45 -05:00
auth.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
backchannel_rqst.c SUNRPC: Don't hold the transport lock when receiving backchannel data 2017-08-16 15:10:16 -04:00
cache.c NFS client updates for Linux 4.11 2017-03-01 16:10:30 -08:00
clnt.c SUNRPC: remove some dead code. 2017-09-06 12:31:15 -04:00
debugfs.c sunrpc: record rpc client pointer in seq->private directly 2017-02-08 17:02:47 -05:00
Kconfig svcrdma: Introduce local rdma_rw API helpers 2017-04-25 17:25:55 -04:00
Makefile SUNRPC: Add a structure to track multiple transports 2016-02-05 18:48:54 -05:00
netns.h netns: make struct pernet_operations::id unsigned int 2016-11-18 10:59:15 -05:00
rpc_pipe.c fs: Replace CURRENT_TIME with current_time() for inode timestamps 2016-09-27 21:06:21 -04:00
rpcb_clnt.c sunrpc: mark all struct rpc_procinfo instances as const 2017-07-13 15:57:57 -04:00
sched.c sunrpc: don't check for failure from mempool_alloc() 2017-04-20 13:44:57 -04:00
socklib.c sunrpc: do not pull udp headers on receive 2016-04-11 15:31:33 -04:00
stats.c sunrpc: move pc_count out of struct svc_procinfo 2017-07-13 15:58:02 -04:00
sunrpc_syms.c SUNRPC: cleanup ida information when removing sunrpc module 2017-01-24 15:29:24 -05:00
sunrpc.h SUNRPC: track whether a request is coming from a loop-back interface. 2014-05-22 15:59:18 -04:00
svc_xprt.c sunrcp: make function _svc_create_xprt static 2017-11-07 16:43:57 -05:00
svc.c sunrpc: Const-ify struct sv_serv_ops 2017-08-24 22:13:50 -04:00
svcauth_unix.c sunrpc: rename NFS_NGROUPS to UNX_NGROUPS for auth unix 2017-02-08 17:02:45 -05:00
svcauth.c locking/atomic, kref: Implement kref_put_lock() 2017-01-18 10:03:29 +01:00
svcsock.c NFS client updates for Linux 4.14 2017-09-11 22:01:44 -07:00
sysctl.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
timer.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
xdr.c sunrpc: Fix xdr_init_decode_pages() documenting comment 2017-04-25 16:12:34 -04:00
xprt.c xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler 2017-09-05 18:27:07 -04:00
xprtmultipath.c SUNRPC search xprt switch for sockaddr 2016-09-19 13:08:36 -04:00
xprtsock.c NFS-over-RDMA client updates for Linux 4.14 2017-09-05 15:16:04 -04:00