linux/net
Wen Gu 3511227167 net/smc: Reset connection when trying to use SMCRv2 fails.
We found a crash when using SMCRv2 with 2 Mellanox ConnectX-4. It
can be reproduced by:

- smc_run nginx
- smc_run wrk -t 32 -c 500 -d 30 http://<ip>:<port>

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 8000000108713067 P4D 8000000108713067 PUD 151127067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 4 PID: 2441 Comm: kworker/4:249 Kdump: loaded Tainted: G        W   E      6.4.0-rc1+ #42
 Workqueue: smc_hs_wq smc_listen_work [smc]
 RIP: 0010:smc_clc_send_confirm_accept+0x284/0x580 [smc]
 RSP: 0018:ffffb8294b2d7c78 EFLAGS: 00010a06
 RAX: ffff8f1873238880 RBX: ffffb8294b2d7dc8 RCX: 0000000000000000
 RDX: 00000000000000b4 RSI: 0000000000000001 RDI: 0000000000b40c00
 RBP: ffffb8294b2d7db8 R08: ffff8f1815c5860c R09: 0000000000000000
 R10: 0000000000000400 R11: 0000000000000000 R12: ffff8f1846f56180
 R13: ffff8f1815c5860c R14: 0000000000000001 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff8f1aefd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 00000001027a0001 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  ? mlx5_ib_map_mr_sg+0xa1/0xd0 [mlx5_ib]
  ? smcr_buf_map_link+0x24b/0x290 [smc]
  ? __smc_buf_create+0x4ee/0x9b0 [smc]
  smc_clc_send_accept+0x4c/0xb0 [smc]
  smc_listen_work+0x346/0x650 [smc]
  ? __schedule+0x279/0x820
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

During the CLC handshake, server sequentially tries available SMCRv2
and SMCRv1 devices in smc_listen_work().

If an SMCRv2 device is found. SMCv2 based link group and link will be
assigned to the connection. Then assumed that some buffer assignment
errors happen later in the CLC handshake, such as RMB registration
failure, server will give up SMCRv2 and try SMCRv1 device instead. But
the resources assigned to the connection won't be reset.

When server tries SMCRv1 device, the connection creation process will
be executed again. Since conn->lnk has been assigned when trying SMCRv2,
it will not be set to the correct SMCRv1 link in
smcr_lgr_conn_assign_link(). So in such situation, conn->lgr points to
correct SMCRv1 link group but conn->lnk points to the SMCRv2 link
mistakenly.

Then in smc_clc_send_confirm_accept(), conn->rmb_desc->mr[link->link_idx]
will be accessed. Since the link->link_idx is not correct, the related
MR may not have been initialized, so crash happens.

 | Try SMCRv2 device first
 |     |-> conn->lgr:	assign existed SMCRv2 link group;
 |     |-> conn->link:	assign existed SMCRv2 link (link_idx may be 1 in SMC_LGR_SYMMETRIC);
 |     |-> sndbuf & RMB creation fails, quit;
 |
 | Try SMCRv1 device then
 |     |-> conn->lgr:	create SMCRv1 link group and assign;
 |     |-> conn->link:	keep SMCRv2 link mistakenly;
 |     |-> sndbuf & RMB creation succeed, only RMB->mr[link_idx = 0]
 |         initialized.
 |
 | Then smc_clc_send_confirm_accept() accesses
 | conn->rmb_desc->mr[conn->link->link_idx, which is 1], then crash.
 v

This patch tries to fix this by cleaning conn->lnk before assigning
link. In addition, it is better to reset the connection and clean the
resources assigned if trying SMCRv2 failed in buffer creation or
registration.

Fixes: e49300a6bf ("net/smc: add listen processing for SMC-Rv2")
Link: https://lore.kernel.org/r/20220523055056.2078994-1-liuyacan@corp.netease.com/
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-19 08:54:04 +01:00
..
6lowpan 6lowpan: Remove redundant initialisation. 2023-03-29 08:22:52 +01:00
9p Including fixes from netfilter. 2023-05-05 19:12:01 -07:00
802
8021q vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit() 2023-05-17 12:55:39 +01:00
appletalk
atm atm: hide unused procfs functions 2023-05-17 21:27:30 -07:00
ax25
batman-adv net: vlan: introduce skb_vlan_eth_hdr() 2023-04-23 14:16:44 +01:00
bluetooth Driver core changes for 6.4-rc1 2023-04-27 11:53:57 -07:00
bpf bpf: add test_run support for netfilter program type 2023-04-21 11:34:50 -07:00
bpfilter
bridge bridge: always declare tunnel functions 2023-05-17 21:28:58 -07:00
caif net: caif: Fix use-after-free in cfusbl_device_notify() 2023-03-02 22:22:07 -08:00
can can: j1939: recvmsg(): allow MSG_CMSG_COMPAT flag 2023-05-15 22:24:46 +02:00
ceph Networking changes for 6.3. 2023-02-21 18:24:12 -08:00
core net: datagram: fix data-races in datagram_poll() 2023-05-10 19:06:49 -07:00
dcb net: dcb: add helper functions to retrieve PCP and DSCP rewrite maps 2023-01-20 09:33:22 +00:00
dccp netfilter: keep conntrack reference until IPsecv6 policy checks are done 2023-03-22 21:50:23 +01:00
devlink devlink: Fix crash with CONFIG_NET_NS=n 2023-05-16 19:57:52 -07:00
dns_resolver
dsa net: dsa: tag_ocelot: call only the relevant portion of __skb_vlan_pop() on TX 2023-04-23 14:16:45 +01:00
ethernet
ethtool ethtool: Fix uninitialized number of lanes 2023-05-03 09:13:20 +01:00
handshake net/handshake: Fix section mismatch in handshake_exit 2023-04-21 20:24:57 -07:00
hsr hsr: ratelimit only when errors are printed 2023-03-16 21:11:03 -07:00
ieee802154 net: ieee802154: remove an unnecessary null pointer check 2023-03-17 09:13:53 +01:00
ife
ipv4 tcp: fix possible sk_priority leak in tcp_v4_send_reset() 2023-05-12 10:05:50 +01:00
ipv6 erspan: get the proto with the md version for collect_md 2023-05-13 16:58:58 +01:00
iucv net/iucv: Fix size of interrupt data 2023-03-16 17:34:40 -07:00
kcm net/sock: Introduce trace_sk_data_ready() 2023-01-23 11:26:50 +00:00
key af_key: Reject optional tunnel/BEET mode templates in outbound policies 2023-05-10 07:04:51 +02:00
l2tp l2tp: generate correct module alias strings 2023-03-31 09:25:12 +01:00
l3mdev
lapb
llc net: deal with most data-races in sk_wait_event() 2023-05-10 10:03:32 +01:00
mac80211 wifi: mac80211: recalc chanctx mindef before assigning 2023-05-16 10:26:00 -07:00
mac802154 mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep() 2023-04-05 13:48:04 +00:00
mctp mctp: remove MODULE_LICENSE in non-modules 2023-03-09 23:06:21 -08:00
mpls net: mpls: fix stale pointer if allocation fails during device rename 2023-02-15 10:26:37 +00:00
mptcp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-04-20 16:29:51 -07:00
ncsi net/ncsi: clear Tx enable mode when handling a Config required AEN 2023-04-28 09:35:33 +01:00
netfilter netfilter: nft_set_rbtree: fix null deref on element insertion 2023-05-17 14:18:28 +02:00
netlabel
netlink netlink: annotate accesses to nlk->cb_running 2023-05-10 09:28:38 +01:00
netrom netrom: Fix use-after-free caused by accept on already connected socket 2023-01-30 07:30:47 +00:00
nfc nfc: change order inside nfc_se_io error path 2023-03-07 13:37:05 -08:00
nsh net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment() 2023-05-15 08:40:27 +01:00
openvswitch net: openvswitch: fix race on port output 2023-04-07 19:42:53 -07:00
packet net: add vlan_get_protocol_and_depth() helper 2023-05-10 10:25:55 +01:00
phonet net/sock: Introduce trace_sk_data_ready() 2023-01-23 11:26:50 +00:00
psample
qrtr net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() 2023-04-13 09:35:30 +02:00
rds rds: rds_rm_zerocopy_callback() correct order for list_add_tail() 2023-02-13 09:33:39 +00:00
rfkill net: rfkill-gpio: Add explicit include for of.h 2023-04-06 20:36:27 +02:00
rose net/rose: Fix to not accept on connected socket 2023-01-28 00:19:57 -08:00
rxrpc Including fixes from netfilter. 2023-05-05 19:12:01 -07:00
sched net/sched: flower: fix error handler on replace 2023-05-05 10:01:31 +01:00
sctp sctp: delete the nested flexible array hmac 2023-04-21 08:19:30 +01:00
smc net/smc: Reset connection when trying to use SMCRv2 fails. 2023-05-19 08:54:04 +01:00
strparser
sunrpc nfsd-6.4 fixes: 2023-05-17 09:56:01 -07:00
switchdev
tipc tipc: check the bearer min mtu properly when setting it by netlink 2023-05-15 10:21:20 +01:00
tls tls: rx: strp: don't use GFP_KERNEL in softirq context 2023-05-19 08:37:37 +01:00
unix af_unix: Fix data races around sk->sk_shutdown. 2023-05-10 19:06:53 -07:00
vmw_vsock vsock: avoid to close connected socket after the timeout 2023-05-12 10:04:10 +01:00
wireless wifi: cfg80211: Drop entries with invalid BSSIDs in RNR 2023-05-16 10:09:50 -07:00
x25 net/x25: Fix to not accept on connected socket 2023-01-25 09:51:04 +00:00
xdp bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
xfrm ipsec-2023-05-16 2023-05-16 20:52:35 -07:00
compat.c net/compat: Update msg_control_is_user when setting a kernel pointer 2023-04-14 11:09:27 +01:00
devres.c
Kconfig net/handshake: Add Kunit tests for the handshake consumer API 2023-04-19 18:48:48 -07:00
Kconfig.debug
Makefile net/handshake: Create a NETLINK service for handling handshake requests 2023-04-19 18:48:48 -07:00
socket.c net: annotate sk->sk_err write from do_recvmmsg() 2023-05-10 09:58:29 +01:00
sysctl_net.c