linux/net/smc
Wen Gu 20c9398d33 net/smc: Resolve the race between SMC-R link access and clear
We encountered some crashes caused by the race between SMC-R
link access and link clear that triggered by abnormal link
group termination, such as port error.

Here is an example of this kind of crashes:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 Workqueue: smc_hs_wq smc_listen_work [smc]
 RIP: 0010:smc_llc_flow_initiate+0x44/0x190 [smc]
 Call Trace:
  <TASK>
  ? __smc_buf_create+0x75a/0x950 [smc]
  smcr_lgr_reg_rmbs+0x2a/0xbf [smc]
  smc_listen_work+0xf72/0x1230 [smc]
  ? process_one_work+0x25c/0x600
  process_one_work+0x25c/0x600
  worker_thread+0x4f/0x3a0
  ? process_one_work+0x600/0x600
  kthread+0x15d/0x1a0
  ? set_kthread_struct+0x40/0x40
  ret_from_fork+0x1f/0x30
  </TASK>

smc_listen_work()                     __smc_lgr_terminate()
---------------------------------------------------------------
                                    | smc_lgr_free()
                                    |  |- smcr_link_clear()
                                    |      |- memset(lnk, 0)
smc_listen_rdma_reg()               |
 |- smcr_lgr_reg_rmbs()             |
     |- smc_llc_flow_initiate()     |
         |- access lnk->lgr (panic) |

These crashes are similarly caused by clearing SMC-R link
resources when some functions is still accessing to them.
This patch tries to fix the issue by introducing reference
count of SMC-R links and ensuring that the sensitive resources
of links won't be cleared until reference count reaches zero.

The operation to the SMC-R link reference count can be concluded
as follows:

object          [hold or initialized as 1]         [put]
--------------------------------------------------------------------
links           smcr_link_init()                   smcr_link_clear()
connections     smc_conn_create()                  smc_conn_free()

Through this way, the clear of SMC-R links is later than the
free of all the smc connections above it, thus avoiding the
unsafe reference to SMC-R links.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-13 13:14:53 +00:00
..
af_smc.c net/smc: Introduce a new conn->lgr validity check helper 2022-01-13 13:14:53 +00:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile net/smc: Introduce tracepoint for fallback 2021-11-01 13:39:14 +00:00
smc_cdc.c net/smc: Introduce a new conn->lgr validity check helper 2022-01-13 13:14:53 +00:00
smc_cdc.h net/smc: fix kernel panic caused by race of smc_sock 2021-12-28 12:42:45 +00:00
smc_clc.c net/smc: Introduce a new conn->lgr validity check helper 2022-01-13 13:14:53 +00:00
smc_clc.h net/smc: add v2 format of CLC decline message 2021-10-16 14:58:13 +01:00
smc_close.c net/smc: Keep smc_close_final rc during active close 2021-12-02 12:14:36 +00:00
smc_close.h net/smc: remove close abort worker 2019-10-22 11:23:44 -07:00
smc_core.c net/smc: Resolve the race between SMC-R link access and clear 2022-01-13 13:14:53 +00:00
smc_core.h net/smc: Resolve the race between SMC-R link access and clear 2022-01-13 13:14:53 +00:00
smc_diag.c net/smc: Introduce a new conn->lgr validity check helper 2022-01-13 13:14:53 +00:00
smc_ib.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
smc_ib.h net/smc: Introduce net namespace support for linkgroup 2022-01-02 12:07:39 +00:00
smc_ism.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
smc_ism.h net/smc: keep static copy of system EID 2021-09-14 12:49:10 +01:00
smc_llc.c net/smc: Print net namespace in log 2022-01-02 12:07:39 +00:00
smc_llc.h net/smc: extend LLC layer for SMC-Rv2 2021-10-16 14:58:13 +01:00
smc_netlink.c net/smc: add generic netlink support for system EID 2021-09-14 12:49:10 +01:00
smc_netlink.h net/smc: add support for user defined EIDs 2021-09-14 12:49:10 +01:00
smc_netns.h net/smc: introduce list of pnetids for Ethernet devices 2020-09-28 15:19:03 -07:00
smc_pnet.c net/smc: fix possible NULL deref in smc_pnet_add_eth() 2022-01-12 14:45:29 +00:00
smc_pnet.h net/smc: determine proposed ISM devices 2020-09-28 15:19:03 -07:00
smc_rx.c net/smc: Introduce tracepoints for tx and rx msg 2021-11-01 13:39:14 +00:00
smc_rx.h smc: add support for splice() 2018-05-04 11:45:06 -04:00
smc_stats.c net/smc: Fix ENODATA tests in smc_nl_get_fback_stats() 2021-06-21 12:16:58 -07:00
smc_stats.h net/smc: Make SMC statistics network namespace aware 2021-06-16 12:54:02 -07:00
smc_tracepoint.c net/smc: Introduce tracepoint for smcr link down 2021-11-01 13:39:14 +00:00
smc_tracepoint.h net/smc: Add net namespace for tracepoints 2022-01-02 12:07:39 +00:00
smc_tx.c net/smc: Introduce tracepoints for tx and rx msg 2021-11-01 13:39:14 +00:00
smc_tx.h net/smc: eliminate cursor read and write calls 2018-07-23 10:57:14 -07:00
smc_wr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-30 12:12:12 -08:00
smc_wr.h net/smc: fix kernel panic caused by race of smc_sock 2021-12-28 12:42:45 +00:00
smc.h net/smc: Resolve the race between link group access and termination 2022-01-13 12:55:40 +00:00