linux/net
Jozsef Kadlecsik 28628fa952 netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
Linkui Xiao reported that there's a race condition when ipset swap and destroy is
called, which can lead to crash in add/del/test element operations. Swap then
destroy are usual operations to replace a set with another one in a production
system. The issue can in some cases be reproduced with the script:

ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576
ipset add hash_ip1 172.20.0.0/16
ipset add hash_ip1 192.168.0.0/16
iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
while [ 1 ]
do
	# ... Ongoing traffic...
        ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
        ipset add hash_ip2 172.20.0.0/16
        ipset swap hash_ip1 hash_ip2
        ipset destroy hash_ip2
        sleep 0.05
done

In the race case the possible order of the operations are

	CPU0			CPU1
	ip_set_test
				ipset swap hash_ip1 hash_ip2
				ipset destroy hash_ip2
	hash_net_kadt

Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
removed it, hash_net_kadt crashes.

The fix is to force ip_set_swap() to wait for all readers to finish accessing the
old set pointers by calling synchronize_rcu().

The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.

v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
    ip_set_destroy() unnecessarily when all sets are destroyed.
v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held
    and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.
    So there's no need to extend the rcu read locked area in ipset itself.

Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/
Reported by: Linkui Xiao <xiaolinkui@kylinos.cn>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-11-14 16:16:21 +01:00
..
6lowpan
9p 9p/net: fix possible memory leak in p9_check_errors() 2023-10-27 12:44:13 +09:00
802 net: fill in MODULE_DESCRIPTION()s under net/802* 2023-10-28 11:29:28 +01:00
8021q net: fill in MODULE_DESCRIPTION()s under net/802* 2023-10-28 11:29:28 +01:00
appletalk appletalk: remove special handling code for ipddp 2023-10-13 17:59:32 -07:00
atm net: atm: Remove redundant check. 2023-10-23 08:45:25 +01:00
ax25 net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-24 10:51:39 -07:00
bluetooth This update includes the following changes: 2023-11-02 16:15:30 -10:00
bpf bpf: Add __bpf_kfunc_{start,end}_defs macros 2023-11-01 22:33:53 -07:00
bpfilter
bridge netfilter: nf_conntrack_bridge: initialize err to 0 2023-11-14 16:16:21 +01:00
caif
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-12 17:07:34 -07:00
ceph This update includes the following changes: 2023-11-02 16:15:30 -10:00
core net: gso_test: support CONFIG_MAX_SKB_FRAGS up to 45 2023-11-13 11:04:38 +00:00
dcb net: dcb: choose correct policy to parse DCB_ATTR_BCN 2023-08-01 21:07:46 -07:00
dccp dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses. 2023-11-02 12:56:03 +01:00
devlink netlink: specs: devlink: add forgotten port function caps enum values 2023-11-01 22:13:43 -07:00
dns_resolver
dsa net: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUIT 2023-10-24 13:08:14 -07:00
ethernet
ethtool ethtool: untangle the linkmode and ethtool headers 2023-10-20 12:47:33 +01:00
handshake Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-26 13:46:28 -07:00
hsr hsr: Prevent use after free in prp_create_tagged_frame() 2023-11-01 22:26:04 -07:00
ieee802154 sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
ife
ipv4 net: set SOCK_RCU_FREE before inserting socket into hashtable 2023-11-10 08:37:09 +00:00
ipv6 Including fixes from netfilter and bpf. 2023-11-09 17:09:35 -08:00
iucv s390: use control register bit defines 2023-09-19 13:26:57 +02:00
kcm net: kcm: fill in MODULE_DESCRIPTION() 2023-11-08 18:17:44 -08:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-18 12:44:56 -07:00
l2tp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-05 13:16:47 -07:00
l3mdev
lapb
llc llc: verify mac len before reading mac header 2023-11-01 22:21:32 -07:00
mac80211 wireless-next patches for v6.7 2023-10-26 20:27:58 -07:00
mac802154
mctp mctp: perform route lookups under a RCU read-side lock 2023-10-10 19:43:22 -07:00
mpls networking: Update to register_net_sysctl_sz 2023-08-15 15:26:18 -07:00
mptcp This update includes the following changes: 2023-11-02 16:15:30 -10:00
ncsi ncsi: Propagate carrier gain/loss events to the NCSI controller 2023-09-18 07:06:05 +01:00
netfilter netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test 2023-11-14 16:16:21 +01:00
netlabel netlabel: Remove unused declaration netlbl_cipsov4_doi_free() 2023-08-02 12:28:22 -07:00
netlink netlink: fill in missing MODULE_DESCRIPTION() 2023-11-03 11:42:48 +00:00
netrom net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
nfc nfc: nci: fix possible NULL pointer dereference in send_acknowledge() 2023-10-16 17:34:53 -07:00
nsh
openvswitch net/sched: act_ct: Always fill offloading tuple iifidx 2023-11-08 17:47:08 -08:00
packet Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-12 17:07:34 -07:00
phonet
psample
qrtr
rds net: prevent address rewrite in kernel_bind() 2023-10-01 19:31:29 +01:00
rfkill net: rfkill: reduce data->mtx scope in rfkill_fop_open 2023-10-11 16:55:10 +02:00
rose net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
rxrpc rxrpc: Fix two connection reaping bugs 2023-11-01 22:28:55 -07:00
sched net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP 2023-11-08 18:30:21 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-05 13:16:47 -07:00
smc net/smc: put sk reference if close work was canceled 2023-11-06 10:01:07 +00:00
strparser
sunrpc NFS client updates for Linux 6.7 2023-11-08 13:39:16 -08:00
switchdev
tipc tipc: Fix kernel-infoleak due to uninitialized TLV value 2023-11-13 11:05:47 +00:00
tls tls: don't reset prot->aad_size and prot->tail_size for TLS_HW 2023-10-23 10:15:09 -07:00
unix af_unix: fix use-after-free in unix_stream_read_actor() 2023-11-14 10:51:13 +01:00
vmw_vsock virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt() 2023-11-07 18:56:06 -08:00
wireless wireless-next patches for v6.7 2023-10-26 20:27:58 -07:00
x25 net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
xdp xsk: Avoid starving the xsk further down the list 2023-10-24 11:55:36 +02:00
xfrm Including fixes from netfilter and bpf. 2023-11-09 17:09:35 -08:00
compat.c
devres.c
Kconfig net: add skb_segment kunit test 2023-10-11 10:39:01 +01:00
Kconfig.debug
Makefile
socket.c bpf: Add __bpf_hook_{start,end} macros 2023-11-01 22:33:53 -07:00
sysctl_net.c sysctl: Add size to register_net_sysctl function 2023-08-15 15:26:17 -07:00