linux/net
Jozsef Kadlecsik 8bb930c3a1 netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
[ Upstream commit 28628fa952 ]

Linkui Xiao reported that there's a race condition when ipset swap and destroy is
called, which can lead to crash in add/del/test element operations. Swap then
destroy are usual operations to replace a set with another one in a production
system. The issue can in some cases be reproduced with the script:

ipset create hash_ip1 hash:net family inet hashsize 1024 maxelem 1048576
ipset add hash_ip1 172.20.0.0/16
ipset add hash_ip1 192.168.0.0/16
iptables -A INPUT -m set --match-set hash_ip1 src -j ACCEPT
while [ 1 ]
do
	# ... Ongoing traffic...
        ipset create hash_ip2 hash:net family inet hashsize 1024 maxelem 1048576
        ipset add hash_ip2 172.20.0.0/16
        ipset swap hash_ip1 hash_ip2
        ipset destroy hash_ip2
        sleep 0.05
done

In the race case the possible order of the operations are

	CPU0			CPU1
	ip_set_test
				ipset swap hash_ip1 hash_ip2
				ipset destroy hash_ip2
	hash_net_kadt

Swap replaces hash_ip1 with hash_ip2 and then destroy removes hash_ip2 which
is the original hash_ip1. ip_set_test was called on hash_ip1 and because destroy
removed it, hash_net_kadt crashes.

The fix is to force ip_set_swap() to wait for all readers to finish accessing the
old set pointers by calling synchronize_rcu().

The first version of the patch was written by Linkui Xiao <xiaolinkui@kylinos.cn>.

v2: synchronize_rcu() is moved into ip_set_swap() in order not to burden
    ip_set_destroy() unnecessarily when all sets are destroyed.
v3: Florian Westphal pointed out that all netfilter hooks run with rcu_read_lock() held
    and em_ipset.c wraps the entire ip_set_test() in rcu read lock/unlock pair.
    So there's no need to extend the rcu read locked area in ipset itself.

Closes: https://lore.kernel.org/all/69e7963b-e7f8-3ad0-210-7b86eebf7f78@netfilter.org/
Reported by: Linkui Xiao <xiaolinkui@kylinos.cn>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13 18:36:32 +01:00
..
6lowpan
9p 9p: v9fs_listxattr: fix %s null argument warning 2023-11-28 16:56:19 +00:00
802 mrp: introduce active flags to prevent UAF when applicant uninit 2022-12-31 13:14:42 +01:00
8021q vlan: move dev_put into vlan_dev_uninit 2023-12-08 08:48:02 +01:00
appletalk
atm atm: hide unused procfs functions 2023-06-09 10:32:26 +02:00
ax25 net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg 2022-06-22 14:22:01 +02:00
batman-adv batman-adv: Hold rtnl lock during MTU update via netlink 2023-08-30 16:18:18 +02:00
bluetooth Bluetooth: Fix double free in hci_conn_cleanup 2023-11-28 16:56:16 +00:00
bpf bpf: Move skb->len == 0 checks into __bpf_redirect 2022-12-31 13:14:11 +01:00
bpfilter
bridge netfilter: nf_tables: add and use BE register load-store helpers 2023-11-28 16:56:24 +00:00
caif net: caif: Fix use-after-free in cfusbl_device_notify() 2023-03-17 08:48:54 +01:00
can can: isotp: isotp_sendmsg(): fix TX state detection and wait behavior 2023-11-08 17:26:49 +01:00
ceph libceph: use kernel_connect() 2023-10-19 23:05:36 +02:00
core net: annotate data-races around sk->sk_dst_pending_confirm 2023-11-28 16:56:16 +00:00
dcb net: dcb: choose correct policy to parse DCB_ATTR_BCN 2023-08-11 15:13:53 +02:00
dccp net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
dns_resolver
dsa net: dsa: tag_sja1105: fix MAC DA patching from meta frames 2023-07-23 13:47:30 +02:00
ethernet
ethtool ethtool: Fix uninitialized number of lanes 2023-05-17 11:50:18 +02:00
hsr hsr: Prevent use after free in prp_create_tagged_frame() 2023-11-20 11:08:28 +01:00
ieee802154 net: ieee802154: fix error return code in dgram_bind() 2022-11-03 23:59:14 +09:00
ife
ipv4 ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet 2023-12-08 08:48:02 +01:00
ipv6 net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
iucv net/iucv: Fix size of interrupt data 2023-03-22 13:31:28 +01:00
kcm kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg(). 2023-09-19 12:23:04 +02:00
key net: af_key: fix sadb_x_filter validation 2023-08-26 14:23:32 +02:00
l2tp ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data() 2023-10-10 21:59:07 +02:00
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-27 14:38:53 +02:00
lapb
llc llc: verify mac len before reading mac header 2023-11-20 11:08:28 +01:00
mac80211 wifi: mac80211: don't return unset power in ieee80211_get_tx_power() 2023-11-28 16:56:15 +00:00
mac802154 mac802154: fix missing INIT_LIST_HEAD in ieee802154_if_add() 2022-12-14 11:37:25 +01:00
mctp mctp: perform route lookups under a RCU read-side lock 2023-10-25 11:58:59 +02:00
mpls net: mpls: fix stale pointer if allocation fails during device rename 2023-02-22 12:57:09 +01:00
mptcp net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
ncsi Revert ncsi: Propagate carrier gain/loss events to the NCSI controller 2023-11-28 16:56:33 +00:00
netfilter netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test 2023-12-13 18:36:32 +01:00
netlabel netlabel: fix shift wrapping bug in netlbl_catmap_setlong() 2023-09-19 12:22:29 +02:00
netlink netlink: Add __sock_i_ino() for __netlink_diag_dump(). 2023-07-23 13:46:56 +02:00
netrom netrom: Deny concurrent connect(). 2023-09-19 12:22:35 +02:00
nfc nfc: nci: fix possible NULL pointer dereference in send_acknowledge() 2023-10-25 11:58:55 +02:00
nsh net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment() 2023-05-24 17:36:51 +01:00
openvswitch net: openvswitch: fix possible memory leak in ovs_meter_cmd_set() 2023-02-22 12:57:09 +01:00
packet net/packet: annotate data-races around tp->status 2023-08-16 18:22:01 +02:00
phonet phonet: refcount leak in pep_sock_accep 2022-01-11 15:35:16 +01:00
psample
qrtr net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() 2023-04-20 12:13:53 +02:00
rds net: prevent address rewrite in kernel_bind() 2023-10-19 23:05:33 +02:00
rfkill net: rfkill: gpio: prevent value glitch during probe 2023-10-25 11:58:57 +02:00
rose net/rose: Fix to not accept on connected socket 2023-02-22 12:57:02 +01:00
rxrpc rxrpc: Fix hard call timeout units 2023-05-17 11:50:17 +02:00
sched net: sched: cls_u32: Fix allocation size in u32_init() 2023-11-08 17:26:45 +01:00
sctp sctp: update hb timer immediately after users change hb_interval 2023-10-10 21:59:08 +02:00
smc net/smc: avoid data corruption caused by decline 2023-12-03 07:31:22 +01:00
strparser
sunrpc svcrdma: Drop connection after an RDMA Read error 2023-11-28 16:56:29 +00:00
switchdev
tipc tipc: Fix kernel-infoleak due to uninitialized TLV value 2023-11-28 16:56:23 +00:00
tls net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() 2023-09-19 12:23:04 +02:00
unix af_unix: fix use-after-free in unix_stream_read_actor() 2023-11-28 16:56:24 +00:00
vmw_vsock vsock/virtio: initialize the_virtio_vsock before using VQs 2023-11-08 17:26:37 +01:00
wireless wifi: cfg80211: avoid leaking stack data into trace 2023-10-25 11:59:00 +02:00
x25 net/x25: Fix to not accept on connected socket 2023-02-09 11:26:40 +01:00
xdp xsk: Fix xsk_diag use-after-free error during socket cleanup 2023-09-19 12:22:58 +02:00
xfrm xfrm: interface: use DEV_STATS_INC() 2023-10-25 11:58:56 +02:00
compat.c
devres.c
Kconfig Remove DECnet support from kernel 2023-06-21 15:59:15 +02:00
Makefile Remove DECnet support from kernel 2023-06-21 15:59:15 +02:00
socket.c net: prevent address rewrite in kernel_bind() 2023-10-19 23:05:33 +02:00
sysctl_net.c