linux/include/net
Pablo Neira Ayuso 5f68718b34 netfilter: nf_tables: GC transaction API to avoid race with control plane
The set types rhashtable and rbtree use a GC worker to reclaim memory.
From system work queue, in periodic intervals, a scan of the table is
done.

The major caveat here is that the nft transaction mutex is not held.
This causes a race between control plane and GC when they attempt to
delete the same element.

We cannot grab the netlink mutex from the work queue, because the
control plane has to wait for the GC work queue in case the set is to be
removed, so we get following deadlock:

   cpu 1                                cpu2
     GC work                            transaction comes in , lock nft mutex
       `acquire nft mutex // BLOCKS
                                        transaction asks to remove the set
                                        set destruction calls cancel_work_sync()

cancel_work_sync will now block forever, because it is waiting for the
mutex the caller already owns.

This patch adds a new API that deals with garbage collection in two
steps:

1) Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT
   so they are not visible via lookup. Annotate current GC sequence in
   the GC transaction. Enqueue GC transaction work as soon as it is
   full. If ruleset is updated, then GC transaction is aborted and
   retried later.

2) GC work grabs the mutex. If GC sequence has changed then this GC
   transaction lost race with control plane, abort it as it contains
   stale references to objects and let GC try again later. If the
   ruleset is intact, then this GC transaction deactivates and removes
   the elements and it uses call_rcu() to destroy elements.

Note that no elements are removed from GC lockless path, the _DEAD bit
is set and pointers are collected. GC catchall does not remove the
elements anymore too. There is a new set->dead flag that is set on to
abort the GC transaction to deal with set->ops->destroy() path which
removes the remaining elements in the set from commit_release, where no
mutex is held.

To deal with GC when mutex is held, which allows safe deactivate and
removal, add sync GC API which releases the set element object via
call_rcu(). This is used by rbtree and pipapo backends which also
perform garbage collection from control plane path.

Since element removal from sets can happen from control plane and
element garbage collection/timeout, it is necessary to keep the set
structure alive until all elements have been deactivated and destroyed.

We cannot do a cancel_work_sync or flush_work in nft_set_destroy because
its called with the transaction mutex held, but the aforementioned async
work queue might be blocked on the very mutex that nft_set_destroy()
callchain is sitting on.

This gives us the choice of ABBA deadlock or UaF.

To avoid both, add set->refs refcount_t member. The GC API can then
increment the set refcount and release it once the elements have been
free'd.

Set backends are adapted to use the GC transaction API in a follow up
patch entitled:

  ("netfilter: nf_tables: use gc transaction API in set backends")

This is joint work with Florian Westphal.

Fixes: cfed7e1b1f ("netfilter: nf_tables: add set garbage collection helpers")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-08-10 08:25:16 +02:00
..
9p 9p: Add additional debug flags and open modes 2023-03-27 02:33:48 +00:00
bluetooth Bluetooth: coredump: fix building with coredump disabled 2023-07-20 11:25:24 -07:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: Use struct_group() to zero struct iucv_sock region 2021-11-19 11:52:25 +00:00
mana Linux 6.4 2023-06-27 14:06:29 -03:00
netfilter netfilter: nf_tables: GC transaction API to avoid race with control plane 2023-08-10 08:25:16 +02:00
netns tcp: enforce receive buffer memory limits by allowing the tcp window to shrink 2023-06-17 09:53:53 +01:00
nfc NFC: add NCI_UNREG flag to eliminate the race 2021-11-17 20:17:05 -08:00
phonet net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
sctp sctp: delete the nested flexible array peer_init 2023-04-21 08:19:30 +01:00
tc_act net/sched: act_connmark: transition to percpu stats and rcu 2023-02-16 10:39:28 +01:00
6lowpan.h
act_api.h net/sched: Rename user cookie and act cookie 2023-02-20 16:46:10 -08:00
addrconf.h ipv6: constify inet6_mc_check() 2023-03-17 08:56:37 +00:00
af_ieee802154.h
af_rxrpc.h rxrpc: Fix timeout of a call that hasn't yet been granted a channel 2023-05-01 07:43:19 +01:00
af_unix.h af_unix: preserve const qualifier in unix_sk() 2023-03-18 12:23:33 +00:00
af_vsock.h vsock: support sockmap 2023-03-29 08:19:38 +01:00
ah.h
amt.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
arp.h neighbour: switch to standard rcu, instead of rcu_bh 2023-03-21 21:32:18 -07:00
atmclip.h
ax25.h x25: preserve const qualifier in [a]x25_sk() 2023-03-18 12:23:34 +00:00
ax88796.h ax88796: Fix some typo in a comment 2022-08-09 22:14:02 -07:00
bareudp.h bareudp: Move definition of struct bareudp_conf to bareudp.c 2021-12-13 12:34:09 +00:00
bond_3ad.h net: bonding: Share lacpdu_mcast_addr definition 2022-09-16 14:34:01 +01:00
bond_alb.h bonding (gcc13): synchronize bond_{a,t}lb_xmit() types 2022-11-02 20:38:13 -07:00
bond_options.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
bonding.h net: bonding: remove kernel-doc comment marker 2023-07-14 20:39:29 -07:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h net: Fix a data-race around sysctl_net_busy_poll. 2022-08-24 13:46:58 +01:00
calipso.h
cfg80211-wext.h wifi: cfg80211: Avoid clashing function prototypes 2022-11-16 11:31:47 +02:00
cfg80211.h wifi: cfg80211: Retrieve PSD information from RNR AP information 2023-06-21 14:01:29 +02:00
cfg802154.h net: cfg802154: fix kernel-doc notation warnings 2023-07-14 20:39:29 -07:00
checksum.h net: checksum: drop the linux/uaccess.h include 2023-01-27 11:19:46 +00:00
cipso_ipv4.h
cls_cgroup.h
codel_impl.h codel: remove unnecessary sock.h include 2021-12-22 15:03:47 -08:00
codel_qdisc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
codel.h codel: fix kernel-doc notation warnings 2023-07-14 20:39:29 -07:00
compat.h net: copy from user before calling __get_compat_msghdr 2022-07-24 18:39:17 -06:00
datalink.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbnl.h net: dcb: add helper functions to retrieve PCP and DSCP rewrite maps 2023-01-20 09:33:22 +00:00
devlink.h devlink: fix kernel-doc notation warnings 2023-07-14 20:39:29 -07:00
dropreason-core.h net: extend drop reasons for multiple subsystems 2023-04-20 20:20:49 -07:00
dropreason.h mac80211: use the new drop reasons infrastructure 2023-04-20 20:20:49 -07:00
dsa_stubs.h net: dsa: replace NETDEV_PRE_CHANGE_HWTSTAMP notifier with a stub 2023-04-09 15:35:49 +01:00
dsa.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-06-27 09:45:22 -07:00
dsfield.h
dst_cache.h wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst_metadata.h xfrm: interface: Add unstable helpers for setting/getting XFRM metadata from TC-BPF 2022-12-05 21:58:27 -08:00
dst_ops.h ipv6: remove max_size check inline with ipv4 2023-01-13 20:59:14 -08:00
dst.h net: dst: Switch to rcuref_t reference counting 2023-03-28 18:52:28 -07:00
erspan.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
esp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
espintcp.h
ethoc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
failover.h net: failover: add net device refcount tracker 2021-12-06 16:06:02 -08:00
fib_notifier.h
fib_rules.h fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
firewire.h firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48() 2022-07-28 22:21:54 -07:00
flow_dissector.h net: flow_dissector: add support for cfm packets 2023-06-12 17:01:45 -07:00
flow_offload.h net/sched: cls_api: Support hardware miss to tc action 2023-02-20 16:46:10 -08:00
flow.h ipv4: Drop tos parameter from flowi4_update_output() 2023-06-02 10:52:38 +01:00
fou.h bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs 2023-04-12 16:40:39 -07:00
fq_impl.h wifi: mac80211: add support for restricting netdev features per vif 2022-12-01 15:09:10 +01:00
fq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
garp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gen_stats.h net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
genetlink.h mptcp: more detailed error reporting on endpoint creation 2022-11-21 13:09:07 +00:00
geneve.h net: geneve: fix array of flexible structures warnings 2022-10-31 10:43:04 +00:00
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro_cells.h
gro.h net: gro: fix misuse of CB in udp socket lookup 2023-07-29 17:10:27 +01:00
gso.h net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
gtp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gue.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
handshake.h net/handshake: Enable the SNI extension to work properly 2023-05-24 22:05:24 -07:00
hwbm.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h wifi: iwlwifi: mvm: support U-SIG EHT validate checks 2023-06-14 12:32:19 +02:00
ieee802154_netdev.h mac802154: Handle received BEACON_REQ 2023-03-23 21:51:30 +01:00
if_inet6.h ipv6: fix locking issues with loops over idev->addr_list 2022-04-06 22:09:39 -07:00
ife.h
ila.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_connection_sock.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_hashtables.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_common.h sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) 2023-06-24 15:50:13 -07:00
inet_connection_sock.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
inet_dscp.h ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules 2022-02-07 20:12:45 -08:00
inet_ecn.h net: add skb_get_dsfield() helper 2021-10-15 11:33:08 +01:00
inet_frag.h inet: frags: eliminate kernel-doc warning 2023-07-14 20:39:29 -07:00
inet_hashtables.h tcp: Add TIME_WAIT sockets in bhash2. 2022-12-30 07:25:52 +00:00
inet_sock.h net: annotate data-races around sk->sk_mark 2023-07-29 18:13:41 +01:00
inet_timewait_sock.h tcp: Add TIME_WAIT sockets in bhash2. 2022-12-30 07:25:52 +00:00
inetpeer.h
ioam6.h treewide: Replace zero-length arrays with flexible-array members 2022-02-17 07:00:39 -06:00
ip6_checksum.h net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
ip6_fib.h ipv6: Remove in6addr_any alternatives. 2023-03-29 08:22:52 +01:00
ip6_route.h net: dst: Prevent false sharing vs. dst_entry:: __refcnt 2023-03-28 18:52:22 -07:00
ip6_tunnel.h ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-04-25 11:40:45 +01:00
ip_fib.h ipv4: Use dscp_t in struct fib_entry_notifier_info 2022-04-11 17:37:50 -07:00
ip_tunnels.h bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
ip_vs.h ipvs: Correct spelling in comments 2023-04-22 01:39:41 +02:00
ip.h net: annotate data-races around sk->sk_mark 2023-07-29 18:13:41 +01:00
ipcomp.h xfrm: ipcomp: add extack to ipcomp{4,6}_init_state 2022-09-29 07:18:00 +02:00
ipconfig.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ipv6_frag.h net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
ipv6_stubs.h bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() 2022-09-02 20:34:32 -07:00
ipv6.h tcp: Reduce chance of collisions in inet6_hashfn(). 2023-07-24 16:52:37 -07:00
iw_handler.h
kcm.h kcm: Send multiple frags in one sendmsg() 2023-06-12 21:13:23 -07:00
l3mdev.h
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc_c_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_c_ev.h
llc_c_st.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_conn.h llc: Check netns in llc_estab_match() and llc_listener_match(). 2023-07-20 10:46:28 +02:00
llc_if.h llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
llc_pdu.h net: llc: fix kernel-doc notation warnings 2023-07-14 20:39:29 -07:00
llc_s_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_ev.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_st.h add missing includes and forward declarations to networking includes under linux/ 2022-07-28 11:29:36 +02:00
llc_sap.h
llc.h llc: fix out-of-bound array index in llc_sk_dev_hash() 2021-11-07 19:25:29 +00:00
lwtunnel.h netfilter: add netfilter hooks to SRv6 data plane 2021-08-30 01:51:36 +02:00
mac80211.h wifi: mac80211: fix documentation config reference 2023-06-21 09:16:57 +02:00
mac802154.h mac802154: Drop IEEE802154_HW_RX_DROP_BAD_CKSUM 2022-10-12 12:57:19 +02:00
macsec.h macsec: Use helper macsec_netdev_priv for offload drivers 2023-05-10 11:32:09 +01:00
mctp.h mctp: Reorder fields in 'struct mctp_route' 2023-06-20 20:06:16 -07:00
mctpdevice.h mctp: Pass flow data & flow release events to drivers 2021-10-29 13:23:51 +01:00
mip6.h
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls_iptunnel.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
mpls.h
mptcp.h mptcp: remove MPTCP 'ifdef' in TCP SYN cookies 2022-12-12 13:11:24 -08:00
mrp.h mrp: introduce active flags to prevent UAF when applicant uninit 2022-11-18 12:14:55 +00:00
ncsi.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ndisc.h neighbour: switch to standard rcu, instead of rcu_bh 2023-03-21 21:32:18 -07:00
neighbour.h neighbour: fix unaligned access to pneigh_entry 2023-06-01 21:36:37 -07:00
net_debug.h net: add CONFIG_DEBUG_NET 2022-05-11 12:43:10 +01:00
net_failover.h
net_namespace.h net: add a refcount tracker for kernel sockets 2022-10-24 11:04:43 +01:00
net_ratelimit.h
net_trackers.h net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
netdev_queues.h net: add macro netif_subqueue_completed_wake 2023-04-18 12:59:01 +02:00
netevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
netlabel.h
netlink.h net: netlink: recommend policy range validation 2023-01-28 00:33:51 -08:00
netprio_cgroup.h
netrom.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
nexthop.h ipv6: remove nexthop_fib6_nh_bh() 2023-05-11 18:07:05 -07:00
nl802154.h ieee802154: Add support for user beaconing requests 2023-01-28 13:51:22 +01:00
nsh.h net: NSH: fix kernel-doc notation warning 2023-07-14 20:39:29 -07:00
p8022.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
page_pool.h page_pool: fix inconsistency for page_pool_ring_[un]lock() 2023-05-23 20:25:13 -07:00
pie.h pie: fix kernel-doc notation warning 2023-07-14 20:39:30 -07:00
ping.h net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294 2023-06-02 09:55:22 +01:00
pkt_cls.h sch_htb: Allow HTB priority parameter in offload mode 2023-05-15 09:31:07 +01:00
pkt_sched.h net/sched: make psched_mtu() RTNL-less safe 2023-07-12 15:59:33 -07:00
pptp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
protocol.h tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
psample.h psample: Add a fwd declaration for skbuff 2021-08-09 15:34:21 -07:00
psnap.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
raw.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-04-06 12:01:20 -07:00
rawv6.h ipv6: raw: constify raw_v6_match() socket argument 2023-03-17 08:56:37 +00:00
red.h treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
regulatory.h wifi: cfg80211: fix regulatory disconnect with OCB/NAN 2023-06-19 12:05:29 +02:00
request_sock.h tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
rose.h net: rose: add netdev ref tracker to 'struct rose_sock' 2022-08-01 11:59:23 -07:00
route.h net: annotate data-races around sk->sk_mark 2023-07-29 18:13:41 +01:00
rpl.h ipv6: rpl: Remove pskb(_may)?_pull() in ipv6_rpl_srh_rcv(). 2023-06-19 11:32:58 -07:00
rsi_91x.h rsi: remove kernel-doc comment marker 2023-07-14 20:39:30 -07:00
rtnetlink.h rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link 2022-10-31 18:10:21 -07:00
rtnh.h
sch_generic.h net: sched: Remove unused qdisc_l2t() 2023-06-17 00:17:42 -07:00
scm.h net: scm: introduce and use scm_recv_unix helper 2023-06-27 10:50:22 -07:00
secure_seq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
seg6_hmac.h
seg6_local.h
seg6.h udp6: Use Segment Routing Header for dest address if present 2022-01-04 12:17:35 +00:00
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h net/smc: Introduce explicit check for v2 support 2023-03-15 08:18:35 +00:00
snmp.h
sock_reuseport.h soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-10-25 11:35:16 +02:00
sock.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-06-27 09:45:22 -07:00
Space.h wan: remove sbni/granch driver 2021-08-03 13:05:26 +01:00
stp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
strparser.h tls: rx: remove the message decrypted tracking 2022-07-18 11:24:10 +01:00
switchdev.h bridge: switchdev: Allow device drivers to install locked FDB entries 2022-11-09 19:06:13 -08:00
tc_wrapper.h net/sched: Retire rsvp classifier 2023-02-16 09:27:07 +01:00
tcp_states.h
tcp.h tcp: annotate data-races around tp->notsent_lowat 2023-07-20 12:34:18 -07:00
timewait_sock.h
tipc.h
tls_toe.h
tls.h net: tls: make the offload check helper take skb not socket 2023-06-15 09:01:05 +01:00
transp_v6.h inet6: Remove inet6_destroy_sock(). 2022-10-24 09:40:39 +01:00
tso.h net: tso: inline tso_count_descs() 2022-12-12 15:04:39 -08:00
tun_proto.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
udp_tunnel.h net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error() 2022-11-08 16:42:28 +00:00
udp.h net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
udplite.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
vsock_addr.h
vxlan.h vxlan: calculate correct header length for GPE 2023-07-24 09:37:32 +01:00
wext.h
x25.h x25: preserve const qualifier in [a]x25_sk() 2023-03-18 12:23:34 +00:00
x25device.h
xdp_priv.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
xdp_sock_drv.h xsk: Remove unused inline function xsk_buff_discard() 2023-06-19 14:06:22 +02:00
xdp_sock.h bpf, net: xskmap memory usage 2023-03-07 09:33:43 -08:00
xdp.h bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
xfrm.h xfrm: Treat already-verified secpath entries as optional 2023-05-21 09:21:37 +02:00
xsk_buff_pool.h xsk: Use pool->dma_pages to check for DMA 2023-04-27 22:24:51 +02:00