linux/net/ipv4
Stanislav Fomichev 871019b22d net: set SOCK_RCU_FREE before inserting socket into hashtable
We've started to see the following kernel traces:

 WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0

 Call Trace:
  <IRQ>
  __bpf_skc_lookup+0x10d/0x120
  bpf_sk_lookup+0x48/0xd0
  bpf_sk_lookup_tcp+0x19/0x20
  bpf_prog_<redacted>+0x37c/0x16a3
  cls_bpf_classify+0x205/0x2e0
  tcf_classify+0x92/0x160
  __netif_receive_skb_core+0xe52/0xf10
  __netif_receive_skb_list_core+0x96/0x2b0
  napi_complete_done+0x7b5/0xb70
  <redacted>_poll+0x94/0xb0
  net_rx_action+0x163/0x1d70
  __do_softirq+0xdc/0x32e
  asm_call_irq_on_stack+0x12/0x20
  </IRQ>
  do_softirq_own_stack+0x36/0x50
  do_softirq+0x44/0x70

__inet_hash can race with lockless (rcu) readers on the other cpus:

  __inet_hash
    __sk_nulls_add_node_rcu
    <- (bpf triggers here)
    sock_set_flag(SOCK_RCU_FREE)

Let's move the SOCK_RCU_FREE part up a bit, before we are inserting
the socket into hashtables. Note, that the race is really harmless;
the bpf callers are handling this situation (where listener socket
doesn't have SOCK_RCU_FREE set) correctly, so the only
annoyance is a WARN_ONCE.

More details from Eric regarding SOCK_RCU_FREE timeline:

Commit 3b24d854cb ("tcp/dccp: do not touch listener sk_refcnt under
synflood") added SOCK_RCU_FREE. At that time, the precise location of
sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling
__inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested
at dismantle time.

Commit 6acc9b432e ("bpf: Add helper to retrieve socket in BPF")
started checking SOCK_RCU_FREE _after_ the lookup to infer whether
the refcount has been taken care of.

Fixes: 6acc9b432e ("bpf: Add helper to retrieve socket in BPF")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-10 08:37:09 +00:00
..
bpfilter net: Use umd_cleanup_helper() 2023-05-31 13:06:57 +02:00
netfilter Including fixes from netfilter and bpf. 2023-11-09 17:09:35 -08:00
af_inet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-10-19 13:29:01 -07:00
ah4.c net: ipv4: stop checking crypto_ahash_alignmask 2023-10-27 18:04:29 +08:00
arp.c neighbour: annotate lockless accesses to n->nud_state 2023-03-15 00:37:32 -07:00
bpf_tcp_ca.c bpf: Drop useless btf_vmlinux in bpf_tcp_ca 2023-07-18 17:31:10 -07:00
cipso_ipv4.c inet: move inet->is_icsk to inet->inet_flags 2023-08-16 11:09:17 +01:00
datagram.c inet: implement lockless getsockopt(IP_MULTICAST_IF) 2023-10-01 19:39:19 +01:00
devinet.c net: ipv4: fix one memleak in __inet_del_ifa() 2023-09-08 08:02:17 +01:00
esp4_offload.c xfrm: Support GRO for IPv4 ESP in UDP encapsulation 2023-10-06 07:30:40 +02:00
esp4.c net: ipv4: fix typo in comments 2023-10-25 10:38:07 +01:00
fib_frontend.c ipv4: Fix incorrect table ID in IOCTL path 2023-03-16 17:26:31 -07:00
fib_lookup.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-17 11:44:20 -08:00
fib_notifier.c
fib_rules.c ipv4: remove unnecessary type castings 2022-04-30 15:12:58 +01:00
fib_semantics.c ipv4: fib: annotate races around nh->nh_saddr_genid and nh->nh_saddr 2023-10-18 18:11:31 -07:00
fib_trie.c ipv4/fib: send notify when delete source address routes 2023-10-03 09:00:40 +02:00
fou_bpf.c bpf: Add __bpf_kfunc_{start,end}_defs macros 2023-11-01 22:33:53 -07:00
fou_core.c bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs 2023-04-12 16:40:39 -07:00
fou_nl.c net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
fou_nl.h net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
gre_demux.c
gre_offload.c net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
icmp.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
igmp.c ipv4: igmp: Remove redundant comparison in igmp_mcf_get_next() 2023-09-14 17:20:17 +02:00
inet_connection_sock.c tcp: allow again tcp_disconnect() when threads are waiting 2023-10-13 16:49:32 -07:00
inet_diag.c inet: implement lockless IP_TOS 2023-10-01 19:39:18 +01:00
inet_fragment.c net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
inet_hashtables.c net: set SOCK_RCU_FREE before inserting socket into hashtable 2023-11-10 08:37:09 +00:00
inet_timewait_sock.c inet: move inet->transparent to inet->inet_flags 2023-08-16 11:09:17 +01:00
inetpeer.c inetpeer: Fix data-races around sysctl. 2022-07-08 12:10:33 +01:00
ip_forward.c net: fix IPSTATS_MIB_OUTFORWDATAGRAMS increment after fragment check 2023-10-13 09:58:45 -07:00
ip_fragment.c networking: Update to register_net_sysctl_sz 2023-08-15 15:26:18 -07:00
ip_gre.c ipv4: ip_gre: fix return value check in erspan_xmit() 2023-07-19 12:27:09 +01:00
ip_input.c ipv4: ignore dst hint for multipath routes 2023-09-01 08:11:51 +01:00
ip_options.c ipv4: drop fragmentation code from ip_options_build() 2022-01-29 17:53:07 +00:00
ip_output.c net: fix IPSTATS_MIB_OUTPKGS increment in OutForwDatagrams. 2023-10-20 12:01:00 +01:00
ip_sockglue.c net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos() 2023-10-27 15:41:28 -07:00
ip_tunnel_core.c tunnels: fix kasan splat when generating ipv4 pmtu error 2023-08-04 18:24:52 -07:00
ip_tunnel.c bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
ip_vti.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
ipcomp.c xfrm: ipcomp: add extack to ipcomp{4,6}_init_state 2022-09-29 07:18:00 +02:00
ipconfig.c net: ipconfig: move ic_nameservers_fallback into #ifdef block 2023-05-22 11:17:55 +01:00
ipip.c ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices 2023-04-12 16:40:39 -07:00
ipmr_base.c ipmr: adopt rcu_read_lock() in mr_dump() 2022-06-24 11:34:38 +01:00
ipmr.c net: ipv4, ipv6: fix IPSTATS_MIB_OUTOCTETS increment duplicated 2023-08-30 09:44:09 +01:00
Kconfig net/tcp: Add TCP-AO config and structures 2023-10-27 10:35:44 +01:00
Makefile net/tcp: Introduce TCP_AO setsockopt()s 2023-10-27 10:35:44 +01:00
metrics.c ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() 2023-01-23 21:37:25 -08:00
netfilter.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
netlink.c
nexthop.c nexthop: Do not increment dump sentinel at the end of the dump 2023-08-15 18:54:53 -07:00
ping.c bpf-next-for-netdev 2023-10-16 21:05:33 -07:00
proc.c net/tcp: Ignore specific ICMPs for TCP-AO connections 2023-10-27 10:35:45 +01:00
protocol.c
raw_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-04-06 12:01:20 -07:00
raw.c inet: implement lockless getsockopt(IP_MULTICAST_IF) 2023-10-01 19:39:19 +01:00
route.c ipv4: rename and move ip_route_output_tunnel() 2023-10-16 09:57:52 +01:00
syncookies.c tcp: fix fastopen code vs usec TS 2023-11-03 09:16:42 +00:00
sysctl_net_ipv4.c tcp: Set pingpong threshold via sysctl 2023-10-16 14:55:32 -07:00
tcp_ao.c tcp: Fix SYN option room calculation for TCP-AO. 2023-11-06 08:59:54 +00:00
tcp_bbr.c net: implement lockless SO_MAX_PACING_RATE 2023-10-01 19:09:54 +01:00
tcp_bic.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_bpf.c tcp_bpf: properly release resources on error paths 2023-10-18 18:09:31 -07:00
tcp_cdg.c Random number generator fixes for Linux 6.1-rc1. 2022-10-16 15:27:07 -07:00
tcp_cong.c net: Update an existing TCP congestion control algorithm. 2023-03-22 22:53:00 -07:00
tcp_cubic.c bpf: Add __bpf_kfunc tag to all kfuncs 2023-02-02 00:25:14 +01:00
tcp_dctcp.c bpf: Add __bpf_kfunc tag to all kfuncs 2023-02-02 00:25:14 +01:00
tcp_dctcp.h
tcp_diag.c tcp: Access &tcp_hashinfo via net. 2022-09-20 10:21:49 -07:00
tcp_fastopen.c inet: move inet->defer_connect to inet->inet_flags 2023-08-16 11:09:18 +01:00
tcp_highspeed.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_htcp.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_hybla.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_illinois.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_input.c tcp: fix fastopen code vs usec TS 2023-11-03 09:16:42 +00:00
tcp_ipv4.c net/tcp: Wire up l3index to TCP-AO 2023-10-27 10:35:46 +01:00
tcp_lp.c tcp: rename tcp_time_stamp() to tcp_time_stamp_ts() 2023-10-23 09:35:01 +01:00
tcp_metrics.c tcp_metrics: optimize tcp_metrics_flush_all() 2023-10-03 10:05:22 +02:00
tcp_minisocks.c net/tcp: Add TCP-AO SNE support 2023-10-27 10:35:45 +01:00
tcp_nv.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_offload.c net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
tcp_output.c tcp: Fix -Wc23-extensions in tcp_options_write() 2023-11-07 22:23:56 +00:00
tcp_plb.c prandom: remove prandom_u32_max() 2022-12-20 03:13:45 +01:00
tcp_rate.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
tcp_recovery.c tcp: fix excessive TLP and RACK timeouts from HZ rounding 2023-10-17 17:25:42 -07:00
tcp_scalable.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_sigpool.c net/tcp_sigpool: Fix some off by one bugs 2023-11-01 22:28:09 -07:00
tcp_timer.c tcp: add support for usec resolution in TCP TS values 2023-10-23 09:35:01 +01:00
tcp_ulp.c net/ulp: use consistent error code when blocking ULP 2023-01-19 09:26:16 -08:00
tcp_vegas.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_vegas.h
tcp_veno.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_westwood.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_yeah.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp.c net/tcp: Add TCP_AO_REPAIR 2023-10-27 10:35:46 +01:00
tunnel4.c
udp_bpf.c bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() 2023-03-03 17:25:15 +01:00
udp_diag.c udp: Access &udp_table via net. 2022-11-16 09:43:35 +00:00
udp_impl.h sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) 2023-06-24 15:50:13 -07:00
udp_offload.c udp: move udp->gro_enabled to udp->udp_flags 2023-09-14 16:16:36 +02:00
udp_tunnel_core.c ipv4: use tunnel flow flags for tunnel route lookups 2023-10-16 09:57:52 +01:00
udp_tunnel_nic.c udp_tunnel: Use flex array to simplify code 2023-10-03 11:39:34 +02:00
udp_tunnel_stub.c
udp.c ipsec-next-2023-10-28 2023-10-30 14:36:57 -07:00
udplite.c udplite: remove UDPLITE_BIT 2023-09-14 16:16:36 +02:00
xfrm4_input.c xfrm Fix use after free in __xfrm6_udp_encap_rcv. 2023-10-23 07:10:39 +02:00
xfrm4_output.c
xfrm4_policy.c sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
xfrm4_protocol.c net: xfrm: unexport __init-annotated xfrm4_protocol_init() 2022-06-08 10:10:13 -07:00
xfrm4_state.c
xfrm4_tunnel.c xfrm: tunnel: add extack to ipip_init_state, xfrm6_tunnel_init_state 2022-09-29 07:18:00 +02:00