linux/net/ipv6
Martin KaFai Lau 427165421c net: inet: Retire port only listening_hash
[ Upstream commit cae3873c5b ]

The listen sk is currently stored in two hash tables,
listening_hash (hashed by port) and lhash2 (hashed by port and address).

After commit 0ee58dad5b ("net: tcp6: prefer listeners bound to an address")
and commit d9fbc7f643 ("net: tcp: prefer listeners bound to an address"),
the TCP-SYN lookup fast path does not use listening_hash.

The commit 05c0b35709 ("tcp: seq_file: Replace listening_hash with lhash2")
also moved the seq_file (/proc/net/tcp) iteration usage from
listening_hash to lhash2.

There are still a few listening_hash usages left.
One of them is inet_reuseport_add_sock() which uses the listening_hash
to search a listen sk during the listen() system call.  This turns
out to be very slow on use cases that listen on many different
VIPs at a popular port (e.g. 443).  [ On top of the slowness in
adding to the tail in the IPv6 case ].  The latter patch has a
selftest to demonstrate this case.

This patch takes this chance to move all remaining listening_hash
usages to lhash2 and then retire listening_hash.

Since most changes need to be done together, it is hard to cut
the listening_hash to lhash2 switch into small patches.  The
changes in this patch is highlighted here for the review
purpose.

1. Because of the listening_hash removal, lhash2 can use the
   sk->sk_nulls_node instead of the icsk->icsk_listen_portaddr_node.
   This will also keep the sk_unhashed() check to work as is
   after stop adding sk to listening_hash.

   The union is removed from inet_listen_hashbucket because
   only nulls_head is needed.

2. icsk->icsk_listen_portaddr_node and its helpers are removed.

3. The current lhash2 users needs to iterate with sk_nulls_node
   instead of icsk_listen_portaddr_node.

   One case is in the inet[6]_lhash2_lookup().

   Another case is the seq_file iterator in tcp_ipv4.c.
   One thing to note is sk_nulls_next() is needed
   because the old inet_lhash2_for_each_icsk_continue()
   does a "next" first before iterating.

4. Move the remaining listening_hash usage to lhash2

   inet_reuseport_add_sock() which this series is
   trying to improve.

   inet_diag.c and mptcp_diag.c are the final two
   remaining use cases and is moved to lhash2 now also.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 871019b22d ("net: set SOCK_RCU_FREE before inserting socket into hashtable")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 16:56:22 +00:00
..
ila ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping() 2023-03-17 08:48:54 +01:00
netfilter netfilter: tproxy: fix deadlock due to missing BH disable 2023-03-17 08:48:55 +01:00
addrconf_core.c
addrconf.c net: release reference to inet6_dev pointer 2023-10-19 23:05:35 +02:00
addrlabel.c ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to network 2022-11-16 09:58:18 +01:00
af_inet6.c dccp: Call inet6_destroy_sock() via sk->sk_destruct(). 2023-04-26 13:51:54 +02:00
ah6.c xfrm: remove hdr_offset indirection 2021-06-11 14:48:50 +02:00
anycast.c
calipso.c
datagram.c ipv6: Fix datagram socket connection with DSCP. 2023-02-22 12:57:09 +01:00
esp6_offload.c xfrm: Linearize the skb after offloading if needed. 2023-06-28 10:29:46 +02:00
esp6.c net: ipv6: fix return value check in esp_remove_trailer 2023-10-25 11:58:57 +02:00
exthdrs_core.c ipv6: Fix out-of-bounds access in ipv6_find_tlv() 2023-05-30 13:55:31 +01:00
exthdrs_offload.c
exthdrs.c ipv6: rpl: Fix Route of Death. 2023-06-14 11:13:02 +02:00
fib6_notifier.c
fib6_rules.c ipv6: fix memory leak in fib6_rule_suppress 2021-12-08 09:04:43 +01:00
fou6.c
icmp.c icmp6: Fix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev(). 2023-07-23 13:47:41 +02:00
inet6_connection_sock.c
inet6_hashtables.c net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
ioam6_iptunnel.c ipv6: ioam: move the check for undefined bits 2021-10-12 11:49:49 +01:00
ioam6.c ipv6: ioam: move the check for undefined bits 2021-10-12 11:49:49 +01:00
ip6_checksum.c
ip6_fib.c ipv6: annotate accesses to fn->fn_sernum 2022-02-01 17:27:09 +01:00
ip6_flowlabel.c ipv6: per-netns exclusive flowlabel checks 2022-02-23 12:03:10 +01:00
ip6_gre.c net:ipv6: check return value of pskb_trim() 2023-07-27 08:47:01 +02:00
ip6_icmp.c
ip6_input.c tcp/udp: Make early_demux back namespacified. 2022-11-10 18:15:38 +01:00
ip6_offload.c gso: do not skip outer ip header in case of ipip and net_failover 2022-03-02 11:47:56 +01:00
ip6_offload.h
ip6_output.c ipv6: avoid atomic fragment on GSO packets 2023-11-20 11:08:16 +01:00
ip6_tunnel.c net: tunnels: annotate lockless accesses to dev->needed_headroom 2023-03-22 13:31:26 +01:00
ip6_udp_tunnel.c
ip6_vti.c ip6_vti: fix slab-use-after-free in decode_session6 2023-08-26 14:23:32 +02:00
ip6mr.c ip6mr: Fix skb_under_panic in ip6mr_cache_report() 2023-08-11 15:13:53 +02:00
ipcomp6.c xfrm: remove hdr_offset indirection 2021-06-11 14:48:50 +02:00
ipv6_sockglue.c udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). 2023-04-26 13:51:54 +02:00
Kconfig ipv6: ioam: Support for IOAM injection with lwtunnels 2021-07-21 08:14:33 -07:00
Makefile ipv6: ioam: Support for IOAM injection with lwtunnels 2021-07-21 08:14:33 -07:00
mcast_snoop.c net: bridge: mcast: fix broken length + header check for MRDv6 Adv. 2021-04-27 14:02:06 -07:00
mcast.c net: mld: fix reference count leak in mld_{query | report}_work() 2022-08-03 12:03:51 +02:00
mip6.c xfrm: ipv6: move mip6_rthdr_offset into xfrm core 2021-06-11 14:48:50 +02:00
ndisc.c net: change accept_ra_min_rtr_lft to affect all RA lifetimes 2023-10-19 23:05:35 +02:00
netfilter.c netfilter: Update ip6_route_me_harder to consider L3 domain 2022-05-09 09:14:41 +02:00
output_core.c ipv6: use prandom_u32() for ID generation 2021-05-31 22:12:08 -07:00
ping.c ping6: Fix send to link-local addresses with VRF. 2023-06-21 15:59:16 +02:00
proc.c
protocol.c
raw.c ipv{4,6}/raw: fix output xfrm lookup wrt protocol 2023-06-05 09:21:26 +02:00
reassembly.c ipv6: record frag_max_size in atomic fragments in input path 2021-05-21 15:02:25 -07:00
route.c ipv6: Add lwtunnel encap size of all siblings in nexthop calculation 2023-03-11 13:57:28 +01:00
rpl_iptunnel.c
rpl.c net: rpl: fix rpl header size calculation 2023-04-26 13:51:49 +02:00
seg6_hmac.c net: ipv6: unexport __init-annotated seg6_hmac_net_init() 2022-07-07 17:53:26 +02:00
seg6_iptunnel.c seg6: fix skb checksum evaluation in SRH encapsulation/insertion 2022-07-21 21:24:30 +02:00
seg6_local.c seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors 2022-07-21 21:24:30 +02:00
seg6.c ipv6: sr: fix out-of-bounds read when setting HMAC data. 2022-09-15 11:30:06 +02:00
sit.c sit: update dev->needed_headroom in ipip6_tunnel_bind_dev() 2023-05-17 11:50:16 +02:00
syncookies.c dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses. 2023-11-20 11:08:28 +01:00
sysctl_net_ipv6.c ipv6: ioam: Data plane support for Pre-allocated Trace 2021-07-21 08:14:33 -07:00
tcp_ipv6.c tcp: annotate data-races around tcp_rsk(req)->ts_recent 2023-07-27 08:47:01 +02:00
tcpv6_offload.c
tunnel6.c
udp_impl.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2023-04-26 13:51:54 +02:00
udp_offload.c
udp.c ipv6: Add reasons for skb drops to __udp6_lib_rcv 2023-09-19 12:22:32 +02:00
udplite.c udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated(). 2023-05-30 13:55:31 +01:00
xfrm6_input.c xfrm: fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets 2023-06-28 10:29:46 +02:00
xfrm6_output.c xfrm: fix tunnel model fragmentation behavior 2022-04-08 14:22:46 +02:00
xfrm6_policy.c xfrm6: fix inet6_dev refcount underflow problem 2023-10-25 11:59:04 +02:00
xfrm6_protocol.c
xfrm6_state.c
xfrm6_tunnel.c xfrm: remove description from xfrm_type struct 2021-06-09 09:38:52 +02:00