linux/include/net
Martin KaFai Lau 427165421c net: inet: Retire port only listening_hash
[ Upstream commit cae3873c5b ]

The listen sk is currently stored in two hash tables,
listening_hash (hashed by port) and lhash2 (hashed by port and address).

After commit 0ee58dad5b ("net: tcp6: prefer listeners bound to an address")
and commit d9fbc7f643 ("net: tcp: prefer listeners bound to an address"),
the TCP-SYN lookup fast path does not use listening_hash.

The commit 05c0b35709 ("tcp: seq_file: Replace listening_hash with lhash2")
also moved the seq_file (/proc/net/tcp) iteration usage from
listening_hash to lhash2.

There are still a few listening_hash usages left.
One of them is inet_reuseport_add_sock() which uses the listening_hash
to search a listen sk during the listen() system call.  This turns
out to be very slow on use cases that listen on many different
VIPs at a popular port (e.g. 443).  [ On top of the slowness in
adding to the tail in the IPv6 case ].  The latter patch has a
selftest to demonstrate this case.

This patch takes this chance to move all remaining listening_hash
usages to lhash2 and then retire listening_hash.

Since most changes need to be done together, it is hard to cut
the listening_hash to lhash2 switch into small patches.  The
changes in this patch is highlighted here for the review
purpose.

1. Because of the listening_hash removal, lhash2 can use the
   sk->sk_nulls_node instead of the icsk->icsk_listen_portaddr_node.
   This will also keep the sk_unhashed() check to work as is
   after stop adding sk to listening_hash.

   The union is removed from inet_listen_hashbucket because
   only nulls_head is needed.

2. icsk->icsk_listen_portaddr_node and its helpers are removed.

3. The current lhash2 users needs to iterate with sk_nulls_node
   instead of icsk_listen_portaddr_node.

   One case is in the inet[6]_lhash2_lookup().

   Another case is the seq_file iterator in tcp_ipv4.c.
   One thing to note is sk_nulls_next() is needed
   because the old inet_lhash2_for_each_icsk_continue()
   does a "next" first before iterating.

4. Move the remaining listening_hash usage to lhash2

   inet_reuseport_add_sock() which this series is
   trying to improve.

   inet_diag.c and mptcp_diag.c are the final two
   remaining use cases and is moved to lhash2 now also.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 871019b22d ("net: set SOCK_RCU_FREE before inserting socket into hashtable")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 16:56:22 +00:00
..
9p 9p: Add client parameter to p9_req_put() 2022-08-17 14:24:07 +02:00
bluetooth Bluetooth: hci_sock: Correctly bounds check and pad HCI_MON_NEW_INDEX name 2023-10-25 11:59:04 +02:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: don't track individual TX skbs for TRANS_HIPER sockets 2021-01-28 20:36:21 -08:00
netfilter netfilter: nft_redir: use struct nf_nat_range2 throughout and deduplicate eval call-backs 2023-11-20 11:08:29 +01:00
netns xfrm: fix a data-race in xfrm_gen_index() 2023-10-25 11:58:56 +02:00
nfc NFC: add NCI_UNREG flag to eliminate the race 2021-11-25 09:48:40 +01:00
phonet
sctp sctp: add a refcnt in sctp_stream_priorities to avoid a nested loop 2023-03-11 13:57:28 +01:00
tc_act net/sched: transition act_pedit to rcu and percpu stats 2023-03-11 13:57:29 +01:00
6lowpan.h 6lowpan: Replace zero-length array with flexible-array member 2020-02-28 14:51:30 +01:00
act_api.h net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
addrconf.h ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr 2022-08-03 12:03:47 +02:00
af_ieee802154.h
af_rxrpc.h afs: Don't truncate iter during data fetch 2021-04-23 10:17:26 +01:00
af_unix.h af_unix: Add unix_stream_proto for sockmap 2021-08-16 18:43:39 -07:00
af_vsock.h vsock: each transport cycles only on its own sockets 2022-03-23 09:16:41 +01:00
ah.h
arp.h ipv4: Invalidate neighbour for broadcast address upon address addition 2022-04-13 20:59:05 +02:00
atmclip.h
ax25.h ax25: fix reference count leaks of ax25_dev 2022-04-20 09:34:22 +02:00
ax88796.h ax88796: export ax_NS8390_init() hook 2021-08-03 13:05:25 +01:00
bareudp.h bareudp: Reverted support to enable & disable rx metadata collection 2020-07-21 18:30:47 -07:00
bond_3ad.h net: bonding: Share lacpdu_mcast_addr definition 2022-09-28 11:11:48 +02:00
bond_alb.h bonding (gcc13): synchronize bond_{a,t}lb_xmit() types 2023-06-14 11:13:00 +02:00
bond_options.h Bonding: add arp_missed_max option 2023-06-05 09:21:19 +02:00
bonding.h bonding: fix macvlan over alb bond support 2023-08-30 16:18:15 +02:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h net: Fix a data-race around sysctl_net_busy_poll. 2022-08-31 17:16:43 +02:00
calipso.h
cfg80211-wext.h
cfg80211.h wifi: cfg80211: fix sband iftype data lookup for AP_VLAN 2023-08-16 18:22:01 +02:00
cfg802154.h cfg802154: Replace zero-length array with flexible-array member 2020-02-29 14:39:08 +01:00
checksum.h net: Force inlining of checksum functions in net/checksum.h 2022-03-02 11:47:58 +01:00
cipso_ipv4.h cipso: Remove unused inline functions 2020-07-15 07:45:24 -07:00
cls_cgroup.h bpf: Allow to retrieve cgroup v1 classid from v2 hooks 2020-03-27 19:40:38 -07:00
codel_impl.h
codel_qdisc.h
codel.h
compat.h net/ipv4/ipv6: Replace one-element arraya with flexible-array members 2021-08-05 11:46:42 +01:00
datalink.h
dcbevent.h
dcbnl.h
devlink.h devlink: Use xarray to store devlink instances 2021-08-14 13:59:10 +01:00
dsa.h net: dsa: introduce helpers for iterating through ports using dp 2023-06-05 09:21:17 +02:00
dsfield.h ipv6: Annotate bitwise IPv6 dsfield pointer cast 2019-12-16 16:09:44 -08:00
dst_cache.h wireguard: device: reset peer src endpoint when netns exits 2021-12-08 09:04:46 +01:00
dst_metadata.h net: fix a memleak when uncloning an skb dst and its metadata 2022-02-16 12:56:30 +01:00
dst_ops.h net/dst: use a smaller percpu_counter batch for dst entries accounting 2020-05-08 21:33:33 -07:00
dst.h net: Remove unused inline function dst_hold_and_use() 2023-06-21 15:59:19 +02:00
erspan.h erspan: Add type I version 0 support. 2020-05-05 13:23:29 -07:00
esp.h esp: limit skb_page_frag_refill use to a single page 2022-04-27 14:38:52 +02:00
espintcp.h xfrm: espintcp: save and call old ->sk_destruct 2020-04-20 07:34:16 +02:00
ethoc.h
failover.h
fib_notifier.h ipv6: Remove old route notifications and convert listeners 2019-12-24 22:37:30 -08:00
fib_rules.h ipv6: fix memory leak in fib6_rule_suppress 2021-12-08 09:04:43 +01:00
firewire.h
flow_dissector.h net/sched: flower: fix parsing of ethertype following VLAN header 2022-04-20 09:34:09 +02:00
flow_offload.h netfilter: nf_tables: bail out early if hardware offload is not supported 2022-06-14 18:36:17 +02:00
flow.h inet: shrink struct flowi_common 2023-11-20 11:08:28 +01:00
fou.h
fq_impl.h net/fq_impl: do not maintain a backlog-sorted list of flows 2021-01-21 13:33:45 +01:00
fq.h net/fq_impl: do not maintain a backlog-sorted list of flows 2021-01-21 13:33:45 +01:00
garp.h treewide: Use sizeof_field() macro 2019-12-09 10:36:44 -08:00
gen_stats.h net_sched: extend packet counter to 64bit 2019-11-05 18:20:55 -08:00
genetlink.h mptcp: avoid lock_fast usage in accept path 2021-02-12 16:31:46 -08:00
geneve.h
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro_cells.h
gro.h gro: add combined call_gro_receive() + INDIRECT_CALL_INET() helper 2021-03-18 19:51:12 -07:00
gtp.h
gue.h GUE: Fix a typo 2020-06-22 21:12:44 -07:00
hwbm.h net: hwbm: if CONFIG_NET_HWBM unset, make stub functions static 2019-10-25 16:24:32 -07:00
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h mac80211: Use flex-array for radiotap header bitmap 2021-08-13 09:58:25 +02:00
ieee802154_netdev.h net: ieee802154: return -EINVAL for unknown addr type 2022-10-26 12:35:54 +02:00
if_inet6.h ipv6: fix locking issues with loops over idev->addr_list 2022-06-09 10:22:31 +02:00
ife.h
ila.h
inet6_connection_sock.h
inet6_hashtables.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-08-17 14:23:36 +02:00
inet_common.h bpf: Allow rewriting to ports under ip_unprivileged_port_start 2021-01-27 18:18:15 -08:00
inet_connection_sock.h net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
inet_ecn.h inet_ecn: Use csum16_add() helper for IP_ECN_set_* helpers 2020-12-14 18:38:58 -08:00
inet_frag.h inet: frags: annotate races around fqdir->dead and fqdir->high_thresh 2022-01-27 11:05:35 +01:00
inet_hashtables.h net: inet: Retire port only listening_hash 2023-11-28 16:56:22 +00:00
inet_sock.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-08-17 14:23:36 +02:00
inet_timewait_sock.h tcp: honor SO_PRIORITY in TIME_WAIT state 2019-09-27 12:05:02 +02:00
inetpeer.h
ioam6.h ipv6: ioam: Support for IOAM injection with lwtunnels 2021-07-21 08:14:33 -07:00
ip6_checksum.h tcp: remove indirect calls for icsk->icsk_af_ops->send_check 2020-06-20 17:47:53 -07:00
ip6_fib.h net: fib: avoid warn splat in flow dissector 2023-09-19 12:22:58 +02:00
ip6_route.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-08-05 15:08:47 -07:00
ip6_tunnel.h ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-05-09 09:14:36 +02:00
ip_fib.h ipv4/fib: send notify when delete source address routes 2023-10-25 11:59:00 +02:00
ip_tunnels.h ip_tunnels: use DEV_STATS_INC() 2023-09-19 12:22:59 +02:00
ip_vs.h ipvs: Update width of source for ip_vs_sync_conn_options 2023-05-24 17:36:46 +01:00
ip.h ipv4: ignore dst hint for multipath routes 2023-09-19 12:22:58 +02:00
ipcomp.h
ipconfig.h
ipv6_frag.h inet: frags: annotate races around fqdir->dead and fqdir->high_thresh 2022-01-27 11:05:35 +01:00
ipv6_stubs.h net: ipv6: add fib6_nh_release_dsts stub 2021-12-01 09:04:49 +01:00
ipv6.h ipv6: fix ip6_sock_set_addr_preferences() typo 2023-09-19 12:23:04 +02:00
iw_handler.h
kcm.h
l3mdev.h l3mdev: add infrastructure for table to VRF mapping 2020-06-20 17:22:22 -07:00
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h llc: fix sk_buff leak in llc_conn_service() 2019-10-08 13:23:05 -07:00
llc_if.h
llc_pdu.h net: llc: fix skb_over_panic 2021-07-27 13:05:56 +01:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h llc: fix out-of-bound array index in llc_sk_dev_hash() 2021-11-18 19:17:10 +01:00
lwtunnel.h lwt: Check LWTUNNEL_XMIT_CONTINUE strictly 2023-09-19 12:22:34 +02:00
mac80211.h mac80211: Fix Ptk0 rekey documentation 2021-09-27 12:02:54 +02:00
mac802154.h
macsec.h net: macsec: indicate next pn update when offloading 2023-10-19 23:05:34 +02:00
mctp.h mctp: unify sockaddr_mctp types 2021-10-18 13:47:09 +01:00
mctpdevice.h mctp: Remove the repeated declaration 2021-08-25 11:23:14 +01:00
mip6.h net: mip6: Replace zero-length array with flexible-array member 2020-03-02 11:16:27 -08:00
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls_iptunnel.h net: mpls: Replace zero-length array with flexible-array member 2020-02-28 12:08:37 -08:00
mpls.h net: Make mpls_entry_encode() available for generic users 2020-05-29 21:20:20 -07:00
mptcp.h mptcp: remove MPTCP 'ifdef' in TCP SYN cookies 2023-01-12 11:58:52 +01:00
mrp.h mrp: introduce active flags to prevent UAF when applicant uninit 2022-12-31 13:14:42 +01:00
ncsi.h
ndisc.h ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report() 2022-03-08 19:12:33 +01:00
neighbour.h neighbour: delete neigh_lookup_nodev as not used 2023-06-21 15:59:19 +02:00
net_failover.h
net_namespace.h net: initialize init_net earlier 2022-04-13 20:59:03 +02:00
net_ratelimit.h
netevent.h
netlabel.h
netlink.h net: netlink: add the case when nlh is NULL 2021-07-27 11:43:50 +01:00
netprio_cgroup.h netprio: use css ID instead of cgroup ID 2019-11-12 08:18:03 -08:00
netrom.h
nexthop.h net: ipv4: Fix rtnexthop len when RTA_FLOW is present 2021-09-24 14:07:10 +01:00
nl802154.h net: ieee802154: handle iftypes as u32 2021-12-01 09:04:46 +01:00
nsh.h
p8022.h
page_pool.h page_pool: fix inconsistency for page_pool_ring_[un]lock() 2023-06-05 09:21:22 +02:00
pie.h pie: realign comment 2020-03-04 13:25:55 -08:00
ping.h
pkt_cls.h sch_htb: Fix inconsistency when leaf qdisc creation fails 2021-08-30 16:33:59 -07:00
pkt_sched.h net/sched: make psched_mtu() RTNL-less safe 2023-07-23 13:47:45 +02:00
pptp.h
protocol.h tcp/udp: Make early_demux back namespacified. 2022-11-10 18:15:38 +01:00
psample.h psample: Add a fwd declaration for skbuff 2021-08-09 15:34:21 -07:00
psnap.h
raw.h raw: Fix a data-race around sysctl_raw_l3mdev_accept. 2022-07-21 21:24:27 +02:00
rawv6.h
red.h sch_red: fix off-by-one checks in red_check_params() 2021-03-25 17:40:43 -07:00
regulatory.h net/wireless: regulatory.h: drop duplicate word in comment 2020-07-31 09:24:23 +02:00
request_sock.h tcp: bpf: Optionally store mac header in TCP_SAVE_SYN 2020-08-24 14:35:00 -07:00
rose.h
route.h ip: Fix data-races around sysctl_ip_default_ttl. 2022-07-29 17:25:09 +02:00
rpl.h ipv6: rpl: Fix Route of Death. 2023-06-14 11:13:02 +02:00
rsi_91x.h
rtnetlink.h net: validate veth and vxcan peer ifindexes 2023-08-30 16:18:14 +02:00
rtnh.h
sch_generic.h net/sched: sch_taprio: fix possible use-after-free 2023-02-01 08:27:09 +01:00
scm.h scm: fix MSG_CTRUNC setting condition for SO_PASSSEC 2023-05-11 23:00:26 +09:00
secure_seq.h secure_seq: use the 64 bits of the siphash for port offset calculation 2022-05-18 10:26:53 +02:00
seg6_hmac.h
seg6_local.h
seg6.h udp6: Use Segment Routing Header for dest address if present 2022-01-27 11:05:05 +01:00
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h net/smc: introduce CHID callback for ISM devices 2020-09-28 15:19:03 -07:00
snmp.h net/tls: add skeleton of MIB statistics 2019-10-05 16:29:00 -07:00
sock_reuseport.h soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-12-31 13:14:07 +01:00
sock.h net: annotate data-races around sk->sk_dst_pending_confirm 2023-11-28 16:56:16 +00:00
Space.h wan: remove sbni/granch driver 2021-08-03 13:05:26 +01:00
stp.h
strparser.h bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg 2021-11-18 19:17:11 +01:00
switchdev.h net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge 2021-08-04 12:35:07 +01:00
tcp_states.h
tcp.h tcp: fix cookie_init_timestamp() overflows 2023-11-20 11:08:16 +01:00
timewait_sock.h
tipc.h
tls_toe.h net/tls: rename tls_hw_* functions tls_toe_* 2019-10-04 14:07:07 -07:00
tls.h net/tls: Multi-threaded calls to TX tls_dev_del 2023-08-26 14:23:22 +02:00
transp_v6.h tcp: move ipv4_specific to tcp include file 2020-06-23 20:10:15 -07:00
tso.h net: tso: cache transport header length 2020-06-18 20:46:23 -07:00
tun_proto.h
udp_tunnel.h rxrpc: Fix ICMP/ICMP6 error handling 2022-09-15 11:30:05 +02:00
udp.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2023-04-26 13:51:54 +02:00
udplite.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2023-04-26 13:51:54 +02:00
vsock_addr.h vsock: remove include/linux/vm_sockets.h file 2019-11-14 18:12:17 -08:00
vxlan.h vxlan: Fix nexthop hash size 2023-08-11 15:13:54 +02:00
wext.h
x25.h net/x25: add new state X25_STATE_5 2019-12-09 10:28:43 -08:00
x25device.h
xdp_priv.h page_pool: do not release pool until inflight == 0. 2019-11-16 12:39:10 -08:00
xdp_sock_drv.h i40e: xsk: Move tmp desc array from driver to pool 2022-06-14 18:36:18 +02:00
xdp_sock.h xdp: Add proper __rcu annotations to redirect map entries 2021-06-24 19:41:15 +02:00
xdp.h xdp: Allow registering memory model without rxq reference 2023-06-05 09:21:21 +02:00
xfrm.h xfrm: Treat already-verified secpath entries as optional 2023-06-28 10:29:45 +02:00
xsk_buff_pool.h xsk: Fix unaligned descriptor validation 2023-05-11 23:00:27 +09:00