linux/include/net
Joanne Koong 28044fc1d4 net: Add a bhash2 table hashed by port and address
The current bind hashtable (bhash) is hashed by port only.
In the socket bind path, we have to check for bind conflicts by
traversing the specified port's inet_bind_bucket while holding the
hashbucket's spinlock (see inet_csk_get_port() and
inet_csk_bind_conflict()). In instances where there are tons of
sockets hashed to the same port at different addresses, the bind
conflict check is time-intensive and can cause softirq cpu lockups,
as well as stops new tcp connections since __inet_inherit_port()
also contests for the spinlock.

This patch adds a second bind table, bhash2, that hashes by
port and sk->sk_rcv_saddr (ipv4) and sk->sk_v6_rcv_saddr (ipv6).
Searching the bhash2 table leads to significantly faster conflict
resolution and less time holding the hashbucket spinlock.

Please note a few things:
* There can be the case where the a socket's address changes after it
has been bound. There are two cases where this happens:

  1) The case where there is a bind() call on INADDR_ANY (ipv4) or
  IPV6_ADDR_ANY (ipv6) and then a connect() call. The kernel will
  assign the socket an address when it handles the connect()

  2) In inet_sk_reselect_saddr(), which is called when rebuilding the
  sk header and a few pre-conditions are met (eg rerouting fails).

In these two cases, we need to update the bhash2 table by removing the
entry for the old address, and add a new entry reflecting the updated
address.

* The bhash2 table must have its own lock, even though concurrent
accesses on the same port are protected by the bhash lock. Bhash2 must
have its own lock to protect against cases where sockets on different
ports hash to different bhash hashbuckets but to the same bhash2
hashbucket.

This brings up a few stipulations:
  1) When acquiring both the bhash and the bhash2 lock, the bhash2 lock
  will always be acquired after the bhash lock and released before the
  bhash lock is released.

  2) There are no nested bhash2 hashbucket locks. A bhash2 lock is always
  acquired+released before another bhash2 lock is acquired+released.

* The bhash table cannot be superseded by the bhash2 table because for
bind requests on INADDR_ANY (ipv4) or IPV6_ADDR_ANY (ipv6), every socket
bound to that port must be checked for a potential conflict. The bhash
table is the only source of port->socket associations.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-24 19:30:07 -07:00
..
9p 9p: Add client parameter to p9_req_put() 2022-07-09 14:38:35 +09:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-28 18:21:16 -07:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: Use struct_group() to zero struct iucv_sock region 2021-11-19 11:52:25 +00:00
netfilter Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-08-17 20:29:36 -07:00
netns Remove DECnet support from kernel 2022-08-22 14:26:30 +01:00
nfc NFC: add NCI_UNREG flag to eliminate the race 2021-11-17 20:17:05 -08:00
phonet net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
sctp net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
tc_act Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-05-12 16:15:30 -07:00
6lowpan.h
act_api.h net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
addrconf.h ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr 2022-07-28 10:42:44 -07:00
af_ieee802154.h
af_rxrpc.h afs: Don't truncate iter during data fetch 2021-04-23 10:17:26 +01:00
af_unix.h af_unix: Remove unix_table_locks. 2022-06-22 12:59:43 +01:00
af_vsock.h vsock: add API call for data ready 2022-08-23 10:43:11 +02:00
ah.h
amt.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
arp.h ipv4: Invalidate neighbour for broadcast address upon address addition 2022-02-21 11:44:30 +00:00
atmclip.h
ax25.h ax25: fix incorrect dev_tracker usage 2022-07-28 22:06:15 -07:00
ax88796.h ax88796: Fix some typo in a comment 2022-08-09 22:14:02 -07:00
bareudp.h bareudp: Move definition of struct bareudp_conf to bareudp.c 2021-12-13 12:34:09 +00:00
bond_3ad.h bonding: fix data-races around agg_select_timer 2022-02-15 14:35:18 +00:00
bond_alb.h bonding: make tx_rebalance_counter an atomic 2021-12-03 14:16:48 +00:00
bond_options.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
bonding.h net: bonding: replace dev_trans_start() with the jiffies of the last ARP/NS 2022-08-03 19:20:12 -07:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h tcp: fix another uninit-value (sk_rx_queue_mapping) 2021-12-03 14:15:49 +00:00
calipso.h
cfg80211-wext.h
cfg80211.h wifi: nl80211: add MLO link ID to the NL80211_CMD_FRAME TX API 2022-07-22 14:28:33 +02:00
cfg802154.h net: wrap the wireless pointers in struct net_device in an ifdef 2022-05-22 21:51:54 +01:00
checksum.h powerpc/net: Implement powerpc specific csum_shift() to remove branch 2022-03-11 10:57:22 +00:00
cipso_ipv4.h cipso: Remove unused inline functions 2020-07-15 07:45:24 -07:00
cls_cgroup.h
codel_impl.h codel: remove unnecessary sock.h include 2021-12-22 15:03:47 -08:00
codel_qdisc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
codel.h codel: remove unnecessary pkt_sched.h include 2021-12-22 15:03:51 -08:00
compat.h net: copy from user before calling __get_compat_msghdr 2022-07-24 18:39:17 -06:00
datalink.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbnl.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
devlink.h devlink: introduce framework for selftests 2022-07-28 21:56:53 -07:00
dropreason.h net: dropreason: reformat the comment fo skb drop reasons 2022-06-07 12:51:41 +02:00
dsa.h net: dsa: use dsa_tree_for_each_cpu_port in dsa_tree_{setup,teardown}_master 2022-08-23 11:39:22 +02:00
dsfield.h
dst_cache.h wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst_metadata.h net: fix a memleak when uncloning an skb dst and its metadata 2022-02-09 11:41:47 +00:00
dst_ops.h
dst.h net: dst: add net device refcount tracking to dst_entry 2021-12-06 16:05:10 -08:00
erspan.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
esp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
espintcp.h
ethoc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
failover.h net: failover: add net device refcount tracker 2021-12-06 16:06:02 -08:00
fib_notifier.h
fib_rules.h fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
firewire.h firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48() 2022-07-28 22:21:54 -07:00
flow_dissector.h flow_dissector: Add PPPoE dissectors 2022-07-26 09:49:12 -07:00
flow_offload.h flow_offload: Introduce flow_match_pppoe 2022-07-26 10:49:27 -07:00
flow.h net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
fou.h
fq_impl.h net/fq_impl: Use the bitmap API to allocate bitmaps 2022-07-11 19:49:38 -07:00
fq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
garp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gen_stats.h net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
genetlink.h net: add missing kdoc for struct genl_multicast_group::flags 2022-08-11 09:26:04 -07:00
geneve.h
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro_cells.h
gro.h net: gro: Fix a 'directive in macro's argument list' sparse warning 2022-02-18 11:00:25 +00:00
gtp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gue.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
hwbm.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h ieee80211: radiotap: fix -Wcast-qual warnings 2022-02-04 16:25:21 +01:00
ieee802154_netdev.h
if_inet6.h ipv6: fix locking issues with loops over idev->addr_list 2022-04-06 22:09:39 -07:00
ife.h
ila.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_connection_sock.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_hashtables.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_common.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet_connection_sock.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
inet_dscp.h ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules 2022-02-07 20:12:45 -08:00
inet_ecn.h net: add skb_get_dsfield() helper 2021-10-15 11:33:08 +01:00
inet_frag.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet_hashtables.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
inet_sock.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_timewait_sock.h Revert "tcp/dccp: get rid of inet_twsk_purge()" 2022-05-13 12:24:12 +01:00
inetpeer.h
ioam6.h treewide: Replace zero-length arrays with flexible-array members 2022-02-17 07:00:39 -06:00
ip6_checksum.h net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
ip6_fib.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-17 11:44:20 -08:00
ip6_route.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ip6_tunnel.h ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-04-25 11:40:45 +01:00
ip_fib.h ipv4: Use dscp_t in struct fib_entry_notifier_info 2022-04-11 17:37:50 -07:00
ip_tunnels.h ip_tunnels: Add new flow flags field to ip_tunnel_key 2022-07-26 12:43:16 +02:00
ip_vs.h ipvs: add sysctl_run_estimation to support disable estimation 2021-10-07 19:52:58 +02:00
ip.h ip: Fix data-races around sysctl_ip_prot_sock. 2022-07-20 10:14:49 +01:00
ipcomp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ipconfig.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ipv6_frag.h net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
ipv6_stubs.h net: ipv6: add fib6_nh_release_dsts stub 2021-11-22 15:44:49 +00:00
ipv6.h ipv6: Fix signed integer overflow in __ip6_append_data 2022-06-08 10:56:43 -07:00
iw_handler.h
kcm.h
l3mdev.h
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc_c_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_c_ev.h
llc_c_st.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_conn.h llc: add net device refcount tracker 2021-12-07 20:44:59 -08:00
llc_if.h llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
llc_pdu.h net: llc: fix skb_over_panic 2021-07-27 13:05:56 +01:00
llc_s_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_ev.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_st.h add missing includes and forward declarations to networking includes under linux/ 2022-07-28 11:29:36 +02:00
llc_sap.h
llc.h llc: fix out-of-bound array index in llc_sk_dev_hash() 2021-11-07 19:25:29 +00:00
lwtunnel.h netfilter: add netfilter hooks to SRv6 data plane 2021-08-30 01:51:36 +02:00
mac80211.h wifi: mac80211: add macros to loop over active links 2022-07-22 14:28:47 +02:00
mac802154.h net: mac802154: Create an error helper for asynchronous offloading errors 2022-04-25 20:51:12 +02:00
macsec.h net: macsec: Expose MACSEC_SALT_LEN definition to user space 2022-08-18 20:37:35 -07:00
mctp.h mctp: Use output netdev to allocate skb headroom 2022-04-01 12:04:15 +01:00
mctpdevice.h mctp: Pass flow data & flow release events to drivers 2021-10-29 13:23:51 +01:00
mip6.h
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls_iptunnel.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
mpls.h
mptcp.h mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled 2022-08-08 15:30:45 +02:00
mrp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ncsi.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ndisc.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
neighbour.h neighbour: make proxy_queue.qlen limit per-device 2022-08-15 11:25:09 +01:00
net_debug.h net: add CONFIG_DEBUG_NET 2022-05-11 12:43:10 +01:00
net_failover.h
net_namespace.h netfilter: nf_flow_table: count pending offload workqueue tasks 2022-07-11 16:25:14 +02:00
net_ratelimit.h
net_trackers.h net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
netevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
netlabel.h
netlink.h netlink: fix some kernel-doc comments 2022-08-24 18:46:16 -07:00
netprio_cgroup.h
netrom.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
nexthop.h net: ipv4: Fix rtnexthop len when RTA_FLOW is present 2021-09-24 14:07:10 +01:00
nl802154.h net: ieee802154: handle iftypes as u32 2021-11-16 18:02:46 +01:00
nsh.h
p8022.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
page_pool.h net: page_pool: introduce ethtool stats 2022-04-15 10:43:47 +01:00
pie.h
ping.h net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
pkt_cls.h net/sched: remove return value of unregister_tcf_proto_ops 2022-07-13 14:46:59 +01:00
pkt_sched.h net: sched: remove the unused return value of unregister_qdisc 2022-08-16 19:37:06 -07:00
pptp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
protocol.h tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
psample.h psample: Add a fwd declaration for skbuff 2021-08-09 15:34:21 -07:00
psnap.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
raw.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-14 15:27:35 -07:00
rawv6.h raw: convert raw sockets to RCU 2022-06-19 10:00:02 +01:00
red.h sch_red: fix off-by-one checks in red_check_params() 2021-03-25 17:40:43 -07:00
regulatory.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
request_sock.h tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
rose.h net: rose: add netdev ref tracker to 'struct rose_sock' 2022-08-01 11:59:23 -07:00
route.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-07-25 13:25:39 +01:00
rpl.h
rsi_91x.h
rtnetlink.h net: rtnetlink: add bulk delete support flag 2022-04-13 12:46:26 +01:00
rtnh.h
sch_generic.h net/sched: remove qdisc_root_lock() helper 2022-07-19 17:14:55 -07:00
scm.h
secure_seq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
seg6_hmac.h
seg6_local.h
seg6.h udp6: Use Segment Routing Header for dest address if present 2022-01-04 12:17:35 +00:00
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h net/smc: Pass on DMBE bit mask in IRQ handler 2022-07-27 13:24:42 +01:00
snmp.h
sock_reuseport.h tcp: Add reuseport_migrate_sock() to select a new listener. 2021-06-15 18:01:05 +02:00
sock.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
Space.h wan: remove sbni/granch driver 2021-08-03 13:05:26 +01:00
stp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
strparser.h tls: rx: remove the message decrypted tracking 2022-07-18 11:24:10 +01:00
switchdev.h net: switchdev: add reminder near struct switchdev_notifier_fdb_info 2022-06-29 20:37:36 -07:00
tcp_states.h
tcp.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-28 18:21:16 -07:00
timewait_sock.h
tipc.h
tls_toe.h
tls.h net/tls: Use RCU API to access tls_ctx->netdev 2022-08-10 22:58:43 -07:00
transp_v6.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
tso.h
tun_proto.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
udp_tunnel.h udp: call udp_encap_enable for v6 sockets when enabling encap 2021-02-04 18:37:14 -08:00
udp.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
udplite.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
vsock_addr.h
vxlan.h drivers: vxlan: vnifilter: per vni stats 2022-03-01 08:38:02 +00:00
wext.h
x25.h
x25device.h
xdp_priv.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
xdp_sock_drv.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-08-03 09:04:55 +02:00
xdp_sock.h net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
xdp.h net: veth: Account total xdp_frame len running ndo_xdp_xmit 2022-03-17 20:33:52 +01:00
xfrm.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-07-25 13:25:39 +01:00
xsk_buff_pool.h xsk: Fix possible crash when multiple sockets are created 2022-04-26 16:19:54 +02:00