linux/include/net
Arjun Roy 7a7f094635 tcp: Use per-vma locking for receive zerocopy
Per-VMA locking allows us to lock a struct vm_area_struct without
taking the process-wide mmap lock in read mode.

Consider a process workload where the mmap lock is taken constantly in
write mode. In this scenario, all zerocopy receives are periodically
blocked during that period of time - though in principle, the memory
ranges being used by TCP are not touched by the operations that need
the mmap write lock. This results in performance degradation.

Now consider another workload where the mmap lock is never taken in
write mode, but there are many TCP connections using receive zerocopy
that are concurrently receiving. These connections all take the mmap
lock in read mode, but this does induce a lot of contention and atomic
ops for this process-wide lock. This results in additional CPU
overhead caused by contending on the cache line for this lock.

However, with per-vma locking, both of these problems can be avoided.

As a test, I ran an RPC-style request/response workload with 4KB
payloads and receive zerocopy enabled, with 100 simultaneous TCP
connections. I measured perf cycles within the
find_tcp_vma/mmap_read_lock/mmap_read_unlock codepath, with and
without per-vma locking enabled.

When using process-wide mmap semaphore read locking, about 1% of
measured perf cycles were within this path. With per-VMA locking, this
value dropped to about 0.45%.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-18 11:16:00 +01:00
..
9p 9p: Add additional debug flags and open modes 2023-03-27 02:33:48 +00:00
bluetooth Bluetooth: ISO: use correct CIS order in Set CIG Parameters event 2023-06-05 17:14:07 -07:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: Use struct_group() to zero struct iucv_sock region 2021-11-19 11:52:25 +00:00
mana net: mana: Fix perf regression: remove rx_cqes, tx_cqes counters 2023-05-30 12:05:22 +02:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-06-15 22:19:41 -07:00
netns tcp: enforce receive buffer memory limits by allowing the tcp window to shrink 2023-06-17 09:53:53 +01:00
nfc NFC: add NCI_UNREG flag to eliminate the race 2021-11-17 20:17:05 -08:00
phonet net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
sctp sctp: delete the nested flexible array peer_init 2023-04-21 08:19:30 +01:00
tc_act net/sched: act_connmark: transition to percpu stats and rcu 2023-02-16 10:39:28 +01:00
6lowpan.h
act_api.h net/sched: Rename user cookie and act cookie 2023-02-20 16:46:10 -08:00
addrconf.h ipv6: constify inet6_mc_check() 2023-03-17 08:56:37 +00:00
af_ieee802154.h
af_rxrpc.h rxrpc: Fix timeout of a call that hasn't yet been granted a channel 2023-05-01 07:43:19 +01:00
af_unix.h af_unix: preserve const qualifier in unix_sk() 2023-03-18 12:23:33 +00:00
af_vsock.h vsock: support sockmap 2023-03-29 08:19:38 +01:00
ah.h
amt.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
arp.h neighbour: switch to standard rcu, instead of rcu_bh 2023-03-21 21:32:18 -07:00
atmclip.h
ax25.h x25: preserve const qualifier in [a]x25_sk() 2023-03-18 12:23:34 +00:00
ax88796.h ax88796: Fix some typo in a comment 2022-08-09 22:14:02 -07:00
bareudp.h bareudp: Move definition of struct bareudp_conf to bareudp.c 2021-12-13 12:34:09 +00:00
bond_3ad.h net: bonding: Share lacpdu_mcast_addr definition 2022-09-16 14:34:01 +01:00
bond_alb.h bonding (gcc13): synchronize bond_{a,t}lb_xmit() types 2022-11-02 20:38:13 -07:00
bond_options.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
bonding.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-05-25 19:57:39 -07:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h net: Fix a data-race around sysctl_net_busy_poll. 2022-08-24 13:46:58 +01:00
calipso.h
cfg80211-wext.h wifi: cfg80211: Avoid clashing function prototypes 2022-11-16 11:31:47 +02:00
cfg80211.h wifi: cfg80211: add a work abstraction with special semantics 2023-06-07 19:53:15 +02:00
cfg802154.h ieee802154: Add support for user beaconing requests 2023-01-28 13:51:22 +01:00
checksum.h net: checksum: drop the linux/uaccess.h include 2023-01-27 11:19:46 +00:00
cipso_ipv4.h cipso: Remove unused inline functions 2020-07-15 07:45:24 -07:00
cls_cgroup.h
codel_impl.h codel: remove unnecessary sock.h include 2021-12-22 15:03:47 -08:00
codel_qdisc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
codel.h codel: remove unnecessary pkt_sched.h include 2021-12-22 15:03:51 -08:00
compat.h net: copy from user before calling __get_compat_msghdr 2022-07-24 18:39:17 -06:00
datalink.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbnl.h net: dcb: add helper functions to retrieve PCP and DSCP rewrite maps 2023-01-20 09:33:22 +00:00
devlink.h devlink: bring port new reply back 2023-06-01 21:37:32 -07:00
dropreason-core.h net: extend drop reasons for multiple subsystems 2023-04-20 20:20:49 -07:00
dropreason.h mac80211: use the new drop reasons infrastructure 2023-04-20 20:20:49 -07:00
dsa_stubs.h net: dsa: replace NETDEV_PRE_CHANGE_HWTSTAMP notifier with a stub 2023-04-09 15:35:49 +01:00
dsa.h net: dsa: add support for mac_prepare() and mac_finish() calls 2023-05-26 10:39:40 +01:00
dsfield.h
dst_cache.h wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst_metadata.h xfrm: interface: Add unstable helpers for setting/getting XFRM metadata from TC-BPF 2022-12-05 21:58:27 -08:00
dst_ops.h ipv6: remove max_size check inline with ipv4 2023-01-13 20:59:14 -08:00
dst.h net: dst: Switch to rcuref_t reference counting 2023-03-28 18:52:28 -07:00
erspan.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
esp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
espintcp.h
ethoc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
failover.h net: failover: add net device refcount tracker 2021-12-06 16:06:02 -08:00
fib_notifier.h
fib_rules.h fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
firewire.h firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48() 2022-07-28 22:21:54 -07:00
flow_dissector.h net: flow_dissector: add support for cfm packets 2023-06-12 17:01:45 -07:00
flow_offload.h net/sched: cls_api: Support hardware miss to tc action 2023-02-20 16:46:10 -08:00
flow.h ipv4: Drop tos parameter from flowi4_update_output() 2023-06-02 10:52:38 +01:00
fou.h bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs 2023-04-12 16:40:39 -07:00
fq_impl.h wifi: mac80211: add support for restricting netdev features per vif 2022-12-01 15:09:10 +01:00
fq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
garp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gen_stats.h net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
genetlink.h mptcp: more detailed error reporting on endpoint creation 2022-11-21 13:09:07 +00:00
geneve.h net: geneve: fix array of flexible structures warnings 2022-10-31 10:43:04 +00:00
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro_cells.h
gro.h net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
gso.h net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
gtp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gue.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
handshake.h net/handshake: Enable the SNI extension to work properly 2023-05-24 22:05:24 -07:00
hwbm.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h wifi: radiotap: separate vendor TLV into header/content 2023-03-07 20:11:36 +01:00
ieee802154_netdev.h mac802154: Handle basic beaconing 2023-01-28 13:55:10 +01:00
if_inet6.h ipv6: fix locking issues with loops over idev->addr_list 2022-04-06 22:09:39 -07:00
ife.h
ila.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_connection_sock.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_hashtables.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_common.h ipv4, ipv6: Use splice_eof() to flush 2023-06-08 19:40:30 -07:00
inet_connection_sock.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
inet_dscp.h ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules 2022-02-07 20:12:45 -08:00
inet_ecn.h net: add skb_get_dsfield() helper 2021-10-15 11:33:08 +01:00
inet_frag.h Revert "net: Remove low_thresh in ip defrag" 2023-05-16 20:46:30 -07:00
inet_hashtables.h tcp: Add TIME_WAIT sockets in bhash2. 2022-12-30 07:25:52 +00:00
inet_sock.h inet: preserve const qualifier in inet_sk() 2023-03-17 08:56:37 +00:00
inet_timewait_sock.h tcp: Add TIME_WAIT sockets in bhash2. 2022-12-30 07:25:52 +00:00
inetpeer.h
ioam6.h treewide: Replace zero-length arrays with flexible-array members 2022-02-17 07:00:39 -06:00
ip6_checksum.h net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
ip6_fib.h ipv6: Remove in6addr_any alternatives. 2023-03-29 08:22:52 +01:00
ip6_route.h net: dst: Prevent false sharing vs. dst_entry:: __refcnt 2023-03-28 18:52:22 -07:00
ip6_tunnel.h ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-04-25 11:40:45 +01:00
ip_fib.h ipv4: Use dscp_t in struct fib_entry_notifier_info 2022-04-11 17:37:50 -07:00
ip_tunnels.h bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
ip_vs.h ipvs: Correct spelling in comments 2023-04-22 01:39:41 +02:00
ip.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-05-25 19:57:39 -07:00
ipcomp.h xfrm: ipcomp: add extack to ipcomp{4,6}_init_state 2022-09-29 07:18:00 +02:00
ipconfig.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ipv6_frag.h net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
ipv6_stubs.h bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() 2022-09-02 20:34:32 -07:00
ipv6.h ipv6: icmp6: add drop reason support to icmpv6_notify() 2023-02-13 19:55:32 -08:00
iw_handler.h
kcm.h kcm: Send multiple frags in one sendmsg() 2023-06-12 21:13:23 -07:00
l3mdev.h l3mdev: add infrastructure for table to VRF mapping 2020-06-20 17:22:22 -07:00
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc_c_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_c_ev.h
llc_c_st.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_conn.h llc: add net device refcount tracker 2021-12-07 20:44:59 -08:00
llc_if.h llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
llc_pdu.h net: llc: fix skb_over_panic 2021-07-27 13:05:56 +01:00
llc_s_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_ev.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_st.h add missing includes and forward declarations to networking includes under linux/ 2022-07-28 11:29:36 +02:00
llc_sap.h
llc.h llc: fix out-of-bound array index in llc_sk_dev_hash() 2021-11-07 19:25:29 +00:00
lwtunnel.h netfilter: add netfilter hooks to SRv6 data plane 2021-08-30 01:51:36 +02:00
mac80211.h wifi: mac80211: provide a helper to fetch the medium synchronization delay 2023-06-06 14:15:16 +02:00
mac802154.h mac802154: Drop IEEE802154_HW_RX_DROP_BAD_CKSUM 2022-10-12 12:57:19 +02:00
macsec.h macsec: Use helper macsec_netdev_priv for offload drivers 2023-05-10 11:32:09 +01:00
mctp.h mctp: Use output netdev to allocate skb headroom 2022-04-01 12:04:15 +01:00
mctpdevice.h mctp: Pass flow data & flow release events to drivers 2021-10-29 13:23:51 +01:00
mip6.h
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls_iptunnel.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
mpls.h
mptcp.h mptcp: remove MPTCP 'ifdef' in TCP SYN cookies 2022-12-12 13:11:24 -08:00
mrp.h mrp: introduce active flags to prevent UAF when applicant uninit 2022-11-18 12:14:55 +00:00
ncsi.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ndisc.h neighbour: switch to standard rcu, instead of rcu_bh 2023-03-21 21:32:18 -07:00
neighbour.h neighbour: fix unaligned access to pneigh_entry 2023-06-01 21:36:37 -07:00
net_debug.h net: add CONFIG_DEBUG_NET 2022-05-11 12:43:10 +01:00
net_failover.h
net_namespace.h net: add a refcount tracker for kernel sockets 2022-10-24 11:04:43 +01:00
net_ratelimit.h
net_trackers.h net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
netdev_queues.h net: add macro netif_subqueue_completed_wake 2023-04-18 12:59:01 +02:00
netevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
netlabel.h
netlink.h net: netlink: recommend policy range validation 2023-01-28 00:33:51 -08:00
netprio_cgroup.h
netrom.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
nexthop.h ipv6: remove nexthop_fib6_nh_bh() 2023-05-11 18:07:05 -07:00
nl802154.h ieee802154: Add support for user beaconing requests 2023-01-28 13:51:22 +01:00
nsh.h
p8022.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
page_pool.h page_pool: fix inconsistency for page_pool_ring_[un]lock() 2023-05-23 20:25:13 -07:00
pie.h
ping.h net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294 2023-06-02 09:55:22 +01:00
pkt_cls.h sch_htb: Allow HTB priority parameter in offload mode 2023-05-15 09:31:07 +01:00
pkt_sched.h net/sched: taprio: report class offload stats per TXQ, not per TC 2023-06-12 09:43:30 +01:00
pptp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
protocol.h tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
psample.h psample: Add a fwd declaration for skbuff 2021-08-09 15:34:21 -07:00
psnap.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
raw.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-04-06 12:01:20 -07:00
rawv6.h ipv6: raw: constify raw_v6_match() socket argument 2023-03-17 08:56:37 +00:00
red.h treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
regulatory.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
request_sock.h tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
rose.h net: rose: add netdev ref tracker to 'struct rose_sock' 2022-08-01 11:59:23 -07:00
route.h ipv4: Drop tos parameter from flowi4_update_output() 2023-06-02 10:52:38 +01:00
rpl.h ipv6: rpl: Fix Route of Death. 2023-06-06 20:59:08 -07:00
rsi_91x.h
rtnetlink.h rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link 2022-10-31 18:10:21 -07:00
rtnh.h
sch_generic.h net: sched: Remove unused qdisc_l2t() 2023-06-17 00:17:42 -07:00
scm.h scm: add SO_PASSPIDFD and SCM_PIDFD 2023-06-12 10:45:49 +01:00
secure_seq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
seg6_hmac.h
seg6_local.h
seg6.h udp6: Use Segment Routing Header for dest address if present 2022-01-04 12:17:35 +00:00
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h net/smc: Introduce explicit check for v2 support 2023-03-15 08:18:35 +00:00
snmp.h
sock_reuseport.h soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-10-25 11:35:16 +02:00
sock.h net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
Space.h wan: remove sbni/granch driver 2021-08-03 13:05:26 +01:00
stp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
strparser.h tls: rx: remove the message decrypted tracking 2022-07-18 11:24:10 +01:00
switchdev.h bridge: switchdev: Allow device drivers to install locked FDB entries 2022-11-09 19:06:13 -08:00
tc_wrapper.h net/sched: Retire rsvp classifier 2023-02-16 09:27:07 +01:00
tcp_states.h
tcp.h tcp: Use per-vma locking for receive zerocopy 2023-06-18 11:16:00 +01:00
timewait_sock.h
tipc.h
tls_toe.h
tls.h net: tls: make the offload check helper take skb not socket 2023-06-15 09:01:05 +01:00
transp_v6.h inet6: Remove inet6_destroy_sock(). 2022-10-24 09:40:39 +01:00
tso.h net: tso: inline tso_count_descs() 2022-12-12 15:04:39 -08:00
tun_proto.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
udp_tunnel.h net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error() 2022-11-08 16:42:28 +00:00
udp.h net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
udplite.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
vsock_addr.h
vxlan.h net: vxlan: Add nolocalbypass option to vxlan. 2023-05-13 17:02:33 +01:00
wext.h
x25.h x25: preserve const qualifier in [a]x25_sk() 2023-03-18 12:23:34 +00:00
x25device.h
xdp_priv.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
xdp_sock_drv.h xsk: Remove unused xsk_buff_discard 2022-09-30 07:55:46 -07:00
xdp_sock.h bpf, net: xskmap memory usage 2023-03-07 09:33:43 -08:00
xdp.h bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
xfrm.h xfrm: add new device offload acquire flag 2023-03-20 11:29:33 +02:00
xsk_buff_pool.h xsk: Use pool->dma_pages to check for DMA 2023-04-27 22:24:51 +02:00