linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-15 00:34:10 +08:00

History

Neal Cardwell 4720852ed9 tcp: fix delayed ACKs for MSS boundary condition This commit fixes poor delayed ACK behavior that can cause poor TCP latency in a particular boundary condition: when an application makes a TCP socket write that is an exact multiple of the MSS size. The problem is that there is painful boundary discontinuity in the current delayed ACK behavior. With the current delayed ACK behavior, we have: (1) If an app reads data when > 1MSS is unacknowledged, then tcp_cleanup_rbuf() ACKs immediately because of: tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss \|\| (2) If an app reads all received data, and the packets were < 1MSS, and either (a) the app is not ping-pong or (b) we received two packets < 1MSS, then tcp_cleanup_rbuf() ACKs immediately beecause of: ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) \|\| ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && !inet_csk_in_pingpong_mode(sk))) && (3) However: if an app reads exactly 1MSS of data, tcp_cleanup_rbuf() does not send an immediate ACK. This is true even if the app is not ping-pong and the 1MSS of data had the PSH bit set, suggesting the sending application completed an application write. Thus if the app is not ping-pong, we have this painful case where >1MSS gets an immediate ACK, and <1MSS gets an immediate ACK, but a write whose last skb is an exact multiple of 1MSS can get a 40ms delayed ACK. This means that any app that transfers data in one direction and takes care to align write size or packet size with MSS can suffer this problem. With receive zero copy making 4KB MSS values more common, it is becoming more common to have application writes naturally align with MSS, and more applications are likely to encounter this delayed ACK problem. The fix in this commit is to refine the delayed ACK heuristics with a simple check: immediately ACK a received 1MSS skb with PSH bit set if the app reads all data. Why? If an skb has a len of exactly 1MSS and has the PSH bit set then it is likely the end of an application write. So more data may not be arriving soon, and yet the data sender may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send an ACK immediately if the app reads all of the data and is not ping-pong. Note that this logic is also executed for the case where len > MSS, but in that case this logic does not matter (and does not hurt) because tcp_cleanup_rbuf() will always ACK immediately if the app reads data and there is more than an MSS of unACKed data. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Xin Guo <guoxin0309@gmail.com> Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>		2023-10-04 15:34:18 -07:00
..
bpfilter	net: Use umd_cleanup_helper()	2023-05-31 13:06:57 +02:00
netfilter	inet: move inet->nodefrag to inet->inet_flags	2023-08-16 11:09:17 +01:00
af_inet.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2023-08-24 10:51:39 -07:00
ah4.c	net: ipv4: Remove completion function scaffolding	2023-02-13 18:35:15 +08:00
arp.c	neighbour: annotate lockless accesses to n->nud_state	2023-03-15 00:37:32 -07:00
bpf_tcp_ca.c	bpf: Drop useless btf_vmlinux in bpf_tcp_ca	2023-07-18 17:31:10 -07:00
cipso_ipv4.c	inet: move inet->is_icsk to inet->inet_flags	2023-08-16 11:09:17 +01:00
datagram.c	ipv4: fix data-races around inet->inet_id	2023-08-20 11:40:49 +01:00
devinet.c	net: ipv4: fix one memleak in __inet_del_ifa()	2023-09-08 08:02:17 +01:00
esp4_offload.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2023-06-22 18:40:38 -07:00
esp4.c	net: ipv4: Use kfree_sensitive instead of kfree	2023-07-19 11:03:03 +01:00
fib_frontend.c	ipv4: Fix incorrect table ID in IOCTL path	2023-03-16 17:26:31 -07:00
fib_lookup.h
fib_notifier.c
fib_rules.c	ipv4: remove unnecessary type castings	2022-04-30 15:12:58 +01:00
fib_semantics.c	ipv4/fib: send notify when delete source address routes	2023-10-03 09:00:40 +02:00
fib_trie.c	ipv4/fib: send notify when delete source address routes	2023-10-03 09:00:40 +02:00
fou_bpf.c	bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs	2023-04-12 16:40:39 -07:00
fou_core.c	bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs	2023-04-12 16:40:39 -07:00
fou_nl.c	net: ynl: prefix uAPI header include with uapi/	2023-05-26 10:30:14 +01:00
fou_nl.h	net: ynl: prefix uAPI header include with uapi/	2023-05-26 10:30:14 +01:00
gre_demux.c
gre_offload.c	net: move gso declarations and functions to their own files	2023-06-10 00:11:41 -07:00
icmp.c	icmp: guard against too small mtu	2023-03-31 21:37:06 -07:00
igmp.c	igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU	2023-09-05 17:49:40 +01:00
inet_connection_sock.c	tcp: annotate data-races around icsk->icsk_syn_retries	2023-07-20 12:34:18 -07:00
inet_diag.c	inet: move inet->defer_connect to inet->inet_flags	2023-08-16 11:09:18 +01:00
inet_fragment.c	net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT	2022-10-31 20:14:27 -07:00
inet_hashtables.c	tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.	2023-09-13 07:18:04 +01:00
inet_timewait_sock.c	inet: move inet->transparent to inet->inet_flags	2023-08-16 11:09:17 +01:00
inetpeer.c	inetpeer: Fix data-races around sysctl.	2022-07-08 12:10:33 +01:00
ip_forward.c	net: ipv4, ipv6: fix IPSTATS_MIB_OUTOCTETS increment duplicated	2023-08-30 09:44:09 +01:00
ip_fragment.c	networking: Update to register_net_sysctl_sz	2023-08-15 15:26:18 -07:00
ip_gre.c	ipv4: ip_gre: fix return value check in erspan_xmit()	2023-07-19 12:27:09 +01:00
ip_input.c	ipv4: ignore dst hint for multipath routes	2023-09-01 08:11:51 +01:00
ip_options.c
ip_output.c	net: annotate data-races around sk->sk_tsflags	2023-09-01 07:27:33 +01:00
ip_sockglue.c	net: annotate data-races around sk->sk_tsflags	2023-09-01 07:27:33 +01:00
ip_tunnel_core.c	tunnels: fix kasan splat when generating ipv4 pmtu error	2023-08-04 18:24:52 -07:00
ip_tunnel.c	bpf-next-for-netdev	2023-04-13 16:43:38 -07:00
ip_vti.c	ip_vti: fix potential slab-use-after-free in decode_session6	2023-07-11 11:06:08 +02:00
ipcomp.c	xfrm: ipcomp: add extack to ipcomp{4,6}_init_state	2022-09-29 07:18:00 +02:00
ipconfig.c	net: ipconfig: move ic_nameservers_fallback into #ifdef block	2023-05-22 11:17:55 +01:00
ipip.c	ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices	2023-04-12 16:40:39 -07:00
ipmr_base.c	ipmr: adopt rcu_read_lock() in mr_dump()	2022-06-24 11:34:38 +01:00
ipmr.c	net: ipv4, ipv6: fix IPSTATS_MIB_OUTOCTETS increment duplicated	2023-08-30 09:44:09 +01:00
Kconfig	tcp: configurable source port perturb table size	2022-11-16 13:02:04 +00:00
Makefile	bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs	2023-04-12 16:40:39 -07:00
metrics.c	ipv4: prevent potential spectre v1 gadget in ip_metrics_convert()	2023-01-23 21:37:25 -08:00
netfilter.c	netfilter: Use l3mdev flow key when re-routing mangled packets	2022-05-16 13:03:29 +02:00
netlink.c
nexthop.c	nexthop: Do not increment dump sentinel at the end of the dump	2023-08-15 18:54:53 -07:00
ping.c	inet: move inet->recverr to inet->inet_flags	2023-08-16 11:09:17 +01:00
proc.c	icmp: Add counters for rate limits	2023-01-26 10:52:18 +01:00
protocol.c
raw_diag.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2023-04-06 12:01:20 -07:00
raw.c	inet: move inet->hdrincl to inet->inet_flags	2023-08-16 11:09:17 +01:00
route.c	ipv4: Set offload_failed flag in fibmatch results	2023-10-04 11:39:36 -07:00
syncookies.c	tcp: Set route scope properly in cookie_v4_check().	2023-06-06 21:13:03 -07:00
sysctl_net_ipv4.c	networking: Update to register_net_sysctl_sz	2023-08-15 15:26:18 -07:00
tcp_bbr.c	bpf: Add __bpf_kfunc tag to all kfuncs	2023-02-02 00:25:14 +01:00
tcp_bic.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_bpf.c	bpf, sockmap: Do not inc copied_seq when PEEK flag set	2023-09-29 17:05:00 +02:00
tcp_cdg.c	Random number generator fixes for Linux 6.1-rc1.	2022-10-16 15:27:07 -07:00
tcp_cong.c	net: Update an existing TCP congestion control algorithm.	2023-03-22 22:53:00 -07:00
tcp_cubic.c	bpf: Add __bpf_kfunc tag to all kfuncs	2023-02-02 00:25:14 +01:00
tcp_dctcp.c	bpf: Add __bpf_kfunc tag to all kfuncs	2023-02-02 00:25:14 +01:00
tcp_dctcp.h
tcp_diag.c	tcp: Access &tcp_hashinfo via net.	2022-09-20 10:21:49 -07:00
tcp_fastopen.c	inet: move inet->defer_connect to inet->inet_flags	2023-08-16 11:09:18 +01:00
tcp_highspeed.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_htcp.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_hybla.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_illinois.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_input.c	tcp: fix delayed ACKs for MSS boundary condition	2023-10-04 15:34:18 -07:00
tcp_ipv4.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2023-08-24 10:51:39 -07:00
tcp_lp.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_metrics.c	tcp_metrics: hash table allocation cleanup	2023-08-04 15:33:39 -07:00
tcp_minisocks.c	inet: move inet->transparent to inet->inet_flags	2023-08-16 11:09:17 +01:00
tcp_nv.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_offload.c	net: move gso declarations and functions to their own files	2023-06-10 00:11:41 -07:00
tcp_output.c	tcp: fix quick-ack counting to count actual ACKs of new data	2023-10-04 15:34:18 -07:00
tcp_plb.c	prandom: remove prandom_u32_max()	2022-12-20 03:13:45 +01:00
tcp_rate.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2022-04-28 13:02:01 -07:00
tcp_recovery.c	tcp: preserve const qualifier in tcp_sk()	2023-03-18 12:23:34 +00:00
tcp_scalable.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_timer.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2023-08-18 12:44:56 -07:00
tcp_ulp.c	net/ulp: use consistent error code when blocking ULP	2023-01-19 09:26:16 -08:00
tcp_vegas.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_vegas.h
tcp_veno.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_westwood.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp_yeah.c	tcp: add accessors to read/set tp->snd_cwnd	2022-04-06 12:05:41 -07:00
tcp.c	bpf: tcp_read_skb needs to pop skb regardless of seq	2023-09-29 17:04:07 +02:00
tunnel4.c
udp_bpf.c	bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()	2023-03-03 17:25:15 +01:00
udp_diag.c	udp: Access &udp_table via net.	2022-11-16 09:43:35 +00:00
udp_impl.h	sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)	2023-06-24 15:50:13 -07:00
udp_offload.c	net: gro: fix misuse of CB in udp socket lookup	2023-07-29 17:10:27 +01:00
udp_tunnel_core.c	inet: move inet->mc_loop to inet->inet_frags	2023-08-16 11:09:17 +01:00
udp_tunnel_nic.c	udp_tunnel: Add checks for nla_nest_start() in __udp_tunnel_nic_dump_write()	2022-11-29 08:44:24 -08:00
udp_tunnel_stub.c
udp.c	net: annotate data-races around sk->sk_forward_alloc	2023-09-01 07:27:33 +01:00
udplite.c	sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)	2023-06-24 15:50:13 -07:00
xfrm4_input.c	xfrm: fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets	2023-06-09 08:16:34 +02:00
xfrm4_output.c
xfrm4_policy.c	sysctl-6.6-rc1	2023-08-29 17:39:15 -07:00
xfrm4_protocol.c	net: xfrm: unexport __init-annotated xfrm4_protocol_init()	2022-06-08 10:10:13 -07:00
xfrm4_state.c
xfrm4_tunnel.c	xfrm: tunnel: add extack to ipip_init_state, xfrm6_tunnel_init_state	2022-09-29 07:18:00 +02:00