linux/net/ipv4
Neal Cardwell f4ce91ce12 tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
This commit fixes a bug in the tracking of max_packets_out and
is_cwnd_limited. This bug can cause the connection to fail to remember
that is_cwnd_limited is true, causing the connection to fail to grow
cwnd when it should, causing throughput to be lower than it should be.

The following event sequence is an example that triggers the bug:

 (a) The connection is cwnd_limited, but packets_out is not at its
     peak due to TSO deferral deciding not to send another skb yet.
     In such cases the connection can advance max_packets_seq and set
     tp->is_cwnd_limited to true and max_packets_out to a small
     number.

(b) Then later in the round trip the connection is pacing-limited (not
     cwnd-limited), and packets_out is larger. In such cases the
     connection would raise max_packets_out to a bigger number but
     (unexpectedly) flip tp->is_cwnd_limited from true to false.

This commit fixes that bug.

One straightforward fix would be to separately track (a) the next
window after max_packets_out reaches a maximum, and (b) the next
window after tp->is_cwnd_limited is set to true. But this would
require consuming an extra u32 sequence number.

Instead, to save space we track only the most important
information. Specifically, we track the strongest available signal of
the degree to which the cwnd is fully utilized:

(1) If the connection is cwnd-limited then we remember that fact for
the current window.

(2) If the connection not cwnd-limited then we track the maximum
number of outstanding packets in the current window.

In particular, note that the new logic cannot trigger the buggy
(a)/(b) sequence above because with the new logic a condition where
tp->packets_out > tp->max_packets_out can only trigger an update of
tp->is_cwnd_limited if tp->is_cwnd_limited is false.

This first showed up in a testing of a BBRv2 dev branch, but this
buggy behavior highlighted a general issue with the
tcp_cwnd_validate() logic that can cause cwnd to fail to increase at
the proper rate for any TCP congestion control, including Reno or
CUBIC.

Fixes: ca8a226343 ("tcp: make cwnd-limited checks measurement-based, and gentler")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Kevin(Yudong) Yang <yyd@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:37:45 +01:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
af_inet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
ah4.c net: ipv4: fix clang -Wformat warnings 2022-07-12 12:58:53 +02:00
arp.c net: ipv4: new arp_accept option to accept garp only if in-network 2022-07-15 18:55:49 -07:00
bpf_tcp_ca.c bpf: Switch to new kfunc flags infrastructure 2022-07-21 20:59:42 -07:00
cipso_ipv4.c cipso: Fix data-races around sysctl. 2022-07-08 12:10:33 +01:00
datagram.c ipv4: Avoid using RTO_ONLINK with ip_route_connect(). 2022-04-22 13:06:03 +01:00
devinet.c net: Fix data-races around sysctl_devconf_inherit_init_net. 2022-08-24 13:46:58 +01:00
esp4_offload.c net: Fix esp GSO on inter address family tunnels. 2022-03-07 13:14:04 +01:00
esp4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
fib_frontend.c ip: fix triggering of 'icmp redirect' 2022-08-31 19:50:36 -07:00
fib_lookup.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-17 11:44:20 -08:00
fib_notifier.c net: ipv4: remove superfluous header files from fib_notifier.c 2021-09-28 17:32:56 -07:00
fib_rules.c ipv4: remove unnecessary type castings 2022-04-30 15:12:58 +01:00
fib_semantics.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
fib_trie.c ipv4: Fix data-races around sysctl_fib_notify_on_flag_change. 2022-07-25 12:42:10 +01:00
fou.c fou: Remove XRFM from NET_FOU Kconfig 2022-04-12 14:56:33 -07:00
gre_demux.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
gre_offload.c gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
icmp.c ip: Fix data-races around sysctl_ip_no_pmtu_disc. 2022-07-15 11:49:55 +01:00
igmp.c igmp: Fix data-races around sysctl_igmp_qrv. 2022-07-18 12:21:54 +01:00
inet_connection_sock.c tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries. 2022-07-18 12:21:54 +01:00
inet_diag.c net: inet: Retire port only listening_hash 2022-05-12 16:52:18 -07:00
inet_fragment.c ipv4: remove unnecessary type castings 2022-04-30 15:12:58 +01:00
inet_hashtables.c Revert "net: Add a second bind table hashed by port and address" 2022-06-16 11:07:59 -07:00
inet_timewait_sock.c tcp: Fix a data-race around sysctl_max_tw_buckets. 2022-07-13 12:56:49 +01:00
inetpeer.c inetpeer: Fix data-races around sysctl. 2022-07-08 12:10:33 +01:00
ip_forward.c ip: Fix data-races around sysctl_ip_fwd_update_priority. 2022-07-15 11:49:55 +01:00
ip_fragment.c net: ip: Handle delivery_time in ip defrag 2022-03-03 14:38:48 +00:00
ip_gre.c ip_tunnel: Respect tunnel key's "flow_flags" in IP tunnels 2022-08-18 21:18:28 +02:00
ip_input.c tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
ip_options.c ipv4: drop fragmentation code from ip_options_build() 2022-01-29 17:53:07 +00:00
ip_output.c net: Fix data-races around sysctl_[rw]mem_(max|default). 2022-08-24 13:46:57 +01:00
ip_sockglue.c net: Fix data-races around sysctl_optmem_max. 2022-08-24 13:46:57 +01:00
ip_tunnel_core.c tunnels: do not assume mac header is set in skb_tunnel_check_pmtu() 2022-06-27 11:50:30 +01:00
ip_tunnel.c ip_tunnel: Respect tunnel key's "flow_flags" in IP tunnels 2022-08-18 21:18:28 +02:00
ip_vti.c ip: use dev_addr_set() in tunnels 2021-10-13 09:41:37 -07:00
ipcomp.c Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
ipconfig.c Driver core / kernfs changes for 6.0-rc1 2022-08-04 11:31:20 -07:00
ipip.c ip: use dev_addr_set() in tunnels 2021-10-13 09:41:37 -07:00
ipmr_base.c ipmr: adopt rcu_read_lock() in mr_dump() 2022-06-24 11:34:38 +01:00
ipmr.c ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section 2022-09-20 08:22:15 -07:00
Kconfig fou: Remove XRFM from NET_FOU Kconfig 2022-04-12 14:56:33 -07:00
Makefile bpf: Clean up sockmap related Kconfigs 2021-02-26 12:28:03 -08:00
metrics.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
netfilter.c netfilter: Use l3mdev flow key when re-routing mangled packets 2022-05-16 13:03:29 +02:00
netlink.c
nexthop.c nexthop: Fix data-races around nexthop_compat_mode. 2022-07-13 12:56:50 +01:00
ping.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-06-23 12:33:24 -07:00
proc.c ip: Fix data-races around sysctl_ip_default_ttl. 2022-07-15 11:49:55 +01:00
protocol.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
raw_diag.c raw: complete rcu conversion 2022-06-21 11:38:29 +02:00
raw.c raw: fix a typo in raw_icmp_error() 2022-06-24 22:48:33 -07:00
route.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-07-25 13:25:39 +01:00
syncookies.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
sysctl_net_ipv4.c ip: Fix data-races around sysctl_ip_prot_sock. 2022-07-20 10:14:49 +01:00
tcp_bbr.c bpf: Switch to new kfunc flags infrastructure 2022-07-21 20:59:42 -07:00
tcp_bic.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_bpf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-06-23 12:33:24 -07:00
tcp_cdg.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_cong.c tcp: Add tracepoint for tcp_set_ca_state 2022-04-07 20:33:15 -07:00
tcp_cubic.c bpf: Switch to new kfunc flags infrastructure 2022-07-21 20:59:42 -07:00
tcp_dctcp.c bpf: Switch to new kfunc flags infrastructure 2022-07-21 20:59:42 -07:00
tcp_dctcp.h
tcp_diag.c
tcp_fastopen.c tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout. 2022-07-18 12:21:54 +01:00
tcp_highspeed.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_htcp.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_hybla.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_illinois.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_input.c tcp: fix early ETIMEDOUT after spurious non-SACK RTO 2022-09-06 11:06:31 +02:00
tcp_ipv4.c tcp: make global challenge ack rate limitation per net-ns and default disabled 2022-08-31 19:56:48 -07:00
tcp_lp.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_metrics.c tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save. 2022-07-22 12:06:17 +01:00
tcp_minisocks.c tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. 2022-07-20 10:14:50 +01:00
tcp_nv.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_offload.c net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
tcp_output.c tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited 2022-09-30 12:37:45 +01:00
tcp_rate.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
tcp_recovery.c tcp: Fix data-races around sysctl_tcp_recovery. 2022-07-20 10:14:50 +01:00
tcp_scalable.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_timer.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-21 13:03:39 -07:00
tcp_ulp.c
tcp_vegas.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_vegas.h
tcp_veno.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_westwood.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp_yeah.c tcp: add accessors to read/set tp->snd_cwnd 2022-04-06 12:05:41 -07:00
tcp.c tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited 2022-09-30 12:37:45 +01:00
tunnel4.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
udp_bpf.c net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
udp_diag.c net: Use nlmsg_unicast() instead of netlink_unicast() 2021-07-13 09:28:29 -07:00
udp_impl.h net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
udp_offload.c gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
udp_tunnel_core.c rxrpc: Fix ICMP/ICMP6 error handling 2022-09-01 11:42:12 +01:00
udp_tunnel_nic.c udp_tunnel: Fix end of loop test in udp_tunnel_nic_unregister() 2022-02-23 12:35:00 +00:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp.c udp: Use WARN_ON_ONCE() in udp_read_skb() 2022-09-22 06:42:57 -07:00
udplite.c net: add per_cpu_fw_alloc field to struct proto 2022-06-10 16:21:26 -07:00
xfrm4_input.c
xfrm4_output.c
xfrm4_policy.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
xfrm4_protocol.c net: xfrm: unexport __init-annotated xfrm4_protocol_init() 2022-06-08 10:10:13 -07:00
xfrm4_state.c
xfrm4_tunnel.c net/ipv4/xfrm4_tunnel.c: remove superfluous header files from xfrm4_tunnel.c 2021-09-23 10:10:00 +02:00