linux/net/ipv4
Yuchung Cheng 659a8ad56f tcp: track the packet timings in RACK
This patch is the first half of the RACK loss recovery.

RACK loss recovery uses the notion of time instead
of packet sequence (FACK) or counts (dupthresh). It's inspired by the
previous FACK heuristic in tcp_mark_lost_retrans(): when a limited
transmit (new data packet) is sacked, then current retransmitted
sequence below the newly sacked sequence must been lost,
since at least one round trip time has elapsed.

But it has several limitations:
1) can't detect tail drops since it depends on limited transmit
2) is disabled upon reordering (assumes no reordering)
3) only enabled in fast recovery ut not timeout recovery

RACK (Recently ACK) addresses these limitations with the notion
of time instead: a packet P1 is lost if a later packet P2 is s/acked,
as at least one round trip has passed.

Since RACK cares about the time sequence instead of the data sequence
of packets, it can detect tail drops when later retransmission is
s/acked while FACK or dupthresh can't. For reordering RACK uses a
dynamically adjusted reordering window ("reo_wnd") to reduce false
positives on ever (small) degree of reordering.

This patch implements tcp_advanced_rack() which tracks the
most recent transmission time among the packets that have been
delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp
is the key to determine which packet has been lost.

Consider an example that the sender sends six packets:
T1: P1 (lost)
T2: P2
T3: P3
T4: P4
T100: sack of P2. rack.mstamp = T2
T101: retransmit P1
T102: sack of P2,P3,P4. rack.mstamp = T4
T205: ACK of P4 since the hole is repaired. rack.mstamp = T101

We need to be careful about spurious retransmission because it may
falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK
to falsely mark all packets lost, just like a spurious timeout.

We identify spurious retransmission by the ACK's TS echo value.
If TS option is not applicable but the retransmission is acknowledged
less than min-RTT ago, it is likely to be spurious. We refrain from
using the transmission time of these spurious retransmissions.

The second half is implemented in the next patch that marks packet
lost using RACK timestamp.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21 07:00:48 -07:00
..
netfilter Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-10-17 14:28:03 +02:00
af_inet.c net: Replace vrf_dev_table and friends 2015-09-29 20:40:33 -07:00
ah4.c ah4: Fix error return in ah_input(). 2015-08-25 13:38:50 -07:00
arp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
cipso_ipv4.c ipv4: coding style: comparison for inequality with NULL 2015-04-03 12:11:15 -04:00
datagram.c net: Set sk_txhash from a random number 2015-07-29 22:44:04 -07:00
devinet.c rtnetlink: RTEXT_FILTER_SKIP_STATS support to avoid dumping inet/inet6 stats 2015-09-15 15:25:02 -07:00
esp4.c esp4: Switch to new AEAD interface 2015-05-28 11:23:20 +08:00
fib_frontend.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-02 07:21:25 -07:00
fib_lookup.h ipv4: consider TOS in fib_select_default 2015-07-24 22:46:11 -07:00
fib_rules.c net: ipv6: use common fib_default_rule_pref 2015-09-09 14:19:50 -07:00
fib_semantics.c net: Fix suspicious RCU usage in fib_rebalance 2015-10-16 00:57:55 -07:00
fib_trie.c net: Fix vti use case with oif in dst lookups 2015-09-17 16:36:34 -07:00
fou.c fou: reject IPv6 config 2015-08-29 13:07:54 -07:00
gre_demux.c gre: Remove support for sharing GRE protocol hook. 2015-08-10 14:03:54 -07:00
gre_offload.c ipv4: coding style: comparison for inequality with NULL 2015-04-03 12:11:15 -04:00
icmp.c Revert "ipv4/icmp: redirect messages can use the ingress daddr as source" 2015-10-14 06:01:07 -07:00
igmp.c ipv4, ipv6: Pass net into ip_local_out and ip6_local_out 2015-10-08 04:27:02 -07:00
inet_connection_sock.c tcp/dccp: fix race at listener dismantle phase 2015-10-16 00:52:19 -07:00
inet_diag.c tcp/dccp: install syn_recv requests into ehash table 2015-10-03 04:32:41 -07:00
inet_fragment.c inet: frags: remove INET_FRAG_EVICTED and use list_evictor for the test 2015-07-26 21:00:15 -07:00
inet_hashtables.c tcp/dccp: fix potential NULL deref in __inet_inherit_port() 2015-10-14 19:06:31 -07:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c tcp/dccp: fix timewait races in timer handling 2015-09-21 16:32:29 -07:00
inetpeer.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
ip_forward.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ip_fragment.c ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
ip_gre.c ip_tunnels: record IP version in tunnel info 2015-08-29 13:07:54 -07:00
ip_input.c ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
ip_options.c ipv4: coding style: comparison for inequality with NULL 2015-04-03 12:11:15 -04:00
ip_output.c tcp: do not set queue_mapping on SYNACK 2015-10-18 22:26:02 -07:00
ip_sockglue.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
ip_tunnel_core.c ipv4, ipv6: Pass net into ip_local_out and ip6_local_out 2015-10-08 04:27:02 -07:00
ip_tunnel.c ip_gre: Add support to collect tunnel metadata. 2015-08-10 14:03:54 -07:00
ip_vti.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ipcomp.c ipv4: coding style: comparison for equality with NULL 2015-04-03 12:11:15 -04:00
ipconfig.c ipconfig: send Client-identifier in DHCP requests 2015-10-18 19:23:52 -07:00
ipip.c ip_gre: Add support to collect tunnel metadata. 2015-08-10 14:03:54 -07:00
ipmr.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
Kconfig geneve: Consolidate Geneve functionality in single module. 2015-08-27 15:42:48 -07:00
Makefile tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
netfilter.c ipv4: Pass struct net into ip_route_me_harder 2015-09-29 20:21:32 +02:00
ping.c ipv6: Nonlocal bind 2015-07-09 21:09:10 -07:00
proc.c net: track success and failure of TCP PMTU probing 2015-07-21 22:36:33 -07:00
protocol.c net: Export inet_offloads and inet6_offloads 2014-09-19 17:15:31 -04:00
raw.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
route.c net: Do not drop to make_route if oif is l3mdev 2015-10-08 05:18:47 -07:00
syncookies.c net: shrink struct sock and request_sock by 8 bytes 2015-10-12 19:28:22 -07:00
sysctl_net_ipv4.c tcp: track min RTT using windowed min-filter 2015-10-21 07:00:43 -07:00
tcp_bic.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_cdg.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_cong.c tcp: remove tcp_ecn_make_synack() socket argument 2015-09-25 13:00:38 -07:00
tcp_cubic.c tcp_cubic: do not set epoch_start in the future 2015-09-17 22:35:07 -07:00
tcp_dctcp.c net: tcp: dctcp_update_alpha() fixes. 2015-06-10 23:28:33 -07:00
tcp_diag.c sock_diag: implement a get_info handler for inet 2015-06-15 19:49:22 -07:00
tcp_fastopen.c tcp: fix fastopen races vs lockless listener 2015-10-05 02:45:24 -07:00
tcp_highspeed.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_htcp.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_hybla.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_illinois.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_input.c tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
tcp_ipv4.c tcp: do not set queue_mapping on SYNACK 2015-10-18 22:26:02 -07:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c memcg: cleanup static keys decrement 2015-02-12 18:54:10 -08:00
tcp_metrics.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
tcp_minisocks.c tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
tcp_offload.c tcp: reserve tcp_skb_mss() to tcp stack 2015-06-11 16:33:10 -07:00
tcp_output.c tcp: remove tcp_mark_lost_retrans() 2015-10-21 07:00:44 -07:00
tcp_probe.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_recovery.c tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
tcp_scalable.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_timer.c tcp: change type of alive from int to bool 2015-10-12 05:15:03 -07:00
tcp_vegas.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_vegas.h tcp: prepare CC get_info() access from getsockopt() 2015-04-29 17:10:38 -04:00
tcp_veno.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_westwood.c tcp_westwood: fix tcp_westwood_info() 2015-05-05 19:50:09 -04:00
tcp_yeah.c tcp: stretch ACK fixes prep 2015-01-28 22:18:37 -08:00
tcp.c tcp: track min RTT using windowed min-filter 2015-10-21 07:00:43 -07:00
tunnel4.c
udp_diag.c sock_diag: specify info_size per inet protocol 2015-06-15 19:49:22 -07:00
udp_impl.h net: Remove iocb argument from sendmsg and recvmsg 2015-03-02 13:06:31 -05:00
udp_offload.c ipv4: coding style: comparison for inequality with NULL 2015-04-03 12:11:15 -04:00
udp_tunnel.c tunnel: introduce udp_tun_rx_dst() 2015-08-27 15:42:47 -07:00
udp.c net: SO_INCOMING_CPU setsockopt() support 2015-10-12 19:28:20 -07:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c ipv4: hash net ptr into fragmentation bucket selection 2015-03-25 14:07:04 -04:00
xfrm4_output.c dst: Pass net into dst->output 2015-10-08 04:27:03 -07:00
xfrm4_policy.c ipv4: Merge __ip_local_out and __ip_local_out_sk 2015-10-08 04:26:57 -07:00
xfrm4_protocol.c xfrm4: Remove duplicate semicolon 2014-06-30 07:49:47 +02:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c