2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-17 17:53:56 +08:00
linux-next/net/ipv4
Nandita Dukkipati 6ba8a3b19e tcp: Tail loss probe (TLP)
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.

TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.

PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.

TLP Algorithm

On transmission of new data in Open state:
  -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
  -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
  -> PTO = min(PTO, RTO)

Conditions for scheduling PTO:
  -> Connection is in Open state.
  -> Connection is either cwnd limited or no new data to send.
  -> Number of probes per tail loss episode is limited to one.
  -> Connection is SACK enabled.

When PTO fires:
  new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

  no_new_packet:
    -> retransmit the last segment.
       Its ACK triggers FACK or early retransmit based recovery.

ACK path:
  -> rearm RTO at start of ACK processing.
  -> reschedule PTO if need be.

In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
		 ==1; enables RFC5827 ER.
		 ==2; delayed ER.
		 ==3; TLP and delayed ER. [DEFAULT]
		 ==4; TLP only.

The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:30:34 -04:00
..
netfilter Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
af_inet.c tunneling: Add generic Tunnel segmentation. 2013-03-09 16:09:17 -05:00
ah4.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
arp.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
cipso_ipv4.c cipso: don't follow a NULL pointer when setsockopt() is called 2012-07-18 09:01:12 -07:00
datagram.c ipv4: Add a socket release callback for datagram sockets 2013-01-21 14:17:05 -05:00
devinet.c netconf: add the handler to dump entries 2013-03-06 15:40:53 -05:00
esp4.c xfrm4: Invalidate all ipv4 routes on IPsec pmtu events 2013-01-21 12:43:54 +01:00
fib_frontend.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
fib_lookup.h ipv4: Fix nexthop caching wrt. scoping. 2011-03-24 18:06:47 -07:00
fib_rules.c sections: fix section conflicts in net 2012-10-06 03:04:45 +09:00
fib_semantics.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
fib_trie.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
gre.c v4 GRE: Add TCP segmentation offload for GRE 2013-02-15 15:17:11 -05:00
icmp.c ipv4: fix error handling in icmp_protocol. 2013-02-22 15:10:18 -05:00
igmp.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
inet_connection_sock.c Fix: sparse warning in inet_csk_prepare_forced_close 2013-03-07 16:31:29 -05:00
inet_diag.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
inet_fragment.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inet_hashtables.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inet_lro.c net: add skb frag size accessors 2011-10-19 03:10:46 -04:00
inet_timewait_sock.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inetpeer.c Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2012-10-02 09:54:49 -07:00
ip_forward.c ipv4: introduce rt_uses_gateway 2012-10-08 17:42:36 -04:00
ip_fragment.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
ip_gre.c tunnel: use iptunnel_xmit() again 2013-03-10 03:05:44 -04:00
ip_input.c ipv[4|6]: correct dropwatch false positive in local_deliver_finish 2013-03-01 15:56:29 -05:00
ip_options.c net/ipv4: Ensure that location of timestamp option is stored 2013-03-12 05:35:39 -04:00
ip_output.c net: Fix possible wrong checksum generation. 2013-02-13 13:30:10 -05:00
ip_sockglue.c net: prevent setting ttl=0 via IP_TTL 2013-01-08 17:57:10 -08:00
ip_vti.c net: Allow userns root to control ipv4 2012-11-18 20:32:45 -05:00
ipcomp.c ipcomp: Mark as netns_ok. 2013-02-04 15:46:15 -05:00
ipconfig.c net: proc: change proc_net_fops_create to proc_create 2013-02-18 14:53:08 -05:00
ipip.c tunnel: use iptunnel_xmit() again 2013-03-10 03:05:44 -04:00
ipmr.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
Kconfig net/ipv4: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:40:00 -08:00
Makefile memcg: rename config variables 2012-07-31 18:42:43 -07:00
netfilter.c netfilter: properly annotate ipv4_netfilter_{init,fini}() 2012-09-03 13:56:04 +02:00
ping.c ipv4: fix a bug in ping_err(). 2013-02-21 15:25:00 -05:00
proc.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
protocol.c ipv4: Disallow non-namespace aware protocols to register. 2013-02-05 14:42:23 -05:00
raw.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
route.c net: ipv4: fix waring -Wunused-variable 2013-02-19 13:18:13 -05:00
syncookies.c tcp: make sysctl_tcp_ecn namespace aware 2013-01-06 21:09:56 -08:00
sysctl_net_ipv4.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
tcp_bic.c tcp: fix undo after RTO for BIC 2012-01-20 14:17:26 -05:00
tcp_cong.c tcp: remove Appropriate Byte Count support 2013-02-05 14:51:16 -05:00
tcp_cubic.c tcp: fix undo after RTO for CUBIC 2012-01-20 14:17:26 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_fastopen.c tcp: TCP Fast Open Server - header & support functions 2012-08-31 20:02:18 -04:00
tcp_highspeed.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_htcp.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_hybla.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_illinois.c net: fix divide by zero in tcp algorithm illinois 2012-11-01 11:55:59 -04:00
tcp_input.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
tcp_ipv4.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
tcp_lp.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tcp_memcontrol.c memcg: decrement static keys at real destroy time 2012-05-29 16:22:28 -07:00
tcp_metrics.c tcp: handle tcp_net_metrics_init() order-5 memory allocation failures 2012-11-16 13:36:27 -05:00
tcp_minisocks.c tcp: send packets with a socket timestamp 2013-02-13 13:22:16 -05:00
tcp_output.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
tcp_probe.c net: proc: change proc_net_remove to remove_proc_entry 2013-02-18 14:53:08 -05:00
tcp_scalable.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_timer.c tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
tcp_vegas.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_vegas.h
tcp_veno.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_westwood.c tcp: mark tcp_congestion_ops read_mostly 2011-03-10 00:40:17 -08:00
tcp_yeah.c Fix common misspellings 2011-03-31 11:26:23 -03:00
tcp.c tunneling: Add generic Tunnel segmentation. 2013-03-09 16:09:17 -05:00
tunnel4.c net: Convert printks to pr_<level> 2012-03-11 23:42:51 -07:00
udp_diag.c netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
udp_impl.h ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
udp.c tunneling: Add generic Tunnel segmentation. 2013-03-09 16:09:17 -05:00
udplite.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
xfrm4_input.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
xfrm4_mode_beet.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c ip: fix warning in xfrm4_mode_tunnel_input 2013-02-18 12:42:48 -05:00
xfrm4_output.c xfrm4: Don't call icmp_send on local error 2011-07-01 17:33:19 -07:00
xfrm4_policy.c xfrm: make gc_thresh configurable in all namespaces 2013-02-06 11:36:29 +01:00
xfrm4_state.c net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules 2011-10-31 19:30:30 -04:00
xfrm4_tunnel.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00