linux/net/core
Eric Dumazet edb09eb17e net: sched: do not acquire qdisc spinlock in qdisc/class stats dump
Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
agent [1] are problematic at scale :

For each qdisc/class found in the dump, we currently lock the root qdisc
spinlock in order to get stats. Sampling stats every 5 seconds from
thousands of HTB classes is a challenge when the root qdisc spinlock is
under high pressure. Not only the dumps take time, they also slow
down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.

An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
that might need the qdisc lock in fq_codel_dump_stats() and
fq_codel_dump_class_stats()

In v2 of this patch, I now use the Qdisc running seqcount to provide
consistent reads of packets/bytes counters, regardless of 32/64 bit arches.

I also changed rate estimators to use the same infrastructure
so that they no longer need to lock root qdisc lock.

[1]
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kevin Athey <kda@google.com>
Cc: Xiaotian Pei <xiaotian@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:37:14 -07:00
..
datagram.c udp: enable MSG_PEEK at non-zero offset 2016-04-05 16:29:37 -04:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c net_sched: transform qdisc running bit into a seqcount 2016-06-07 16:37:13 -07:00
devlink.c devlink: implement shared buffer occupancy monitoring interface 2016-04-14 16:22:03 -04:00
drop_monitor.c net: Replace get_cpu_var through this_cpu_ptr 2014-08-26 13:45:47 -04:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
dst.c net: add dst_cache to ovs vxlan lwtunnel 2016-02-16 20:21:48 -05:00
ethtool.c sctp: Add GSO support 2016-06-03 19:37:21 -04:00
fib_rules.c libnl: nla_put_be64(): align on a 64-bit area 2016-04-23 20:13:24 -04:00
filter.c bpf: prepare bpf_int_jit_compile/bpf_prog_select_runtime apis 2016-05-16 13:49:32 -04:00
flow_dissector.c net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public 2016-03-10 16:24:02 -05:00
flow.c flowcache: Avoid OOM condition under preasure 2016-03-17 10:28:42 +01:00
gen_estimator.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
gen_stats.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c lwtunnel: autoload of lwt modules 2016-02-21 22:00:28 -05:00
Makefile net: add a hardware buffer management helper API 2016-03-14 12:19:46 -04:00
neighbour.c neigh: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
net_namespace.c netns: make nsid_lock per net 2015-05-17 23:41:11 -04:00
net-procfs.c net: remove NETDEV_TX_LOCKED support 2016-04-26 15:53:05 -04:00
net-sysfs.c net: core: use __ethtool_get_ksettings 2016-02-25 22:06:47 -05:00
net-sysfs.h net: netdev_kobject_init: annotate with __init 2014-01-05 20:27:54 -05:00
net-traces.c net: IPv6 fib lookup tracepoint 2015-11-22 11:54:10 -05:00
netclassid_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c Revert "netpoll: Fix extra refcount release in netpoll_cleanup()" 2016-04-05 19:34:44 -04:00
netprio_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
pktgen.c net: pktgen: Call destroy_hrtimer_on_stack() 2016-05-31 11:44:08 -07:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
scm.c unix: correctly track in-flight fds in sending process user_struct 2016-02-08 10:30:42 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c net: Add docbook description for 'mtu' arg to skb_gso_validate_mtu() 2016-06-03 22:56:28 -07:00
sock_diag.c sock_diag: align nlattr properly when needed 2016-04-26 12:00:48 -04:00
sock_reuseport.c soreuseport: fix NULL ptr dereference SO_REUSEPORT after bind 2016-01-19 14:44:23 -05:00
sock.c net: add __sock_wfree() helper 2016-05-03 16:02:36 -04:00
stream.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-03 21:09:12 -05:00
sysctl_net_core.c bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c net: move net_get_random_once to lib 2015-10-08 05:26:35 -07:00