linux/net
Jesper Dangaard Brouer 5772e9a346 qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE
Based on DaveM's recent API work on dev_hard_start_xmit(), that allows
sending/processing an entire skb list.

This patch implements qdisc bulk dequeue, by allowing multiple packets
to be dequeued in dequeue_skb().

The optimization principle for this is two fold, (1) to amortize
locking cost and (2) avoid expensive tailptr update for notifying HW.
 (1) Several packets are dequeued while holding the qdisc root_lock,
amortizing locking cost over several packet.  The dequeued SKB list is
processed under the TXQ lock in dev_hard_start_xmit(), thus also
amortizing the cost of the TXQ lock.
 (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more
API to delay HW tailptr update, which also reduces the cost per
packet.

One restriction of the new API is that every SKB must belong to the
same TXQ.  This patch takes the easy way out, by restricting bulk
dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the
qdisc only have attached a single TXQ.

Some detail about the flow; dev_hard_start_xmit() will process the skb
list, and transmit packets individually towards the driver (see
xmit_one()).  In case the driver stops midway in the list, the
remaining skb list is returned by dev_hard_start_xmit().  In
sch_direct_xmit() this returned list is requeued by dev_requeue_skb().

To avoid overshooting the HW limits, which results in requeuing, the
patch limits the amount of bytes dequeued, based on the drivers BQL
limits.  In-effect bulking will only happen for BQL enabled drivers.

Small amounts for extra HoL blocking (2x MTU/0.24ms) were
measured at 100Mbit/s, with bulking 8 packets, but the
oscillating nature of the measurement indicate something, like
sched latency might be causing this effect. More comparisons
show, that this oscillation goes away occationally. Thus, we
disregard this artifact completely and remove any "magic" bulking
limit.

For now, as a conservative approach, stop bulking when seeing TSO and
segmented GSO packets.  They already benefit from bulking on their own.
A followup patch add this, to allow easier bisect-ability for finding
regressions.

Jointed work with Hannes, Daniel and Florian.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:37:06 -07:00
..
6lowpan 6lowpan: Allow 6LoWPAN to be modular 2014-08-07 11:44:18 -07:00
9p 9P: remove unnecessary break after return 2014-07-15 16:27:00 -07:00
802 net: set name_assign_type in alloc_netdev() 2014-07-15 16:12:48 -07:00
8021q net: Always untag vlan-tagged traffic on input. 2014-08-11 12:16:51 -07:00
appletalk Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-07-16 14:09:34 -07:00
atm atm: Convert pr_warning to pr_warn 2014-09-10 12:40:10 -07:00
ax25
batman-adv batman-adv: Fix parameter order of hlist_add_behind 2014-08-16 19:19:08 -07:00
bluetooth Bluetooth: Fix re-setting RPA as expired when deferring update 2014-09-12 18:34:25 +02:00
bridge net: bridge: add a br_set_state helper function 2014-10-01 22:03:50 -04:00
caif caif: remove unnecessary break after goto 2014-07-15 16:27:01 -07:00
can can: add hash based access to single EFF frame filters 2014-05-19 09:38:24 +02:00
ceph libceph: do not hard code max auth ticket len 2014-09-10 20:08:36 +04:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-02 11:25:43 -07:00
dcb dcbnl : Fix misleading dcb_app->priority explanation 2014-07-30 17:21:05 -07:00
dccp net/dccp/ccid.c: add __init to ccid_activate 2014-10-01 18:33:13 -04:00
decnet af_decnet: Use time_after_eq 2014-08-22 12:23:11 -07:00
dns_resolver Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2014-08-06 08:06:39 -07:00
dsa net: dsa: Fix build warning for !PM_SLEEP 2014-10-01 15:24:00 -04:00
ethernet net: Add function for parsing the header length out of linear ethernet frames 2014-09-05 17:47:02 -07:00
hsr net/hsr: Remove left-over never-true conditional code. 2014-07-11 15:04:40 -07:00
ieee802154 ieee802154: fix __init functions 2014-10-01 02:03:13 -04:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-02 11:25:43 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-02 11:25:43 -07:00
ipx net: Split sk_no_check into sk_no_check_{rx,tx} 2014-05-23 16:28:53 -04:00
irda irda: add __init to irlan_open 2014-09-30 17:08:06 -04:00
iucv iucv: Convert pr_warning to pr_warn 2014-09-10 12:40:10 -07:00
key af_key: remove unnecessary break after return 2014-07-15 16:27:00 -07:00
l2tp l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functions 2014-09-19 15:57:15 -04:00
lapb
llc
mac80211 Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2014-09-26 15:39:24 -04:00
mac802154 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2014-09-08 11:14:56 -04:00
mpls net: Remove gso_send_check as an offload callback 2014-09-26 00:22:47 -04:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-02 11:25:43 -07:00
netlabel Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-08-06 09:38:14 -07:00
netlink netlink: Annotate RCU locking for seq_file walker 2014-08-14 15:13:40 -07:00
netrom net: set name_assign_type in alloc_netdev() 2014-07-15 16:12:48 -07:00
nfc Merge tag 'master-2014-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2014-08-05 13:18:20 -07:00
openvswitch net/openvswitch: remove dup comment in vport.h 2014-09-26 16:42:33 -04:00
packet net: Pass a "more" indication down into netdev_start_xmit() code paths. 2014-09-01 17:39:55 -07:00
phonet net: set name_assign_type in alloc_netdev() 2014-07-15 16:12:48 -07:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-06-12 14:27:40 -07:00
rfkill net: rfkill: gpio: Fix clock status 2014-09-22 16:02:15 -04:00
rose rose: use %*ph specifier 2014-09-07 16:07:25 -07:00
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-09-23 12:09:27 -04:00
sched qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE 2014-10-03 12:37:06 -07:00
sctp net/ipv4: bind ip_nonlocal_bind to current netns 2014-09-09 11:27:09 -07:00
sunrpc NFS client updates for Linux 3.17 2014-08-13 18:13:19 -06:00
tipc tipc: fix sparse warnings 2014-09-10 14:00:58 -07:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-06-12 14:27:40 -07:00
vmw_vsock
wimax
wireless Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next 2014-09-26 15:39:24 -04:00
x25
xfrm net: cleanup and document skb fclone layout 2014-10-01 16:34:25 -04:00
compat.c net: sendmsg: fix NULL pointer dereference 2014-07-29 12:20:22 -07:00
Kconfig netfilter: bridge: build br_nf_core only if required 2014-09-30 14:07:51 -04:00
Makefile 6lowpan: introduce new net/6lowpan directory 2014-07-12 01:53:30 +02:00
nonet.c
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-09-23 12:09:27 -04:00
sysctl_net.c