linux/drivers
Venkatesh Srinivas f277ec42f3 virtio_ring: shadow available ring flags & index
Improves cacheline transfer flow of available ring header.

Virtqueues are implemented as a pair of rings, one producer->consumer
avail ring and one consumer->producer used ring; preceding the
avail ring in memory are two contiguous u16 fields -- avail->flags
and avail->idx. A producer posts work by writing to avail->idx and
a consumer reads avail->idx.

The flags and idx fields only need to be written by a producer CPU
and only read by a consumer CPU; when the producer and consumer are
running on different CPUs and the virtio_ring code is structured to
only have source writes/sink reads, we can continuously transfer the
avail header cacheline between 'M' states between cores. This flow
optimizes core -> core bandwidth on certain CPUs.

(see: "Software Optimization Guide for AMD Family 15h Processors",
Section 11.6; similar language appears in the 10h guide and should
apply to CPUs w/ exclusive caches, using LLC as a transfer cache)

Unfortunately the existing virtio_ring code issued reads to the
avail->idx and read-modify-writes to avail->flags on the producer.

This change shadows the flags and index fields in producer memory;
the vring code now reads from the shadows and only ever writes to
avail->flags and avail->idx, allowing the cacheline to transfer
core -> core optimally.

In a concurrent version of vring_bench, the time required for
10,000,000 buffer checkout/returns was reduced by ~2% (average
across many runs) on an AMD Piledriver (15h) CPU:

(w/o shadowing):
 Performance counter stats for './vring_bench':
     5,451,082,016      L1-dcache-loads
     ...
       2.221477739 seconds time elapsed

(w/ shadowing):
 Performance counter stats for './vring_bench':
     5,405,701,361      L1-dcache-loads
     ...
       2.168405376 seconds time elapsed

The further away (in a NUMA sense) virtio producers and consumers are
from each other, the more we expect to benefit. Physical implementations
of virtio devices and implementations of virtio where the consumer polls
vring avail indexes (vhost) should also benefit.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07 17:28:11 +02:00
..
accessibility
acpi Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2015-12-04 11:30:45 -08:00
amba
android
ata SCSI misc on 20151113 2015-11-13 20:35:54 -08:00
atm
auxdisplay
base Merge branches 'pm-domains' and 'pm-cpufreq' 2015-12-04 14:01:42 +01:00
bcma
block Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-12-04 12:46:07 -08:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-10 18:11:41 -08:00
bus Merge branch 'x15-audio-fixes' into omap-for-v4.4/fixes 2015-11-12 09:58:21 -08:00
cdrom
char ipmi watchdog : add panic_wdt_timeout parameter 2015-11-16 06:28:43 -06:00
clk h8300 update for v4.4 2015-11-12 15:26:39 -08:00
clocksource clocksource: Disallow drivers for ARCH_USES_GETTIMEOFFSET 2015-11-16 19:07:08 +01:00
connector mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
cpufreq Merge branches 'pm-domains' and 'pm-cpufreq' 2015-12-04 14:01:42 +01:00
cpuidle
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-12-05 10:46:44 -08:00
dca
devfreq
dio
dma dmaengine: at_hdmac: use %pad format string for dma_addr_t 2015-11-16 09:21:05 +05:30
dma-buf dma-buf/fence: add fence_wait_any_timeout function v2 2015-10-30 01:16:16 -04:00
edac asm-generic cleanups 2015-11-06 14:22:15 -08:00
eisa
extcon Merge branches 'ib-extcon-mfd-4.4', 'ib-mfd-i2c-v4.4', 'ib-mfd-power-4.4', 'ib-mfd-regmap-4.4' and 'ib-mfd-regulator-4.4' into ibs-for-mfd-merged 2015-10-26 14:48:22 +00:00
firewire IEEE 1394 subsystem patch: 2015-11-11 10:21:34 -08:00
firmware ARM: SoC driver updates for v4.4 2015-11-10 15:00:03 -08:00
fmc
fpga fpga: socfpga: Fix check of return value of devm_request_irq 2015-10-29 15:20:25 -07:00
gpio gpio: omap: drop omap1 mpuio specific irq_mask/unmask callbacks 2015-11-30 13:50:21 +01:00
gpu Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-next 2015-12-05 16:15:38 +10:00
hid HID: lg: restrict filtering out of first interface to G29 only 2015-12-02 14:51:00 +01:00
hsi hsi: controllers:remove redundant code 2015-10-30 16:10:40 +01:00
hv drivers/hv: share Hyper-V SynIC constants with userspace 2015-11-04 16:24:33 +01:00
hwmon hwmon: (scpi) skip unsupported sensors properly 2015-11-16 09:59:50 -08:00
hwspinlock
hwtracing Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2015-11-13 20:04:17 -08:00
i2c i2c: i801: add Intel Lewisburg device IDs 2015-11-20 16:22:21 +01:00
ide mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
idle
iio First set of IIO fixes for the 4.4 cycle. 2015-11-18 13:15:50 -08:00
infiniband SCSI misc on 20151113 2015-11-13 20:35:54 -08:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-11-13 21:41:14 -08:00
iommu s390/pci_dma: handle dma table failures 2015-11-09 09:10:49 +01:00
ipack
irqchip irqchip/gic: Add save/restore of the active state 2015-11-17 14:25:59 +01:00
isdn isdn: Partially revert debug format string usage clean up 2015-11-25 11:49:58 -05:00
leds spi: Updates for v4.4 2015-11-05 13:15:12 -08:00
lguest
lightnvm lightnvm: missing nvm_lock acquire 2015-11-29 14:34:58 -07:00
macintosh
mailbox mailbox: mailbox-test: avoid reading iomem twice 2015-11-04 14:03:04 +05:30
mcb mcb: Destroy IDA on module unload 2015-10-29 09:02:16 +09:00
md dm thin: fix regression in advertised discard limits 2015-11-23 14:54:46 -05:00
media various: fix pci_set_dma_mask return value checking 2015-11-20 16:17:32 -08:00
memory ARM: SoC driver updates for v4.4 2015-11-10 15:00:03 -08:00
memstick
message SCSI queue for 4.4. 2015-11-12 07:06:18 -05:00
mfd asm-generic cleanups 2015-11-06 14:22:15 -08:00
misc Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2015-11-13 20:04:17 -08:00
mmc mmc: remove bondage between REQ_META and reliable write 2015-11-09 14:04:52 +01:00
mtd mtd: nand: fix shutdown/reboot for multi-chip systems 2015-11-16 10:51:39 -08:00
net virtio-net: Stop doing DMA from the stack 2015-12-07 16:10:53 +02:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-10 18:11:41 -08:00
ntb NTB: fix 32-bit compiler warning 2015-11-08 16:24:43 -05:00
nubus
nvdimm libnvdimm, pmem: fix size trim in pmem_direct_access() 2015-11-12 09:55:23 -08:00
nvme nvme: temporary fix for Apple controller reset 2015-12-01 13:23:22 -07:00
nvmem
of More power management and ACPI updates for v4.4-rc1 2015-11-12 11:50:33 -08:00
oprofile
parisc pci: remove pci_dma_supported 2015-11-10 16:32:11 -08:00
parport
pci PCI / PM: Tune down retryable runtime suspend error messages 2015-12-02 15:24:21 +01:00
pcmcia
perf arm64 updates for 4.4: 2015-11-04 14:47:13 -08:00
phy phy: qcom-ufs: fix build error when the component is built as a module 2015-11-09 17:44:24 -05:00
pinctrl pinctrl: sh-pfc: sh7734: Add missing cfg macro parameter to fix build 2015-12-01 11:13:04 +01:00
platform platform/chrome: Branch for v4.4 2015-11-13 21:53:18 -08:00
pnp
power - New Device Support 2015-11-06 10:23:50 -08:00
powercap
pps
ps3
ptp
pwm pwm: Changes for v4.4-rc1 2015-11-11 09:16:10 -08:00
rapidio
ras
regulator spi: Updates for v4.4 2015-11-05 13:15:12 -08:00
remoteproc remoteproc: fix memory leak of remoteproc ida cache layers 2015-11-26 17:44:28 +02:00
reset
rpmsg
rtc rtc: ds1307: fix alarm reading at probe time 2015-11-26 18:11:26 +01:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2015-11-18 08:59:29 -08:00
sbus
scsi SCSI fixes on 20151205 2015-12-06 08:02:25 -08:00
sfi
sh drivers: sh: Get rid of CONFIG_ARCH_SHMOBILE_MULTI 2015-11-17 02:12:46 +09:00
sn
soc Few Keystone fixes for 4.4-rcx 2015-11-25 23:48:12 +01:00
spi Merge remote-tracking branches 'spi/fix/bcm63xx', 'spi/fix/doc', 'spi/fix/mediatek' and 'spi/fix/pl022' into spi-linus 2015-11-30 12:26:47 +00:00
spmi char/misc drivers for 4.4-rc1 2015-11-04 22:15:15 -08:00
ssb ssb: add Kconfig entry for compiling SoC related code 2015-10-28 21:05:21 +02:00
staging staging/lustre: remove IOC_LIBCFS_PING_TEST ioctl 2015-12-06 14:50:59 -08:00
target target/stat: print full t10_wwn.model buffer 2015-11-28 21:23:13 -08:00
tc
thermal imx: thermal: use CPU temperature grade info for thresholds 2015-11-23 16:38:40 -08:00
thunderbolt
tty serial: export fsl8250_handle_irq 2015-11-20 16:19:54 -08:00
uio
usb usblp: do not set TASK_INTERRUPTIBLE before lock 2015-11-19 16:31:42 -08:00
uwb driver core update for 4.4-rc1 2015-11-04 21:50:37 -08:00
vfio VFIO updates for v4.4-rc1 2015-11-13 17:05:32 -08:00
vhost vhost: replace % with & on data path 2015-12-07 17:28:10 +02:00
video fbdev changes for 4.4 2015-11-10 10:00:09 -08:00
virt
virtio virtio_ring: shadow available ring flags & index 2015-12-07 17:28:11 +02:00
vlynq
vme char/misc drivers for 4.4-rc1 2015-11-04 22:15:15 -08:00
w1 power supply and reset changes for the v4.4 series 2015-11-05 12:28:15 -08:00
watchdog watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog 2015-11-23 09:00:09 +01:00
xen xen: bug fixes for 4.4-rc2 2015-11-26 11:42:25 -08:00
zorro
Kconfig char/misc drivers for 4.4-rc1 2015-11-04 22:15:15 -08:00
Makefile null_blk: register as a LightNVM device 2015-11-16 15:22:28 -07:00