Go to file
Thomas Huth 831e882253 hw/net/spapr_llan: Fix receive buffer handling for better performance
tl;dr:
This patch introduces an alternate way of handling the receive
buffers of the spapr-vlan device, resulting in much better
receive performance for the guest.

Full story:
One of our testers recently discovered that the performance of the
spapr-vlan device is very poor compared to other NICs, and that
a simple "ping -i 0.2 -s 65507 someip" in the guest can result
in more than 50% lost ping packets (especially with older guest
kernels < 3.17).

After doing some analysis, it was clear that there is a problem
with the way we handle the receive buffers in spapr_llan.c: The
ibmveth driver of the guest Linux kernel tries to add a lot of
buffers into several buffer pools (with 512, 2048 and 65536 byte
sizes by default, but it can be changed via the entries in the
/sys/devices/vio/1000/pool* directories of the guest). However,
the spapr-vlan device of QEMU only tries to squeeze all receive
buffer descriptors into one single page which has been supplied
by the guest during the H_REGISTER_LOGICAL_LAN call, without
taking care of different buffer sizes. This has two bad effects:
First, only a very limited number of buffer descriptors is accepted
at all. Second, we also hand 64k buffers to the guest even if
the 2k buffers would fit better - and this results in dropped packets
in the IP layer of the guest since too much skbuf memory is used.

Though it seems at a first glance like PAPR says that we should store
the receive buffer descriptors in the page that is supplied during
the H_REGISTER_LOGICAL_LAN call, chapter 16.4.1.2 in the LoPAPR spec
declares that "the contents of these descriptors are architecturally
opaque, none of these descriptors are manipulated by code above
the architected interfaces". That means we don't have to store
the RX buffer descriptors in this page, but can also manage the
receive buffers at the hypervisor level only. This is now what we
are doing here: Introducing proper RX buffer pools which are also
sorted by size of the buffers, so we can hand out a buffer with
the best fitting size when a packet has been received.

To avoid problems with migration from/to older version of QEMU,
the old behavior is also retained and enabled by default. The new
buffer management has to be enabled via a new "use-rx-buffer-pools"
property.

Now with the new buffer pool management enabled, the problem with
"ping -s 65507" is fixed for me, and the throughput of a simple
test with wget increases from creeping 3MB/s up to 20MB/s!

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-24 11:17:34 +11:00
audio all: Clean up includes 2016-02-23 12:43:05 +00:00
backends qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
block qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
bsd-user build: [bsd-user] Rename "syscall.h" to "target_syscall.h" in target directories 2016-02-25 16:41:08 +00:00
contrib contrib/ivshmem-server: Print "not for production" warning 2016-03-21 21:29:03 +01:00
crypto crypto: fix cipher function signature mismatch with nettle & xts 2016-03-21 10:03:45 +00:00
default-configs event_notifier: Make event_notifier_init_fd() #ifdef CONFIG_EVENTFD 2016-03-21 21:28:59 +01:00
disas Remove unneeded include statements for setjmp.h 2016-03-22 19:11:15 +01:00
docs ivshmem: Fixes, cleanups, device model split 2016-03-23 12:57:44 +00:00
dtc@65cc4d2748 dtc: Update dtc / libfdt submodule to version 1.4.0 2015-06-03 23:56:49 +02:00
fpu fpu: Use plain 'int' rather than 'int_fast16_t' for exponents 2016-02-19 16:27:22 +00:00
fsdev module: Rename machine_init() to opts_init() 2016-03-16 15:54:23 -03:00
gdb-xml target-ppc: gdbstub: Add VSX support 2016-01-30 23:37:38 +11:00
hw hw/net/spapr_llan: Fix receive buffer handling for better performance 2016-03-24 11:17:34 +11:00
include ivshmem: Fixes, cleanups, device model split 2016-03-23 12:57:44 +00:00
io osdep: remove use of socket_error() from all code 2016-03-10 17:19:34 +00:00
libdecnumber libdecnumber: Clean up includes 2016-02-16 14:29:27 +00:00
linux-headers linux-headers: update against kvm/next 2016-03-01 12:15:28 +01:00
linux-user osdep: add wrappers for socket functions 2016-03-10 17:19:07 +00:00
migration migration: 2016-03-14 13:51:21 +00:00
nbd all: Clean up includes 2016-02-23 12:43:05 +00:00
net qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
pc-bios pc-bios/s390-ccw: fix old bug in ptr increment 2016-03-10 10:37:16 +01:00
pixman@87eea99e44 pixman: update internal copy to pixman-0.32.6 2014-09-15 08:14:19 +02:00
po Update language files for QEMU 2.5.0 2015-12-10 13:50:45 +00:00
qapi QAPI patches for 2016-03-18 2016-03-18 17:18:41 +00:00
qga qemu-ga: drop unused local err variable 2016-03-20 19:51:18 -05:00
qobject qobject: Document more shortcomings in our number handling 2016-02-08 17:29:54 +01:00
qom cpu: Clean up includes 2016-02-23 12:43:04 +00:00
replay qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
roms seabios: update to 1.9.1 stable release 2016-03-01 09:37:07 +01:00
scripts Remove unneeded include statements for setjmp.h 2016-03-22 19:11:15 +01:00
slirp slirp/slirp.h: Remove now-empty #ifdefs 2016-03-16 12:48:11 +00:00
stubs block: Add bdrv_next_monitor_owned() 2016-03-17 15:47:56 +01:00
target-alpha tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-arm target-arm: Fix translation level on early translation faults 2016-03-16 17:42:18 +00:00
target-cris tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-i386 X86 fixes 2016-03-15 11:05:37 +00:00
target-lm32 tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-m68k tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-microblaze tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-mips tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-moxie tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-openrisc tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-ppc ppc: A couple more dummy POWER8 Book4 regs 2016-03-24 11:17:34 +11:00
target-s390x s390x/cpu: Allow hotplug of CPUs 2016-03-10 10:37:15 +01:00
target-sh4 tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-sparc tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-tilegx tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-tricore tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-unicore32 tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
target-xtensa tcg: Add type for vCPU pointers 2016-03-01 13:27:09 +00:00
tcg tcg: Move definition of type TCGv 2016-03-01 13:27:09 +00:00
tests ivshmem: Fixes, cleanups, device model split 2016-03-23 12:57:44 +00:00
trace trace: Add 'vcpu' event property to trace guest vCPU 2016-03-01 13:27:10 +00:00
ui qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
util ivshmem: Fixes, cleanups, device model split 2016-03-23 12:57:44 +00:00
.dir-locals.el Add .dir-locals.el file to configure emacs coding style 2015-10-08 19:46:01 +03:00
.exrc qemu: add .exrc 2012-09-07 09:02:44 +03:00
.gitignore maint: Ignore ivshmem binaries 2015-11-06 15:42:38 +03:00
.gitmodules PPC: Add u-boot firmware for e500 2014-06-16 13:24:35 +02:00
.mailmap Update mailmap 2013-09-05 09:40:31 -05:00
.travis.yml .travis.yml: reduce the test matrix a little 2016-02-08 18:50:25 +00:00
accel.c all: Clean up includes 2016-02-04 17:41:30 +00:00
aio-posix.c aio-posix: Change CONFIG_EPOLL to CONFIG_EPOLL_CREATE1 2016-03-17 09:50:14 +00:00
aio-win32.c all: Clean up includes 2016-02-04 17:41:30 +00:00
arch_init.c all: Clean up includes 2016-02-04 17:41:30 +00:00
async.c all: Clean up includes 2016-02-04 17:41:30 +00:00
balloon.c all: Clean up includes 2016-02-04 17:41:30 +00:00
block.c block: Use BdrvChild in BlockBackend 2016-03-17 15:47:57 +01:00
blockdev-nbd.c nbd: enable use of TLS with nbd-server-start command 2016-02-16 17:17:49 +01:00
blockdev.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
blockjob.c blockjob: Fix hang in block_job_finish_sync 2016-02-09 13:52:26 +00:00
bootdevice.c qom: Swap 'name' next to visitor in ObjectPropertyAccessor 2016-02-08 17:29:56 +01:00
bt-host.c all: Clean up includes 2016-02-04 17:41:30 +00:00
bt-vhci.c all: Clean up includes 2016-02-04 17:41:30 +00:00
Changelog Use qemu-project.org domain name 2013-10-11 09:34:56 -07:00
CODING_STYLE CODING_STYLE: update mixed declaration rules 2015-09-09 15:34:54 +02:00
configure wxx: Add support for ncurses 2016-03-22 19:17:38 +01:00
COPYING
COPYING.LIB
cpu-exec-common.c exec: Clean up includes 2016-01-29 15:07:22 +00:00
cpu-exec.c log: do not unnecessarily include qom/cpu.h 2016-02-03 09:19:10 +00:00
cpus.c block: Use blk_{commit,flush}_all() consistently 2016-03-17 15:47:56 +01:00
cputlb.c memory: Drop MemoryRegion.ram_addr 2016-03-07 13:26:29 +01:00
device_tree.c device_tree: qemu_fdt_getprop_cell converted to use the error API 2016-02-19 09:42:30 -07:00
device-hotplug.c blockdev: Split monitor reference from BB creation 2016-03-17 15:47:56 +01:00
disas.c all: Clean up includes 2016-02-04 17:41:30 +00:00
dma-helpers.c all: Clean up includes 2016-02-04 17:41:30 +00:00
dump.c dump-guest-memory: add qmp event DUMP_COMPLETED 2016-02-22 18:40:29 +01:00
exec.c exec: fix early return from ram_block_add 2016-03-15 18:23:33 +01:00
gdbstub.c replay: character devices 2016-03-15 18:23:40 +01:00
HACKING HACKING: Add a section on error handling and reporting 2016-02-09 13:19:49 +01:00
hmp-commands-info.hx Dump: add hmp command "info dump" 2016-02-22 18:40:28 +01:00
hmp-commands.hx hmp: 'drive_add -n' for creating a node without BB 2016-03-14 16:46:43 +01:00
hmp.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
hmp.h Dump: add hmp command "info dump" 2016-02-22 18:40:28 +01:00
iohandler.c all: Clean up includes 2016-02-04 17:41:30 +00:00
ioport.c all: Clean up includes 2016-02-04 17:41:30 +00:00
iothread.c all: Clean up includes 2016-02-04 17:41:30 +00:00
kvm-all.c kvm/irqchip: use bitmap utility for gsi tracking 2016-03-07 15:18:22 +01:00
kvm-stub.c all: Clean up includes 2016-02-04 17:41:30 +00:00
LICENSE vfio: move hw/misc/vfio.c to hw/vfio/pci.c Move vfio.h into include/hw/vfio 2014-12-19 15:24:06 -07:00
main-loop.c icount: decouple warp calls 2016-03-15 18:23:45 +01:00
MAINTAINERS MAINTAINERS: Fix typo, block/stream.h -> block/stream.c 2016-03-16 13:25:29 -04:00
Makefile osdep: add wrappers for socket functions 2016-03-10 17:19:07 +00:00
Makefile.objs crypto: add cryptographic random byte source 2016-03-17 09:49:01 +00:00
Makefile.target io: add abstract QIOChannel classes 2015-12-18 12:18:05 +00:00
memory_mapping.c dump-guest-memory: add "detach" support 2016-02-22 18:40:28 +01:00
memory.c trace: separate MMIO tracepoints from TB-access tracepoints 2016-03-14 09:34:30 +00:00
module-common.c all: Clean up includes 2016-02-04 17:41:30 +00:00
monitor.c monitor: Use BB list for BB name completion 2016-03-17 15:47:56 +01:00
numa.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
os-posix.c log: Redirect stderr to logfile if deamonized 2016-02-22 18:40:29 +01:00
os-win32.c all: Clean up includes 2016-02-04 17:41:30 +00:00
page_cache.c all: Clean up includes 2016-02-04 17:41:30 +00:00
qapi-schema.json qapi: Use anonymous bases in QMP flat unions 2016-03-18 10:29:26 +01:00
qdev-monitor.c qdev-monitor: add missing aliases for virtio device classes 2016-03-16 10:13:10 +01:00
qdict-test-data.txt
qemu-bridge-helper.c all: Clean up includes 2016-02-04 17:41:30 +00:00
qemu-char.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
qemu-doc.texi ivshmem: Require master to have ID zero 2016-03-21 21:29:03 +01:00
qemu-ga.texi docs: Style the command and its options in the synopsis 2016-01-26 15:58:11 +01:00
qemu-img-cmds.hx qemu-img: allow specifying image as a set of options args 2016-02-22 09:50:04 +01:00
qemu-img.c blockdev: Split monitor reference from BB creation 2016-03-17 15:47:56 +01:00
qemu-img.texi qemu-img: allow specifying image as a set of options args 2016-02-22 09:50:04 +01:00
qemu-io-cmds.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qemu-io.c blockdev: Split monitor reference from BB creation 2016-03-17 15:47:56 +01:00
qemu-nbd.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
qemu-nbd.texi qemu-nbd: allow specifying image as a set of options args 2016-02-22 09:50:04 +01:00
qemu-options-wrapper.h vl.c: In qemu -h output, only print options for the arch we are running as 2011-12-19 10:27:33 -06:00
qemu-options.h vl.c: Move option generation logic into a wrapper file 2011-12-19 10:27:33 -06:00
qemu-options.hx qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses 2016-03-15 10:35:25 +01:00
qemu-seccomp.c all: Clean up includes 2016-02-04 17:41:30 +00:00
qemu-tech.texi tcg: Rename tcg-target.c to tcg-target.inc.c 2016-02-23 08:30:38 -08:00
qemu-timer.c icount: decouple warp calls 2016-03-15 18:23:45 +01:00
qemu.nsi nsis: Add QEMU version information to Windows registry 2015-09-24 20:52:28 +02:00
qemu.sasl sasl: Avoid 'Could not find keytab file' in syslog 2014-03-15 13:54:18 +04:00
qjson.c all: Clean up includes 2016-02-04 17:41:30 +00:00
qmp-commands.hx postcopy: Remove the x- 2016-03-11 17:53:59 +05:30
qmp.c dump-guest-memory: add dump_in_progress() helper function 2016-02-22 18:40:28 +01:00
qtest.c all: Clean up includes 2016-02-04 17:41:30 +00:00
README README: fill out some useful quickstart information 2015-10-13 18:48:46 +02:00
rules.mak rules: filter out irrelevant files 2016-02-17 16:59:36 +02:00
softmmu_template.h exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS 2016-01-21 14:15:05 +00:00
spice-qemu-char.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
tcg-runtime.c all: Clean up includes 2016-02-04 17:41:30 +00:00
tci.c all: Clean up includes 2016-02-04 17:41:30 +00:00
thread-pool.c all: Clean up includes 2016-02-04 17:41:30 +00:00
thunk.c all: Clean up includes 2016-02-04 17:41:30 +00:00
tpm.c qapi: Don't special-case simple union wrappers 2016-03-18 10:29:26 +01:00
trace-events hw/intc: Add (new) ASPEED VIC device model 2016-03-16 17:42:18 +00:00
translate-all.c log: do not unnecessarily include qom/cpu.h 2016-02-03 09:19:10 +00:00
translate-all.h translate-all: remove unnecessary argument to tb_invalidate_phys_range 2015-06-05 17:09:59 +02:00
translate-common.c exec: Clean up includes 2016-01-29 15:07:22 +00:00
user-exec.c all: Clean up includes 2016-02-04 17:41:30 +00:00
VERSION Open 2.6 development tree 2015-12-17 10:17:08 +00:00
version.rc Use qemu-project.org domain name 2013-10-11 09:34:56 -07:00
vl.c module: Rename machine_init() to opts_init() 2016-03-16 15:54:23 -03:00
xen-common-stub.c xen: Clean up includes 2016-01-29 15:07:23 +00:00
xen-common.c xen: drop XenXC and associated interface wrappers 2016-02-10 12:01:24 +00:00
xen-hvm-stub.c fix MSI injection on Xen 2016-02-06 20:44:10 +02:00
xen-hvm.c xen: Drop __XEN_LATEST_INTERFACE_VERSION__ checks from prior to Xen 4.2 2016-02-10 12:01:32 +00:00
xen-mapcache.c xen: Clean up includes 2016-01-29 15:07:23 +00:00

         QEMU README
         ===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:

  mkdir build
  cd build
  ../configure
  make

Complete details of the process for building and configuring QEMU for
all supported host platforms can be found in the qemu-tech.html file.
Additional information can also be found online via the QEMU website:

  http://qemu-project.org/Hosts/Linux
  http://qemu-project.org/Hosts/W32


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

   git clone git://git.qemu-project.org/qemu.git

When submitting patches, the preferred approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the HACKING and CODING_STYLE files.

Additional information on submitting patches can be found online via
the QEMU website

  http://qemu-project.org/Contribute/SubmitAPatch
  http://qemu-project.org/Contribute/TrivialPatches


Bug reporting
=============

The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

  https://bugs.launchpad.net/qemu/

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via launchpad.

For additional information on bug reporting consult:

  http://qemu-project.org/Contribute/ReportABug


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC

 - qemu-devel@nongnu.org
   http://lists.nongnu.org/mailman/listinfo/qemu-devel
 - #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

  http://qemu-project.org/Contribute/StartHere

-- End