For good reason, vhost-user is currently built asynchronously, that
way better performance can be obtained. However, for certain use
cases such as simulation, this is problematic.
Consider an event-based simulation in which both the device and CPU
have scheduled according to a simulation "calendar". Now, consider
the CPU sending I/O to the device, over a vring in the vhost-user
protocol. In this case, the CPU must wait for the vring interrupt
to have been processed by the device, so that the device is able to
put an entry onto the simulation calendar to obtain time to handle
the interrupt. Note that this doesn't mean the I/O is actually done
at this time, it just means that the handling of it is scheduled
before the CPU can continue running.
This cannot be done with the asynchronous eventfd based vring kick
and call design.
Extend the protocol slightly, so that a message can be used for kick
and call instead, if VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS is
negotiated. This in itself doesn't guarantee synchronisation, but both
sides can also negotiate VHOST_USER_PROTOCOL_F_REPLY_ACK and thus get
a reply to this message by setting the need_reply flag, and ensure
synchronisation this way.
To really use it in both directions, VHOST_USER_PROTOCOL_F_SLAVE_REQ
is also needed.
Since it is used for simulation purposes and too many messages on
the socket can lock up the virtual machine, document that this should
only be used together with the mentioned features.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Message-Id: <20200123081708.7817-6-johannes@sipsolutions.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the following tools documentation files to the new tools manual:
docs/interop/qemu-img.rst
docs/interop/qemu-nbd.rst
docs/interop/virtfs-proxy-helper.rst
docs/interop/qemu-trace-stap.rst
docs/interop/virtiofsd.rst
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-4-peter.maydell@linaro.org
The qemu-option-trace.rst.inc file contains a rST documentation
fragment which describes trace options common to qemu-nbd and
qemu-img. We put this file into interop/, but we'd like to move the
qemu-nbd and qemu-img files into the tools/ manual. We could move
the .rst.inc file along with them, but we're eventually going to want
to use it for the main QEMU binary options documentation too, and
that will be in system/. So move qemu-option-trace.rst.inc to the
top-level docs/ directory, where all these files can include it via
.. include:: ../qemu-option-trace.rst.inc
This does have the slight downside that we now need to explicitly
tell Make which manuals use this file rather than relying on
a wildcard for all .rst.inc in the manual.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200217155415.30949-3-peter.maydell@linaro.org
In many cases the target of a convert operation is a newly provisioned
target that the user knows is blank (reads as zero). In this situation
there is no requirement for qemu-img to wastefully zero out the entire
device.
Add a new option, --target-is-zero, allowing the user to indicate that
an existing target device will return zeros for all reads.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20200205110248.2009589-2-david.edmondson@oracle.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The patch adds a new additional field to the qcow2 header: compression_type,
which specifies compression type. If field is absent or zero, default
compression type is set: ZLIB, which corresponds to current behavior.
New compression type (ZSTD) is to be added in further commit.
Suggested-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200131142219.3264-3-vsementsov@virtuozzo.com>
[mreitz: s/Bits 3-63: Reserved/Bits 4-63: Reserved/]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make it more obvious how to add new fields to the version 3 header and
how to interpret them.
The specification is adjusted so that for new defined optional fields:
1. Software may support some of these optional fields and ignore the
others, which means that features may be backported to downstream
Qemu independently.
2. If we want to add incompatible field (or a field, for which some of
its values would be incompatible), it must be accompanied by
incompatible feature bit.
Also the concept of "default is zero" is clarified, as it's strange to
say that the value of the field is assumed to be zero for the software
version which don't know about the field at all and don't know how to
treat it be it zero or not.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200131142219.3264-2-vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
[mreitz: s/some its/some of its/]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Document the virtiofsd(1) program and its command-line options. This
man page is a rST conversion of the original texi documentation that I
wrote.
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The '-i AIO' option was accidentally placed after '-n' and '-t'. Move it
after '--flush-interval'.
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200205163008.204493-1-jusual@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The option was deprecated in 4.0.0 (commit 0ae2d546); it's now been
long enough with no complaints to follow through with that process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200123164650.1741798-3-eblake@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virtfs-proxy-helper documentation is currently in
fsdev/qemu-trace-stap.texi in Texinfo format, which we
present to the user as:
* a virtfs-proxy-helper manpage
* but not (unusually for QEMU) part of the HTML docs
Convert the documentation to rST format that lives in
the docs/ subdirectory, and present it to the user as:
* a virtfs-proxy-helper manpage
* part of the interop/ Sphinx manual
There are minor formatting changes to suit Sphinx, but no
content changes. In particular I've split the -u and -g
options into each having their own description text.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Greg Kurz <groug@kaod.org>
Message-id: 20200124162606.8787-9-peter.maydell@linaro.org
The qemu-trace-stap documentation is currently in
scripts/qemu-trace-stap.texi in Texinfo format, which we
present to the user as:
* a qemu-trace-stap manpage
* but not (unusually for QEMU) part of the HTML docs
Convert the documentation to rST format that lives in
the docs/ subdirectory, and present it to the user as:
* a qemu-trace-stap manpage
* part of the interop/ Sphinx manual
There are minor formatting changes to suit Sphinx, but no
content changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200124162606.8787-8-peter.maydell@linaro.org
The qemu-img documentation is currently in qemu-nbd.texi in Texinfo
format, which we present to the user as:
* a qemu-img manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to the user as:
* a qemu-img manpage
* part of the interop/ Sphinx manual
The qemu-img rST document uses the new hxtool extension
to handle pulling rST fragments out of qemu-img-cmds.hx.
The documentation of the various options and commands is rather
muddled, with some options being described inside the relevant
command description and some in a more general section near the start
of the manual. All the command synopses are replicated in the .hx
file and then again in the manual. A lot of text is also duplicated
in the qemu-img.c code for the help text. I have not attempted to
deal with any of this, but have simply transposed the existing
structure into rST.
As usual, there are some minor formatting changes but no
textual changes, except that as with one or two other conversions
I have dropped the 'see also' section since it's not very
informative and looks odd in the HTML.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200124162606.8787-6-peter.maydell@linaro.org
It's been deprecated since QEMU v3.1. The 40p machine should be
used nowadays instead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200114114617.28854-1-thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Import our virtiofsd.
This pulls in the daemon to drive a file system connected to the
existing qemu virtiofsd device.
It's derived from upstream libfuse with lots of changes (and a lot
trimmed out).
The daemon lives in the newly created qemu/tools/virtiofsd
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
v2
drop the docs while we discuss where they should live
and we need to redo the manpage in anything but texi
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAl4pzZ4ACgkQBRYzHrxb
/eelUg//evho+RwlOK4TjOjLJyGfMqZDQO5TFR2S2NmiCP7makND4BWll2A5Zu26
oqzZw5pcsKZmpYJ81tqe1bnlCa6SCUx6cNGt+5n2cA0MYSjpPDeB2OegjS57NUoE
eGXXIE7GOrGShHx1fW7BuA3Pi0hmFSHGRHCs006WsVktb1rP1w+7/NBohgLFkYob
fqytP/K9ACEySETPGgDUEh6ZmmalrY1WeD+a11RZstOSA+2YhR3WbyN0z8fc6lCE
puFHNEs2L0zVIUicSyJ4ux9+rbxdZIelLD91mGZhxrWy0H0AIox4bYURUJlbajI7
Yl/FInQRMhStsKn3UN25MSYgGS8ZAM3IcG605vrC4HoQh9r8mVC/H19buWFCycvL
1naK6LTqFkL0igAKTeg+DUk3tNP3i+j8JaMnopvKIfEHwV1lpVEVHI7zUynBA85d
2xfOllkJreFtniYg5nfdfhVixKHLAId0x9ZvYw3wefLDF3ugXLHbrtj0hPcJiAny
TINAzZCbxZsCEdZsrPq4Ldf7Pmb8vI8pxJVsoD28gRcHNRQvPWef07mtW370IAdJ
SJXWLlsFh/rPJx51lVIMQf6d4qLePyHfB81VQ25qlrS5CW87XMmTyr5rngGFlJ2e
vJnMb+DgwG1gf+HV4W5Y3l0wehou5GxgbLT478s+r3YzfdV13d4=
=UUVN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b' into staging
virtiofsd first pull v2
Import our virtiofsd.
This pulls in the daemon to drive a file system connected to the
existing qemu virtiofsd device.
It's derived from upstream libfuse with lots of changes (and a lot
trimmed out).
The daemon lives in the newly created qemu/tools/virtiofsd
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
v2
drop the docs while we discuss where they should live
and we need to redo the manpage in anything but texi
# gpg: Signature made Thu 23 Jan 2020 16:45:18 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert-gitlab/tags/pull-virtiofs-20200123b: (108 commits)
virtiofsd: add some options to the help message
virtiofsd: stop all queue threads on exit in virtio_loop()
virtiofsd/passthrough_ll: Pass errno to fuse_reply_err()
virtiofsd: Convert lo_destroy to take the lo->mutex lock itself
virtiofsd: add --thread-pool-size=NUM option
virtiofsd: fix lo_destroy() resource leaks
virtiofsd: prevent FUSE_INIT/FUSE_DESTROY races
virtiofsd: process requests in a thread pool
virtiofsd: use fuse_buf_writev to replace fuse_buf_write for better performance
virtiofsd: add definition of fuse_buf_writev()
virtiofsd: passthrough_ll: Use cache_readdir for directory open
virtiofsd: Fix data corruption with O_APPEND write in writeback mode
virtiofsd: Reset O_DIRECT flag during file open
virtiofsd: convert more fprintf and perror to use fuse log infra
virtiofsd: do not always set FUSE_FLOCK_LOCKS
virtiofsd: introduce inode refcount to prevent use-after-free
virtiofsd: passthrough_ll: fix refcounting on remove/rename
libvhost-user: Fix some memtable remap cases
virtiofsd: rename inode->refcount to inode->nlookup
virtiofsd: prevent races with lo_dirp_put()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add the --print-capabilities option as per vhost-user.rst "Backend
programs conventions". Currently there are no advertised features.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The qemu-nbd documentation is currently in qemu-nbd.texi in Texinfo
format, which we present to the user as:
* a qemu-nbd manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to the user as:
* a qemu-nbd manpage
* part of the interop/ Sphinx manual
This follows the same pattern as commit 27a296fce9 did for the
qemu-ga manpage.
All the content of the old manpage is retained, except that I have
dropped the "This is free software; see the source for copying
conditions. There is NO warranty..." text that was in the old AUTHOR
section; Sphinx's manpage builder doesn't expect that much text in
the AUTHOR section, and since none of our other manpages have it it
seems easiest to delete it rather than try to figure out where else
in the manpage to put it.
The only other textual change is that I have had to give the
--nocache option its own description ("Equivalent to --cache=none")
because Sphinx doesn't have an equivalent of using item/itemx
to share a description between two options.
Some minor aspects of the formatting have changed, to suit what is
easiest for Sphinx to output. (The most notable is that Sphinx
option section option syntax doesn't support '--option foo=bar'
with bar underlined rather than bold, so we have to switch to
'--option foo=BAR' instead.)
The contents of qemu-option-trace.texi are now duplicated in
docs/interop/qemu-option-trace.rst.inc, until such time as we complete
the conversion of the other files which use it; since it has had only
3 changes in 3 years, this shouldn't be too awkward a burden.
(We use .rst.inc because if this file fragment has a .rst extension
then Sphinx complains about not seeing it in a toctree.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200116141511.16849-2-peter.maydell@linaro.org
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl4TaMEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpvzgH/2LyDAzCa9h93ikSJjmyUk5FUaqve38daEb3
S3JYjwKxQx7u1ydooKhvBQnBCZ2i3S+k62gfYyKB+nBv8xvjs0Eg5D1YJ5E8hciy
lf5OFGWWtX2iPDjZwQwT13kiJe0o3JRGxJJ6XqTEG+1EYOp7cky/FEv4PD030b9m
I2wROZ/Am+onB9YJX8c0Vv1CG+AryuJNXnvwQzTXEjj4U7bEYUyJwVZaCRyAdWQ3
uYXIZN9VwjVX6BFvy9ZAJbEsUVJvOM1/aQaDqcrLz+VlzRT7bRkKHi2G3vakrm1I
r5OpgyLo84132awCncbSykKDH5o8WaxLaJBjGmuBfasMz9wPzAg=
=uL1o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci, pc: fixes, features
Bugfixes all over the place.
HMAT support.
New flags for vhost-user-blk utility.
Auto-tuning of seg max for virtio storage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 06 Jan 2020 17:05:05 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (32 commits)
intel_iommu: add present bit check for pasid table entries
intel_iommu: a fix to vtd_find_as_from_bus_num()
virtio-net: delete also control queue when TX/RX deleted
virtio: reset region cache when on queue deletion
virtio-mmio: update queue size on guest write
tests: add virtio-scsi and virtio-blk seg_max_adjust test
virtio: make seg_max virtqueue size dependent
hw: fix using 4.2 compat in 5.0 machine types for i440fx/q35
vhost-user-scsi: reset the device if supported
vhost-user: add VHOST_USER_RESET_DEVICE to reset devices
hw/pci/pci_host: Let pci_data_[read/write] use unsigned 'size' argument
hw/pci/pci_host: Remove redundant PCI_DPRINTF()
virtio-mmio: Clear v2 transport state on soft reset
ACPI: add expected files for HMAT tests (acpihmat)
tests/bios-tables-test: add test cases for ACPI HMAT
tests/numa: Add case for QMP build HMAT
hmat acpi: Build Memory Side Cache Information Structure(s)
hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)
hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
numa: Extend CLI to provide memory side cache information
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When instantiated, this object will connect to the given D-Bus bus
"addr". During migration, it will take/restore the data from
org.qemu.VMState1 instances. See documentation for details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add a VHOST_USER_RESET_DEVICE message which will reset the vhost user
backend. Disabling all rings, and resetting all internal state, ready
for the backend to be reinitialized.
A backend has to report it supports this features with the
VHOST_USER_PROTOCOL_F_RESET_DEVICE protocol feature bit. If it does
so, the new message is used instead of sending a RESET_OWNER which has
had inconsistent implementations.
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1572385083-5254-2-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch is to add standard commands defined in docs/interop/vhost-user.rst
For vhost-user-* program
Signed-off-by: Micky Yun Chan (michiboo) <chanmickyyun@gmail.com>
Message-Id: <20191209015331.5455-1-chanmickyyun@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
'the' has a tendency to double up; squash them back down.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191104185202.102504-1-dgilbert@redhat.com>
[lv: removed disas/libvixl/vixl/invalset.h change]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The qemu-ga documentation is currently in qemu-ga.texi in
Texinfo format, which we present to the user as:
* a qemu-ga manpage
* a section of the main qemu-doc HTML documentation
Convert the documentation to rST format, and present it to
the user as:
* a qemu-ga manpage
* part of the interop/ Sphinx manual
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-id: 20190905131040.8350-1-peter.maydell@linaro.org
Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to
avoid wasting time on a preliminary write-zero request that will later
be rewritten by actual data, if it is known that the write-zero
request will use a slow fallback; but in doing so, could not optimize
for NBD. The NBD specification is now considering an extension that
will allow passing on those semantics; this patch updates the new
protocol bits and 'qemu-nbd --list' output to recognize the bit, as
well as the new errno value possible when using the new flag; while
upcoming patches will improve the client to use the feature when
present, and the server to advertise support for it.
The NBD spec recommends (but not requires) that ENOTSUP be avoided for
all but failures of a fast zero (the only time it is mandatory to
avoid an ENOTSUP failure is when fast zero is supported but not
requested during write zeroes; the questionable use is for ENOTSUP to
other actions like a normal write request). However, clients that get
an unexpected ENOTSUP will either already be treating it the same as
EINVAL, or may appreciate the extra bit of information. We were
equally loose for returning EOVERFLOW in more situations than
recommended by the spec, so if it turns out to be a problem in
practice, a later patch can tighten handling for both error codes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190823143726.27062-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message, also handle EOPNOTSUPP]
The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be
advertised when the server promises cache consistency between
simultaneous clients (basically, rules that determine what FUA and
flush from one client are able to guarantee for reads from another
client). When we don't permit simultaneous clients (such as qemu-nbd
without -e), the bit makes no sense; and for writable images, we
probably have a lot more work before we can declare that actions from
one client are cache-consistent with actions from another. But for
read-only images, where flush isn't changing any data, we might as
well advertise multi-conn support. What's more, advertisement of the
bit makes it easier for clients to determine if 'qemu-nbd -e' was in
use, where a second connection will succeed rather than hang until the
first client goes away.
This patch affects qemu as server in advertising the bit. We may want
to consider patches to qemu as client to attempt parallel connections
for higher throughput by spreading the load over those connections
when a server advertises multi-conn, but for now sticking to one
connection per nbd:// BDS is okay.
See also: https://bugzilla.redhat.com/1708300
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190815185024.7010-1-eblake@redhat.com>
[eblake: tweak blockdev-nbd.c to not request shared when writable,
fix iotest 233]
Reviewed-by: John Snow <jsnow@redhat.com>
Move query-target and its return type TargetInfo from misc.json to
machine.json, where they are covered by MAINTAINERS section "Machine
core". Also move its implementation from arch_init.c to
hw/core/machine-qmp-cmds, where it is likewise covered.
All users of SysEmuTarget are now in machine.json. Move it there from
common.json.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-3-armbru@redhat.com>
The vhost-user specification does not explain when
VHOST_USER_PROTOCOL_F_MQ must be implemented. This may lead
implementors of vhost-user masters to believe that this protocol feature
is required for any device that has multiple virtqueues. That would be
a mistake since existing vhost-user slaves offer multiple virtqueues but
do not advertise VHOST_USER_PROTOCOL_F_MQ.
For example, a vhost-net device with one rx/tx queue pair is not
multiqueue. The slave does not need to advertise
VHOST_USER_PROTOCOL_F_MQ. Therefore the master must assume it has these
virtqueues and cannot rely on askingt the slave how many virtqueues
exist.
Extend the specification to explain the different between true
multiqueue and regular devices with a fixed virtqueue layout.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190624091304.666-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The annotated style json we use in QMP documentation is not strict json
and depending on the version of Sphinx (2.0+) or Pygments installed,
might cause the build to fail.
Use the new QMP lexer.
Further, some versions of Sphinx can not apply custom lexers to "code"
directives and require the use of "code-block" directives instead, so
make that change at this time as well.
Tested under:
- Sphinx 1.3.6 and Pygments 2.4
- Sphinx 1.7.6 and Pygments 2.2 (Fedora 29 packages)
- Sphinx 2.0.1 and Pygments 2.4
- Sphinx 3.0.0+/f396b3a783 and Pygments 2.4 (From Sphinx git c4f44bdd)
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20190603214653.29369-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Pygments and Sphinx get pickier all the time; Sphinx 2.1+ now catches
these errors.
Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190603214653.29369-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The "Multiple queue support" section makes references to vhost-user-net
"queue pairs". This is confusing for two reasons:
1. This actually applies to all device types, not just vhost-user-net.
2. VHOST_USER_GET_QUEUE_NUM returns the number of virtqueues, not the
number of queue pairs.
Reword the section so that the vhost-user-net specific part is relegated
to the very end: we acknowledge that vhost-user-net historically
automatically enabled the first queue pair.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190626074815.19994-5-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190605131221.29432-1-marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a new vhost-user message to give a unix socket to a vhost-user
backend for GPU display updates.
Back when I started that work, I added a new GPU channel because the
vhost-user protocol wasn't bidirectional. Since then, there is a
vhost-user-slave channel for the slave to send requests to the master.
We could extend it with GPU messages. However, the GPU protocol is
quite orthogonal to vhost-user, thus I chose to have a new dedicated
channel.
See vhost-user-gpu.rst for the protocol details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190315180735.13096-1-marcandre.lureau@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This just about rewrites the entirety of the bitmaps.rst document to
make it consistent with the 4.0 release. I have added new features seen
in the 4.0 release, as well as tried to clarify some points that keep
coming up when discussing this feature both in-house and upstream.
It does not yet cover pull backups or migration details, but I intend to
keep extending this document to cover those cases.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190426221528.30293-3-jsnow@redhat.com
[Adjusted commit message. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD
and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared
buffer between qemu and backend.
Firstly, qemu uses VHOST_USER_GET_INFLIGHT_FD to get the
shared buffer from backend. Then qemu should send it back
through VHOST_USER_SET_INFLIGHT_FD each time we start vhost-user.
This shared buffer is used to track inflight I/O by backend.
Qemu should retrieve a new one when vm reset.
Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Chai Wen <chaiwen@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Message-Id: <20190228085355.9614-2-xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As discussed during "[PATCH v4 00/29] vhost-user for input & GPU"
review, let's define a common set of backend conventions to help with
management layer implementation, and interoperability.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190308140454.32437-3-marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already use (we didn't notice it) IN_USE flag for marking bitmap
metadata outdated, such as AUTO flag, which mirrors enabled/disabled
bitmaps. Now we are going to support bitmap resize, so it's good to
write IN_USE meaning with more details.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190311185147.52309-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The previous commit added a way to configure firmware with -blockdev
rather than -drive if=pflash. Document it as the preferred way.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190308131445.17502-13-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Be more specific about the string representation in header extensions.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
By default Sphinx wants to build a single manual at once.
For QEMU, this doesn't suit us, because we want to have
separate manuals for "Developer's Guide", "User Manual",
and so on, and we don't want to ship the Developer's Guide
to end-users. However, we don't want to completely duplicate
conf.py for each manual, and we'd like to continue to
support "build all docs in one run" for third-party sites
like readthedocs.org.
Make the top-level conf.py support two usage forms:
(1) as a common config file which is included by the conf.py
for each of QEMU's manuals: in this case sphinx-build is run
multiple times, once per subdirectory.
(2) as a top level conf file which will result in building all
the manuals into a single document: in this case sphinx-build is
run once, on the top-level docs directory.
Provide per-manual conf.py files and top level pages for
our first two manuals:
* QEMU Developer's Guide (docs/devel)
* QEMU System Emulation Management and Interoperability Guide
(docs/interop)
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-id: 20190305172139.32662-9-peter.maydell@linaro.org
Message-id: 20190228145624.24885-9-peter.maydell@linaro.org
It can be useful to figure out which NBD protocol features are
exposed by a server, as well as what features a client will
take advantage of if available, for a given qemu release. It's
not always precise to base features on version numbers (thanks
to downstream backports), but any documentation is better than
making users search through git logs themselves.
This patch originally stemmed from a request to document that
pristine 3.0 has a known bug where NBD_OPT_LIST_META_CONTEXT
with 0 queries forgot to advertise an available
"qemu:dirty-bitmap" context, but documenting bugs like this (or
the fact that 3.0 also botched NBD_CMD_CACHE) gets to be too
much details, especially since buggy releases will be less
likely connection targets over time. Instead, I chose to just
remind users to check stable release branches.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181215135324.152629-3-eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
RFC8259 obsoletes RFC7159. Fix a couple of URLs to point to the
newer version.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181203175702.128701-1-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When a QMP client sends in-band commands more quickly that we can
process them, we can either queue them without limit (QUEUE), drop
commands when the queue is full (DROP), or suspend receiving commands
when the queue is full (SUSPEND). None of them is ideal:
* QUEUE lets a misbehaving client make QEMU eat memory without bounds.
Not such a hot idea.
* With DROP, the client has to cope with dropped in-band commands. To
inform the client, we send a COMMAND_DROPPED event then. The event is
flawed by design in two ways: it's ambiguous (see commit d621cfe0a1),
and it brings back the "eat memory without bounds" problem.
* With SUSPEND, the client has to manage the flow of in-band commands to
keep the monitor available for out-of-band commands.
We currently DROP. Switch to SUSPEND.
Managing the flow of in-band commands to keep the monitor available for
out-of-band commands isn't really hard: just count the number of
"outstanding" in-band commands (commands sent minus replies received),
and if it exceeds the limit, hold back additional ones until it drops
below the limit again.
Note that we need to be careful pairing the suspend with a resume, or
else the monitor will hang, possibly forever. And here since we need to
make sure both:
(1) popping request from the req queue, and
(2) reading length of the req queue
will be in the same critical section, we let the pop function take the
corresponding queue lock when there is a request, then we release the
lock from the caller.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20181009062718.1914-2-peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Although off_t permits up to 63 bits (8EB) of file offsets, in
practice, we're going to hit other limits first. Document some
of those limits in the qcow2 spec (some are inherent, others are
implementation choices of qemu), and how choice of cluster size
can influence some of the limits.
While we cannot map any uncompressed virtual cluster to any
address higher than 64 PB (56 bits) (due to the current L1/L2
field encoding stopping at bit 55), qemu's cap of 8M for the
refcount table can still access larger host addresses for some
combinations of large clusters and small refcount_order. For
comparison, ext4 with 4k blocks caps files at 16PB.
Another interesting limit: for compressed clusters, the L2 layout
requires an ever-smaller maximum host offset as cluster size gets
larger, down to a 512 TB maximum with 2M clusters. In particular,
note that with a cluster size of 8k or smaller, the L2 entry for
a compressed cluster could technically point beyond the 64PB mark,
but when you consider that with 8k clusters and refcount_order = 0,
you cannot access beyond 512T without exceeding qemu's limit of an
8M cap on the refcount table, it is unlikely that any image in the
wild has attempted to do so. To be safe, let's document that bits
beyond 55 in a compressed cluster must be 0.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>