Commit Graph

75617 Commits

Author SHA1 Message Date
Stefan Hajnoczi
aa38e19f05 aio-posix: support userspace polling of fd monitoring
Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled
from userspace.  Previously userspace polling was only allowed when all
AioHandler's had an ->io_poll() callback.  This prevented starvation of
fds by userspace pollable handlers.

Add the FDMonOps->need_wait() callback that enables userspace polling
even when some AioHandlers lack ->io_poll().

For example, it's now possible to do userspace polling when a TCP/IP
socket is monitored thanks to Linux io_uring.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-7-stefanha@redhat.com
Message-Id: <20200305170806.1313245-7-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
73fd282e7b aio-posix: add io_uring fd monitoring implementation
The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2).  Details are given in the source code.

Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.

It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring.  fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-6-stefanha@redhat.com
Message-Id: <20200305170806.1313245-6-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
b321051cf4 aio-posix: simplify FDMonOps->update() prototype
The AioHandler *node, bool is_new arguments are more complicated to
think about than simply being given AioHandler *old_node, AioHandler
*new_node.

Furthermore, the new Linux io_uring file descriptor monitoring mechanism
added by the new patch requires access to both the old and the new
nodes.  Make this change now in preparation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-5-stefanha@redhat.com
Message-Id: <20200305170806.1313245-5-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
1f050a4690 aio-posix: extract ppoll(2) and epoll(7) fd monitoring
The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code.  Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.

The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct.  See the patch for details.

Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
   while we're still on the clock for adaptive polling.  This was
   already happening for epoll(7), so if it's really an issue then we'll
   need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
   was broken when the number of fds exceeded the epoll(7) upgrade
   threshold.  I guess this code path simply wasn't tested and no one
   noticed the bug.  I didn't go out of my way to fix it but the correct
   code is simpler than preserving the bug.

I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-4-stefanha@redhat.com
Message-Id: <20200305170806.1313245-4-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
3aa221b382 aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
Now that run_poll_handlers_once() is only called by run_poll_handlers()
we can improve the CPU time profile by moving the expensive
RCU_READ_LOCK() out of the polling loop.

This reduces the run_poll_handlers() from 40% CPU to 10% CPU in perf's
sampling profiler output.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-3-stefanha@redhat.com
Message-Id: <20200305170806.1313245-3-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
e4346192f1 aio-posix: completely stop polling when disabled
One iteration of polling is always performed even when polling is
disabled.  This is done because:
1. Userspace polling is cheaper than making a syscall.  We might get
   lucky.
2. We must poll once more after polling has stopped in case an event
   occurred while stopping polling.

However, there are downsides:
1. Polling becomes a bottleneck when the number of event sources is very
   high.  It's more efficient to monitor fds in that case.
2. A high-frequency polling event source can starve non-polling event
   sources because ppoll(2)/epoll(7) is never invoked.

This patch removes the forced polling iteration so that poll_ns=0 really
means no polling.

IOPS increases from 10k to 60k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device because the large number of event sources being polled slows down
the event loop.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-2-stefanha@redhat.com
Message-Id: <20200305170806.1313245-2-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi
c39cbedb54 aio-posix: remove confusing QLIST_SAFE_REMOVE()
QLIST_SAFE_REMOVE() is confusing here because the node must be on the
list.  We actually just wanted to clear the linked list pointers when
removing it from the list.  QLIST_REMOVE() now does this, so switch to
it.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200224103406.1894923-3-stefanha@redhat.com
Message-Id: <20200224103406.1894923-3-stefanha@redhat.com>
2020-03-09 16:39:20 +00:00
Stefan Hajnoczi
a31ca6801c qemu/queue.h: clear linked list pointers on remove
Do not leave stale linked list pointers around after removal.  It's
safer to set them to NULL so that use-after-removal results in an
immediate segfault.

The RCU queue removal macros are unchanged since nodes may still be
traversed after removal.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200224103406.1894923-2-stefanha@redhat.com
Message-Id: <20200224103406.1894923-2-stefanha@redhat.com>
2020-03-09 16:39:20 +00:00
Peter Maydell
67f17e23ba Block layer patches:
- Add qemu-storage-daemon (still experimental)
 - rbd: Add support for ceph namespaces
 - Fix bdrv_reopen() with backing file in different AioContext
 - qcow2: Fix read-write reopen with persistent dirty bitmaps
 - qcow2: Fix alloc_cluster_abort() for pre-existing clusters
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeYoR/AAoJEH8JsnLIjy/W3VcP/jOCxj+dMiQO7/+ywFvPYg+8
 Mux0nDD4vvvduldOObxvFPT+ufid03MTIX+V27gGqsXeh+GhZja5w+uLYK8H9fy8
 swviWEBGgZJb2q5RXVUMtFfnViNzO3NyWNC8vD2E8lJmEwPDyXeL31mJZCjSnSZ1
 /9tJaztN/+8H2Ra26HBj2OnqsMb1CYPS0vmGLdJ34Bn2BpLSmmeUdalxOTeOHRP3
 KlmJYLk7GwVfz98YKVtet71/WfBsU5s7h6Dq/ZQkHoqwk4LNQAVcv/qbiPoIhm5I
 pqERbQlk/3le8jh5M8rmdE1P1LVLRfW0CDlkFNeTt45XBd3lfk396i+14dZvWMa0
 m1egpCUcCGS6GcxJvVOPnsqyAZzTjW/EW7NJkeKlOl6ljzSmbVyfcke7DlUua1SG
 xwe9zrR4Ru8LC/JRDJtapTPnWWfa/63BV3dACokjaS3ix+OA3gLmzwoszErYIYfM
 MGxkfog/rRmb69s61tGUwuSEDEnlAQpiZ+r+yEo9mMtULCFlwKH0GlvsuwNcpHd/
 XffT7omg/Phdrtffp8Vs0SQ4a3A+ILGotqY+LKRhVtP06i84SNTohhmQANBjSJo8
 qeZA4sHvwxGxunllb8fF7r3QZ8sIbN3dckfHze3MCtA4gpHs5ZM/x9A//J/R7bEC
 OFx4nuk2ME9SSCTWeM3W
 =32U7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Add qemu-storage-daemon (still experimental)
- rbd: Add support for ceph namespaces
- Fix bdrv_reopen() with backing file in different AioContext
- qcow2: Fix read-write reopen with persistent dirty bitmaps
- qcow2: Fix alloc_cluster_abort() for pre-existing clusters

# gpg: Signature made Fri 06 Mar 2020 17:12:31 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (29 commits)
  block: bdrv_reopen() with backing file in different AioContext
  iotests: Refactor blockdev-reopen test for iothreads
  block/rbd: Add support for ceph namespaces
  qemu-storage-daemon: Add --monitor option
  monitor: Add allow_hmp parameter to monitor_init()
  hmp: Fail gracefully if chardev is already in use
  qmp: Fail gracefully if chardev is already in use
  monitor: Create QAPIfied monitor_init()
  qapi: Create 'pragma' module
  stubs: Update monitor stubs for qemu-storage-daemon
  qemu-storage-daemon: Add --chardev option
  qemu-storage-daemon: Add main loop
  qemu-storage-daemon: Add --export option
  blockdev-nbd: Boxed argument type for nbd-server-add
  qemu-storage-daemon: Add --nbd-server option
  qemu-storage-daemon: Add --object option
  qapi: Flatten object-add
  qemu-storage-daemon: Add --blockdev option
  block: Move sysemu QMP commands to QAPI block module
  block: Move common QMP commands to block-core QAPI module
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-03-06 17:15:36 +00:00
Kevin Wolf
1de6b45fb5 block: bdrv_reopen() with backing file in different AioContext
This patch allows bdrv_reopen() (and therefore the x-blockdev-reopen QMP
command) to attach a node as the new backing file even if the node is in
a different AioContext than the parent if one of both nodes can be moved
to the AioContext of the other node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <20200306141413.30705-3-kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:34:09 +01:00
Kevin Wolf
97518e11c3 iotests: Refactor blockdev-reopen test for iothreads
We'll want to test more than one successful case in the future, so
prepare the test for that by a refactoring that runs each scenario in a
separate VM.

test_iothreads_switch_{backing,overlay} currently produce errors, but
these are cases that should actually work, by switching either the
backing file node or the overlay node to the AioContext of the other
node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <20200306141413.30705-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:34:01 +01:00
Florian Florensa
19ae9ae014 block/rbd: Add support for ceph namespaces
Starting from ceph Nautilus, RBD has support for namespaces, allowing
for finer grain ACLs on images inside a pool, and tenant isolation.

In the rbd cli tool documentation, the new image-spec and snap-spec are :
 - [pool-name/[namespace-name/]]image-name
 - [pool-name/[namespace-name/]]image-name@snap-name

When using an non namespace's enabled qemu, it complains about not
finding the image called namespace-name/image-name, thus we only need to
parse the image once again to find if there is a '/' in its name, and if
there is, use what is before it as the name of the namespace to later
pass it to rados_ioctx_set_namespace.
rados_ioctx_set_namespace if called with en empty string or a null
pointer as the namespace parameters pretty much does nothing, as it then
defaults to the default namespace.

The namespace is extracted inside qemu_rbd_parse_filename, stored in the
qdict, and used in qemu_rbd_connect to make it work with both qemu-img,
and qemu itself.

Signed-off-by: Florian Florensa <fflorensa@online.net>
Message-Id: <20200110111513.321728-2-fflorensa@online.net>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
2af282ec51 qemu-storage-daemon: Add --monitor option
This adds and parses the --monitor option, so that a QMP monitor can be
used in the storage daemon. The monitor offers commands defined in the
QAPI schema at storage-daemon/qapi/qapi-schema.json.

The --monitor options currently allows to create multiple monitors with
the same ID. This part of the interface is considered unstable. We will
reject such configurations as soon as we have a design for the monitor
subsystem to perform these checks. (In the system emulator, we depend on
QemuOpts rejecting duplicate IDs.)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-21-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
a2f411c467 monitor: Add allow_hmp parameter to monitor_init()
Add a new parameter allow_hmp to monitor_init() so that the storage
daemon can disable HMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-20-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
8e9119a807 hmp: Fail gracefully if chardev is already in use
Trying to attach a HMP monitor to a chardev that is already in use
results in a crash because monitor_init_hmp() passes &error_abort to
qemu_chr_fe_init():

$ ./x86_64-softmmu/qemu-system-x86_64 --chardev stdio,id=foo --mon foo --mon foo
QEMU 4.2.50 monitor - type 'help' for more information
(qemu) Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:220:
qemu-system-x86_64: --mon foo: Device 'foo' is in use
Abgebrochen (Speicherabzug geschrieben)

Fix this by allowing monitor_init_hmp() to return an error and passing
any error in qemu_chr_fe_init() to its caller instead of aborting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-19-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
f27a9bb3e9 qmp: Fail gracefully if chardev is already in use
Trying to attach a QMP monitor to a chardev that is already in use
results in a crash because monitor_init_qmp() passes &error_abort to
qemu_chr_fe_init():

$ ./x86_64-softmmu/qemu-system-x86_64 --chardev stdio,id=foo --mon foo,mode=control --mon foo,mode=control
Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:220:
qemu-system-x86_64: --mon foo,mode=control: Device 'foo' is in use
Abgebrochen (Speicherabzug geschrieben)

Fix this by allowing monitor_init_qmp() to return an error and passing
any error in qemu_chr_fe_init() to its caller instead of aborting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-18-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
f2098725aa monitor: Create QAPIfied monitor_init()
This adds a new QAPI-based monitor_init() function. The existing
monitor_init_opts() is rewritten to simply put its QemuOpts parameter
into a visitor and pass the resulting QAPI object to monitor_init().

This will cause some change in those error messages for the monitor
options in the system emulator that are now generated by the visitor
rather than explicitly checked in monitor_init_opts().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-17-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
9a9f909951 qapi: Create 'pragma' module
We want to share the whitelists between the system emulator schema and
the storage daemon schema, so move all the pragmas from the main schema
file into a separate file that can be included from both.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-16-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
6ede81d576 stubs: Update monitor stubs for qemu-storage-daemon
Before we can add the monitor to qemu-storage-daemon, we need to add a
stubs for monitor_fdsets_cleanup().

We also need to make sure that stubs that are actually implemented in
the monitor core aren't linked to qemu-storage-daemon so that we don't
get linker errors because of duplicate symbols. This is achieved by
moving the stubs in question to a new file stubs/monitor-core.c.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-15-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
5e6911cf11 qemu-storage-daemon: Add --chardev option
This adds a --chardev option to the storage daemon that works the same
as the -chardev option of the system emulator.

The syntax of the --chardev option is still considered unstable. We want
to QAPIfy it and will potentially make changes to its syntax while
converting it. However, we haven't decided yet on a design for the
QAPIfication, so QemuOpts will have to do for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-14-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
aa70683ded qemu-storage-daemon: Add main loop
Instead of exiting after processing all command line options, start a
main loop and keep processing events until exit is requested with a
signal (e.g. SIGINT).

Now qemu-storage-daemon can be used as an alternative for qemu-nbd that
provides a few features that were previously only available from QMP,
such as access to options only available with -blockdev and the socket
types 'vsock' and 'fd'.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-13-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
39411120b7 qemu-storage-daemon: Add --export option
Add a --export option to qemu-storage-daemon to export a block node. For
now, only NBD exports are implemented. Apart from the 'type' option
(which is the implied key), it maps the arguments for nbd-server-add to
the command line. Example:

    --export nbd,device=disk,name=test-export,writable=on

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-12-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
c62d24e906 blockdev-nbd: Boxed argument type for nbd-server-add
Move the arguments of nbd-server-add to a new struct BlockExportNbd and
convert the command to 'boxed': true. This makes it easier to share code
with the storage daemon.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-11-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
eed8b69178 qemu-storage-daemon: Add --nbd-server option
Add a --nbd-server option to qemu-storage-daemon to start the built-in
NBD server right away. It maps the arguments for nbd-server-start to the
command line, with the exception that it uses SocketAddress instead of
SocketAddressLegacy: New interfaces shouldn't use legacy types, and the
additional nesting would be nasty on the command line.

Example (only with required options):

    --nbd-server addr.type=inet,addr.host=localhost,addr.port=10809

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-10-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
d6da78b5fd qemu-storage-daemon: Add --object option
Add a command line option to create user-creatable QOM objects.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-9-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:28 +01:00
Kevin Wolf
5f07c4d60d qapi: Flatten object-add
Mapping object-add to the command line as is doesn't result in nice
syntax because of the nesting introduced with 'props'. This becomes
nicer and more consistent with device_add and netdev_add when we accept
properties for the object on the top level instead.

'props' is still accepted after this patch, but marked as deprecated.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-8-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:21:27 +01:00
Kevin Wolf
14837c6493 qemu-storage-daemon: Add --blockdev option
This adds a --blockdev option to the storage daemon that works the same
as the -blockdev option of the system emulator.

In order to be able to link with blockdev.o, we also need to change
stream.o from common-obj to block-obj, which is where all other block
jobs already are.

In contrast to the system emulator, qemu-storage-daemon options will be
processed in the order they are given. The user needs to take care to
refer to other objects only after defining them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-7-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Kevin Wolf
5a16818b45 block: Move sysemu QMP commands to QAPI block module
QMP commands that are related to the system emulator and don't make
sense in the context of tools such as qemu-storage-daemon should live in
qapi/block.json rather than qapi/block-core.json. Move them there.

The associated data types are actually also used in code shared with the
tools, so they stay in block-core.json.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-6-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Kevin Wolf
b3cf1ec06a block: Move common QMP commands to block-core QAPI module
block-core is for everything that isn't related to the system emulator.
Internal snapshots, the NBD server and quorum events make sense in the
tools, too, so move them to block-core.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-5-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Kevin Wolf
12c929bca2 block: Move system emulator QMP commands to block/qapi-sysemu.c
These commands make only sense for system emulators and their
implementations call functions that don't exist in tools (e.g. to
resolve qdev IDs). Move them out so that blockdev.c can be linked to
qemu-storage-daemon.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-4-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Kevin Wolf
5964ed56d9 stubs: Add arch_type
blockdev.c uses the arch_type constant, so before we can use the file in
tools (i.e. outside of the system emulator), we need to add a stub for
it. A new QEMU_ARCH_NONE is introduced for this case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-3-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Kevin Wolf
f353415ffd qemu-storage-daemon: Add barebone tool
This adds a new binary qemu-storage-daemon that doesn't yet do more than
some typical initialisation for tools and parsing the basic command
options --version, --help and --trace.

Even though this doesn't add any options yet that create things (like
--object or --blockdev), already document that we're planning to process
them in the order they are given on the command line rather than trying
(and failing, like vl.c) to resolve dependencies between options
automatically.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200224143008.13362-2-kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:38 +01:00
Peter Krempa
65eb7c85a3 block/qcow2: Move bitmap reopen into bdrv_reopen_commit_post
The bitmap code requires writing the 'file' child when the qcow2 driver
is reopened in read-write mode.

If the 'file' child is being reopened due to a permissions change, the
modification is commited yet when qcow2_reopen_commit is called. This
means that any attempt to write the 'file' child will end with EBADFD
as the original fd was already closed.

Moving bitmap reopening to the new callback which is called after
permission modifications are commited fixes this as the file descriptor
will be replaced with the correct one.

The above problem manifests itself when reopening 'qcow2' format layer
which uses a 'file-posix' file child which was opened with the
'auto-read-only' property set.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <db118dbafe1955afbc0a18d3dd220931074ce349.1582893284.git.pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Peter Krempa
17e1e2be5f block: Introduce 'bdrv_reopen_commit_post' step
Add another step in the reopen process where driver can execute code
after permission changes are comitted.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Message-Id: <adc02cf591c3cb34e98e33518eb1c540a0f27db1.1582893284.git.pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Max Reitz
eeea1faa09 block: Fix leak in bdrv_create_file_fallback()
@options is leaked by the first two return statements in this function.

Note that blk_new_open() takes the reference to @options even on
failure, so all we need to do to fix the leak is to move the QDict
allocation down to where we actually need it.

Reported-by: Coverity (CID 1419884)
Fixes: fd17146cd9
       ("block: Generic file creation fallback")
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200225155618.133412-1-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Max Reitz
81311255f2 iotests/026: Test EIO on allocation in a data-file
Test what happens when writing data to an external data file, where the
write requires an L2 entry to be allocated, but the data write fails.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200225143130.111267-4-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Max Reitz
31ab00f374 iotests/026: Test EIO on preallocated zero cluster
Test what happens when writing data to a preallocated zero cluster, but
the data write fails.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200225143130.111267-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Max Reitz
3ede935fdb qcow2: Fix alloc_cluster_abort() for pre-existing clusters
handle_alloc() reuses preallocated zero clusters.  If anything goes
wrong during the data write, we do not change their L2 entry, so we
must not let qcow2_alloc_cluster_abort() free them.

Fixes: 8b24cd1415
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200225143130.111267-2-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-03-06 17:15:37 +01:00
Peter Maydell
c205828579 Pull request
These patches would have gone through Thomas Huth but he is away on leave.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl5iXM8ACgkQnKSrs4Gr
 c8jrWQf/fwbUiGrUaJwFLcYt9trPr0IAR5RxlJgyEcVXFD2PN6IVt1bShvPibYPr
 6Eum+xk1PYWI/TBVeGMMn5AV6ABXp/S2Evp++cIXGgZ4OGiGp62UPibSKVqx6/xg
 ebPzci41vlJ8BN3qyMkikXuyYitfPrrhj32+ReW5CwQodculw3HPeCyntQ3Auii4
 Ku7yX0g80pxJDfcKeK/ETYpMyildDH/Z97rBQB4DIHGCtioYSYMvtIQ1pOnPd5Y7
 EOAVuwcOrxT5Cn4Nj03unFPRCiTyWnpzBPm0SJYF0n/CrHJynzyuIOzjb2usF44s
 r6BNmJDwpJdECNeo8tVfiHVOto8fpQ==
 =sCek
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

These patches would have gone through Thomas Huth but he is away on leave.

# gpg: Signature made Fri 06 Mar 2020 14:23:11 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  tests: Fix a bug with count variables
  qtest: fix fuzzer-related 80-char limit violations
  fuzz: fix style/typos in linker-script comments

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-03-06 14:24:24 +00:00
Peter Maydell
f4c4357fbf docs:
* Convert qemu-doc from Texinfo to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAl5iLx8ZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3nPLD/91Z1uCXz64MGGfgdje5+Ur
 4BFG1qPjL2Q9vcECK9LuF1gHnlBBDE8GCB6XyBBSIdGHJWHr3+a8ztbbuH6zesXx
 kUmGzLB1hXN/IZRHCWp41e+nUiymlJMzht5P42tzFHeakjWdb9kOfqouvZJRpyug
 jJahjAuXgbLeToLWj/2Klf//o/stzzf7g3mn1QbO5KnCsDLiJqCxjI+jFfc2sgsP
 GOZiMM4ReusnJgPPvElAg+VQiw4JWA/joNPh+KGNj9aASv+fWzmswcNzoyG8sVzU
 k5Wi2FTMwLINlWIGlWM6CfTelDuEito98mc1BEMdu5IGjrd+gi6UMW9k/c0tAetJ
 3LHXmF7+1zWsQOeKmBcQZrmG+767ebNlKt3w5brk5EbXdnknuG1PHekopHB9I2nD
 OkJEvJ60PyMbbGxQ6M4xsA4bI51aWc5rQb+l5mSp1HWhRL4MMxO5QB7t3wCPZ2ln
 BSvX0nln2O9K1AzCaI0twU7mByaWrFeo77qlwkLqA72r04LDKnpUXlMb+nNm6+yf
 YrtkCbu70AaeAJLsDZMXNLNriraO4SoFyYtUuHux0DBi/ckmiaH3hXJMQZxsjdqq
 /Qt7kqxaxt3ZStONSANbjxO4bwbb3027uSAOTa1fvh96Pcht0ak0+cDADWNS8GOZ
 e09u3rwyhaEzR68gFQXuqw==
 =GGb7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-docs-20200306' into staging

docs:
 * Convert qemu-doc from Texinfo to rST

# gpg: Signature made Fri 06 Mar 2020 11:08:15 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-docs-20200306: (33 commits)
  *.hx: Remove all the STEXI/ETEXI blocks
  docs: Remove old texinfo sources
  docs: Stop building qemu-doc
  ui/cocoa.m: Update documentation file and pathname
  docs: Generate qemu.1 manpage with Sphinx
  docs: Split out sections for the manpage into .rst.inc files
  qemu-options.hx: Fix up the autogenerated rST
  qemu-options.hx: Add rST documentation fragments
  scripts/hxtool-conv: Archive script used in qemu-options.hx conversion
  docs: Roll -prom-env and -g target-specific info into qemu-options.hx
  docs: Roll semihosting option information into qemu-options.hx
  doc/scripts/hxtool.py: Strip trailing ':' from DEFHEADING/ARCHHEADING
  hmp-commands-info.hx: Add rST documentation fragments
  hmp-commands.hx: Add rST documentation fragments
  docs/system: convert Texinfo documentation to rST
  docs/system: convert the documentation of deprecated features to rST.
  docs/system: convert managed startup to rST.
  docs/system: Convert security.texi to rST format
  docs/system: Convert qemu-cpu-models.texi to rST
  docs: Create defs.rst.inc as a place to define substitutions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-03-06 11:11:54 +00:00
Peter Maydell
29f9dff790 *.hx: Remove all the STEXI/ETEXI blocks
We no longer generate texinfo from the hxtool input files,
so delete all the STEXI/ETEXI blocks.

This commit was created using the following Perl one-liner:
  perl -i -n -e '$suppress = 1,next if /^STEXI/;$suppress=0,next if /^ETEXI/; print if !$suppress;' *.hx

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-03-06 11:06:55 +00:00
Peter Maydell
3a8273b1ab docs: Remove old texinfo sources
We can now delete the old .texi files, which we have been keeping in
the tree as a parallel set of documentation to the new rST sources.
The only remaining use of Texinfo is the autogenerated manuals
and HTML documents created from the QAPI JSON doc comments.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-33-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Peter Maydell
5b1d0e9249 docs: Stop building qemu-doc
Stop building the old texinfo qemu-doc; all its contents are
now available in the Sphinx-generated manuals and manpages.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-32-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Peter Maydell
1879f241e6 ui/cocoa.m: Update documentation file and pathname
We want to stop generating the old qemu-doc.html; first we
must update places that refer to it so they instead go to
our top level index.html documentation landing page.
The Cocoa UI has a menu option to bring up the documentation;
make it point to the new top level index.html instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-31-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Peter Maydell
d06118bfbd docs: Generate qemu.1 manpage with Sphinx
Generate the qemu.1 manpage using Sphinx; we do this with a new
top-level rst source file which is just the skeleton of the manpage
and which includes .rst.inc fragments where it needs to incorporate
sections from the larger HTML manuals.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-30-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Peter Maydell
bf87bef091 docs: Split out sections for the manpage into .rst.inc files
Sphinx doesn't have very good facilities for marking chunks
of documentation as "put this in the manpage only". So instead
we move the parts we want to put into both the HTML manuals
and the manpage into their own .rst.inc files, which we can
include from both the main manual rst files and a new toplevel
rst file that will be the skeleton of the qemu.1 manpage.

In this commit, just split out the parts of the documentation
that go in the manpage.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-29-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Peter Maydell
09ce5f2d6b qemu-options.hx: Fix up the autogenerated rST
This commit contains hand-written fixes for some issues with the
autogenerated rST fragments in qemu-options.hx:

 * Sphinx complains about the UTF-8 art table in the documentation of
   the -drive option.  Replace it with a proper rST format table.

 * rST does not like definition list entries with no actual
   definition, but it is possible to work around this by putting a
   single escaped literal space as the definition line.

 * The "-g widthxheight" option documentation suffers particularly
   badly from losing the distinction between italics and fixed-width
   as a result of the auto conversion, so put it back in again.

 * The script missed some places that use the |qemu_system| etc
   macros and need to be marked up as parsed-literal blocks.

 * The script autogenerated an expanded out version of the
   contents of qemu-option-trace.texi; replace it with an
   qemu-option-trace.rst.inc include.

This is sufficient that we can enable inclusion of the
option documentation from invocation.rst.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200228153619.9906-28-peter.maydell@linaro.org
2020-03-06 11:06:55 +00:00
Tianjia Zhang
1f40ace7b5 tests: Fix a bug with count variables
The counting code here should use the local variable n_nodes_local.
Otherwise, the variable n_nodes is counting incorrectly, causing the
counting logic of the code to be wrong.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200207115433.118254-1-tianjia.zhang@linux.alibaba.com
Message-Id: <20200207115433.118254-1-tianjia.zhang@linux.alibaba.com>
2020-03-06 10:35:15 +00:00
Alexander Bulekov
3fc92f8752 qtest: fix fuzzer-related 80-char limit violations
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200227031439.31386-3-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-03-06 10:33:26 +00:00
Alexander Bulekov
2f36421c34 fuzz: fix style/typos in linker-script comments
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20200227031439.31386-2-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-03-06 10:33:26 +00:00