If a qcow2 image specifies a backing file format that doesn't correspond
to any format driver that qemu knows, we shouldn't fall back to probing,
but simply error out.
Not looking up the backing file driver in bdrv_open_backing_file(), but
just filling in the "driver" option if it isn't there moves us closer to
the goal of having everything in QDict options and gets us the error
handling of bdrv_open(), which correctly refuses unknown drivers.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-4-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.
Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest. A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.
Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing. Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing. QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.
All of these additions that allow to avoid format probing have to be
specified explicitly. The default still allows the attack.
In order to fix this, commit 79368c8 (July 2010) put probed raw images
in a restricted mode, in which they wouldn't be able to overwrite the
first few bytes of the image so that they would identify as a different
image. If a write to the first sector would write one of the signatures
of another driver, qemu would instead zero out the first four bytes.
This patch was later reverted in commit 8b33d9e (September 2010) because
it didn't get the handling of unaligned qiov members right.
Today's block layer that is based on coroutines and has qiov utility
functions makes it much easier to get this functionality right, so this
patch implements it.
The other differences of this patch to the old one are that it doesn't
silently write something different than the guest requested by zeroing
out some bytes (it fails the request instead) and that it doesn't
maintain a list of signatures in the raw driver (it calls the usual
probe function instead).
Note that this change doesn't introduce new breakage for false positive
cases where the guest legitimately writes data into the first sector
that matches the signatures of an image format (e.g. for nested virt):
These cases were broken before, only the failure mode changes from
corruption after the next restart (when the wrong format is probed) to
failing the problematic write request.
Also note that like in the original patch, the restrictions only apply
if the image format has been guessed by probing. Explicitly specifying a
format allows guests to write anything they like.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416497234-29880-8-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only image format driver that even potentially accesses anything
after 512 bytes in its bdrv_probe() implementation is VMDK, which reads
a plain-text descriptor file. In practice, the field it's looking for
seems to come first and will be well within the first 512 bytes, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-7-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416497234-29880-6-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This returns the node name of a BDS. Remove the TODO comment and expect
the callers to be explicit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Similar to bdrv_next, this traverses through graph_bdrv_states. Will be
useful to enumerate all the named nodes.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In bdrv_rw_co we report -EINVAL for nb_sectors > INT_MAX /
BDRV_SECTOR_SIZE, so a caller shouldn't exceed it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1415603264-21497-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJUYh9vAAoJEL7lnXSkw9fbgPQH/065L5+SpaJR1Nte9Lz3N2s1
a6tGSI22yu85tKvYCdYjeoVHSkSTyR57FdTfUd2xc2QPj+J4sWXpA81KILBGTJUp
NMpmLpWg4LOh8Ek4ViRgmFFdryzIFa4dT4gc1AcSAIAQ6jsgK1dM7m5kfncC3TN0
TUs248vJ2i/DaE0k8TOeJqxJTqInoFttlJEqG7RD+V5JznokE4zpFNXHDGx9BptE
W2J38GJ/TKRPe9UrHMKZI1r6+ZBdXyE/CaqsNNKLJdqrHgSQuAyK/PS6dQbM4BLg
M1qdP7Tp0wOlvv9qoEZMOEiUsi54XPqLgaLMbW74Yp5X459fqmLW2imy49pHXt8=
=klsW
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2014-11-11' into staging
trivial patches for 2014-11-11
# gpg: Signature made Tue 11 Nov 2014 14:38:39 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
* remotes/mjt/tags/pull-trivial-patches-2014-11-11:
block: Fix comment for bdrv_co_get_block_status
sysbus: Correct SYSTEM_BUS(obj) defines
target-i386: cpu: keeping function parameters alignment on new line
xen-hvm: Remove redundant variable 'xstate'
coroutine-sigaltstack: Change jmp_buf to sigjmp_buf
pc-bios: petalogix-s3adsp1800.dtb: Use 'xlnx, xps-ethernetlite-2.00.a' instead of 'xlnx, xps-ethernetlite-2.00.b'
gdbstub: Add a missing case of signal number translation in gdbstub
numa: make 'info numa' take into account hotplugged memory
slirp/smbd: modify/set several parameters in generated smbd.conf
qemu-doc.texi: fix typos in x509 examples
icc_bus: fix typo ICC_BRIGDE -> ICC_BRIDGE
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It returns more information than binary, fix the comment.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If the specified backing file could not be opened, do not generate a new
error message which contains the message which has been generated by
bdrv_open(), but just propagate the latter.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The mirror block job must run in the BlockDriverState AioContext so that
it works with dataplane.
Acquire the AioContext in blockdev.c so starting the block job is safe.
Note that to_replace is treated separately from other BlockDriverStates
in that it does not need to be in the same AioContext. Explicitly
acquire/release to_replace's AioContext when accessing it.
The completion code in block/mirror.c must perform BDS graph
manipulation and bdrv_reopen() from the main loop. Use
block_job_defer_to_main_loop() to achieve that.
The bdrv_drain_all() call is not allowed outside the main loop since it
could lead to lock ordering problems. Use bdrv_drain(bs) instead
because we have acquired the AioContext so nothing else can sneak in
I/O.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-10-git-send-email-stefanha@redhat.com
Now that op blockers are in use, we can ensure that no other sources are
generating I/O on a BlockDriverState. Therefore it is possible to drain
requests for a single BDS.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1413889440-32577-7-git-send-email-stefanha@redhat.com
Depending on the changed options and the image format,
bdrv_amend_options() may take a significant amount of time. In these
cases, a way to be informed about the operation's status is desirable.
Since the operation is rather complex and may fundamentally change the
image, implementing it as AIO or a coroutine does not seem feasible. On
the other hand, implementing it as a block job would be significantly
more difficult than a simple callback and would not add benefits other
than progress report to the amending operation, because it should not
actually be run as a block job at all.
A callback may not be very pretty, but it's very easy to implement and
perfectly fits its purpose here.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414404776-4919-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The local variable 'ac' in send_qmp_error_event() is declared with the
wrong type, which causes clang to complain when it is initialized
and again when it is used:
block.c:3655:20: warning: implicit conversion from enumeration type 'enum IoOperationType' to different enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') [-Wenum-conversion]
ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
~ ^~~~~~~~~~~~~~~~~~~~~~
block.c:3655:45: warning: implicit conversion from enumeration type 'enum IoOperationType' to different enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') [-Wenum-conversion]
ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
~ ^~~~~~~~~~~~~~~~~~~~~~~
block.c:3656:62: warning: implicit conversion from enumeration type 'BlockErrorAction' (aka 'enum BlockErrorAction') to different enumeration type 'IoOperationType' (aka 'enum IoOperationType') [-Wenum-conversion]
qapi_event_send_block_io_error(bdrv_get_device_name(bs), ac, action,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~
Correct the type to IoOperationType, and rename the variable
to 'optype' to match its correct type.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Message-id: 1412969583-21045-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When falling through to the underlying file in
bdrv_co_get_block_status(), if it returns that the query offset is
beyond the file end (by setting *pnum to 0), return the range to be
zero and do not let the number of sectors for which information could be
obtained be overwritten.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions call their non-0-counterparts and then fill the
allocated buffer with 0 (if the allocation has been successful).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move device model attachment / detachment and the BlockDevOps device
model callbacks and their wrappers from BlockDriverState to
BlockBackend.
Wrapper calls in block.c change from
bdrv_dev_FOO_cb(bs, ...)
to
if (bs->blk) {
bdrv_dev_FOO_cb(bs->blk, ...);
}
No change, because both bdrv_dev_change_media_cb() and
bdrv_dev_resize_cb() do nothing when no device model is attached, and
a device model can be attached only when bs->blk.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use it with block backends shortly, and the name is going to fit
badly there. It's a block layer thing anyway, not just a block driver
thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there. It's a block layer thing anyway, not just a
block driver thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockBackend's name space is separate only to keep the initial patches
simple. Time to merge the two.
Retain bdrv_find() and bdrv_get_device_name() for now, to keep this
series manageable.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
device_name[] can become non-empty only in bdrv_new_root() and
bdrv_move_feature_fields(). The latter is used only to undo damage
done by bdrv_swap(). The former is called only by blk_new_with_bs().
Therefore, when a BlockDriverState's device_name[] is non-empty, then
it's been created with a BlockBackend, and vice versa. Furthermore,
blk_new_with_bs() keeps the two names equal.
Therefore, device_name[] is redundant. Eliminate it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make the BlockBackend own the DriveInfo. Change blockdev_init() to
return the BlockBackend instead of the DriveInfo.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Convenience function blk_new_with_bs() creates a BlockBackend with its
BlockDriverState. Callers have to unref both. The commit after next
will relieve them of the need to unref the BlockDriverState.
Complication: due to the silly way drive_del works, we need a way to
hide a BlockBackend, just like bdrv_make_anon(). To emphasize its
"special" status, give the function a suitably off-putting name:
blk_hide_on_behalf_of_do_drive_del(). Unfortunately, hiding turns the
BlockBackend's name into the empty string. Can't avoid that without
breaking the blk->bs->device_name equals blk->name invariant.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Creating an anonymous BDS can't fail. Make that obvious.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When migrated using libvirt with "--copy-storage-all", at the end of
migration there is race between NBD mirroring task trying to do flush
and migration completion, both end up invalidating cache. Since qcow2
driver does not handle this situation very well, random crashes happen.
This disables the BDRV_O_INCOMING flag for the block device being migrated
once the cache has been invalidated.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
--
fixed parens by hand
Signed-off-by: Juan Quintela <quintela@redhat.com>
IDs have long spread beyond QemuOpts: not everything with an ID
necessarily goes through QemuOpts. Commit 9aebf3b is about such a
case: block layer names are meant to be well-formed IDs, but some of
them don't go through QemuOpts, and thus weren't checked. The commit
fixed that the straightforward way: rename the internal QemuOpts
helper id_wellformed() to qemu_opts_id_wellformed() and give it
external linkage.
Instead of using it directly in block.c, the commit adds wrapper
bdrv_is_valid_name(), probably to hide the connection to QemuOpts.
Go one logical step further: emancipate IDs from QemuOpts. Rename the
function back to id_wellformed(), and put it in another file. While
there, clean up its value to bool. Peel off the bdrv_is_valid_name()
wrapper.
[Replaced stray return 0 with return false to match bool returns used
elsewhere in id_wellformed().
--Stefan]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The device_name of a BlockDriverState is currently checked because it is
always used as a QemuOpts ID and qemu_opts_create() checks whether such
IDs are wellformed.
node-name is supposed to share the same namespace, but it isn't checked
currently. This patch adds explicit checks both for device_name and
node-name so that the same rules will still apply even if QemuOpts won't
be used any more at some point.
qemu-img used to use names with spaces in them, which isn't allowed any
more. Replace them with underscores.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Benoit Canet <benoit.canet@nodalink.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the BDS's refcnt > 0, drive_del() destroys the DriveInfo, but not
the BDS. This can happen in three places:
* Device model destruction during unplug: blockdev_auto_del()
* Xen IDE unplug: pci_piix3_xen_ide_unplug()
* drive_del command when no device model is attached: do_drive_del()
The other callers of drive_del are on error paths where refcnt == 1.
If the user somehow manages to plug in a device model using a BDS that
has gone through drive_del(), the legacy configuration passed in
DriveInfo doesn't reach the device model, and automatic deletion on
unplug doesn't work. Worse, some device models such as scsi-disk
crash when DriveInfo doesn't exist.
This is theoretical; I didn't research an actual reproducer. The problem
was introduced when we replaced DriveInfo reference counting by BDS
reference counting in commit a94a3fa..fa510eb.
Fix by keeping DriveInfo alive until its BDS dies.
This affects qemu_drive_opts: now you can't reuse the same ID for new
drive options until the BDS dies. Before, you could, but since the
code always attempts to create a BDS with the same ID next, the
enclosing operation "create a new drive" failed anyway. Different
error path, same result.
Unfortunately, the fix involves use of blockdev.c stuff from block.c,
which is a layering violation. Fortunately, my forthcoming
BlockBackend work will get rid of it again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that all the implementations are converted to asynchronous version
and we can emulate synchronous cancellation with it. Let's drop the
unused member.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the async version of bdrv_aio_cancel, which doesn't block the
caller. It guarantees that the cb is called either before returning or
some time later.
bdrv_aio_cancel can base on bdrv_aio_cancel_async, later we can convert
all .io_cancel implementations to .io_cancel_async, and the aio_poll is
the common logic. In the end, .io_cancel can be dropped.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be useful in synchronous cancel emulation with
bdrv_aio_cancel_async.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
BLOCK_IO_ERROR events are logged by libvirt, which helps with
post mortem analysis of guests. However, one information that
we miss today is a human readable string describing the cause
of the I/O error.
This commit adds that string it to BLOCK_IO_ERROR. Note that
this string is a debugging aid for humans, meaning that it
should not parsed by applications.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is the next step for decoupling block accounting functions from
BlockDriverState.
In a future commit the BlockAcctStats structure will be moved from
BlockDriverState to the device models structures.
Note that bdrv_get_stats was introduced so device models can retrieve the
BlockAcctStats structure of a BlockDriverState without being aware of it's
layout.
This function should go away when BlockAcctStats will be embedded in the device
models structures.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Anthony Liguori <aliguori@amazon.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Michael Tokarev <mjt@tls.msk.ru>
CC: John Snow <jsnow@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Alexander Graf <agraf@suse.de>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The plan is to add new accounting metrics (latency, invalid requests, failed
requests, queue depth) and block.c is overpopulated so it will be better to work
in a separate module.
Moreover the long term plan is to have statistics in each of the BDS of the graph
for metrology purpose; this means that the device model statistics must move from
the topmost BDS to the device model.
So we need to decouple the statistic code from BlockDriverState.
This is another argument for the extraction of the code in a separate module.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Benoit Canet <benoit@irqsave.net>
CC: Fam Zheng <famz@redhat.com>
CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Extract the block accounting statistics into a structure so the block device
models can hold them in the future.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Management software, such as RHEV's vdsm, want to be able to allocate
disk space on demand. The basic use case is to start a VM with a small
disk and then the disk is enlarged when QEMU hits a ENOSPC condition.
To this end, the management software has to be notified when QEMU
encounters ENOSPC. The solution implemented by this commit is simple:
it extends the BLOCK_IO_ERROR with a 'nospace' key, which is true
when QEMU is stopped due to ENOSPC.
Note that support for querying this event is already present in
query-block by means of the 'io-status' key. Also, the new 'nospace'
BLOCK_IO_ERROR field shares the same semantics with 'io-status',
which basically means that werror= has to be set to either
'stop' or 'enospc' to enable 'nospace'.
Finally, this commit also updates the 'io-status' key doc in the
schema with a list of supported device models.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When request A is a strict superset of request B:
AAAAAAAA
BBBB
multiwrite_merge() merges them as follows:
AABBBB
The tail of request A should have been included:
AABBBBAA
This patch fixes data loss but this code path is probably rare. Since
guests cannot assume ordering between in-flight requests, few
applications submit overlapping write requests.
Reported-by: Slava Pestov <sviatoslav.pestov@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
If a long-running operation on a BDS wants to always remain in the same
AIO context, it somehow needs to keep track of the BDS changing its
context. This adds a function for registering callbacks on a BDS which
are called whenever the BDS is attached or detached from an AIO context.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Format names are best consumed in alphabetical order. This makes
human-readable output easy to produce.
bdrv_iterate_format() already has an array of format strings. Sort them
before invoking the iteration callback.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Some block devices may not have a filename in their BDS; and for some,
there may not even be a normal filename at all. To work around this, add
a function which tries to construct a valid filename for the
BDS.filename field.
If a filename exists or a block driver is able to reconstruct a valid
filename (which is placed in BDS.exact_filename), this can directly be
used.
If no filename can be constructed, we can still construct an options
QDict which is then converted to a JSON object and prefixed with the
"json:" pseudo protocol prefix. The QDict is placed in
BDS.full_open_options.
For most block drivers, this process can be done automatically; those
that need special handling may define a .bdrv_refresh_filename() method
to fill BDS.exact_filename and BDS.full_open_options themselves.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
Patch created with Coccinelle, with two manual changes on top:
* Add const to bdrv_iterate_format() to keep the types straight
* Convert the allocation in bdrv_drop_intermediate(), which Coccinelle
inexplicably misses
Coccinelle semantic patch:
@@
type T;
@@
-g_malloc(sizeof(T))
+g_new(T, 1)
@@
type T;
@@
-g_try_malloc(sizeof(T))
+g_try_new(T, 1)
@@
type T;
@@
-g_malloc0(sizeof(T))
+g_new0(T, 1)
@@
type T;
@@
-g_try_malloc0(sizeof(T))
+g_try_new0(T, 1)
@@
type T;
expression n;
@@
-g_malloc(sizeof(T) * (n))
+g_new(T, n)
@@
type T;
expression n;
@@
-g_try_malloc(sizeof(T) * (n))
+g_try_new(T, n)
@@
type T;
expression n;
@@
-g_malloc0(sizeof(T) * (n))
+g_new0(T, n)
@@
type T;
expression n;
@@
-g_try_malloc0(sizeof(T) * (n))
+g_try_new0(T, n)
@@
type T;
expression p, n;
@@
-g_realloc(p, sizeof(T) * (n))
+g_renew(T, p, n)
@@
type T;
expression p, n;
@@
-g_try_realloc(p, sizeof(T) * (n))
+g_try_renew(T, p, n)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-img check calls bdrv_check() twice if the first run repaired some
inconsistencies. If the first run however again triggered corruption
prevention (on qcow2) due to very bad inconsistencies, bs->drv may be
NULL afterwards. Thus, bdrv_check() should check whether bs->drv is set.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.
This patch addresses bounce buffer allocations in block.c. While at it,
convert bdrv_commit() from plain g_malloc() to qemu_try_blockalign().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This function returns NULL instead of aborting when an allocation fails.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
If bdrv_unref() is passed a NULL BDS pointer, it is safe to
exit with no operation. This will allow cleanup code to blindly
call bdrv_unref() on a BDS that has been initialized to NULL.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a BlockDriverState is associated with a storage controller
DeviceState we expect guest I/O. Use this opportunity to bump the
coroutine pool size by 64.
This patch ensures that the coroutine pool size scales with the number
of drives attached to the guest. It should increase coroutine pool
usage (which makes qemu_coroutine_create() fast) without hogging too
much memory when fewer drives are attached.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
bdrv_get_geometry() hides errors. Use bdrv_nb_sectors() or
bdrv_getlength() instead where that's obviously inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It returns a multiple of the sector size.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of bdrv_getlength().
Aside: a few of these callers don't handle errors. I didn't
investigate whether they should.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of bdrv_getlength().
Replace variables length, length2 by total_sectors, nb_sectors2.
Bonus: use total_sectors instead of the slightly unclean
bs->total_sectors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of bdrv_getlength().
Variable target_size is initially in bytes, then changes meaning to
sectors. Ugh. Replace by target_sectors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A call to retrieve the image size converts between bytes and sectors
several times:
* BlockDriver method bdrv_getlength() returns bytes.
* refresh_total_sectors() converts to sectors, rounding up, and stores
in total_sectors.
* bdrv_getlength() converts total_sectors back to bytes (now rounded
up to a multiple of the sector size).
* Callers wanting sectors rather bytes convert it right back.
Example: bdrv_get_geometry().
bdrv_nb_sectors() provides a way to omit the last two conversions.
It's exactly bdrv_getlength() with the conversion to bytes omitted.
It's functionally like bdrv_get_geometry() without its odd error
handling.
Reimplement bdrv_getlength() and bdrv_get_geometry() on top of
bdrv_nb_sectors().
The next patches will convert some users of bdrv_getlength() to
bdrv_nb_sectors().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
At least raw-posix relies on this because it can allocate bounce buffers
based on the request length, but access it using all of the qiov entries
later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
If a read request goes across EOF, the block driver sees a shortened
request that stops at EOF (the rest is memsetted in block.c), however
the original qiov was used for this request.
This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
bdrv_is_allocated() should return either 0 or 1 in successful cases.
We're lucky that currently, the callers that rely on this (e.g. because
they check for ret == 1) don't seem to break badly. They just might skip
some optimisation or in the case of qemu-io 'map' print separate lines
where a single line would suffice. In theory, a wrong allocation status
could lead to image corruption with certain operations, so let's fix
this quickly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This patch introduces three APIs so that following
patches can support queuing I/O requests and submitting them
as a batch for improving I/O performance.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
On some image chains, QEMU may not always be able to resolve the
filenames properly, when updating the backing file of an image
after a block commit.
For instance, certain relative pathnames may fail, or drives may
have been specified originally by file descriptor (e.g. /dev/fd/???),
or a relative protocol pathname may have been used.
In these instances, QEMU may lack the information to be able to make
the correct choice, but the user or management layer most likely does
have that knowledge.
With this extension to the block-commit api, the user is able to change
the backing file of the overlay image as part of the block-commit
operation.
This allows the change to be 'safe', in the sense that if the attempt
to write the overlay image metadata fails, then the block-commit
operation returns failure, without disrupting the guest.
If the commit top is the active layer, then specifying the backing
file string will be treated as an error (there is no overlay image
to modify in that case).
If a backing file string is not specified in the command, the backing
file string to use is determined in the same manner as it was
previously.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is a small helper function, to determine if 'base' is in the
chain of BlockDriverState 'top'. It returns true if it is in the chain,
and false otherwise.
If either argument is NULL, it will also return false.
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This simplifies the function bdrv_find_overlay(). With this change,
bdrv_find_base() is just a subset of usage of bdrv_find_overlay(),
so this also takes advantage of that.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When failure occurs, 'ret' need be set, or may return 0 to indicate
success. Previously, an error was set in errp, but 0 was returned
anyway. So let bdrv_append_temp_snapshot() return an error code and
use that for the bdrv_open() return value.
Also, error_propagate() need be called only one time within a function.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
drive-mirror will bdrv_swap the new BDS named node-name with the one
pointed by replaces when the mirroring is finished.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we check for the RESIZE blocker in bdrv_truncate(), that means a
commit will fail if the overlay layer is larger than the base, due to
the backing blocker.
This is a regression in behavior from 2.0; currently, commit will try to
grow the size of the base image to match the overlay size, if the
overlay size is larger.
By moving this into the QMP command qmp_block_resize(), it allows
usage of bdrv_truncate() within block jobs.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only semantic change is that bs->open_flags gets BDRV_O_PROTOCOL set
now. This isn't useful, but it doesn't hurt either. The code that was
previously skipped by 'goto done' is automatically disabled because
protocol drivers don't support backing files (and if they did, this
would probably be a fix) and can't have snapshot_flags set.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).
This is obvious nonsense and should be rejected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This recursion was introduced in commit 505d7583 in order to allow
nesting image formats. It only ever takes effect when the user
explicitly specifies a driver name and that driver isn't suitable for
the protocol level.
We can check this earlier in bdrv_open() and if the explicitly
requested driver is a format driver, clear BDRV_O_PROTOCOL so that
another bs->file layer is opened.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
It doesn't do much any more, we can move the code to bdrv_open() now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
This moves the bdrv_open_file() call a bit down so that it can use the
bdrv_open() code that selects the right block driver.
The code between the old and the new call site is either common code
(the error message for an unknown driver has been unified now) or
doesn't run with cleared BDRV_O_PROTOCOL (added an if block in one
place, whereas the right path was already asserted in another place)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
The "driver" entry in the options QDict is now only missing if we're
opening an image with format probing.
We also catch cases now where both the drv argument and a "driver"
option is specified, e.g. by specifying -drive format=qcow2,driver=raw
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The idea of bdrv_fill_options() is to convert every parameter for
opening images, in particular the filename and flags, to entries in the
options QDict.
This patch starts with moving the filename parsing and driver probing
part from bdrv_file_open() to the new function.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
'buf' is not used actually, so remove it and related snprintf() statement.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In order to let event defines use existing types later, instead of
redefine new ones, some old type defines for spice and vnc are changed,
and BlockErrorAction is moved from block.h to qapi schema. Note that
BlockErrorAction is not merged with BlockdevOnError.
At this point, VncInfo is not made a child of VncBasicInfo, because
VncBasicInfo has mandatory fields where VncInfo makes them optional.
Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
With virtio-blk dataplane, I/O errors might occur while QEMU is
not in the main I/O thread. However, it's invalid to call vm_stop
when we're neither in a VCPU thread nor in the main I/O thread,
even if we were to take the iothread mutex around it.
To avoid this problem, we can raise a request to the main I/O thread,
similar to what QEMU does when vm_stop is called from a CPU thread.
We know that bdrv_error_action is called from an AIO callback, and
the moment at which the callback will fire is not well-defined; it
depends on the moment at which the disk or OS finishes the operation,
which can happen at any time. Note that QEMU is certainly not in a CPU
thread and we do not need to call cpu_stop_current() like vm_stop() does.
However, we need to ensure that any action taken by management will
result in correct detection of the error _and_ a running VM. In particular:
- the event must be raised after the iostatus has been set, so that
"info block" will return an iostatus that matches the event.
- the VM must be stopped after the iostatus has been set, so that
"info block" will return an iostatus that matches the runstate.
The ordering between the STOP and BLOCK_IO_ERROR events is preserved;
BLOCK_IO_ERROR is documented to come first.
This makes bdrv_error_action() thread safe (assuming QMP events are,
which is attacked by a separate series).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block I/O throttling uses timers and currently always adds them to the
main loop. Throttling will break if bdrv_set_aio_context() is used to
move a BlockDriverState to a different AioContext.
This patch adds throttle_detach/attach_aio_context() interfaces so the
throttling timers and uses them to move timers to the new AioContext.
Note that bdrv_set_aio_context() already drains all requests so we're
sure no throttled requests are pending.
The test cases need to be updated since the throttle_init() interface
has changed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Up until now all BlockDriverState instances have used the QEMU main loop
for fd handlers, timers, and BHs. This is not scalable on SMP guests
and hosts so we need to move to a model with multiple event loops on
different host CPUs.
bdrv_set_aio_context() assigns the AioContext event loop to use for a
particular BlockDriverState. It first detaches the entire
BlockDriverState graph from the current AioContext and then attaches to
the new AioContext.
This function will be used by virtio-blk data-plane to assign a
BlockDriverState to its IOThread AioContext. Make
bdrv_aio_set_context() public since data-plane should not include
block_int.h.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Modify bdrv_drain_all() to take into account that BlockDriverState
instances may be running in different AioContexts.
This patch changes the implementation of bdrv_drain_all() while
preserving the semantics. Previously kicking throttled requests and
checking for pending requests were done across all BlockDriverState
instances in sequence. Now we process each BlockDriverState in turn,
making sure to acquire and release its AioContext.
This prevents race conditions between the thread executing
bdrv_drain_all() and the thread running the AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_close_all(), bdrv_commit_all(), bdrv_flush_all(),
bdrv_invalidate_cache_all(), and bdrv_clear_incoming_migration_all() are
called by main loop code and touch all BlockDriverState instances.
Some BlockDriverState instances may be running in another AioContext.
Make sure to acquire the AioContext before closing the BlockDriverState.
This will protect against race conditions once virtio-blk data-plane is
using the BlockDriverState from another AioContext event loop.
Note that this patch does not convert bdrv_drain_all() yet since that
conversion is non-trivial.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop the assumption that we're using the main AioContext. Convert
qemu_aio_wait() to aio_poll() and qemu_bh_new() to aio_bh_new() so the
BlockDriverState AioContext is used.
Note there is still one qemu_aio_wait() left in bdrv_create() but we do
not have a BlockDriverState there and only main loop code invokes this
function.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduced in commit da557a. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The above bdrv_set_backing_hd already does this.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This makes use of op_blocker and blocks all the operations except for
commit target, on each BlockDriverState->backing_hd.
The asserts for op_blocker in bdrv_swap are removed because with this
change, the target of block commit has at least the backing blocker of
its child, so the assertion is not true. Callers should do their check.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We need to handle the coming backing_blocker properly, so don't open
code the assignment, instead, call bdrv_set_backing_hd to change
backing_hd.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the common but non-trivial steps to assign or change the
backing_hd of BDS.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This drops BlockDriverState.in_use with op_blockers:
- Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
- Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
- Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
The specific types are used, e.g. in place of starting block backup,
bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
There is one exception in block_job_create, where
bdrv_op_blocker_is_empty() is used, because we don't know the operation
type here. This doesn't matter because in a few commits away we will drop
the check and move it to callers that _do_ know the type.
- Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).
Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
this moment. So although the checks are specific to op types, this
changes can still be seen as identical logic with previously with
in_use. The difference is error message are improved because of blocker
error info.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>