mirrors/qemu

mirror of https://github.com/qemu/qemu.git synced 2024-11-25 03:43:37 +08:00

Author	SHA1	Message	Date
Alberto Garcia	764ba3ae51	block: remove redundant check before g_slist_find() An empty GSList is represented by a NULL pointer, therefore it's a perfectly valid argument for g_slist_find() and there's no need to make any additional check. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1435583533-5758-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Peter Lieven	29c838cdc9	block/nfs: limit maximum readahead size to 1MB a malicious caller could otherwise specify a very large value via the URI and force libnfs to allocate a large amount of memory for the readahead buffer. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1435317241-25585-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Peter Lieven	9049736ec7	block/iscsi: restore compatiblity with libiscsi 1.9.0 RHEL7 and others are stuck with libiscsi 1.9.0 since there unfortunately was an ABI breakage after that release. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1435313881-19366-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Fam Zheng	508249952c	block: Fix dirty bitmap in bdrv_co_discard Unsetting dirty globally with discard is not very correct. The discard may zero out sectors (depending on can_write_zeroes_with_unmap), we should replicate this change to destination side to make sure that the guest sees the same data. Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator doesn't expect unsetting of bits after current position. So let's do it the opposite way which fixes both problems: set the dirty bits if we are to discard it. Reported-by: wangxiaolong@ucloud.cn Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Fam Zheng	dcfb3beb51	mirror: Do zero write on target if sectors not allocated If guest discards a source cluster, mirroring with bdrv_aio_readv is overkill. Some protocols do zero upon discard, where it's best to use bdrv_aio_write_zeroes, otherwise, bdrv_aio_discard will be enough. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Fam Zheng	0fc9f8ea28	qmp: Add optional bool "unmap" to drive-mirror If specified as "true", it allows discarding on target sectors where source is not allocated. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:06:23 +01:00
Fam Zheng	ba3f0e2545	block: Add bdrv_get_block_status_above Like bdrv_is_allocated_above, this function follows the backing chain until seeing BDRV_BLOCK_ALLOCATED. Base is not included. Reimplement bdrv_is_allocated on top. [Initialized bdrv_co_get_block_status_above() ret to 0 to silence mingw64 compiler warning about the unitialized variable. assert(bs != base) prevents that case but I suppose the program could be compiled with -DNDEBUG. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 10:03:50 +01:00
John Snow	4b80ab2b7d	qapi: Rename 'dirty-bitmap' mode to 'incremental' If we wish to make differential backups a feature that's easy to access, it might be pertinent to rename the "dirty-bitmap" mode to "incremental" to make it clear what /type/ of backup the dirty-bitmap is helping us perform. This is an API breaking change, but 2.4 has not yet gone live, so we have this flexibility. Signed-off-by: John Snow <jsnow@redhat.com> Message-id: 1433463642-21840-2-git-send-email-jsnow@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Jindřich Makovička	3e5feb6202	qcow2: Handle EAGAIN returned from update_refcount Fixes a crash during image compression Signed-off-by: Jindřich Makovička <makovick@gmail.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Peter Lieven	5dd7a535b7	block/iscsi: add support for request timeouts libiscsi starting with 1.15 will properly support timeout of iscsi commands. The default will remain no timeout, but this can be changed via cmdline parameters, e.g.: qemu -iscsi timeout=30 -drive file=iscsi://... If a timeout occurs a reconnect is scheduled and the timed out command will be requeued for processing after a successful reconnect. The required API call iscsi_set_timeout is present since libiscsi 1.10 which was released in October 2013. However, due to some bugs in the libiscsi code the use is not recommended before version 1.15. Please note that this patch bumps the libiscsi requirement to 1.10 to have all function and macros defined. The patch fixes also a off-by-one error in the NOP timeout calculation which was fixed while touching these code parts. Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1434455107-19328-1-git-send-email-pl@kamp.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-02 09:20:18 +01:00
Dimitris Aragiorgis	3307ed7b3f	raw-posix: Introduce hdev_is_sg() Until now, an SG device was identified only by checking if its path started with "/dev/sg". Then, hdev_open() would set the bs->sg flag accordingly. The patch relies on the actual properties of the device instead of the specified file path. To this end, test for an SG device (e.g. /dev/sg0) by ensuring that all of the following holds: - The specified file name corresponds to a character device - The device supports the SG_GET_VERSION_NUM ioctl - The device supports the SG_GET_SCSI_ID ioctl Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Message-id: 1435056300-14924-6-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	a93a3982a6	raw-posix: Use DPRINTF for DEBUG_FLOPPY Get rid of several #ifdef DEBUG_FLOPPY and substitute them with DPRINTF. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-5-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	bcb225550d	raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT Building the QEMU tools fails if we #define DEBUG_BLOCK inside block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with a simple DPRINTF() (that does not cause bit-rot). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-4-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	1b6bc94d5d	Fix migration in case of scsi-generic During migration, QEMU uses fsync()/fdatasync() on the open file descriptor for read-write block devices to flush data just before stopping the VM. However, fsync() on a scsi-generic device returns -EINVAL which causes the migration to fail. This patch skips flushing data in case of an SG device, since submitting SCSI commands directly via an SG character device (e.g. /dev/sg0) bypasses the page cache completely, anyway. Note that fsync() not only flushes the page cache but also the disk cache. The scsi-generic device never sends flushes, and for migration it assumes that the same SCSI device is used by the destination host, so it does not issue any SCSI SYNCHRONIZE CACHE (10) command. Finally, remove the bdrv_is_sg() test from iscsi_co_flush() since this is now redundant (we flush the underlying protocol at the end of bdrv_co_flush() which, with this patch, we never reach). Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-3-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Dimitris Aragiorgis	b192af8acc	block: Use bdrv_is_sg() everywhere Instead of checking bs->sg use bdrv_is_sg() consistently throughout the code. Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1435056300-14924-2-git-send-email-dimara@arrikto.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:08:52 +01:00
Wolfgang Bumiller	d5941ddae8	vvfat: add a label option Until now the vvfat volume label was hardcoded to be "QEMU VVFAT", now you can pass a file.label=labelname option to the -drive to change it. The FAT structure defines the volume label to be limited to 11 bytes and is filled up spaces when shorter than that. The trailing spaces however aren't exposed to the user by operating systems. [Added missing comment '#' characters in block-core.json to fix build errors. --Stefan] Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Message-id: 1434706529-13895-2-git-send-email-w.bumiller@proxmox.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:17 +01:00
Alexander Yarygin	97b0385a34	block-backend: Introduce blk_drain() This patch introduces the blk_drain() function which allows to replace blk_drain_all() when only one BlockDriverState needs to be drained. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1434537440-28236-2-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Alberto Garcia	2f388b93a1	throttle: Check current timers before updating any_timer_armed[] Calling throttle_group_config() cancels all timers from a particular BlockDriverState, so any_timer_armed[] should be updated accordingly. However, with the current code it may happen that a timer is armed in a different BlockDriverState from the same group, so any_timer_armed[] would be set to false in a situation where there is still a timer armed. The consequence is that we might end up with two timers armed. This should not have any noticeable impact however, since all accesses to the ThrottleGroup are protected by a lock, and the situation would become normal again shortly thereafter as soon as all timers have been fired. The correct way to solve this is to check that we're actually cancelling a timer before updating any_timer_armed[]. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1434382875-3998-1-git-send-email-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Alexander Yarygin	f406c03c09	block: Let bdrv_drain_all() to call aio_poll() for each AioContext After the commit `9b536adc` ("block: acquire AioContext in bdrv_drain_all()") the aio_poll() function got called for every BlockDriverState, in assumption that every device may have its own AioContext. If we have thousands of disks attached, there are a lot of BlockDriverStates but only a few AioContexts, leading to tons of unnecessary aio_poll() calls. This patch changes the bdrv_drain_all() function allowing it find shared AioContexts and to call aio_poll() only for unique ones. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-id: 1433936297-7098-4-git-send-email-yarygin@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-23 15:06:16 +01:00
Markus Armbruster	cc7a8ea740	Include qapi/qmp/qerror.h exactly where needed In particular, don't include it into headers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:41 +02:00
Markus Armbruster	d49b683644	qerror: Move #include out of qerror.h Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Markus Armbruster	4629ed1e98	qerror: Finally unused, clean up Remove it except for two things in qerror.h: * Two #include to be cleaned up separately to avoid cluttering this patch. * The QERR_ macros. Mark as obsolete. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Markus Armbruster	c6bd8c706a	qerror: Clean up QERR_ macros to expand into a single string These macros expand into error class enumeration constant, comma, string. Unclean. Has been that way since commit `13f59ae`. The error class is always ERROR_CLASS_GENERIC_ERROR since the previous commit. Clean up as follows: * Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and delete it from the QERR_ macro. No change after preprocessing. * Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into error_setg(...). Again, no change after preprocessing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>	2015-06-22 18:20:40 +02:00
Eric Blake	fc48ffc39e	qobject: Use 'bool' for qbool We require a C99 compiler, so let's use 'bool' instead of 'int' when dealing with boolean values. There are few enough clients to fix them all in one pass. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Alberto Garcia <berto@igalia.com> Acked-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2015-06-22 17:40:00 +02:00
Peter Maydell	f3e3b083d4	Block layer core and image format patches -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJVevYFAAoJEH8JsnLIjy/W3jEP/0hiQ3rCRZ/he8s5maTdT+TR YSeHkB5rKpz0Uopn1DMn1QrIbUVzX7dyb+uf9zQ0/xRQIzf6k8uxqU/NWrdoF3NK qx91dGWedwnG+TEBIMbcR7nMrw4dP6kH7uPz/VWMXDHVLz0HIcD95qhKgs0mSY6J dWqex6ACjXM68zJU5IioagU9evV80WZE1S8z7zfixxtTBx5hCaTVbwalkaCxcrXw PbZle55rjI8B10+OzgBw0fq10nias+NTndU9CwNBboxmEtAjq8/mQ663vcWlmiFo 9a/hkda27Z5ut/0Tqk1v4uLHauylp++rrAabPBAuCFMKes6cdkddP15Q/r52aJ29 5meodQtbet1rGrM+Aq4vuSuWId71PGypEI/3URDdNfYFNISoeLLsk4lcQUu7VrDD sRX3Jt8SI3nkIgOnhPyi7NDPmafxFt8yRt5vM8MyR5ynF8NS/2hiAc3wqnbXGjUj a5GqDCefb1yM0R5HvksuFFt3OnXlKJQ3J+ksXNUJf9DSAZPauqWD696pcTeg8wyy 3PIGkczgUuKTVfFWd3THZxJLAo7ZuqvXBHHV8o1SeMBDxwh4FhTd8Kjvm3rUNFfl VDox4qwZ1AcxLrxgqazKU7sD9iWBDHURRcpOoUsBys7oxQZnhcmMp1fRlEkTOyrD HNiSNByqBrtkfeVzHlSe =QgYk -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer core and image format patches # gpg: Signature made Fri Jun 12 16:08:53 2015 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (25 commits) block: Fix reopen flag inheritance block: Add BlockDriverState.inherits_from block: Add list of children to BlockDriverState queue.h: Add QLIST_FIX_HEAD_PTR() block: Drain requests before swapping nodes in bdrv_swap() block: Move flag inheritance to bdrv_open_inherit() block: Use QemuOpts in bdrv_open_common() block: Use macro for cache option names vmdk: Use bdrv_open_image() quorum: Use bdrv_open_image() check-qdict: Test cases for new functions qdict: Add qdict_{set,copy}_default() qdict: Add qdict_array_entries() iotests: Add tests for overriding BDRV_O_PROTOCOL block: driver should override flags in bdrv_open() block: Change bitmap truncate conditional to assertion block: record new size in bdrv_dirty_bitmap_truncate raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size vmdk: Use vmdk_find_index_in_cluster everywhere vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-06-15 10:43:06 +01:00
Kevin Wolf	67251a3113	block: Fix reopen flag inheritance When reopening an image, the block layer already takes care to reopen bs->file as well with recalculated inherited flags. The same must happen for any other child (most notably missing before this patch: backing files). If bs->file (or any other child) didn't originally inherit from bs, e.g. because it was created separately and then only referenced, it must not inherit flags on reopen either, so check the inherited_from field before propagation the reopen down. VMDK already reopened its extents manually; this code can now be dropped. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2015-06-12 17:04:59 +02:00
Kevin Wolf	f3930ed0bb	block: Move flag inheritance to bdrv_open_inherit() Instead of letting every caller of bdrv_open() determine the right flags for its child node manually and pass them to the function, pass the parent node and the role of the newly opened child (like backing file, protocol layer, etc.). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2015-06-12 17:04:59 +02:00
Kevin Wolf	a646836784	vmdk: Use bdrv_open_image() Besides standardising on a single interface for opening child nodes, this patch allows the user to specify options to individual extent nodes. Overriding file names isn't possible with this yet, so it's of limited usefulness, but still a step forward. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2015-06-12 16:58:07 +02:00
Kevin Wolf	ea6828d81b	quorum: Use bdrv_open_image() Besides standardising on a single interface for opening child nodes, this simplifies the .bdrv_open() implementation of the quorum block driver by using block layer functionality for handling BlockdevRefs. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2015-06-12 16:58:07 +02:00
Kevin Wolf	b8684454e1	raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Cole Robinson <crobinso@redhat.com>	2015-06-12 15:54:01 +02:00
Fam Zheng	90df601f06	vmdk: Use vmdk_find_index_in_cluster everywhere Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Fam Zheng	61f0ed1d54	vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status It has the similar issue with `b1649fae49`. Since the calculation is repeated for a few times already, introduce a function so it can be reused. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Max Reitz	bc85ef265a	qcow2: Add DEFAULT_L2_CACHE_CLUSTERS If a relatively large cluster size is chosen, the default of 1 MB L2 cache is not really appropriate. In this case, unless overridden by the user, the default cache size should not be determined by its size in bytes but by the number of L2 tables (clusters) it is supposed to contain. Note that without this patch, MIN_L2_CACHE_SIZE will effectively take over the same role. However, providing space for just two L2 tables is not enough to be the default. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:01 +02:00
Max Reitz	57e2166959	qcow2: Set MIN_L2_CACHE_SIZE to 2 The L2 cache must cover at least two L2 tables, because during COW two L2 tables are accessed simultaneously. Reported-by: Alexander Graf <agraf@suse.de> Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Tested-by: Alexander Graf <agraf@suse.de> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-06-12 15:54:00 +02:00
Alberto Garcia	b8fe1694e5	throttle: add the name of the ThrottleGroup to BlockDeviceInfo Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 172df91f09c69c6f0440a697bbd1b3f95b077ee4.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	db6283385c	throttle: acquire the ThrottleGroup lock in bdrv_swap() bdrv_swap() touches the fields of a BlockDriverState that are protected by the ThrottleGroup lock. Although those fields end up in their original place, they are temporarily swapped in the process, so there's a chance that an operation on a member of the same group happening on a different thread can try to use them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: d92dc40d7c4f1fc5cda5cbbf4ffb7a4670b79d17.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	76f4afb40f	throttle: Add throttle group support The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: f0082a86f3ac01c46170f7eafe2101a92e8fde39.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Alberto Garcia	2ff1f2e3a3	throttle: Add throttle group infrastructure Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Benoît Canet	0e5b0a2d54	throttle: Extract timers from ThrottleState into a separate structure Group throttling will share ThrottleState between multiple bs. As a consequence the ThrottleState will be accessed by multiple aio context. Timers are tied to their aio context so they must go out of the ThrottleState structure. This commit paves the way for each bs of a common ThrottleState to have its own timer. Signed-off-by: Benoit Canet <benoit.canet@nodalink.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 6cf9ea96d8b32ae2f8769cead38f68a6a0c8c909.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 14:00:00 +01:00
Kevin Wolf	f4a769abaa	raw-posix: Fix .bdrv_co_get_block_status() for unaligned image size Image files with an unaligned image size have a final hole that starts at EOF, i.e. in the middle of a sector. Currently, *pnum == 0 is returned when checking the status of this sector. In qemu-img, this triggers an assertion failure. In order to fix this, one type for the sector that contains EOF must be found. Treating a hole as data is safe, so this patch rounds the calculated number of data sectors up, so that a partial sector at EOF is treated as a full data sector. This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1229394 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1433840108-9996-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-06-12 13:58:33 +01:00
Markus Armbruster	8809cfc38e	blkdebug: Simplify passing of Error through qemu_opts_foreach() Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-09 07:40:23 +02:00
Markus Armbruster	28d0de7a4f	QemuOpts: Convert qemu_opts_foreach() to Error Retain the function value for now, to permit selective conversion of its callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-09 07:37:37 +02:00
Markus Armbruster	a4c7367f7d	QemuOpts: Drop qemu_opts_foreach() parameter abort_on_failure When the argument is non-zero, qemu_opts_foreach() stops on callback returning non-zero, and returns that value. When the argument is zero, it doesn't stop, and returns the bit-wise inclusive or of all the return values. Funky :) The callers that pass zero could just as well pass one, because their callbacks can't return anything but zero: * qemu_add_globals()'s callback qdev_add_one_global() * qemu_config_write()'s callback config_write_opts() * main()'s callbacks default_driver_check(), drive_enable_snapshot(), vnc_init_func() Drop the parameter, and always stop. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com>	2015-06-08 19:33:20 +02:00
Fam Zheng	44f192f364	iscsi: Remove pointless runtime check of macro value raw_bsd already has QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512), so iscsi should relax. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-06-03 14:21:23 +03:00
Daniel P. Berrange	8336aafae1	qcow2/qcow: protect against uninitialized encryption key When a qcow[2] file is opened, if the header reports an encryption method, this is used to set the 'crypt_method_header' field on the BDRVQcow[2]State struct, and the 'encrypted' flag in the BDRVState struct. When doing I/O operations, the 'crypt_method' field on the BDRVQcow[2]State struct is checked to determine if encryption needs to be applied. The crypt_method_header value is copied into crypt_method when the bdrv_set_key() method is called. The QEMU code which opens a block device is expected to always do a check if (bdrv_is_encrypted(bs)) { bdrv_set_key(bs, ....key...); } If code forgets to do this, then 'crypt_method' is never set and so when I/O is performed, QEMU writes plain text data into a sector which is expected to contain cipher text, or when reading, will return cipher text instead of plain text. Change the qcow[2] code to consult bs->encrypted when deciding whether encryption is required, and assert(s->crypt_method) to protect against cases where the caller forgets to set the encryption key. Also put an assert in the set_key methods to protect against the case where the caller sets an encryption key on a block device that does not have encryption Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	d1b4efe5c4	qcow2: style fixes in qcow2-cache.c Fix pointer declaration to make it consistent with the rest of the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	a3f1afb43a	qcow2: make qcow2_cache_put() a void function This function never receives an invalid table pointer, so we can make it void and remove all the error checking code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	812e4082ca	qcow2: use a hash to look for entries in the L2 cache The current cache algorithm traverses the array starting always from the beginning, so the average number of comparisons needed to perform a lookup is proportional to the size of the array. By using a hash of the offset as the starting point, lookups are faster and independent from the array size. The hash is computed using the cluster number of the table, multiplied by 4 to make it perform better when there are collisions. In my tests, using a cache with 2048 entries, this reduces the average number of comparisons per lookup from 430 to 2.5. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	fdfbca82a0	qcow2: remove qcow2_cache_find_entry_to_replace() A cache miss means that the whole array was traversed and the entry we were looking for was not found, so there's no need to traverse it again in order to select an entry to replace. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00
Alberto Garcia	2693310ecc	qcow2: use an LRU algorithm to replace entries from the L2 cache The current algorithm to evict entries from the cache gives always preference to those in the lowest positions. As the size of the cache increases, the chances of the later elements of being removed decrease exponentially. In a scenario with random I/O and lots of cache misses, entries in positions 8 and higher are rarely (if ever) evicted. This can be seen even with the default cache size, but with larger caches the problem becomes more obvious. Using an LRU algorithm makes the chances of being removed from the cache independent from the position. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-05-22 17:08:01 +02:00

1 2 3 4 5 ...

2019 Commits