mirrors/qemu

mirror of https://github.com/qemu/qemu.git synced 2024-11-25 11:53:39 +08:00

Author	SHA1	Message	Date
Pavel Butsykin	46b732cdf3	qcow2: add shrink image support This patch add shrinking of the image file for qcow2. As a result, this allows us to reduce the virtual image size and free up space on the disk without copying the image. Image can be fragmented and shrink is done by punching holes in the image file. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170918124230.8152-4-pbutsykin@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-09-26 15:00:32 +02:00
Pavel Butsykin	f71c08ea8e	qcow2: add qcow2_cache_discard Whenever l2/refcount table clusters are discarded from the file we can automatically drop unnecessary content of the cache tables. This reduces the chance of eviction useful cache data and eliminates inconsistent data in the cache with the data in the file. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170918124230.8152-3-pbutsykin@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-09-26 15:00:32 +02:00
Kevin Wolf	e0995dc3da	block: Add reopen_queue to bdrv_child_perm() In the context of bdrv_reopen(), we'll have to look at the state of the graph as it will be after the reopen. This interface addition is in preparation for the change. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Thomas Huth	7a6ab45e19	block: Clean up some bad code in the vvfat driver Remove the unnecessary home-grown redefinition of the assert() macro here, and remove the unusable debug code at the end of the checkpoint() function. The code there uses assert() with side-effects (assignment to the "mapping" variable), which should be avoided. Looking more closely, it seems as it is apparently also only usable for one certain directory layout (with a file named USB.H in it) and thus is of no use for the rest of the world. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-26 14:46:23 +02:00
Manos Pitsidianakis	43a5dc02fd	block/throttle-groups.c: allocate RestartData on the heap RestartData is the opaque data of the throttle_group_restart_queue_entry coroutine. By being stack allocated, it isn't available anymore if aio_co_enter schedules the coroutine with a bottom half and runs after throttle_group_restart_queue returns. Cc: qemu-stable@nongnu.org Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-26 14:46:23 +02:00
Fam Zheng	97ec9117c3	file-posix: Clear out first sector in hdev_create People get surprised when, after "qemu-img create -f raw /dev/sdX", they still see qcow2 with "qemu-img info", if previously the bdev had a qcow2 header. While this is natural because raw doesn't need to write any magic bytes during creation, hdev_create is free to clear out the first sector to make sure the stale qcow2 header doesn't cause such confusion. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-26 14:46:23 +02:00
Vladimir Sementsov-Ogievskiy	a693437037	block/nbd-client: nbd_co_send_request: fix return code It's incorrect to return success rc >= 0 if we skip qio_channel_writev_all() call due to s->quit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-09-25 08:21:26 -05:00
Vladimir Sementsov-Ogievskiy	9397067221	block/nbd-client: simplify check in nbd_co_receive_reply If we are woken up from while() loop in nbd_read_reply_entry handles must be equal. If we are woken up from nbd_recv_coroutines_wake_all s->quit must be true, so we do not need checking handles equality. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-09-25 08:21:26 -05:00
Vladimir Sementsov-Ogievskiy	319a56cde7	block/nbd-client: refactor nbd_co_receive_reply "NBDReply *reply" parameter of nbd_co_receive_reply is used only to pass return value for nbd_co_request (reply.error). Remove it and use function return value instead. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20170920124507.18841-2-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-09-25 08:21:25 -05:00
Eric Blake	cfa3ad635c	nbd-client: Use correct macro parenthesization If 'bs' is a complex expression, we were only casting the front half rather than the full expression. Luckily, none of the callers were passing bad arguments, but it's better to be robust up front. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170918214649.17550-1-eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-09-25 08:21:25 -05:00
Paolo Bonzini	7c9e527659	scsi, file-posix: add support for persistent reservation management It is a common requirement for virtual machine to send persistent reservations, but this currently requires either running QEMU with CAP_SYS_RAWIO, or using out-of-tree patches that let an unprivileged QEMU bypass Linux's filter on SG_IO commands. As an alternative mechanism, the next patches will introduce a privileged helper to run persistent reservation commands without expanding QEMU's attack surface unnecessarily. The helper is invoked through a "pr-manager" QOM object, to which file-posix.c passes SG_IO requests for PERSISTENT RESERVE OUT and PERSISTENT RESERVE IN commands. For example: $ qemu-system-x86_64 -device virtio-scsi \ -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock -drive if=none,id=hd,driver=raw,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd or: $ qemu-system-x86_64 -device virtio-scsi \ -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock -blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd Multiple pr-manager implementations are conceivable and possible, though only one is implemented right now. For example, a pr-manager could: - talk directly to the multipath daemon from a privileged QEMU (i.e. QEMU links to libmpathpersist); this makes reservation work properly with multipath, but still requires CAP_SYS_RAWIO - use the Linux IOC_PR_* ioctls (they require CAP_SYS_ADMIN though) - more interestingly, implement reservations directly in QEMU through file system locks or a shared database (e.g. sqlite) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-22 01:06:51 +02:00
Alistair Francis	b62e39b469	General warn report fixups Tidy up some of the warn_report() messages after having converted them to use warn_report(). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <9cb1d23551898c9c9a5f84da6773e99871285120.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
Alistair Francis	8297be80f7	Convert multi-line fprintf() to warn_report() Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these commands: find ./* -type f -exec sed -i \ 'N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N;N; {s\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig}' \ {} + Indentation fixed up manually afterwards. Some of the lines were manually edited to reduce the line length to below 80 charecters. Some of the lines with newlines in the middle of the string were also manually edit to avoid checkpatch errrors. The #include lines were manually updated to allow the code to compile. Several of the warning messages can be improved after this patch, to keep this patch mechanical this has been moved into a later patch. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Yongbok Kim <yongbok.kim@imgtec.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Graf <agraf@suse.de> Cc: Jason Wang <jasowang@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
Alistair Francis	2ab4b13563	Convert single line fprintf(.../n) to warn_report() Convert all the single line uses of fprintf(stderr, "warning:"..."\n"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using this command: find ./* -type f -exec sed -i \ 's\|fprintf(.".warning[,:] $.$\\n"$.$);\|warn_report("\1"\2);\|Ig' \ {} + Some of the lines were manually edited to reduce the line length to below 80 charecters. The #include lines were manually updated to allow the code to compile. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Yongbok Kim <yongbok.kim@imgtec.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> [mips] Message-Id: <ae8f8a7f0a88ded61743dff2adade21f8122a9e7.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
Alistair Francis	55d527a94d	Convert remaining error_report() to warn_report() In a previous patch (`3dc6f86936`) we converted uses of error_report("warning:"... to use warn_report() instead. This was to help standardise on a single method of printing warnings to the user. There appears to have been some cases that slipped through in patch sets applied around the same time, this patch catches the few remaining cases. All of the warnings were changed using this command: find ./* -type f -exec sed -i \ 's\|error_report(".*warning[,:] \|warn_report("\|Ig' {} + Indentation fixed up manually afterwards. Two messages were manually fixed up as well. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Alexander Graf <agraf@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <eec8cba0d5434bd828639e5e45f12182490ff47d.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:34 +02:00
Paolo Bonzini	08e2c9f19c	scsi: move block/scsi.h to include/scsi/constants.h Complete the transition by renaming this header, which was shared by block/iscsi.c and the SCSI emulation code. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:31 +02:00
Paolo Bonzini	e5b5728cd3	scsi: move non-emulation specific code to scsi/ util/scsi.c includes some SCSI code that is shared by block/iscsi.c and hw/scsi, but the introduction of the persistent reservation helper will add many more instances of this. There is also include/block/scsi.h, which actually is not part of the core block layer. The persistent reservation manager will also need a home. A scsi/ directory provides one for both the aforementioned shared code and the PR manager code. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:11 +02:00
Fam Zheng	2875135807	scsi: Refactor scsi sense interpreting code So that it can be reused outside of iscsi.c. Also update MAINTAINERS to include the new files in SCSI section. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170821141008.19383-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-09-19 14:09:11 +02:00
Peter Maydell	75be9a52b1	nbd patches for 2017-09-06 - Daniel P. Berrange: [0/2] Fix / skip recent iotests with LUKS driver - Eric Blake: [0/3] nbd: Use common read/write-all qio functions -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAABCAAGBQJZsBGjAAoJEKeha0olJ0NqVRoH/iiNEB2SlZFFl5W++wf3Ekq/ lvtZjK3rxpvRXvy6LiRsYVs27Etc8E9aSw2UK6aaqgA3qR8g3zdmwUZb9w3slkeI OXedt0fS5IpQ4UP0ORUBb/LgyOgW3uA0UjHBTEAKl0SyvFPx+TrTZXxqQUqlAc9A lFaA0g71xvfqWWhXmt0PQjRr9bBEpe+4L4NgOypa+Z3xbBAektx390S8N/b/P8fC FNwAqBPTY5XAgJGnEhL9EUOdUWnVgoyG1MR63puJzULYi+2+TlpR2w030qRif75b h7TqYUvwKLnoqMyhBb5LmyhcqwNdphz/1DsEudk18XGuvC94WYkopC3rT7TPWLs= =vGUc -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-09-06' into staging nbd patches for 2017-09-06 - Daniel P. Berrange: [0/2] Fix / skip recent iotests with LUKS driver - Eric Blake: [0/3] nbd: Use common read/write-all qio functions # gpg: Signature made Wed 06 Sep 2017 16:17:55 BST # gpg: using RSA key 0xA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2017-09-06: nbd: Use new qio_channel_*_all() functions io: Add new qio_channel_read{, v}_all_eof functions io: Yield rather than wait when already in coroutine iotests: blacklist 194 with the luks driver iotests: rewrite 192 to use _launch_qemu to fix LUKS support Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-07 17:53:59 +01:00
Peter Maydell	8ee5f9b3ec	Block layer patches -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZr/vJAAoJEH8JsnLIjy/WrNMP/RMlpIfzjPTIKl1qwdxEbtEe kdsQulnSILVAWnXldB6xiQ8/epO2oTP+8sE9VCAoblQfJjD6RgffF1YCC7h1ZyBX 182ZnhapIwprH5RLKz/kgjfkx5/bCYjqpQ3JzznKJHNXJOAexznrYJMcbA2agfII 5qijA06dDoMIQTz49J2vvFAHrRUq/JqK85Ao8Zk41GDHDan5OfvQwsgt+Wa0V3vz mV6G1UsWCe4pmrv7v7/buhkVypy/BYz7vu6N20+2o3GDLwHmsgfKogUiSAC1N3iR olkeKtXdplY17iO6VgVrmFdkvaja0XCxYJjXnL54x/f1lQQQc01wUFNrh6WoIQLO Bl+XZ0oEQpFKJeBlu9mbDvgit0AGYE/yaLkCnfRFOU15lW5rjwqpF8husU0ntUcI TzGWt21kG0EXisejLMGEzEkMwkdhTwX6U+U7x5pF+x+pwSdcREDekeFcVhsb42Y/ brTgZCXdf32eJ8gOSzFoBJ5KfFaCqKgA6lWAv/kLsVs8DN+MnAv3SJGRBr22854W yJC5e3yLh36RVemjBqbqsU9VMD/P8fB3nJQwZRMyQh5A3RNxrK1y6e4XIqRwGqcC aj4cT2GbLWFH+EJUVdSRELmrLJPLyj5a1lU28Dq6b2Q34f9Hvg8GjSBOFf+4Vx6C N/z6+8O1mDtXdGHuCbmI =qJjo -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches # gpg: Signature made Wed 06 Sep 2017 14:44:41 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing qemu-iotests: add 184 for throttle filter driver block: add throttle block filter driver block: convert ThrottleGroup to object with QOM block: tidy ThrottleGroupMember initializations block: add aio_context field in ThrottleGroupMember block: move ThrottleGroup membership to ThrottleGroupMember block: document semantics of bdrv_co_preadv\|pwritev qcow: Check failure of bdrv_getlength() and bdrv_truncate() qcow: Change signature of get_cluster_offset() block: add default implementations for bdrv_co_get_block_status() block: remove bdrv_truncate callback in blkdebug block: remove unused bdrv_media_changed block: pass bdrv_* methods to bs->file by default in block filters Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-07 10:45:18 +01:00
Eric Blake	030fa7f6f9	nbd: Use new qio_channel_*_all() functions Rather than open-coding our own read/write-all functions, we can make use of the recently-added qio code. It slightly changes the error message in one of the iotests. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170905191114.5959-4-eblake@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>	2017-09-06 10:11:54 -05:00
Pavel Butsykin	83a8c775a8	qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing After calling qcow2_inactivate(), all qcow2 caches must be flushed, but this may not happen, because the last call qcow2_store_persistent_dirty_bitmaps() can lead to marking l2/refcont cache as dirty. Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe flushing to fix it. Cc: qemu-stable@nongnu.org Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-06 14:40:18 +02:00
Manos Pitsidianakis	d8e7d87ec4	block: add throttle block filter driver block/throttle.c uses existing I/O throttle infrastructure inside a block filter driver. I/O operations are intercepted in the filter's read/write coroutines, and referred to block/throttle-groups.c The driver can be used with the syntax -drive driver=throttle,file.filename=foo.qcow2,throttle-group=bar which registers the throttle filter node with the ThrottleGroup 'bar'. The given group must be created beforehand with object-add or -object. Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-06 10:12:02 +02:00
Manos Pitsidianakis	432d889e55	block: convert ThrottleGroup to object with QOM ThrottleGroup is converted to an object. This will allow the future throttle block filter drive easy creation and configuration of throttle groups in QMP and cli. A new QAPI struct, ThrottleLimits, is introduced to provide a shared struct for all throttle configuration needs in QMP. ThrottleGroups can be created via CLI as -object throttle-group,id=foo,x-iops-total=100,x-.. where x-* are individual limit properties. Since we can't add non-scalar properties in -object this interface must be used instead. However, setting these properties must be disabled after initialization because certain combinations of limits are forbidden and thus configuration changes should be done in one transaction. The individual properties will go away when support for non-scalar values in CLI is implemented and thus are marked as experimental. ThrottleGroup also has a `limits` property that uses the ThrottleLimits struct. It can be used to create ThrottleGroups or set the configuration in existing groups as follows: { "execute": "object-add", "arguments": { "qom-type": "throttle-group", "id": "foo", "props" : { "limits": { "iops-total": 100 } } } } { "execute" : "qom-set", "arguments" : { "path" : "foo", "property" : "limits", "value" : { "iops-total" : 99 } } } This also means a group's configuration can be fetched with qom-get. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-05 18:12:21 +02:00
Manos Pitsidianakis	f738cfc843	block: tidy ThrottleGroupMember initializations Move the CoMutex and CoQueue inits inside throttle_group_register_tgm() which is called whenever a ThrottleGroupMember is initialized. There's no need for them to be separate. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-05 16:47:52 +02:00
Manos Pitsidianakis	c61791fc23	block: add aio_context field in ThrottleGroupMember timer_cb() needs to know about the current Aio context of the throttle request that is woken up. In order to make ThrottleGroupMember backend agnostic, this information is stored in an aio_context field instead of accessing it from BlockBackend. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-05 16:47:52 +02:00
Manos Pitsidianakis	022cdc9f40	block: move ThrottleGroup membership to ThrottleGroupMember This commit eliminates the 1:1 relationship between BlockBackend and throttle group state. Users will be able to create multiple throttle nodes, each with its own throttle group state, in the future. The throttle group state cannot be per-BlockBackend anymore, it must be per-throttle node. This is done by gathering ThrottleGroup membership details from BlockBackendPublic into ThrottleGroupMember and refactoring existing code to use the structure. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-05 16:47:51 +02:00
Peter Maydell	d3e3447d3d	Merge QEMU I/O 2017/09/05 v2 -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJZrpcMAAoJEL6G67QVEE/fgCIP/jFlfM7PNYc3UK4ePTG2gEnX zg1rXa8tg35Pvo2xF0E7pw+tevu6wghJPRegPip1qh7tEGHIm421ct0F2MyqByP2 NlgPxJMw7+klIg/Pmmt64gCybD1Sm//aEt1vaiJvG9unLWfpedQzhkc7L1MTpB6d r2k4PEdZPp+sQ9tXe1fRWbba548GYx4VnrSbe68+2pDMKSykcY4AEX+Mzr78aP/T yCABwz5tVlqOLjUTFVoV+zyDK0va0GxWXZW167olfIeZye4nuvW4oo3Q/ruFG6eo a1B/XVDHDlqC31pXttAP0izq4yRNcZXfgMSjZyfGUS8wjKAzP81uyE1H2gNGciTo pbcEKNhW+sjUV6ooTyHzD5pvRc/8lt/DG/FzMTmNZq6piMswPsrMRaAoQ9COOq3s Y28xQngCaw05zKYPfPU30y04OcDAA8x5iBuRR+iZzJcJO33gA7+kUO447ib2E7qL aDRR7FVhjbVRWkF5QTxjqq/9cuIqSu7vqS+CTgIoo+VEtdFsU4DlKZ07j1JfPHq0 1Tq6mMeAfBPUAjMp/UCMK2eF/BuUoeXV4jd4uOGvk+JDaROzA5VVhyOAXZLX0m90 auTh1L19j2d/8SLl+XMsTZViG5zD0X/Nukw98l0OMBAhfewYhNL/5PW16FldYnS9 SwqHcFxNRGABOoTsOKB/ =mnn5 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-20170905-2' into staging Merge QEMU I/O 2017/09/05 v2 # gpg: Signature made Tue 05 Sep 2017 13:22:36 BST # gpg: using RSA key 0xBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/pull-qio-20170905-2: io: fix check for handshake completion in TLS test io: add new qio_channel_{readv, writev, read, write}_all functions io: fix typo in docs comment for qio_channel_read util: remove the obsolete non-blocking connect io: fix temp directory used by test-io-channel-tls test Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-05 14:14:33 +01:00
Cao jin	b258793258	util: remove the obsolete non-blocking connect The non-blocking connect mechanism is obsolete, and it doesn't work well in inet connection, because it will call getaddrinfo first and getaddrinfo will blocks on DNS lookups. Since commit `e65c67e4` & `d984464e`, the non-blocking connect of migration goes through QIOChannel in a different manner(using a thread), and nobody use this old non-blocking connect anymore. Any newly written code which needs a non-blocking connect should use the QIOChannel code, so we can drop NonBlockingConnectHandler as a concept entirely. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2017-09-05 13:21:58 +01:00
Eric Blake	d7a753a148	qcow: Check failure of bdrv_getlength() and bdrv_truncate() Omitting the check for whether bdrv_getlength() and bdrv_truncate() failed meant that it was theoretically possible to return an incorrect offset to the caller. More likely, conditions for either of these functions to fail would also cause one of our other calls (such as bdrv_pread() or bdrv_pwrite_sync()) to also fail, but auditing that we are safe is difficult compared to just patching things to always forward on the error rather than ignoring it. Use osdep.h macros instead of open-coded rounding while in the area. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:33:00 +02:00
Eric Blake	56439e9d55	qcow: Change signature of get_cluster_offset() The old signature has an ambiguous meaning for a return of 0: either no allocation was requested or necessary, or an error occurred (but any errno associated with the error is lost to the caller, which then has to assume EIO). Better is to follow the example of qcow2, by changing the signature to have a separate return value that cleanly distinguishes between failure and success, along with a parameter that cleanly holds a 64-bit value. Then update all callers. While auditing that all return paths return a negative errno (rather than -1), I also simplified places where we can pass NULL rather than a local Error that just gets thrown away. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:33:00 +02:00
Manos Pitsidianakis	f7cc69b326	block: add default implementations for bdrv_co_get_block_status() bdrv_co_get_block_status_from_file() and bdrv_co_get_block_status_from_backing() set *file to bs->file and bs->backing respectively, so that bdrv_co_get_block_status() can recurse to them. Future block drivers won't have to duplicate code to implement this. Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:31:13 +02:00
Manos Pitsidianakis	d8e12cd322	block: remove bdrv_truncate callback in blkdebug Now that bdrv_truncate is passed to bs->file by default, remove the callback from block/blkdebug.c and set is_filter to true. is_filter also gives access to other callbacks that are forwarded automatically to bs->file for filters. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:31:13 +02:00
Manos Pitsidianakis	f024aee867	block: remove unused bdrv_media_changed This function is not used anywhere, so remove it. Markus Armbruster adds: The i82078 floppy device model used to call bdrv_media_changed() to implement its media change bit when backed by a host floppy. This went away in `21fcf36` "fdc: simplify media change handling". Probably broke host floppy media change. Host floppy pass-through was dropped in commit `f709623`. bdrv_media_changed() has never been used for anything else. Remove it. (Source is Message-ID: <87y3ruaypm.fsf@dusky.pond.sub.org>) Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:31:13 +02:00
Marc-André Lureau	ebf677c849	qapi: drop the sentinel in enum array Now that all usages have been converted to user lookup helpers. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-14-marcandre.lureau@redhat.com> [Rebased, superfluous local variable dropped, missing check-qom-proplist.c update added] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-17-git-send-email-armbru@redhat.com>	2017-09-04 13:09:13 +02:00
Marc-André Lureau	f7abe0ecd4	qapi: Change data type of the FOO_lookup generated for enum FOO Currently, a FOO_lookup is an array of strings terminated by a NULL sentinel. A future patch will generate enums with "holes". NULL-termination will cease to work then. To prepare for that, store the length in the FOO_lookup by wrapping it in a struct and adding a member for the length. The sentinel will be dropped next. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-13-marcandre.lureau@redhat.com> [Basically redone] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-16-git-send-email-armbru@redhat.com> [Rebased]	2017-09-04 13:09:13 +02:00
Markus Armbruster	977c736f80	qapi: Mechanically convert FOO_lookup[...] to FOO_str(...) Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-14-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-09-04 13:09:13 +02:00
Markus Armbruster	5b5f825d44	qapi: Generate FOO_str() macro for QAPI enum FOO The next commit will put it to use. May look pointless now, but we're going to change the FOO_lookup's type, and then it'll help. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-09-04 13:09:13 +02:00
Marc-André Lureau	8d5fb199fb	quorum: Use qapi_enum_parse() in quorum_open() Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-12-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, qemu_opt_get() factored out, commit message tweaked] Cc: Alberto Garcia <berto@igalia.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-9-git-send-email-armbru@redhat.com>	2017-09-04 13:09:13 +02:00
Marc-André Lureau	f9509d1517	block: Use qemu_enum_parse() in blkdebug_debug_breakpoint() The error message on invalid blkdebug events changes from qemu-system-x86_64: LOCATION: Invalid event name "VALUE" to qemu-system-x86_64: LOCATION: invalid parameter value: VALUE Slight degradation, but the message is sub-par even before the patch. When complaining about a parameter value, both parameter name and value should be mentioned, as the value may well not be unique. Left for another day. Also left is the error message's unhelpful location: it points to the config=FILENAME rather than into that file. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-11-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, commit message rewritten] Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-8-git-send-email-armbru@redhat.com>	2017-09-04 13:09:13 +02:00
Markus Armbruster	06c60b6c46	qapi: Drop superfluous qapi_enum_parse() parameter max The lookup tables have a sentinel, no need to make callers pass their size. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-3-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Rebased, commit message corrected]	2017-09-04 13:09:13 +02:00
Peter Maydell	223cd0e13f	-----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZp+UNAAoJENro4Ql1lpzlL3AP/3gyYuAt4vR9FzeDx64XfPzB x31p50TadRXRIrb5mmN69dXZbg0pmnk68m0HEeSXBl0wh+gQVVPL2xfaMow2UhIw jd0v9IxkR8PH9ruEso3fJH1RbNGy9aRUlgCYQdGo3Y4W3IZhOsSOKwdmrU46rohy Bq+RzEL0sWH5I6v+ylFJXktNrVY6n1P1epWY5BnldDm58+l727z/H1rnHPA3t6sL FHoCmDypimXE4bOEXUQ9y30z1KGYlSmVE9Jm9ABGakcnK3LK0nZl758/DEJDZg02 Ma+TJT3lnwqbLWPIanikeAiP6pf2NkYVhaJN42rqrYhFbOsl6ge2yzHxK83dzju+ 3b+Rk9yO932nQLwPTFGA1VGupAUqBtdDIMfZy8RpVD1anA83xgphBP2xPJh0Jsnj SAFinRdl1XFFVERoTLpMUqJWujp2mBsR14Ljw9dnF0HEfvr2jLkEyTwb6LwHyInx pAT06s9grsv0wlvaH+fZK5P1KviHr8TjX56qQM0YuGYr8LzvWAbd3mPor7c0EtR6 pr2GhbKQIhCq/foRD9nWMDlmUCWmJBjaCk++XUnmwFr61eegLku0jpRiClwFwPI3 I9dNfiJWrQFdtLFi2xi6A/ibtmCE9JS4lAZYw3ZVGnW8ulx0C2qev5HrgkcDtgq+ vmNfitmbOSG5ZvBn+3eC =jCiK -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/elmarco/tags/tidy-pull-request' into staging # gpg: Signature made Thu 31 Aug 2017 11:29:33 BST # gpg: using RSA key 0xDAE8E10975969CE5 # gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" # gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5 * remotes/elmarco/tags/tidy-pull-request: (29 commits) eepro100: replace g_malloc()+memcpy() with g_memdup() test-iov: replace g_malloc()+memcpy() with g_memdup() i386: replace g_malloc()+memcpy() with g_memdup() i386: introduce ELF_NOTE_SIZE macro decnumber: use DIV_ROUND_UP kvm: use DIV_ROUND_UP i386/dump: use DIV_ROUND_UP ppc: use DIV_ROUND_UP msix: use DIV_ROUND_UP usb-hub: use DIV_ROUND_UP q35: use DIV_ROUND_UP piix: use DIV_ROUND_UP virtio-serial: use DIV_ROUND_UP console: use DIV_ROUND_UP monitor: use DIV_ROUND_UP virtio-gpu: use DIV_ROUND_UP vga: use DIV_ROUND_UP ui: use DIV_ROUND_UP vnc: use DIV_ROUND_UP vvfat: use DIV_ROUND_UP ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-31 15:52:43 +01:00
Peter Maydell	1d2a8e0690	-----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZp8cdAAoJEJykq7OBq3PIyeQIALXlHMTJM+I2dfUZfkIYFrEk Euf0z1URMJ9k5hKy1kIhAVlmGWs2fB1snTCm9tZjCtPqMjH5EDWb8z+zrqmorpcQ LyIccYdT/XrFeU1x+n4PlhaubQKXiAfZbUbgZpbkZwGgX0k51gx3V9z1smHme6AX CIODhgotqbJ0Hy2kuAP8TM2OPgx1tcyel34GuT5e3Rrb8nL0QfHfG4nxcpWBB0q8 iipoJfBvKWpRV0azSg+s51x1FFcB3iDKr81uBVABOyLtVW13nF6EMRIP76rqy5qp relNDo6kdmh0W19motNPjOa4BhnPQakEfF+bdARBOJPbXsFzd5X193yQBKW+nq4= =5ltA -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging # gpg: Signature made Thu 31 Aug 2017 09:21:49 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: qcow2: allocate cluster_cache/cluster_data on demand qemu-doc: Add UUID support in initiator name tests: migration/guestperf Python 2.6 argparse compatibility docker.py: Python 2.6 argparse compatibility scripts: add argparse module for Python 2.6 compatibility misc: Remove unused Error variables oslib-posix: Print errors before aborting on qemu_alloc_stack() throttle: Test the valid range of config values throttle: Make burst_length 64bit and add range checks throttle: Make LeakyBucket.avg and LeakyBucket.max integer types throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket() throttle: Make throttle_is_valid() a bit less verbose throttle: Update the throttle_fix_bucket() documentation throttle: Fix wrong variable name in the header documentation nvme: Fix get/set number of queues feature, again Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-31 14:33:54 +01:00
Marc-André Lureau	78ee96de64	vvfat: use DIV_ROUND_UP I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Marc-André Lureau	13f1493f82	vpc: use DIV_ROUND_UP I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Marc-André Lureau	21cf3e1201	qcow2: use DIV_ROUND_UP I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Marc-André Lureau	6fb0022b48	dmg: use DIV_ROUND_UP I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Marc-André Lureau	cf7a09c1e4	vhdx: use QEMU_ALIGN_DOWN I used the clang-tidy qemu-round check to generate the fix: https://github.com/elmarco/clang-tools-extra Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-08-31 12:29:07 +02:00
Vladimir Sementsov-Ogievskiy	f35dff7e13	block/nbd-client: refactor request send/receive Add nbd_co_request, to remove code duplications in nbd_client_co_{pwrite,pread,...} functions. Also this is needed for further refactoring. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com> [eblake: make nbd_co_request a wrapper, rather than merging two existing functions] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy	07b1b99c78	block/nbd-client: rename nbd_recv_coroutines_enter_all Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all, as it most probably just adds all recv coroutines into co_queue_wakeup, rather than directly enter them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com> [eblake: tweak commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Vladimir Sementsov-Ogievskiy	6faa077772	block/nbd-client: get rid of ssize_t Use int variable for nbd_co_send_request return value (as nbd_co_send_request returns int). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20170804151440.320927-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:38 -05:00
Stefan Hajnoczi	3c2d5183f9	nbd-client: avoid read_reply_co entry if send failed The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-30 13:00:37 -05:00
Stefan Hajnoczi	3e4c705212	qcow2: allocate cluster_cache/cluster_data on demand Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1) * cluster_size + 512 bytes upfront. Allocate s->cluster_cache and s->cluster_data when the first read operation is performance on a compressed cluster. The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any code paths that can allocate these buffers, so remove the free functions in the error code path. This patch can result in significant memory savings when many qcow2 disks are attached or backing file chains are long: Before 12.81% (1,023,193,088B) After 5.36% (393,893,888B) Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170821135530.32344-1-stefanha@redhat.com Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-30 18:02:10 +01:00
Alberto Garcia	c3a8fe331e	misc: Remove unused Error variables There's a few cases which we're passing an Error pointer to a function only to discard it immediately afterwards without checking it. In these cases we can simply remove the variable and pass NULL instead. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170829120836.16091-1-berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-30 11:58:26 +01:00
Stefan Hajnoczi	40f4a21895	nbd-client: avoid spurious qio_channel_yield() re-entry The following scenario leads to an assertion failure in qio_channel_yield(): 1. Request coroutine calls qio_channel_yield() successfully when sending would block on the socket. It is now yielded. 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because nbd_receive_reply() failed. 3. Request coroutine is entered and returns from qio_channel_yield(). Note that the socket fd handler has not fired yet so ioc->write_coroutine is still set. 4. Request coroutine attempts to send the request body with nbd_rwv() but the socket would still block. qio_channel_yield() is called again and assert(!ioc->write_coroutine) is hit. The problem is that nbd_read_reply_entry() does not distinguish between request coroutines that are waiting to receive a reply and those that are not. This patch adds a per-request bool receiving flag so nbd_read_reply_entry() can avoid spurious aio_wake() calls. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170822125113.5025-1-stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Eric Blake <eblake@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 11:22:15 -05:00
Fam Zheng	045a2f8254	mirror: Mark target BB as "force allow inactivate" Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-4-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Fam Zheng	ca2e214411	block-backend: Allow more "can inactivate" cases These two conditions corresponds to mirror job's source and target, which need to be allowed as they are part of the non-shared storage migration workflow: failing to inactivate either will result in a failure during migration completion. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-3-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> [eblake: improve comment grammar] Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Fam Zheng	c16de8f59a	block-backend: Refactor inactivate check The logic will be fixed (extended), move it to a separate function. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-2-famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Igor Mammedov	d0a180131c	fix build failure in nbd_read_reply_entry() travis builds fail at HEAD at rc3 master with block/nbd-client.c: In function ‘nbd_read_reply_entry’: block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized] fix it by initializing 'ret' to 0 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-23 12:24:41 +01:00
Eric Blake	72b6ffc766	nbd-client: Fix regression when server sends garbage When we switched NBD to use coroutines for qemu 2.9 (in particular, commit `a12a712a`), we introduced a regression: if a server sends us garbage (such as a corrupted magic number), we quit the read loop but do not stop sending further queued commands, resulting in the client hanging when it never reads the response to those additional commands. In qemu 2.8, we properly detected that the server is no longer reliable, and cancelled all existing pending commands with EIO, then tore down the socket so that all further command attempts get EPIPE. Restore the proper behavior of quitting (almost) all communication with a broken server: Once we know we are out of sync or otherwise can't trust the server, we must assume that any further incoming data is unreliable and therefore end all pending commands with EIO, and quit trying to send any further commands. As an exception, we still (try to) send NBD_CMD_DISC to let the server know we are going away (in part, because it is easier to do that than to further refactor nbd_teardown_connection, and in part because it is the only command where we do not have to wait for a reply). Based on a patch by Vladimir Sementsov-Ogievskiy. A malicious server can be created with the following hack, followed by setting NBD_SERVER_DEBUG to a non-zero value in the environment when running qemu-nbd: \| --- a/nbd/server.c \| +++ b/nbd/server.c \| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel ioc, NBDReply reply, Error *errp) \| stl_be_p(buf + 4, reply->error); \| stq_be_p(buf + 8, reply->handle); \| \| + static int debug; \| + static int count; \| + if (!count++) { \| + const char str = getenv("NBD_SERVER_DEBUG"); \| + if (str) { \| + debug = atoi(str); \| + } \| + } \| + if (debug && !(count % debug)) { \| + buf[0] = 0; \| + } \| return nbd_write(ioc, buf, sizeof(buf), errp); \| } Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170814213426.24681-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-15 10:03:28 -05:00
Fam Zheng	5f7772c4d0	block-backend: Defer shared_perm tightening migration completion As in the case of nbd_export_new(), bdrv_invalidate_cache() can be called when migration is still in progress. In this case we are not ready to tighten the shared permissions fenced by blk->disable_perm. Defer to a VM state change handler. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170815130740.31229-4-famz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-15 10:03:28 -05:00
Fam Zheng	2b218f5dbc	file-posix: Do runtime check for ofd lock API It is reported that on Windows Subsystem for Linux, ofd operations fail with -EINVAL. In other words, QEMU binary built with system headers that exports F_OFD_SETLK doesn't necessarily run in an environment that actually supports it: $ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \ -device virtio-blk-pci,drive=hd0 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100 qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100 As a matter of fact this is not WSL specific. It can happen when running a QEMU compiled against a newer glibc on an older kernel, such as in a containerized environment. Let's do a runtime check to cope with that. Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 14:12:44 +02:00
Eric Blake	d0d5d0e31a	qcow2: Check failure of bdrv_getlength() qcow2_co_pwritev_compressed() should not call bdrv_truncate() if determining the size failed. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:47 +02:00
Eric Blake	c40fe9c06c	qcow2: Drop debugging dump_refcounts() It's been #if 0'd since its introduction in 2006, commit `585f8587`. We can revive dead code if we need it, but in the meantime, it has bit-rotted (for example, not checking for failure in bdrv_getlength()). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:45 +02:00
Eric Blake	81caa3cc3b	vpc: Check failure of bdrv_getlength() vpc_open() was checking for bdrv_getlength() failure in one, but not the other, location. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-11 13:23:40 +02:00
Jeff Cody	113fe792fd	block/nfs: fix mutex assertion in nfs_file_close() Commit `c096358e74` introduced assertion checks for when qemu_mutex() functions are called without the corresponding qemu_mutex_init() having initialized the mutex. This uncovered a latent bug in qemu's nfs driver - in nfs_client_close(), the NFSClient structure is overwritten with zeros, prior to the mutex being destroyed. Go ahead and destroy the mutex in nfs_client_close(), and change where we call qemu_mutex_init() so that it is correctly balanced. There are also a couple of memory leaks obscured by the memset, so this fixes those as well. Finally, we should be able to get rid of the memset(), as it isn't necessary. Cc: qemu-stable@nongnu.org Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	e5e6268348	parallels: drop check that bdrv_truncate() is working This would be actually strange and error prone. If truncate() nowadays will fail, there is something fatally wrong. Let's check for that during the actual work. The only fallback case is when the file is not zero initialized. In this case we should switch to preallocation via fallocate(). Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	d8b83e37c3	parallels: respect error code of bdrv_getlength() in allocate_clusters() If we can not get the file length, the state of BDS is broken completely. Return error to the caller. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-08-08 15:19:16 +02:00
Denis V. Lunev	70d9110b44	block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes Original idea beyond the code in question was the following: we have failed to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the only chance now: if the request comes beyond end of the file. Thus we should calculate file length and respect the error code from that op. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Markus Armbruster <armbru@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-08-08 15:19:16 +02:00
Fam Zheng	0e51b9b7c7	vmdk: Fix error handling/reporting of vmdk_check Errors from the callees must be captured and propagated to our caller, ensure this for both find_extent() and bdrv_getlength(). Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 15:19:16 +02:00
Kevin Wolf	809eb70ed6	block/null: Remove 'filename' option This option was only added to allow 'null-co://' and 'null-aio://' as filenames, its value never served any actual purpose and was ignored. Nevertheless it was accepted as '-drive driver=null,filename=foo'. The correct way to enable the protocol prefixes (and that without adding a useless -drive option) is implementing .bdrv_parse_filename. This is what this patch does. Technically, this is an incompatible change, but the null block driver is only used for benchmarking, testing and debugging, and an option without effect isn't likely to be used by anyone anyway, so no bad effects are to be expected. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-08 15:19:16 +02:00
Jeff Cody	95d729835f	block/vhdx: check error return of bdrv_truncate() Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	c6572fa0d2	block/vhdx: check error return of bdrv_flush() Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	27539ac531	block/vhdx: check for offset overflow to bdrv_truncate() VHDX uses uint64_t types for most offsets, following the VHDX spec. However, bdrv_truncate() takes an int64_t value for the truncating offset. Check for overflow before calling bdrv_truncate(). While we are here, replace the bit shifting with QEMU_ALIGN_UP as well. N.B.: For a compliant image this is not an issue, as the maximum VHDX image size is defined per the spec to be 64TB. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Jeff Cody	3f910692c2	block/vhdx: check error return of bdrv_getlength() Calls to bdrv_getlength() were not checking for error. In vhdx.c, this can lead to truncating an image file, so it is a definite bug. In vhdx-log.c, the path for improper behavior is less clear, but it is best to check in any case. Some minor code movement of the log_guid intialization, as well. Reported-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Alberto Garcia	795be0621a	quorum: Set sectors-count to 0 when reporting a flush error The QUORUM_REPORT_BAD event has fields to report the sector in which the error was detected and the number of affected sectors starting from that one. This is important for read and write errors, but not for flush errors. For flush errors the current code reports the total size of the disk image. That is however not useful information in this case. Moreover, the bdrv_getlength() call can fail, and there's no good way of handling that failure. Since we're reporting useless information and we cannot even guarantee to do it in a consistent way, this patch changes the code to report 0 instead in all cases. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-08 14:37:00 +02:00
Daniel P. Berrange	f42cf447e2	block: move trace probes into bdrv_co_preadv\|pwritev There are trace probes in bdrv_co_readv\|writev, however, the block drivers are being gradually moved over to using the bdrv_co_preadv\|pwritev functions instead. As a result some block drivers miss the current probes. Move the probes into bdrv_co_preadv\|pwritev instead, so that they are triggered by more (all?) I/O code paths. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170804105036.11879-1-berrange@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-07 09:39:35 +01:00
Peter Maydell	3b64f272d3	Block layer patches for 2.10.0-rc1 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZgKgMAAoJEH8JsnLIjy/W8FQP/i8f6lf3kpmTyqfyG/d9rU+V jizaa3uHBmmctQ9Ib0k1woFudSc1Rxt1kXtG3Cj45nYjHXYzWZOA2p3LF0bpRVtX keq3yXTLE+Il0ED6WiklKMS+BjpzxlnnHJpqnP3axl2TWZQtlQFIqO/f51RYtrVe SOaNdXxqEFQgV3uJIwtPdo38BBZvbIwA99+gCHM1YURkCESUnmy5h8plpnovJtMT Ta8sF+LXAjHtusYzfghNJ/p0Rpg3DkUmvHgo5QTA1F54AAPa3TePewqxssHaNK1T cDfDocvq9/gEMCMca2uyWNFxeqDZzaExoNUo2EVYoCPXKWr+vPaEgs9+RjMv6XUw d6lXZ6F0Rpm1zdtYQI8R1/ZpYcf29oo13q6fW0EEDx8y+9LMRKZP7pRtWA+MpNzT 9iRMVm3y0G4FOaoWD9W9cMVfD9aJknz8j3pggIY8nUhvh7BqkEmbgoaO230AmDYc dVDDmGL8544g0x/v0USqe2ed/XdBkZSScOeKVeRpuS/r2E4UCBhhJNSefxzvn2+p GYj+M6HLZ+biyKBkK3gwyk1fT74vOMpOBzysbBpIN8kg9ySDSkFhvj7qE24YJMKT 6yuWQE7WzPmXMiUz6hpn/m5TSjtVAXvL3BDc2lMz0HW6tHWJ5asv9zJjku+Ze9P4 FGBPYzwJ585Wxq/Z6y5b =DIex -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches for 2.10.0-rc1 # gpg: Signature made Tue 01 Aug 2017 17:10:52 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: block/qapi: Remove redundant NULL check to silence Coverity qemu-iotests/059: Fix leaked image files qemu-iotests/063: Fix leaked image qemu-iotests/162: Fix leaked temporary files qemu-iotests/153: Fix leaked scratch images qemu-iotests/141: Fix image cleanup qemu-iotests: Remove blkdebug.conf after tests qemu-iotests/041: Fix leaked scratch images block: fix leaks in bdrv_open_driver() block: fix dangling bs->explicit_options in block.c iotests: Add test of recent fix to 'qemu-img measure' iotests: Check dirty bitmap statistics in 124 iotests: Redirect stderr to stdout in 186 iotests: Fix test 156 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-08-01 17:27:36 +01:00
Kevin Wolf	8e8eb0a903	block/qapi: Remove redundant NULL check to silence Coverity When skipping implicit nodes in bdrv_block_device_info(), we know that bs0 is always non-NULL; initially, because it's taken from a BdrvChild and a BdrvChild never has a NULL bs, and after the first iteration because implicit nodes always have a backing file. Remove the NULL check and add an assertion that the implicit node does indeed have a backing file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2017-08-01 18:09:33 +02:00
Vladimir Sementsov-Ogievskiy	8908eb1a4a	trace-events: fix code style: print 0x before hex numbers The only exception are groups of numers separated by symbols '.', ' ', ':', '/', like 'ab.09.7d'. This patch is made by the following: > find . -name trace-events \| xargs python script.py where script.py is the following python script: ========================= #!/usr/bin/env python import sys import re import fileinput rhex = '%[-+ .0-9](?:[hljztL]\|ll\|hh)?(?:x\|X\|"\sPRI[xX][^"]"?)' rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')') rbad = re.compile('(?<!0x)' + rhex) files = sys.argv[1:] for fname in files: for line in fileinput.input(fname, inplace=True): arr = re.split(rgroup, line) for i in range(0, len(arr), 2): arr[i] = re.sub(rbad, '0x\g<0>', arr[i]) sys.stdout.write(''.join(arr)) ========================= Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-01 12:13:07 +01:00
Vladimir Sementsov-Ogievskiy	db73ee4bc8	trace-events: fix code style: %# -> 0x% In trace format '#' flag of printf is forbidden. Fix it to '0x%'. This patch is created by the following: check that we have a problem > find . -name trace-events \| xargs grep '%#' \| wc -l 56 check that there are no cases with additional printf flags before '#' > find . -name trace-events \| xargs grep "%[-+ 0'I]+#" \| wc -l 0 check that there are no wrong usage of '#' and '0x' together > find . -name trace-events \| xargs grep '0x%#' \| wc -l 0 fix the problem > find . -name trace-events \| xargs sed -i 's/%#/0x%/g' [Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep "%[-+ 0'I]+#" instead so the shell quoting is correct. --Stefan] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-08-01 12:13:07 +01:00
Philippe Mathieu-Daudé	87e0331c5a	docs: fix broken paths to docs/devel/tracing.txt With the move of some docs/ to docs/devel/ on `ac06724a71`, no references were updated. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-07-31 13:12:53 +03:00
Philippe Mathieu-Daudé	f80ac75d0e	qcow2: fix null pointer dereference It seems this assert() was somehow misplaced. block/qcow2-refcount.c:2193:42: warning: Array access (from variable 'on_disk_reftable') results in a null pointer dereference on_disk_reftable[refblock_index] = refblock_offset; ~~~~~~~~~~~~~~~~ ^ Reported-by: Clang Static Analyzer Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-07-31 13:06:38 +03:00
Vladimir Sementsov-Ogievskiy	b6b75a99da	qcow2-bitmap: fix bitmap_free Fix possible crash on error path in qcow2_remove_persistent_dirty_bitmap. Although bitmap_free was added in `88ddffae8f` the bug was introduced later in commit `469c71edc7` (when qcow2_remove_persistent_dirty_bitmap was added). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170714123341.373857-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-25 16:33:31 +02:00
Daniel P. Berrange	0696ae2c92	qcow: fix memory leaks related to encryption Fix leak of the 'encryptopts' string, which was mistakenly declared const. Fix leak of QemuOpts entry which should not have been deleted from the opts array. Reported by: coverity Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170714103105.5781-1-berrange@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-25 16:33:31 +02:00
Kevin Wolf	d3c8c67469	block: Skip implicit nodes in query-block/blockstats Commits `0db832f` and `6cdbceb` introduced the automatic insertion of filter nodes above the top layer of mirror and commit block jobs. The assumption made there was that since libvirt doesn't do node-level management of the block layer yet, it shouldn't be affected by added nodes. This is true as far as commands issued by libvirt are concerned. It only uses BlockBackend names to address nodes, so any operations it performs still operate on the root of the tree as intended. However, the assumption breaks down when you consider query commands, which return data for the wrong node now. These commands also return information on some child nodes (bs->file and/or bs->backing), which libvirt does make use of, and which refer to the wrong nodes, too. One of the consequences is that oVirt gets wrong information about the image size and stops the VM in response as long as a mirror or commit job is running: https://bugzilla.redhat.com/show_bug.cgi?id=1470634 This patch fixes the problem by hiding the implicit nodes created automatically by the mirror and commit block jobs in the output of query-block and BlockBackend-based query-blockstats as long as the user doesn't indicate that they are aware of those nodes by providing a node name for them in the QMP command to start the block job. The node-based commands query-named-block-nodes and query-blockstats with query-nodes=true still show all nodes, including implicit ones. This ensures that users that are capable of node-level management can still access the full information; users that only know BlockBackends won't use these commands. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com>	2017-07-24 15:06:04 +02:00
Eric Blake	24bae02b19	qcow2: Fix sector calculation in qcow2_measure() We used MAX() instead of the intended MIN() when computing how many sectors to view in the current loop iteration of qcow2_measure(), and passed in a value of INT_MAX sectors instead of our more usual limit of BDRV_REQUEST_MAX_SECTORS (the latter avoids 32-bit overflow on conversion to bytes). For small files, the bug is harmless: bdrv_get_block_status_above() clamps its *pnum answer to the BDS size, regardless of any insanely larger input request. However, for any file at least 2T in size, we can very easily end up going into an infinite loop (the maximum of 0x100000000 sectors and INT_MAX is a 64-bit quantity, which becomes 0 when assigned to int; once nb_sectors is 0, we never make progress). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-24 15:06:04 +02:00
Eric Blake	6c98c57af3	dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented We've been documenting the value in bytes since its introduction in commit `b9a9b3a4` (v1.3), where it was actually reported in bytes. Commit `e4654d2` (v2.0) then removed things from block/qapi.c, in preparation for a rewrite to a list of dirty sectors in the next commit `21b5683` in block.c, but the new code mistakenly started reporting in sectors. Fixes: https://bugzilla.redhat.com/1441460 CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-24 15:06:04 +02:00
Mark Cave-Ayland	c8115f8eb8	block/vpc: fix uninitialised variable compiler warning Since commit `cfc87e00` "block/vpc.c: Handle write failures in get_image_offset()" older versions of gcc (in this case 4.7) incorrectly warn that "ret" can be used uninitialised in vpc_co_pwritev(). Setting ret to 0 at the start of vpc_co_pwritev() prevents the warning in gcc 4.7 and enables compilation with -Werror to succeed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1500625265-23844-1-git-send-email-mark.cave-ayland@ilande.co.uk Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-21 15:00:07 +01:00
Max Reitz	7c8730d45f	block/vvfat: Fix compiler warning with gcc 7 gcc 7 complains that the sprintf() might write a null byte beyond the end of the tail buffer. That is wrong, but we can silence it by making i unsigned (it can never be negative anyway, see the if condition right before). For some reason, this allows gcc to suddenly accurately calculate the range of i so we can give the tail[] array the exact size it needs to have (which is 8 bytes) without gcc complaining. In addition, let us convert the sprintf() to snprintf(), because that is always nicer, and add an assertion about the range of the return value afterwards so we can see that "8 - len" will never be negative and thus "entry->name + MIN(j, 8 - len)" will never be out of bounds. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	f80256b7ee	vvfat: initialize memory after allocating it This prevents some host to guest memory content leaks. Fixes: https://bugs.launchpad.net/qemu/+bug/1599539 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	e03da26b71	vvfat: correctly parse non-ASCII short and long file names Write support works again when image contains non-ASCII names. It is either the case when user created a non-ASCII filename, or when initial directory contained a non-ASCII filename (since `0c36111f57`) Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	63d261cb0d	vvfat: add a constant for bootsector name Also add links to related compatibility problems. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Hervé Poussineau	8c4517fd6e	vvfat: add constants for special values of name[0] Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	ec18b0a93a	block: List anonymous device BBs in query-block Instead of listing only monitor-owned BlockBackends in query-block, also add those anonymous BlockBackends that are owned by a qdev device and as such under the control of the user. This allows using query-block to inspect BlockBackends for the modern configuration syntax with -blockdev and -device. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	d5b68844e6	block/qapi: Use blk_all_next() for query-block This patch replaces the blk_next() loop in query-block by a blk_all_next() one so that we also get access to BlockBackends that aren't owned by the monitor. For now, the next thing we do is check whether each BB has a name, so there is no semantic difference. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	a429b9b5f4	block: Make blk_all_next() public Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:36 +02:00
Kevin Wolf	46eade7be8	block/qapi: Add qdev device name to query-block With -blockdev/-device, users can indirectly create anonymous BlockBackends, while the state of such backends is still of interest. As a preparation for making such BBs visible in query-block, make sure that they can be identified even without a name by adding the ID/QOM path of their qdev device to BlockInfo. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:35 +02:00
Kevin Wolf	77beef8365	block: Make blk_get_attached_dev_id() public Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	cfc87e00c2	block/vpc.c: Handle write failures in get_image_offset() Coverity (CID 1355236) points out that get_image_offset() doesn't check that it actually succeeded in writing the updated block bitmap to the file. Check the error return from bdrv_pwrite_sync() and propagate an error response back up to the function which calls get_image_offset() for a write so that it can return the error to its caller. get_sector_offset() is only used for reads, but we move it to the same API for consistency. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	9877860e7b	block/vmdk: Report failures in vmdk_read_cid() The function vmdk_read_cid() can fail if the read on the underlying block device fails, or if there's a format error in the VMDK file. However its API doesn't provide a mechanism to report these errors, and in some cases we were returning a CID of 0 and in some cases a CID of 0xffffffff, either of which might potentially be valid values. Change the function to return 0 on success or a negative errno, and return the CID via a uint32_t* argument. Update the callsites to handle and propagate the error appropriately. This fixes in passing a Coverity-spotted issue (CID 1350038) where we weren't checking the return value from sscanf(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Manos Pitsidianakis	27e4cf1303	block: remove timer canceling in throttle_config() throttle_config() cancels the timers of the calling BlockBackend. This doesn't make sense because other BlockBackends in the group remain untouched. There's no need to cancel the timers in the one specific BlockBackend so let's not do that. Throttled requests will run as scheduled and future requests will follow the new configuration. This also allows a throttle group's configuration to be changed even when it has no members. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Manos Pitsidianakis	dbe824cc57	block: add clock_type field to ThrottleGroup Clock type in throttling is currently inferred by the ThrottleTimer's clock type even though it is a per-ThrottleGroup property; it doesn't make sense to have different clock types in the same group. Moving this to a field in ThrottleGroup can simplify some of the throttle functions. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:14:35 +02:00
Kevin Wolf	b1e1fa0c3a	commit: Add NULL check for overlay_bs I can't see how overlay_bs could become NULL with the current code, but other code in this function already checks it and we can make Coverity happy with this check, so let's add it. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-18 15:14:35 +02:00
Peter Maydell	718d7f4f9c	-----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZbNpiAAoJEJykq7OBq3PIKcIIAMEJpEiWQonSJZVV4fxqcbOF dXJSVxHqrrVUrcM8NY2zGXIwcGS8RBNZG+Yx/SEZgIljoYH4NFmbvKXWS2zgHSyr LUH0M6gYlQ/1vQTXwQrkJdtmgfc3xNVrQbanbynK3+aB1S5Y6pRGauDo8SqCBWu0 uLWkhcSQbG+OHD8Go5X1kZUSdpP8yOqKrxcNLe980ghi4HPMUydL3lbs4SwNlnRt NJIpMTGzJrL+CqyakIL+/PT9RBGCo4hllPD0CgX6HETEkuojxxXaqJIG+Tzj2FeU fXkoK1YQHEHLdVXnPkpModoPylhRqIQcPXoXt+aMvoLSM+bLooYXMYryMPoPDGI= =VgCF -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging # gpg: Signature made Mon 17 Jul 2017 16:40:18 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: block: fix shadowed variable in bdrv_co_pdiscard util/aio-win32: Only select on what we are actually waiting for Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-18 13:09:51 +01:00
Denis V. Lunev	593ed6f0a3	block: fix shadowed variable in bdrv_co_pdiscard We've had a shadowed 'ret' variable, which risks returning the wrong value, introduced in commit `b9c64947`. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170710150559.30163-1-den@openvz.org CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-17 15:58:37 +01:00
Paolo Bonzini	5aca18a4ff	ssh: support I/O from any AioContext The coroutine may run in a different AioContext, causing the fd handler to busy wait. Fix this by resetting the handler in restart_coroutine, before the coroutine is restarted. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-12-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:20 +08:00
Paolo Bonzini	f1af3251f8	sheepdog: add queue_lock Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-11-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:20 +08:00
Paolo Bonzini	1f01e50b83	qed: protect table cache with CoMutex This makes the driver thread-safe. The CoMutex is dropped temporarily while accessing the data clusters or the backing file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-10-pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:34:11 +08:00
Paolo Bonzini	61c7887e0f	qed: introduce bdrv_qed_init_state This will be used in the next patch, which will call bdrv_qed_do_open with a CoMutex taken. bdrv_qed_init_state provides a nice place to initialize it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-9-pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:33:11 +08:00
Paolo Bonzini	61124f03ab	block: invoke .bdrv_drain callback in coroutine context and from AioContext This will let the callback take a CoMutex in the next patch. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-8-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	e7569c1829	qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} This part is never called for in-place writes, move it away to avoid the "backwards" coding style typical of callback-based code. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-7-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	254aee4dbb	vvfat: make it thread-safe Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-6-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	778b087e51	vpc: make it thread-safe Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-5-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	1e88663979	vdi: make it thread-safe The VirtualBox driver is using a mutex to order all allocating writes, but it is not protecting accesses to the bitmap because they implicitly happen under the AioContext mutex. Change this to use a CoRwlock explicitly. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-4-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Paolo Bonzini	a8c57408cd	qcow2: call CoQueue APIs under CoMutex Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-2-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Peter Maydell	6c6076662d	* gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJZaJejAAoJEL/70l94x66DwQ4H/0NUvh/Zfs64wE1iuZJACc24 1za02fFaB50vFDwQKWbM0GkHzDxoXBHk4Rvn92p+VSxpKtaAX4GRwCvxRA5GeUtm GAYbdIJUe0UELepKExrlUVzQcK9VfljoJpK3dZkP5Zzx83L2PAI/SexrZRibN2Uf yRI60uvlsMWU12nenzdVnYORd+TWDNKele7BhMrX/FX9wxaS1PlnsnKZggy6CU7G 8dwZJAZJ/s5tRGXyXyAQzLm5JZQCLnA6jxya540TbPeciFgbvvS2ydIitZ54vSPO VtmZ1rSWfTEbNF5xGD1Ztu8aAENr5/I05l6IjxZd45BdUCW3HxeJkc+7lE0K4uk= =wnVs -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) # gpg: Signature made Fri 14 Jul 2017 11:06:27 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (55 commits) spapr_rng: Convert to DEFINE_PROP_LINK cpu: Convert to DEFINE_PROP_LINK mips_cmgcr: Convert to DEFINE_PROP_LINK ivshmem: Convert to DEFINE_PROP_LINK dimm: Convert to DEFINE_PROP_LINK virtio-crypto: Convert to DEFINE_PROP_LINK virtio-rng: Convert to DEFINE_PROP_LINK virtio-scsi: Convert to DEFINE_PROP_LINK virtio-blk: Convert to DEFINE_PROP_LINK qdev: Add const qualifier to PropertyInfo definitions qmp: Use ObjectProperty.type if present qdev: Introduce DEFINE_PROP_LINK qdev: Introduce PropertyInfo.create qom: enforce readonly nature of link's check callback translate-all: remove redundant !tcg_enabled check in dump_exec_info vl: fix breakage of -tb-size nbd: Implement NBD_INFO_BLOCK_SIZE on client nbd: Implement NBD_INFO_BLOCK_SIZE on server nbd: Implement NBD_OPT_GO on client nbd: Implement NBD_OPT_GO on server ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 12:16:09 +01:00
Eric Blake	081dd1fe36	nbd: Implement NBD_INFO_BLOCK_SIZE on client The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:42 +02:00
Eric Blake	004a89fce9	nbd: Create struct for tracking export info The NBD Protocol is introducing some additional information about exports, such as minimum request size and alignment, as well as an advertised maximum request size. It will be easier to feed this information back to the block layer if we gather all the information into a struct, rather than adding yet more pointer parameters during negotiation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-2-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-14 12:04:41 +02:00
Peter Maydell	a309b290aa	Error reporting patches for 2017-07-13 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZZ1/BAAoJEDhwtADrkYZTo7oP+gLj4B4kkp/DJnkzfuMMD1Ce ZPddZ8Z9RyXE4fS66sq1ODBQo5U+aQQZO7K234+jf8V4cKWW98lpVzLc3YdAHm2U ZF6Z9Rji5K4414ZsUcg92Zlovvdaji+mY0ooINav+4mqlONYrz29ntApWc0e0tGc e3tj4XDLhJrOM+mIx8vzixFlgSYj+6HgEiybYwolEK5svQbIQao3Y2omyb+zy0w0 RDT3XQnAAaZSOQAXcJGkhekkyMe0jMHOF0tULLx1uDQYctg9mUGlAGTZ5oTLgSve TCpSJwWCAx8XAJMkXyDRrdRFDLeUh6yGY7NTqAL3OuPSoAw9ygKrHyhTavxBJL+W rX7Qit3dmVrlZLviwNFQplAKYb10d08vBoKXmrnW5oVCmPEDvJIQfncbucpA/CNS ucdJ3RMLuDbbWdl+5tsL7jfiZAG7oSgAePTjN1rm0bDe5JN7NAU8WzHnKfE83iZq R+I3hofqGoiXSByYRLamZb+6nsURAxWPhcqcw7hdMsk7UI6dyZwWl9Fnm72w0BZK M5LHLkX0LYc+kZjiLKXlNK7Z50bXY0zKQpPCLH3nHA69iMiwVoozrjwa9iCKIxE+ 7ZlOfsu4ztExuicEyTr8b27CBrHjJjYDuFP0hroEOzqCKXUzegoq3oYMGP0doXxe o3xcwXVKT/1PudddyR4z =tByN -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-13' into staging Error reporting patches for 2017-07-13 # gpg: Signature made Thu 13 Jul 2017 12:55:45 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2017-07-13: Convert error_report_err() to warn_report_err() error: Implement the warn and free Error functions char-socket: Report TCP socket waiting as information Convert error_report() to warn_report() error: Functions to report warnings and informational messages util/qemu-error: Rename error_print_loc() to be more generic websock: Don't try to set errp directly block: Don't try to set errp directly xilinx: Fix latent error handling bug Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 09:36:40 +01:00
Alistair Francis	3dc6f86936	Convert error_report() to warn_report() Convert all uses of error_report("warning:"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these two commands: find ./* -type f -exec sed -i \ 's\|error_report(".*warning[,:] \|warn_report("\|Ig' {} + Indentation fixed up manually afterwards. The test-qdev-global-props test case was manually updated to ensure that this patch passes make check (as the test cases are case sensitive). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Suggested-by: Thomas Huth <thuth@redhat.com> Cc: Jeff Cody <jcody@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Lieven <pl@kamp.de> Cc: Josh Durgin <jdurgin@redhat.com> Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Greg Kurz <groug@kaod.org> Cc: Rob Herring <robh@kernel.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Peter Chubb <peter.chubb@nicta.com.au> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcel Apfelbaum <marcel@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Alexander Graf <agraf@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Greg Kurz <groug@kaod.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-07-13 13:49:58 +02:00
Max Reitz	772d1f973f	block/qcow2: falloc/full preallocating growth Implement the preallocation modes falloc and full for growing qcow2 images. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613202107.10125-15-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	60c48a29b7	block/qcow2: Rename "fail_block" to just "fail" Now alloc_refcount_block() only contains a single fail label, so it makes more sense to just name it "fail" instead of "fail_block". Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	12cc30a8cb	block/qcow2: Add qcow2_refcount_area() This function creates a collection of self-describing refcount structures (including a new refcount table) at the end of a qcow2 image file. Optionally, these structures can also describe a number of additional clusters beyond themselves; this will be important for preallocated truncation, which will place the data clusters and L2 tables there. For now, we can use this function to replace the part of alloc_refcount_block() that grows the refcount table (from which it is actually derived). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	95b98f343b	block/qcow2: Metadata preallocation for truncate We can support PREALLOC_MODE_METADATA by invoking preallocate() in qcow2_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170613202107.10125-12-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	652fecd005	block/qcow2: Lock s->lock in preallocate() preallocate() is and will be called only from places that do not otherwise need to lock s->lock: Currently that is qcow2_create2(), as of a future patch it will be called from qcow2_truncate(), too. It therefore makes sense to move locking that mutex into preallocate() itself. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	7bc45dc172	block/qcow2: Generalize preallocate() This patch adds two new parameters to the preallocate() function so we will be able to use it not just for preallocating a new image but also for preallocated image growth. The offset parameter allows the caller to specify a virtual offset from which to start preallocating. For newly created images this is always 0, but for preallocating growth this will be the old image length. The new_length parameter specifies the supposed new length of the image (basically the "end offset" for preallocation). During image truncation, bdrv_getlength() will return the old image length so we cannot rely on its return value then. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:02 +02:00
Max Reitz	35d72602ec	block/file-posix: Preallocation for truncate By using raw_regular_truncate() in raw_truncate(), we can now easily support preallocation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-9-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	d0bc9e5d5e	block/file-posix: Generalize raw_regular_truncate Currently, raw_regular_truncate() is intended for setting the size of a newly created file. However, we also want to use it for truncating an existing file in which case only the newly added space (when growing) should be preallocated. This also means that if resizing failed, we should try to restore the original file size. This is important when using preallocation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	9f63b07ee7	block/file-posix: Extract raw_regular_truncate() This functionality is part of raw_create() which we will be able to reuse nicely in raw_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170613202107.10125-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	7dacd8bd3d	block/file-posix: Small fixes in raw_create() Variables should be declared at the start of a block, and if a certain parameter value is not supported it may be better to return -ENOTSUP instead of -EINVAL. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170613202107.10125-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	3a691c50f1	block: Add PreallocMode to blk_truncate() blk_truncate() itself will pass that value to bdrv_truncate(), and all callers of blk_truncate() just set the parameter to PREALLOC_MODE_OFF for now. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	7ea37c3066	block: Add PreallocMode to bdrv_truncate() For block drivers that just pass a truncate request to the underlying protocol, we can now pass the preallocation mode instead of aborting if it is not PREALLOC_MODE_OFF. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	8243ccb743	block: Add PreallocMode to BD.bdrv_truncate() Add a PreallocMode parameter to the bdrv_truncate() function implemented by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no driver accepts anything else. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Stefan Hajnoczi	c501c35220	qcow2: add bdrv_measure() support Use qcow2_calc_prealloc_size() to get the required file size. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-7-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	0eb4a8c1df	qcow2: extract image creation option parsing The image creation options parsed by qcow2_create() are also needed to implement .bdrv_measure(). Extract the parsing code, including input validation. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-6-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	7c5bcc4212	qcow2: make refcount size calculation conservative The refcount metadata size calculation is inaccurate and can produce numbers that are too small. This is bad because we should calculate a conservative number - one that is guaranteed to be large enough. This patch switches the approach to a fixed point calculation because the existing equation is hard to solve when inaccuracies are taken care of. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-5-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	95c67e3bd7	qcow2: extract preallocation calculation function Calculating the preallocated image size will be needed to implement .bdrv_measure(). Extract the code out into a separate function. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-4-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Stefan Hajnoczi	a843a22a82	raw-format: add bdrv_measure() support Maximum size calculation is trivial for the raw format: it's just the requested image size (because there is no metadata). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-3-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Vladimir Sementsov-Ogievskiy	615b5dcf2d	block: release persistent bitmaps on inactivate We should release them here to reload on invalidate cache. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-31-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	469c71edc7	qcow2: add .bdrv_remove_persistent_dirty_bitmap Realize .bdrv_remove_persistent_dirty_bitmap interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-29-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	56f364e6d7	block/dirty-bitmap: add bdrv_remove_persistent_dirty_bitmap Interface for removing persistent bitmap from its storage. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-28-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	a3b52535e8	qmp: add x-debug-block-dirty-bitmap-sha256 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-26-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	da0eb242ad	qcow2: add .bdrv_can_store_new_dirty_bitmap Realize .bdrv_can_store_new_dirty_bitmap interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-23-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	169b879359	qcow2: store bitmaps on reopening image as read-only Store bitmaps and mark them read-only on reopening image as read-only. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-21-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	5f72826e7f	qcow2: add persistent dirty bitmaps support Store persistent dirty bitmaps in qcow2 image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-20-vsementsov@virtuozzo.com [mreitz: Always assign ret in store_bitmap() in case of an error] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	3dd10a06d1	block/dirty-bitmap: add bdrv_dirty_bitmap_next() Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-19-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	a88b179fdb	block: introduce persistent dirty bitmaps New field BdrvDirtyBitmap.persistent means, that bitmap should be saved by format driver in .bdrv_close and .bdrv_inactivate. No format driver supports it for now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-18-vsementsov@virtuozzo.com [mreitz: Fixed indentation] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	a0319aacd4	block/dirty-bitmap: add autoload field to BdrvDirtyBitmap Mirror AUTO flag from Qcow2 bitmap in BdrvDirtyBitmap. This will be needed in future, to save this flag back to Qcow2 for persistent bitmaps. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-16-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	1b6b0562db	qcow2: support .bdrv_reopen_bitmaps_rw Realize bdrv_reopen_bitmaps_rw interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-15-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	d1258dd0c8	qcow2: autoloading dirty bitmaps Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They are loaded when the image is opened and become BdrvDirtyBitmaps for the corresponding drive. Extra data in bitmaps is not supported for now. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-12-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	d6883bc968	block/dirty-bitmap: add readonly field to BdrvDirtyBitmap It will be needed in following commits for persistent bitmaps. If bitmap is loaded from read-only storage (and we can't mark it "in use" in this storage) corresponding BdrvDirtyBitmap should be read-only. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170628120530.31251-11-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	8bfc932e1e	block/dirty-bitmap: fix comment for BlockDirtyBitmap.disabled field Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-10-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	88ddffae8f	qcow2: add bitmaps extension Add bitmap extension as specified in docs/specs/qcow2.txt. For now, just mirror extension header into Qcow2 state and check constraints. Also, calculate refcounts for qcow2 bitmaps, to not break qemu-img check. For now, disable image resize if it has bitmaps. It will be fixed later. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-9-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	8a5bb1f114	qcow2-refcount: rename inc_refcounts() and make it public This is needed for the following patch, which will introduce refcounts checking for qcow2 bitmaps. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-8-vsementsov@virtuozzo.com [mreitz: s/inc_refcounts/qcow2_inc_refcounts_imrt/ in one more (new) place] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	6bdc8b719a	block/dirty-bitmap: add deserialize_ones func Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for qcow2 bitmap loading, to handle unallocated bitmap parts, marked as all-ones. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy	ba06ff1a5c	block: fix bdrv_dirty_bitmap_granularity signature Make getter signature const-correct. This allows other functions with const dirty bitmap parameter use bdrv_dirty_bitmap_granularity(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170628120530.31251-6-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Daniel P. Berrange	0a12f6f80e	qcow2: report encryption specific image information Currently 'qemu-img info' reports a simple "encrypted: yes" field. This is not very useful now that qcow2 can support multiple encryption formats. Users want to know which format is in use and some data related to it. Wire up usage of the qcrypto_block_get_info() method so that 'qemu-img info' can report about the encryption format and parameters in use $ qemu-img create \ --object secret,id=sec0,data=123456 \ -o encrypt.format=luks,encrypt.key-secret=sec0 \ -f qcow2 demo.qcow2 1G Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \ encryption=off encrypt.format=luks encrypt.key-secret=sec0 \ cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ qemu-img info demo.qcow2 image: demo.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 480K encrypted: yes cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 encrypt: ivgen alg: plain64 hash alg: sha256 cipher alg: aes-256 uuid: 3fa930c4-58c8-4ef7-b3c5-314bb5af21f3 format: luks cipher mode: xts slots: [0]: active: true iters: 1839058 key offset: 4096 stripes: 4000 [1]: active: false key offset: 262144 [2]: active: false key offset: 520192 [3]: active: false key offset: 778240 [4]: active: false key offset: 1036288 [5]: active: false key offset: 1294336 [6]: active: false key offset: 1552384 [7]: active: false key offset: 1810432 payload offset: 2068480 master key iters: 438487 corrupt: false With the legacy "AES" encryption we just report the format name $ qemu-img create \ --object secret,id=sec0,data=123456 \ -o encrypt.format=aes,encrypt.key-secret=sec0 \ -f qcow2 demo.qcow2 1G Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \ encryption=off encrypt.format=aes encrypt.key-secret=sec0 \ cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ ./qemu-img info demo.qcow2 image: demo.qcow2 file format: qcow2 virtual size: 1.0G (1073741824 bytes) disk size: 196K encrypted: yes cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 encrypt: format: aes corrupt: false Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-20-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Daniel P. Berrange	1cd9a787a2	block: pass option prefix down to crypto layer While the crypto layer uses a fixed option name "key-secret", the upper block layer may have a prefix on the options. e.g. "encrypt.key-secret", in order to avoid clashes between crypto option names & other block option names. To ensure the crypto layer can report accurate error messages, we must tell it what option name prefix was used. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-19-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	c01c214b69	block: remove all encryption handling APIs Now that all encryption keys must be provided upfront via the QCryptoSecret API and associated block driver properties there is no need for any explicit encryption handling APIs in the block layer. Encryption can be handled transparently within the block driver. We only retain an API for querying whether an image is encrypted or not, since that is a potentially useful piece of metadata to report to the user. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-18-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	4652b8f3e1	qcow2: add support for LUKS encryption format This adds support for using LUKS as an encryption format with the qcow2 file, using the new encrypt.format parameter to request "luks" format. e.g. # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 \ test.qcow2 10G The legacy "encryption=on" parameter still results in creation of the old qcow2 AES format (and is equivalent to the new 'encryption-format=aes'). e.g. the following are equivalent: # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encryption=on,encrypt.key-secret=sec0 \ test.qcow2 10G # qemu-img create --object secret,data=123456,id=sec0 \ -f qcow2 -o encryption-format=aes,encrypt.key-secret=sec0 \ test.qcow2 10G With the LUKS format it is necessary to store the LUKS partition header and key material in the QCow2 file. This data can be many MB in size, so cannot go into the QCow2 header region directly. Thus the spec defines a FDE (Full Disk Encryption) header extension that specifies the offset of a set of clusters to hold the FDE headers, as well as the length of that region. The LUKS header is thus stored in these extra allocated clusters before the main image payload. Aside from all the cryptographic differences implied by use of the LUKS format, there is one further key difference between the use of legacy AES and LUKS encryption in qcow2. For LUKS, the initialiazation vectors are generated using the host physical sector as the input, rather than the guest virtual sector. This guarantees unique initialization vectors for all sectors when qcow2 internal snapshots are used, thus giving stronger protection against watermarking attacks. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-14-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	b25b387fa5	qcow2: convert QCow2 to use QCryptoBlock for encryption This converts the qcow2 driver to make use of the QCryptoBlock APIs for encrypting image content, using the legacy QCow2 AES scheme. With this change it is now required to use the QCryptoSecret object for providing passwords, instead of the current block password APIs / interactive prompting. $QEMU \ -object secret,id=sec0,file=/home/berrange/encrypted.pw \ -drive file=/home/berrange/encrypted.qcow2,encrypt.key-secret=sec0 The test 087 could be simplified since there is no longer a difference in behaviour when using blockdev_add with encrypted images for the running vs stopped CPU state. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-12-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	446d306d23	qcow2: make qcow2_encrypt_sectors encrypt in place Instead of requiring separate input/output buffers for encrypting data, change qcow2_encrypt_sectors() to assume use of a single buffer, encrypting in place. The current callers all used the same buffer for input/output already. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-11-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	d85f4222b4	qcow: convert QCow to use QCryptoBlock for encryption This converts the qcow driver to make use of the QCryptoBlock APIs for encrypting image content. This is only wired up to permit use of the legacy QCow encryption format. Users who wish to have the strong LUKS format should switch to qcow2 instead. With this change it is now required to use the QCryptoSecret object for providing passwords, instead of the current block password APIs / interactive prompting. $QEMU \ -object secret,id=sec0,file=/home/berrange/encrypted.pw \ -drive file=/home/berrange/encrypted.qcow,encrypt.format=aes,\ encrypt.key-secret=sec0 Though note that running QEMU system emulators with the AES encryption is no longer supported, so while the above syntax is valid, QEMU will refuse to actually run the VM in this particular example. Likewise when creating images with the legacy AES-CBC format qemu-img create -f qcow \ --object secret,id=sec0,file=/home/berrange/encrypted.pw \ -o encrypt.format=aes,encrypt.key-secret=sec0 \ /home/berrange/encrypted.qcow 64M Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-10-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	1fad1f9400	qcow: make encrypt_sectors encrypt in place Instead of requiring separate input/output buffers for encrypting data, change encrypt_sectors() to assume use of a single buffer, encrypting in place. One current caller uses the same buffer for input/output already and the other two callers are easily converted to do so. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-9-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Daniel P. Berrange	0cb8d47ba9	block: deprecate "encryption=on" in favor of "encrypt.format=aes" Historically the qcow & qcow2 image formats supported a property "encryption=on" to enable their built-in AES encryption. We'll soon be supporting LUKS for qcow2, so need a more general purpose way to enable encryption, with a choice of formats. This introduces an "encrypt.format" option, which will later be joined by a number of other "encrypt.XXX" options. The use of a "encrypt." prefix instead of "encrypt-" is done to facilitate mapping to a nested QAPI schema at later date. e.g. the preferred syntax is now qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2 Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-8-berrange@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	6aa837f7bd	qcow: require image size to be > 1 for new images The qcow driver refuses to open images which are less than 2 bytes in size, but will happily create such images. Add a check in the create path to avoid this discrepancy. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-5-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	4a47f85431	block: add ability to set a prefix for opt names When integrating the crypto support with qcow/qcow2, we don't want to use the bare LUKS option names "hash-alg", "key-secret", etc. We need to namespace them to match the nested QAPI schema. e.g. "encrypt.hash-alg", "encrypt.key-secret" so that they don't clash with any general qcow options at a later date. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-3-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Daniel P. Berrange	306a06e5f7	block: expose crypto option names / defs to other drivers The block/crypto.c defines a set of QemuOpts that provide parameters for encryption. This will also be needed by the qcow/qcow2 integration, so expose the relevant pieces in a new block/crypto.h header. Some helper methods taking QemuOpts are changed to take QDict to simplify usage in other places. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-2-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:55 +02:00
Eric Blake	51b0a48888	block: Make bdrv_is_allocated_above() byte-based We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the signature of the function to use int64_t *pnum ensures that the compiler enforces that all callers are updated. For now, the io.c layer still assert()s that all callers are sector-aligned, but that can be relaxed when a later patch implements byte-based block status. Therefore, for the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_is_allocated(). But some code, particularly stream_run(), gets a lot simpler because it no longer has to mess with sectors. Leave comments where we can further simplify by switching to byte-based iterations, once later patches eliminate the need for sector-aligned operations. For ease of review, bdrv_is_allocated() was tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	c00716beb3	block: Minimize raw use of bds->total_sectors bdrv_is_allocated_above() was relying on intermediate->total_sectors, which is a field that can have stale contents depending on the value of intermediate->has_variable_length. An audit shows that we are safe (we were first calling through bdrv_co_get_block_status() which in turn calls bdrv_nb_sectors() and therefore just refreshed the current length), but it's nicer to favor our accessor functions to avoid having to repeat such an audit, even if it means refresh_total_sectors() is called more frequently. Suggested-by: John Snow <jsnow@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	d6a644bbfe	block: Make bdrv_is_allocated() byte-based We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the signature of the function to use int64_t pnum ensures that the compiler enforces that all callers are updated. For now, the io.c layer still assert()s that all callers are sector-aligned on input and that pnum is sector-aligned on return to the caller, but that can be relaxed when a later patch implements byte-based block status. Therefore, this code adds usages like DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned values, where the call might reasonbly give non-aligned results in the future; on the other hand, no rounding is needed for callers that should just continue to work with byte alignment. For the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_is_allocated(). But some code, particularly bdrv_commit(), gets a lot simpler because it no longer has to mess with sectors; also, it is now possible to pass NULL if the caller does not care how much of the image is allocated beyond the initial offset. Leave comments where we can further simplify once a later patch eliminates the need for sector-aligned requests through bdrv_is_allocated(). For ease of review, bdrv_is_allocated_above() will be tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:07 +02:00
Eric Blake	6f8e35e241	backup: Switch backup_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of backups to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are cluster-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	03f5d60bbf	backup: Switch backup_do_cow() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	f6ac207893	backup: Switch block_backup.h to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting the public interface to backup jobs (no semantic change), including a change to CowRequest to track by bytes instead of cluster indices. Note that this does not change the difference between the public interface (starting point, and size of the subsequent range) and the internal interface (starting and end points). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	cf79cdf662	backup: Switch BackupBlockJob to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting an internal structure (no semantic change), and all references to tracking progress. Drop a redundant local variable bytes_per_cluster. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	e8a81e9cad	block: Drop unused bdrv_round_sectors_to_clusters() Now that the last user [mirror_iteration()] has converted to using bytes, we no longer need a function to round sectors to clusters. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	fb2ef7919b	mirror: Switch mirror_iteration() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of mirroring to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are both sector-aligned and multiples of the granularity). Drop the now-unused mirror_clip_sectors(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	ae4cc8777b	mirror: Switch mirror_do_read() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function, preserving all existing semantics, and adding one more assertion that things are still sector-aligned (so that conversions to sectors in mirror_read_complete don't need to round). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	782d97efec	mirror: Switch mirror_cow_align() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change), and add mirror_clip_bytes() as a counterpart to mirror_clip_sectors(). Some of the conversion is a bit tricky, requiring temporaries to convert between units; it will be cleared up in a following patch. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	931e52607f	mirror: Update signature of mirror_clip_sectors() Rather than having a void function that modifies its input in-place as the output, change the signature to reduce a layer of indirection and return the result. Suggested-by: John Snow <jsnow@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	e6f2419389	mirror: Switch mirror_do_zero_or_discard() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	b436982f04	mirror: Switch MirrorBlockJob to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Continue by converting an internal structure (no semantic change), and all references to the buffer size. Add an assertion that our use of s->granularity >> BDRV_SECTOR_BITS (necessary for interaction with sector-based dirty bitmaps, until a later patch converts those to be byte-based) does not suffer from truncation problems. [checkpatch has a false positive on use of MIN() in this patch] Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	317a6676a2	commit: Switch commit_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of committing to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are sector-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	d8a9858408	commit: Switch commit_populate() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Start by converting an internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	d535435f4a	stream: Switch stream_run() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of streaming to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are sector-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	158c649257	stream: Drop reached_end for stream_complete() stream_complete() skips the work of rewriting the backing file if the job was cancelled, if data->reached_end is false, or if there was an error detected (non-zero data->ret) during the streaming. But note that in stream_run(), data->reached_end is only set if the loop ran to completion, and data->ret is only 0 in two cases: either the loop ran to completion (possibly by cancellation, but stream_complete checks for that), or we took an early goto out because there is no bs->backing. Thus, we can preserve the same semantics without the use of reached_end, by merely checking for bs->backing (and logically, if there was no backing file, streaming is a no-op, so there is no backing file to rewrite). Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	8493211c02	stream: Switch stream_populate() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Start by converting an internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	5cb1a49e01	trace: Show blockjob actions via bytes, not sectors Upcoming patches are going to switch to byte-based interfaces instead of sector-based. Even worse, trace_backup_do_cow_enter() had a weird mix of cluster and sector indices. The trace interface is low enough that there are no stability guarantees, and therefore nothing wrong with changing our units, even in cases like trace_backup_do_cow_skip() where we are not changing the trace output. So make the tracing uniformly use bytes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Eric Blake	f3e4ce4af3	blockjob: Track job ratelimits via bytes, not sectors The user interface specifies job rate limits in bytes/second. It's pointless to have our internal representation track things in sectors/second, particularly since we want to move away from sector-based interfaces. Fix up a doc typo found while verifying that the ratelimit code handles the scaling difference. Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be cleaned up later when functions are converted to iterate over images by bytes rather than by sectors. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Hervé Poussineau	8b544293ef	vvfat: change OEM name to 'MSWIN4.1' According to specification: "'MSWIN4.1' is the recommanded setting, because it is the setting least likely to cause compatibility problems. If you want to put something else in here, that is your option, but the result may be that some FAT drivers might not recognize the volume." Specification: "FAT: General overview of on-disk format" v1.03, page 9 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:06 +02:00
Hervé Poussineau	78f002c901	vvfat: handle KANJI lead byte 0xe5 Specification: "FAT: General overview of on-disk format" v1.03, page 23 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	6817efea3a	vvfat: limit number of entries in root directory in FAT12/FAT16 FAT12/FAT16 root directory is two sectors in size, which allows only 512 directory entries. Prevent QEMU startup if too much files exist, instead of overflowing root directory. Also introduce variable root_entries, which will be required for FAT32. Fixes: https://bugs.launchpad.net/qemu/+bug/1599539/comments/4 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	339cebcc01	vvfat: correctly generate numeric-tail of short file names More specifically: - try without numeric-tail only if LFN didn't have invalid short chars - start at ~1 (instead of ~0) - handle case if numeric tail is more than one char (ie > 10) Windows 9x Scandisk doesn't see anymore mismatches between short file names and long file names for non-ASCII filenames. Specification: "FAT: General overview of on-disk format" v1.03, page 31 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	0c36111f57	vvfat: correctly create base short names for non-ASCII filenames More specifically, create short name from filename and change blacklist of invalid chars to whitelist of valid chars. Windows 9x also now correctly see long file names of filenames containing a space, but Scandisk still complains about mismatch between SFN and LFN. [kwolf: Build fix for this intermediate patch (it included declarations for variables that are only used in the next patch) ] Specification: "FAT: General overview of on-disk format" v1.03, pages 30-31 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	09ec4119fb	vvfat: correctly create long names for non-ASCII filenames Assume that input filename is encoded as UTF-8, so correctly create UTF-16 encoding. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	f82d92bb02	vvfat: always create . and .. entries at first and in that order readdir() doesn't always return . and .. entries at first and in that order. This leads to not creating them at first in the directory, which raises some errors on file system checking utilities like MS-DOS Scandisk. Specification: "FAT: General overview of on-disk format" v1.03, page 25 Fixes: https://bugs.launchpad.net/qemu/+bug/1599539 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	92e28d8220	vvfat: fix field names in FAT12/FAT16 and FAT32 boot sectors Specification: "FAT: General overview of on-disk format" v1.03, pages 11-13 Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	4dc705dc7e	vvfat: introduce offset_to_bootsector, offset_to_fat and offset_to_root_dir - offset_to_bootsector is the number of sectors up to FAT bootsector - offset_to_fat is the number of sectors up to first File Allocation Table - offset_to_root_dir is the number of sectors up to root directory sector Replace first_sectors_number - 1 by offset_to_bootsector. Replace first_sectors_number by offset_to_fat. Replace faked_sectors by offset_to_rootdir. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00
Hervé Poussineau	ad05b31857	vvfat: rename useless enumeration values MODE_FAKED and MODE_RENAMED are not and were never used. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-10 13:18:05 +02:00

... 2 3 4 5 6 ...

3532 Commits