mirrors/qemu

mirror of https://github.com/qemu/qemu.git synced 2024-11-29 06:43:37 +08:00

Author	SHA1	Message	Date
Fam Zheng	ecc983a507	block: Add copy offloading trace points A few trace points that can help reveal what is happening in a copy offloading I/O path. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 16:01:52 +02:00
Fam Zheng	f8a30874ca	block: Prefix file driver trace points with "file_" With in one module, trace points usually have a common prefix named after the module name. paio_submit and paio_submit_co are the only two trace points so far in the two file protocol drivers. As we are adding more, having a common prefix here is better so that trace points can be enabled with a glob. Rename them. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 16:01:51 +02:00
Cornelia Huck	44e8b4689c	Revert "block: Remove deprecated -drive option serial" This reverts commit `b008326744`. Hold off removing this for one more QEMU release (current libvirt release still uses it.) Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 14:36:11 +02:00
Ari Sundholm	ba814c82bb	block/blklogwrites: Make sure the log sector size is not too small The sector size needs to be large enough to accommodate the data structures for the log super block and log write entries. This was previously not properly checked, which made it possible to cause QEMU to badly misbehave. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 13:17:48 +02:00
Vladimir Sementsov-Ogievskiy	f8d59dfb40	block/backup: fix fleecing scheme: use serialized writes Fleecing scheme works as follows: we want a kind of temporary snapshot of active drive A. We create temporary image B, with B->backing = A. Then we start backup(sync=none) from A to B. From this point, B reads as point-in-time snapshot of A (A continues to be active drive, accepting guest IO). This scheme needs some additional synchronization between reads from B and backup COW operations, otherwise, the following situation is theoretically possible: (assume B is qcow2, client is NBD client, reading from B) 1. client starts reading and take qcow2 mutex in qcow2_co_preadv, and goes up to l2 table loading (assume cache miss) 2) guest write => backup COW => qcow2 write => try to take qcow2 mutex => waiting 3. l2 table loaded, we see that cluster is UNALLOCATED, go to "case QCOW2_CLUSTER_UNALLOCATED" and unlock mutex before bdrv_co_preadv(bs->backing, ...) 4) aha, mutex unlocked, backup COW continues, and we finally finish guest write and change cluster in our active disk A 5. actually, do bdrv_co_preadv(bs->backing, ...) and read _new updated_ data. To avoid this, let's make backup writes serializing, to not intersect with reads from B. Note: we expand range of handled cases from (sync=none and B->backing = A) to just (A in backing chain of B), to finally allow safe reading from B during backup for all cases when A in backing chain of B, i.e. B formally looks like point-in-time snapshot of A. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 13:10:29 +02:00
Vladimir Sementsov-Ogievskiy	09d2f94846	block: add BDRV_REQ_SERIALISING flag Serialized writes should be used in copy-on-write of backup(sync=none) for image fleecing scheme. We need to change an assert in bdrv_aligned_pwritev, added in `28de2dcd88`. The assert may fail now, because call to wait_serialising_requests here may become first call to it for this request with serializing flag set. It occurs if the request is aligned (otherwise, we should already set serializing flag before calling bdrv_aligned_pwritev and correspondingly waited for all intersecting requests). However, for aligned requests, we should not care about outdating of previously read data, as there no such data. Therefore, let's just update an assert to not care about aligned requests. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 13:10:25 +02:00
Vladimir Sementsov-Ogievskiy	67b51fb998	block: split flags in copy_range Pass read flags and write flags separately. This is needed to handle coming BDRV_REQ_NO_SERIALISING clearly in following patches. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 13:04:25 +02:00
Vladimir Sementsov-Ogievskiy	999658a05e	block/io: fix copy_range Here two things are fixed: 1. Architecture On each recursion step, we go to the child of src or dst, only for one of them. So, it's wrong to create tracked requests for both on each step. It leads to tracked requests duplication. 2. Wait for serializing requests on write path independently of BDRV_REQ_NO_SERIALISING Before commit `9ded4a0114` "backup: Use copy offloading", BDRV_REQ_NO_SERIALISING was used for only one case: read in copy-on-write operation during backup. Also, the flag was handled only on read path (in bdrv_co_preadv and bdrv_aligned_preadv). After `9ded4a0114`, flag is used for not waiting serializing operations on backup target (in same case of copy-on-write operation). This behavior change is unsubstantiated and potentially dangerous, let's drop it and add additional asserts and documentation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 13:04:22 +02:00
Kevin Wolf	b0ddcbbb36	block: Fix copy-on-read crash with partial final cluster If the virtual disk size isn't aligned to full clusters, bdrv_co_do_copy_on_readv() may get pnum == 0 before having the full cluster completed, which will let it run into an assertion failure: qemu-io: block/io.c:1203: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed. Check for EOF, assert that we read at least as much as the read request originally wanted to have (which is true at EOF because otherwise bdrv_check_byte_request() would already have returned an error) and return success early even though we couldn't copy the full cluster. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 10:36:15 +02:00
Kevin Wolf	4be6a6d118	block: Poll after drain on attaching a node Commit `dcf94a23b1` ('block: Don't poll in parent drain callbacks') removed polling in bdrv_child_cb_drained_begin() on the grounds that the original bdrv_drain() already will poll and BdrvChildRole.drained_begin calls must not cause graph changes (and therefore must not call aio_poll() or the recursion through the graph will break. This reasoning is correct for calls through bdrv_do_drained_begin(). However, BdrvChildRole.drained_begin is also called when a node that is already in a drained section (i.e. bdrv_do_drained_begin() has already returned and therefore can't poll any more) is attached to a new parent. In this case, we must explicitly poll to have all requests completed before the drained new child can be attached to the parent. In bdrv_replace_child_noperm(), we know that we're not inside the recursion of bdrv_do_drained_begin() because graph changes are not allowed there, and bdrv_replace_child_noperm() is a graph change. The call of BdrvChildRole.drained_begin() must therefore be followed by a BDRV_POLL_WHILE() that waits for the completion of requests. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 10:36:15 +02:00
Peter Maydell	1daf14ec9e	Block layer patches: - qcow2: Use worker threads for compression to improve performance of 'qemu-img convert -W' and compressed backup jobs - blklogwrites: New filter driver to log write requests to an image in the dm-log-writes format - file-posix: Fix image locking during image creation - crypto: Fix memory leak in error path - Error out instead of silently truncating node names -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJbPfHhAAoJEH8JsnLIjy/Wxp8P/jeEJ/jJ0N2doix5KJSvtkdP dYx4cJxHg9tMSnmBfA49qc1ilgbzZqjbF/wH3Ko3zPsmFP0xsxx+9zE4c3iL7HA6 6E4/r3szznO8DIxdgzAIeHjZc7YPpsY3alrthT55eR/FnDyc8zofi/iUMgvTKPYA 8TnWqgi2fdaVwNy+lIRJFdhzzQNtPxOEO+0DFHaOZvQ9vlc5DGPC+Hj3Qc11GK8M u7orkeYqQPknxy0hJKPxtWHvzCQUJMcSSs6PbuslnzOerYiQLmx+RVIwMhXfVZjV lXe2SppAszBujtcIENhZlj1cECs7MrWTmFDcWvBA+Mh/JhFEWwykmlQHYrjCqdvw QyWhVLgP/6jQDaBls6LADSZDxX7i0sV27DzVGN4gYUX/KcyVPEDROm90MyE8nxN0 Y5hhujTkzu94Zvd/OlKZRs4tPxmbRrd2SWy0id8Kj5/gO/+UWECOZCrPrlB5uzec bsxHoeXLVe2/56JkCIiOw3hLILMY6gPLJgeaQjz6hu1oZJqYuoof8grVFjUCSH0C BBlz8O4upHdPIKNRl4rmdgwasg5YIiJ7SFl4h7KMCD0elkKmo3SyTKosagvqrpVN JvW1YuosCedcfYIKNPnZdEGWDKuTaZtcnZnM9IRY6hr8ySMw1LloFsgXsJ3tqTwe wb1Wjm4Qlz5BaoB7o/Os =z5AC -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - qcow2: Use worker threads for compression to improve performance of 'qemu-img convert -W' and compressed backup jobs - blklogwrites: New filter driver to log write requests to an image in the dm-log-writes format - file-posix: Fix image locking during image creation - crypto: Fix memory leak in error path - Error out instead of silently truncating node names # gpg: Signature made Thu 05 Jul 2018 11:24:33 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: file-posix: Unlock FD after creation file-posix: Fix creation locking block/blklogwrites: Add an option for the update interval of the log superblock block/blklogwrites: Add an option for appending to an old log block/blklogwrites: Change log_sector_size from int64_t to uint64_t block/crypto: Fix memory leak in create error path block: Don't silently truncate node names block: Add blklogwrites block: Move two block permission constants to the relevant enum qcow2: add compress threads qcow2: refactor data compression qemu-img: allow compressed not-in-order writes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-07-05 15:53:04 +01:00
Max Reitz	7c20c808a5	file-posix: Unlock FD after creation Closing the FD does not necessarily mean that it is unlocked. Fix this by relinquishing all permission locks before qemu_close(). Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 11:07:58 +02:00
Max Reitz	d815efcaf0	file-posix: Fix creation locking raw_apply_lock_bytes() takes a bit mask of "permissions that are NOT shared". Also, make the "perm" and "shared" variables uint64_t, because I do not particularly like using ~ on signed integers (and other permission masks are usually uint64_t, too). Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 11:07:58 +02:00
Ari Sundholm	1dce698ea8	block/blklogwrites: Add an option for the update interval of the log superblock This is a way to ensure that the log superblock is periodically updated. Before, this was only done on flush requests, which may not be enough if the VM exits abnormally, omitting the final flush. The default interval is 4096 write requests. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:50:21 +02:00
Ari Sundholm	0878b3c113	block/blklogwrites: Add an option for appending to an old log Suggested by Kevin Wolf. May be useful when testing multiple batches of writes or doing long-term testing involving restarts of the VM. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:50:20 +02:00
Ari Sundholm	2dacaf7c82	block/blklogwrites: Change log_sector_size from int64_t to uint64_t This was a simple oversight when working on intermediate versions of the original patch which introduced blklogwrites. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:50:20 +02:00
Kevin Wolf	0b68589d17	block/crypto: Fix memory leak in create error path Fixes: Coverity CID 1393782 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2018-07-05 10:29:19 +02:00
Aapo Vienamo	bfcc224e3c	block: Add blklogwrites Implements a block device write logging system, similar to Linux kernel device mapper dm-log-writes. The write operations that are performed on a block device are logged to a file or another block device. The write log format is identical to the dm-log-writes format. Currently, log markers are not supported. This functionality can be used for crash consistency and fs consistency testing. By implementing it in qemu, tests utilizing write logs can be be used to test non-Linux drivers and older kernels. The driver accepts an optional parameter to set the sector size used for logging. This makes the driver require all requests to be aligned to this sector size and also makes offsets and sizes of writes in the log metadata to be expressed in terms of this value (the log format has a granularity of one sector for offsets and sizes). This allows accurate logging of writes to guest block devices that have unusual sector sizes. The implementation is based on the blkverify and blkdebug block drivers. Signed-off-by: Aapo Vienamo <aapo@tuxera.com> Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Vladimir Sementsov-Ogievskiy	ceb029cd6f	qcow2: add compress threads Do data compression in separate threads. This significantly improve performance for qemu-img convert with -W (allow async writes) and -c (compressed) options. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Vladimir Sementsov-Ogievskiy	2714f13d69	qcow2: refactor data compression Make a separate function for compression to be parallelized later. - use .avail_out field instead of .next_out to calculate size of compressed data. It looks more natural and it allows to keep dest to be void pointer - set avail_out to be at least one byte less than input, to be sure avoid inefficient compression earlier Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:00 +02:00
Vladimir Sementsov-Ogievskiy	58f72b965e	dirty-bitmap: fix double lock on bitmap enabling Bitmap lock/unlock were added to bdrv_enable_dirty_bitmap in `8b1402ce80`, but some places were not updated correspondingly, which leads to trying to take this lock twice, which is dead-lock. Fix this. Actually, iotest 199 (about dirty bitmap postcopy migration) is broken now, and this fixes it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180625165745.25259-3-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-07-04 02:12:49 -04:00
Vladimir Sementsov-Ogievskiy	92bcea40d3	block/dirty-bitmap: add bdrv_enable_dirty_bitmap_locked Add _locked version of bdrv_enable_dirty_bitmap, to fix dirty bitmap migration in the following patch. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180625165745.25259-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-07-04 02:12:49 -04:00
Peter Maydell	a395717cbd	-----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJbOvCTAAoJEL2+eyfA3jBX354P/RFVLsozhcb3DeFj5Ocq2kfS sRt/82Ke/f/w/8lNd4wNbOsdG/eg2M4RmLWF4ONWWoeO7Z0KIatUOTtw5HxjBxBX XfhQy7RZ65luuCHnLDU6c4IdnDvXBVG/kErydhDZjEyeY8qlxrurBB8331cTRFwu hisweIwogPOFDA+/Bty0W0EyVQWFAobL3ExYFlOYFuHwsqJfMPQbytw2zDzC4kjn 8Ecppyt7rfLsEcyf/4OAoHfbbYOiQl7PkXE7/uXDFyL8zPdRpIlDFSZtmy1Zb213 mcYhPmehUkFHV/BDF/LdnzjlraK8oMaNu0IDld5cX/1xUU4VtbW2YjAt6OdCn7Ll 7YbNNKYU/mM1QUPshX4qJkbUaCu7JoTDKPiBbJei/MV7zMJBLpNVG/AuJE2gbweI 2levV76QzS2+fQVKv/9LUliqOYEp5T0/aybb+35Vzhf5WNpSO7s1oaCDAvSgUhS+ qU1MIAROQQPCmdM8PwqzG9b2TGp/tYcWOju5bqt488Twmo0BbTGjYCFl6StJHibC mN5fASP5nQiz1fc3FrBp0h/PCQlGtd2ZgeyeC+lkPVcFovclA2vo/ib1k2LK/0nU TzkKIFRJZH58yYjppuBrB6c/aFfVkutE4Hz25i+3nZ91ZEyQKbv1mDxCyRNXNWjt Gteul6gUo/AjzMOWFFvH =AKR7 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging # gpg: Signature made Tue 03 Jul 2018 04:42:11 BST # gpg: using RSA key BDBE7B27C0DE3057 # gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>" # gpg: aka "Jeffrey Cody <jeff@codyprime.org>" # gpg: aka "Jeffrey Cody <codyprime@gmail.com>" # Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057 * remotes/cody/tags/block-pull-request: backup: Use copy offloading block: Honour BDRV_REQ_NO_SERIALISING in copy range block: Fix parameter checking in bdrv_co_copy_range_internal Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-07-03 11:49:51 +01:00
Peter Maydell	9b75dcb15f	nbd patches for 2018-07-02 Bug fixes and iotest exposure of fleecing via NBD (serving a read-only point-in-time view via blockdev-backup sync:none, as well as serving dirty bitmaps over NBD), including a new x-dirty-bitmap parameter when opening NBD clients as the counterpart to x-nbd-server-add-bitmap. Also a random fix for iscsi block_status spotted by Coverity that missed other miscellaneous trees. - Eric Blake: nbd/server: Fix dirty bitmap logic regression - Eric Blake: iscsi: Avoid potential for get_status overflow - John Snow/Vladimir Sementsov-Ogievskiy: 0/2 block: formalize and test fleecing - Eric Blake: 0/2 test NBD bitmap export -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAABCAAGBQJbOtJPAAoJEKeha0olJ0NqEvwH/3FwWnlBdBvdYGgPjzGE1Atm ofCKcyxE/2VJtxeWlZQHzs1VqSq81s7am5SdzOrIWnQekvHFcLu6/71RABiauzMd neCvVOrXOVdktj1i1Z2Gg4BgjDmqbTDlo5ssVh/oXP0Zebi6OZFfQrB7y3cGBvui 4XI7lW9qJxt6F1FlKloXnofWRDENyo5vgdz6QjQXfauthw2T5045RIPTfiz03FCp fbs+6K0+bKxfPdNLrqxxOZo/loYnEXbDYv6VBAIWBqztXVnMHxCqnh0YN05jwsfF TRW0/YT8lpWarOZ1soIC6a/OGXQZbxgRhZ+Zr+Wa2jw0YNHJanU9isxi37aUqQo= =Xivx -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-07-02' into staging nbd patches for 2018-07-02 Bug fixes and iotest exposure of fleecing via NBD (serving a read-only point-in-time view via blockdev-backup sync:none, as well as serving dirty bitmaps over NBD), including a new x-dirty-bitmap parameter when opening NBD clients as the counterpart to x-nbd-server-add-bitmap. Also a random fix for iscsi block_status spotted by Coverity that missed other miscellaneous trees. - Eric Blake: nbd/server: Fix dirty bitmap logic regression - Eric Blake: iscsi: Avoid potential for get_status overflow - John Snow/Vladimir Sementsov-Ogievskiy: 0/2 block: formalize and test fleecing - Eric Blake: 0/2 test NBD bitmap export # gpg: Signature made Tue 03 Jul 2018 02:33:03 BST # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2018-07-02: iotests: New test 223 for exporting dirty bitmap over NBD nbd/client: Add x-dirty-bitmap to query bitmap from server iotests: add 222 to test basic fleecing blockdev: enable non-root nodes for backup source iscsi: Avoid potential for get_status overflow nbd/server: Fix dirty bitmap logic regression Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-07-03 10:47:02 +01:00
Fam Zheng	9ded4a0114	backup: Use copy offloading The implementation is similar to the 'qemu-img convert'. In the beginning of the job, offloaded copy is attempted. If it fails, further I/O will go through the existing bounce buffer code path. Then, as Kevin pointed out, both this and qemu-img convert can benefit from a local check if one request fails because of, for example, the offset is beyond EOF, but another may well be accepted by the protocol layer. This will be implemented separately. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180703023758.14422-4-famz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-07-02 23:23:45 -04:00
Fam Zheng	dee12de893	block: Honour BDRV_REQ_NO_SERIALISING in copy range This semantics is needed by drive-backup so implement it before using this API there. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180703023758.14422-3-famz@redhat.com Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-07-02 23:23:45 -04:00
Fam Zheng	d4d3e5a0d5	block: Fix parameter checking in bdrv_co_copy_range_internal src may be NULL if BDRV_REQ_ZERO_WRITE flag is set, in this case only check dst and dst->bs. This bug was introduced when moving in the request tracking code from bdrv_co_copy_range, in `37aec7d75e`. This especially fixes the possible segfault when initializing src_bs with a NULL src. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180703023758.14422-2-famz@redhat.com Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-07-02 23:23:45 -04:00
Eric Blake	216ee3657e	nbd/client: Add x-dirty-bitmap to query bitmap from server In order to test that the NBD server is properly advertising dirty bitmaps, we need a bare minimum client that can request and read the context. Since feature freeze for 3.0 is imminent, this is the smallest workable patch, which replaces the qemu block status report with the results of the NBD server's dirty bitmap (making it very easy to use 'qemu-img map --output=json' to learn where the dirty portions are). Note that the NBD protocol defines a dirty section with the same bit but opposite sense that normal "base:allocation" uses to report an allocated section; so in qemu-img map output, "data":true corresponds to clean, "data":false corresponds to dirty. A more complete solution that allows dirty bitmaps to be queried at the same time as normal block status will be required before this addition can lose the x- prefix. Until then, the fact that this replaces normal status with dirty status means actions like 'qemu-img convert' will likely misbehave due to treating dirty regions of the file as if they are unallocated. The next patch adds an iotest to exercise this new code. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180702191458.28741-2-eblake@redhat.com>	2018-07-02 15:27:38 -05:00
Eric Blake	8ee1cef459	iscsi: Avoid potential for get_status overflow Detected by Coverity: Multiplying two 32-bit int and assigning the result to a 64-bit number is a risk of overflow. Prior to the conversion to byte-based interfaces, the block layer took care of ensuring that a status request never exceeded 2G in the driver; but after that conversion, the block layer expects drivers to deal with any size request (the driver can always truncate the request size back down, as long as it makes progress). So, in the off-chance that someone makes a large request, we are at the mercy of whether iscsi_get_lba_status_task() will cap things to at most INT_MAX / iscsilun->block_size when it populates lbasd->num_blocks; since I could not easily audit that, it's better to be safe than sorry by just forcing a 64-bit multiply. Fixes: `92809c36` CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180508212718.1482663-1-eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2018-07-02 14:28:26 -05:00
Philippe Mathieu-Daudé	f043568f54	vdi: Use definitions from "qemu/units.h" Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20180625124238.25339-3-f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-07-02 14:45:23 +02:00
Eric Blake	583c99d393	block: Remove unused sector-based vectored I/O We are gradually moving away from sector-based interfaces, towards byte-based. Now that all callers of vectored I/O have been converted to use our preferred byte-based bdrv_co_p{read,write}v(), we can delete the unused bdrv_co_{read,write}v(). Furthermore, this gets rid of the signature difference between the public bdrv_co_writev() and the callback .bdrv_co_writev (the latter still exists, because some drivers still need more work before they are fully byte-based). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	3a7404b31e	vhdx: Switch to byte-based calls We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the last few sector-based calls into the block layer from the vhdx driver. Ideally, the vhdx driver should switch to doing everything byte-based, but that's a more invasive change that requires a bit more auditing. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	04a11d87d1	replication: Switch to byte-based calls We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the last few sector-based calls into the block layer from the replication driver. Ideally, the replication driver should switch to doing everything byte-based, but that's a more invasive change that requires a bit more auditing. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	609841a3f8	qcow: Switch to a byte-based driver We are gradually moving away from sector-based interfaces, towards byte-based. The qcow driver is now ready to fully utilize the byte-based callback interface, as long as we override the default alignment to still be 512 (needed at least for asserts present because of encryption, but easier to do everywhere than to audit which sub-sector requests are handled correctly, especially since we no longer recommend qcow for new disk images). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	d1326a786d	qcow: Switch qcow_co_writev to byte-based calls We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the internals of the qcow driver write function, by iterating over offset/bytes instead of sector_num/nb_sectors, and with a rename of index_in_cluster and repurposing of n to track bytes instead of sectors. A later patch will then switch the qcow driver as a whole over to byte-based operation. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	a15312b017	qcow: Switch qcow_co_readv to byte-based calls We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the internals of the qcow driver read function, by iterating over offset/bytes instead of sector_num/nb_sectors, and with a rename of index_in_cluster and repurposing of n to track bytes instead of sectors. A later patch will then switch the qcow driver as a whole over to byte-based operation. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	787993a543	qcow: Switch get_cluster_offset to be byte-based We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the internal helper function get_cluster_offset(), by changing n_start and n_end to be byte offsets rather than sector indices within the cluster being allocated. However, assert that these values are still sector-aligned (at least qcrypto_block_encrypt() still wants that). For now we get that alignment for free because we still use sector-based driver callbacks. A later patch will then switch the qcow driver as a whole over to byte-based operation; but will still leave things at sector alignments as it is not worth auditing the qcow image format to worry about sub-sector requests. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Eric Blake	d08c2a245f	parallels: Switch to byte-based calls We are gradually moving away from sector-based interfaces, towards byte-based. Make the change for the last few sector-based calls into the block layer from the parallels driver. Ideally, the parallels driver should switch to doing everything byte-based, but that's a more invasive change that requires a bit more auditing. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Fam Zheng	c436e3d014	file-posix: Fix EINTR handling EINTR should be checked against errno, not ret. While fixing the bug, collect the branches with a switch block. Also, change the return value from -ENOSTUP to -ENOSPC when the actual issue is request range passes EOF, which should be distinguishable from the case of error == ENOSYS by the caller, so that it could still retry with other byte ranges, whereas it shouldn't retry anymore upon ENOSYS. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Fam Zheng	1439b9c110	iscsi: Don't blindly use designator length in response for memcpy Per SCSI definition the designator_length we receive from INQUIRY is 8, 12 or at most 16, but we should be careful because the remote iscsi target may misbehave, otherwise we could have a buffer overflow. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Fam Zheng	e06f4639d8	qcow2: Fix src_offset in copy offloading Not updating src_offset will result in wrong data being written to dst image. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	33d70fb6fa	file-posix: Implement co versions of discard/flush This simplifies file-posix by implementing the coroutine variants of the discard and flush BlockDriver callbacks. These were the last remaining users of paio_submit(), which can be removed now. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	8b24cd1415	qcow2: Free allocated clusters on write error If we managed to allocate the clusters, but then failed to write the data, there's a good chance that we'll still be able to free the clusters again in order to avoid cluster leaks (the refcounts are cached, so even if we can't write them out right now, we may be able to do so when the VM is resumed after a werror=stop/enospc pause). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com>	2018-06-29 14:20:56 +02:00
Markus Armbruster	796d323945	block/crypto: Simplify block_crypto_{open,create}_opts_init() block_crypto_open_opts_init() and block_crypto_create_opts_init() contain a virtual visit of QCryptoBlockOptions and QCryptoBlockCreateOptions less member "format", respectively. Change their callers to put member "format" in the QDict, so they can use the generated visitors for these types instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Fam Zheng	37aec7d75e	block: Move request tracking to children in copy offloading in_flight and tracked requests need to be tracked in every layer during recursion. For now the only user is qemu-img convert where overlapping requests and IOThreads don't exist, therefore this change doesn't make much difference form user point of view, but it is incorrect as part of the API. Fix it. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Fam Zheng	354d930dc6	qcow2: Remove dead check on !ret In the beginning of the function, we initialize the local variable to 0, and in the body of the function, we check the assigned values and exit the loop immediately. So here it can never be non-zero. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	93f4e2ff4b	file-posix: Make .bdrv_co_truncate asynchronous This moves the code to resize an image file to the thread pool to avoid blocking. Creating large images with preallocation with blockdev-create is now actually a background job instead of blocking the monitor (and most other things) until the preallocation has completed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	1bc5f09f2e	block: Use tracked request for truncate When growing an image, block drivers (especially protocol drivers) may initialise the newly added area. I/O requests to the same area need to wait for this initialisation to be completed so that data writes don't get overwritten and reads don't read uninitialised data. To avoid overhead in the fast I/O path by adding new locking in the protocol drivers and to restrict the impact to requests that actually touch the new area, reuse the existing tracked request infrastructure in block/io.c and mark all discard requests as serialising. With this change, it is safe for protocol drivers to make .bdrv_co_truncate actually asynchronous. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	3d9f2d2af6	block: Move bdrv_truncate() implementation to io.c This moves the bdrv_truncate() implementation from block.c to block/io.c so it can have access to the tracked requests infrastructure. This involves making refresh_total_sectors() public (in block_int.h). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	47e86b868d	qcow2: Remove coroutine trampoline for preallocate_co() All callers are coroutine_fns now, so we can just directly call preallocate_co(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00

1 2 3 4 5 ...

3914 Commits