MigrateChannelList allows to connect accross multiple interfaces.
Add MigrateChannelList struct as argument to migration QAPIs.
We plan to include multiple channels in future, to connnect
multiple interfaces. Hence, we choose 'MigrateChannelList'
as the new argument over 'MigrateChannel' to make migration
QAPIs future proof.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-10-farosas@suse.de>
This patch introduces well defined MigrateAddress struct
and its related child objects.
The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.
Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.
Note: this commit only adds the type, and actual uses comes
in later commits.
Fabiano fixed for "file" transport.
Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-2-farosas@suse.de>
Message-Id: <20231023182053.8711-3-farosas@suse.de>
Add the cpr-reboot migration mode. Usage:
$ qemu-system-$arch -monitor stdio ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate -d file:vm.state
(qemu) info status
VM status: paused (postmigrate)
(qemu) quit
$ qemu-system-$arch -monitor stdio -incoming defer ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate_incoming file:vm.state
(qemu) info status
VM status: running
In this mode, the migrate command saves state to a file, allowing one
to quit qemu, reboot to an updated kernel, and restart an updated version
of qemu. The caller must specify a migration URI that writes to and reads
from a file. Unlike normal mode, the use of certain local storage options
does not block the migration, but the caller must not modify guest block
devices between the quit and restart. To avoid saving guest RAM to the
file, the memory backend must be shared, and the @x-ignore-shared migration
capability must be set. Guest RAM must be non-volatile across reboot, such
as by backing it with a dax device, but this is not enforced. The restarted
qemu arguments must match those used to initially start qemu, plus the
-incoming option.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
Create a mode migration parameter that can be used to select alternate
migration algorithms. The default mode is normal, representing the
current migration algorithm, and does not need to be explicitly set.
No functional change until a new mode is added, except that the mode is
shown by the 'info migrate' command.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-6-quintela@redhat.com>
It is obsolete. It is better to use driver-mirror with NBD instead.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Hanna Czenczek <hreitz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-5-quintela@redhat.com>
Use blocked-mirror with NBD instead.
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-4-quintela@redhat.com>
Use blockdev-mirror with NBD instead.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-3-quintela@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231013104736.31722-2-quintela@redhat.com>
Migration bandwidth is a very important value to live migration. It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.
This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data. This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.
However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing. For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM). It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes.
With that, the decision to switchover will be affected, by assuming that we
may not be able to switchover at all with such a low bandwidth, but in
reality we can.
The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth, keeping iterations forever with a
low estimation of bandwidth.
The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth". At least not something
I can think of so far.
One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.
For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified. It may not be the exact bandwidth when switching over (in which
case qemu will push migration data as fast as possible), but much better
than QEMU trying to wildly guess, especially when very wrong.
A new parameter "avail-switchover-bandwidth" is introduced just for this.
So when the user specified this parameter, instead of trusting the
estimated value from QEMU itself (based on the QEMUFile send speed), it
trusts the user more by using this value to decide when to switchover,
assuming that we'll have such bandwidth available then.
Note that specifying this value will not throttle the bandwidth for
switchover yet, so QEMU will always use the full bandwidth possible for
sending switchover data, assuming that should always be the most important
way to use the network at that time.
This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010221922.40638-1-peterx@redhat.com>
Display it as long as being set, irrelevant of FAILED status. E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.
The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.
This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as
the source for start-time field. This translates to
clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds
since host boot. This is not very useful. The only
reasonable use case of start-time I can imagine is to
check whether previously completed measurements are
too old or not. But this makes sense only if start-time
is reported as host wall-clock time.
This patch replaces source of start-time from
QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST.
Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reformat the dirty-limit migration doc comments to conform
to current conventions as commit a937b6aa73 (qapi: Reformat
doc comments to conform to current conventions).
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Message-ID: <169073570563.19893.2928364761104733482-1@git.sr.ht>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Whitespace tidied up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Since commit a937b6aa73 (qapi: Reformat doc comments to conform to
current conventions), a number of comments not conforming to the
current formatting conventions were added. No problem, just sweep
the entire documentation once more.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-7-armbru@redhat.com>
Has return zero for more than 10 years.
Specifically we introduced the field in 1.5.0
commit f1c72795af
Author: Peter Lieven <pl@kamp.de>
Date: Tue Mar 26 10:58:37 2013 +0100
migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.
even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.
this patch also updates QMP to return the number of
skipped pages in MigrationStats.
but removed its usage in 1.5.3
commit 9ef051e553
Author: Peter Lieven <pl@kamp.de>
Date: Mon Jun 10 12:14:19 2013 +0200
Revert "migration: do not sent zero pages in bulk stage"
Not sending zero pages breaks migration if a page is zero
at the source but not at the destination. This can e.g. happen
if different BIOS versions are used at source and destination.
It has also been reported that migration on pseries is completely
broken with this patch.
This effectively reverts commit f1c72795af.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230612193344.3796-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.
Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.
Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.
dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.
"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.
This two parameters are used to help implement the dirty
page rate limit algo of migration.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.
Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
So all the file is consistent.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230612191604.2219-1-quintela@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Rewrote calc-dirty-rate documentation. Briefly described
different modes of dirty page rate measurement. Added some
examples. Fixed obvious grammar errors.
Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Message-Id: <fe7d32a621ebd69ef6974beb2499c0b5dccb9e19.1684854849.git.gudkov.andrei@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
[Prose tweaked and spacing corrected, as per review]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Migration downtime estimation is calculated based on bandwidth and
remaining migration data. This assumes that loading of migration data in
the destination takes a negligible amount of time and that downtime
depends only on network speed.
While this may be true for RAM, it's not necessarily true for other
migrated devices. For example, loading the data of a VFIO device in the
destination might require from the device to allocate resources, prepare
internal data structures and so on. These operations can take a
significant amount of time which can increase migration downtime.
This patch adds a new capability "switchover ack" that prevents the
source from stopping the VM and completing the migration until an ACK
is received from the destination that it's OK to do so.
This can be used by migrated devices in various ways to reduce downtime.
For example, a device can send initial precopy metadata to pre-allocate
resources in the destination and use this capability to make sure that
the pre-allocation is completed before the source VM is stopped, so it
will have full effect.
This new capability relies on the return path capability to communicate
from the destination back to the source.
The actual implementation of the capability will be added in the
following patches.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
migrate_ignore_shared() is an optimization that avoids copying memory
that is visible and can be mapped on the target. However, a
memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED
flag set is not migrated when migrate_ignore_shared() is true. This is
wrong, because the block has no named backing store, and its contents will
be lost. To fix, ignore shared memory iff it is a named file. Define a
new flag RAM_NAMED_FILE to distinguish this case.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1686151116-253260-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
We don't allow to use x-colo capability when replication is not
configured. So, no reason to build COLO when replication is disabled,
it's unusable in this case.
Note also that the check in migrate_caps_check() is not the only
restriction: some functions in migration/colo.c will just abort if
called with not defined CONFIG_REPLICATION, for example:
migration_iteration_finish()
case MIGRATION_STATUS_COLO:
migrate_start_colo_process()
colo_process_checkpoint()
abort()
It could probably make sense to have possibility to enable COLO without
REPLICATION, but this requires deeper audit of colo & replication code,
which may be done later if needed.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Acked-by: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230428194928.1426370-4-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Change
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
to
# @name: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
# do eiusmod tempor incididunt ut labore et dolore magna aliqua.
See recent commit "qapi: Relax doc string @name: description
indentation rules" for rationale.
Reflow paragraphs to 70 columns width, and consistently use two spaces
to separate sentences.
To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3". Finds no
differences. Comparing with diff is not useful, as the reflown
paragraphs are visible there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-18-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Lukas Straub <lukasstraub2@web.de>
[Straightforward conflicts in qapi/audio.json qapi/misc-target.json
qapi/run-state.json resolved]
MigrateSetParameters has a TODO comment sitting right behind its doc
comment. I wrote it this way to keep it out of the manual, but that
reason is not obvious.
The previous commit (sphinx/qapidoc: Do not emit TODO sections into
user manuals) lets me move it into the doc comment as a TODO section.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230428105429.1687850-8-armbru@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Section tags are case sensitive and end with a colon. Screwing up
either gets them interpreted as ordinary paragraph. Fix a few.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230425064223.820979-15-armbru@redhat.com>
A few examples neglect to prefix QMP input with '->'. Fix that.
Two examples have extra space after '<-'. Delete it.
A few examples neglect to show output. Provide some. The example
output for query-vcpu-dirty-limit could use further improvement. Add
a TODO comment.
Use "Examples:" instead of "Example:" where multiple examples are
given.
One example section numbers its two examples. Not done elsewhere;
drop.
Another example section separates them with "or". Likewise.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230425064223.820979-8-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
query-cpu-definitions returns a list of CpuDefinitionInfo, but
documentation claims CpuDefInfo, which doesn't exist.
query-migrate-capabilities returns a list of
MigrationCapabilityStatus, but documentation claims
MigrationCapabilitiesStatus, which doesn't exist.
balloon and query-balloon can fail with KVMMissingCap, but
documentation claims KvmMissingCap, which doesn't exist.
Fix the documentation.
Fixes: e4e31c6324 (qapi: add query-cpu-definitions command (v2))
Fixes: bbf6da32b5 (Add migration capabilities)
Fixes: d72f326431 (qapi: Convert balloon)
Fixes: 96637bcdf9 (qapi: Convert query-balloon)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20230425064223.820979-4-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
It has nothing to do with migration, except for the "migrate" in the
name of the command. Move it with the rest of the ui commands.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Introduce interface query-migrationthreads. The interface is used
to query information about migration threads and returns with
migration thread's name and its id.
Introduce threadinfo.c to manage threads with migration.
Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
I've used real timestamp and changing them one by one so they would
not be all equal.
Problem was noticed when using the example as a test case for Go
bindings.
Signed-off-by: Victor Toso <victortoso@redhat.com>
Message-Id: <20220901085840.22520-11-victortoso@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220711211112.18951-3-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Firstly, postcopy already preempts precopy due to the fact that we do
unqueue_page() first before looking into dirty bits.
However that's not enough, e.g., when there're host huge page enabled, when
sending a precopy huge page, a postcopy request needs to wait until the whole
huge page that is sending to finish. That could introduce quite some delay,
the bigger the huge page is the larger delay it'll bring.
This patch adds a new capability to allow postcopy requests to preempt existing
precopy page during sending a huge page, so that postcopy requests can be
serviced even faster.
Meanwhile to send it even faster, bypass the precopy stream by providing a
standalone postcopy socket for sending requested pages.
Since the new behavior will not be compatible with the old behavior, this will
not be the default, it's enabled only when the new capability is set on both
src/dst QEMUs.
This patch only adds the capability itself, the logic will be added in follow
up patches.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20220707185342.26794-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Implement dirtyrate calculation periodically basing on
dirty-ring and throttle virtual CPU until it reachs the quota
dirty page rate given by user.
Introduce qmp commands "set-vcpu-dirty-limit",
"cancel-vcpu-dirty-limit", "query-vcpu-dirty-limit"
to enable, disable, query dirty page limit for virtual CPU.
Meanwhile, introduce corresponding hmp commands
"set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit",
"info vcpu_dirty_limit" so the feature can be more usable.
"query-vcpu-dirty-limit" success depends on enabling dirty
page rate limit, so just add it to the list of skipped
command to ensure qmp-cmd-test run successfully.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <4143f26706d413dd29db0b672fe58b3d3fbe34bc.1656177590.git.huangy81@chinatelecom.cn>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When originally implemented, zero_copy_send was designed as a Migration
paramenter.
But taking into account how is that supposed to work, and how
the difference between a capability and a parameter, it only makes sense
that zero-copy-send would work better as a capability.
Taking into account how recently the change got merged, it was decided
that it's still time to make it right, and convert zero_copy_send into
a Migration capability.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: always define the capability, even on non-Linux but error if
set; avoids build problems with the capability
(This replaces the 28th April through 10th May sets)
Compared to that last set it just has the Alpine
uring check that Leo has added; although that's also
now fixed upstream in Alpine.
It contains:
TLS test fixes from Dan
Zerocopy migration feature from Leo
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmKCY80ACgkQBRYzHrxb
/eeEEhAAoUogch7ifxFItr1EA0AU6Sgd3Dcn8wY9pm0NySVg7OcIpk1H++A3CgIh
bubJSwRmpIxGw+5q5w5OvBukFCGYMlAK7J8k1tZmaqdKS8wD0ZwhpPyqTWd14Q/v
xXSGOQfHMMvbBILiXPjSkfNw8yKJhZr+lW39uMz/kZRwZUmTcrdKAT3Q8PW+1DI9
v3mNoFNXqtDlHcQ4nQ1TGk/RDO6oXDlTJwdnjoJT3Dopf8Jhl2etvZgVk2kOf4i5
LmJbSVBr5FNOhJ6P4WL4OEQFOiXXquKdfuGTXIGGhkrW2WkPZulQwB6uO4Gv1wf2
aj9bLDAFoPxFx2zYS6S/9L6rGeBMcTL9xHCfzyylM6YRjoscRdxXc67PClw71JUy
regsoSQej0FpmsGx0uuAsDjCELleVIjeYzuQo5OYOP1BCg/5unLIrMgkyQw7COJI
w+MIZq7IqvUTehU2yXpUGOqPkyDLBlib92dMRgqqG9r9UU7iL3BREbGW4ugW+GM2
a9k8W9HjyDIIODsdXy1ugPHgjr/arHDAPgYosJMLvjTfdJDcIldAw6CbCcqhCDES
UOjMVN9VS+716nY2AqvtEHxf47YwqmeRb+tg4SQ0dHLH5Pvfe2bk1sbZiiQpcelt
Bd88yeBOpcmdzJVur2V4fEZXu5JB/qt/jeJeQa82hS3k93PWm/w=
=Axhk
-----END PGP SIGNATURE-----
Merge tag 'pull-migration-20220516a' of https://gitlab.com/dagrh/qemu into staging
Migration pull 2022-05-16
(This replaces the 28th April through 10th May sets)
Compared to that last set it just has the Alpine
uring check that Leo has added; although that's also
now fixed upstream in Alpine.
It contains:
TLS test fixes from Dan
Zerocopy migration feature from Leo
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmKCY80ACgkQBRYzHrxb
# /eeEEhAAoUogch7ifxFItr1EA0AU6Sgd3Dcn8wY9pm0NySVg7OcIpk1H++A3CgIh
# bubJSwRmpIxGw+5q5w5OvBukFCGYMlAK7J8k1tZmaqdKS8wD0ZwhpPyqTWd14Q/v
# xXSGOQfHMMvbBILiXPjSkfNw8yKJhZr+lW39uMz/kZRwZUmTcrdKAT3Q8PW+1DI9
# v3mNoFNXqtDlHcQ4nQ1TGk/RDO6oXDlTJwdnjoJT3Dopf8Jhl2etvZgVk2kOf4i5
# LmJbSVBr5FNOhJ6P4WL4OEQFOiXXquKdfuGTXIGGhkrW2WkPZulQwB6uO4Gv1wf2
# aj9bLDAFoPxFx2zYS6S/9L6rGeBMcTL9xHCfzyylM6YRjoscRdxXc67PClw71JUy
# regsoSQej0FpmsGx0uuAsDjCELleVIjeYzuQo5OYOP1BCg/5unLIrMgkyQw7COJI
# w+MIZq7IqvUTehU2yXpUGOqPkyDLBlib92dMRgqqG9r9UU7iL3BREbGW4ugW+GM2
# a9k8W9HjyDIIODsdXy1ugPHgjr/arHDAPgYosJMLvjTfdJDcIldAw6CbCcqhCDES
# UOjMVN9VS+716nY2AqvtEHxf47YwqmeRb+tg4SQ0dHLH5Pvfe2bk1sbZiiQpcelt
# Bd88yeBOpcmdzJVur2V4fEZXu5JB/qt/jeJeQa82hS3k93PWm/w=
# =Axhk
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 16 May 2022 07:46:37 AM PDT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
* tag 'pull-migration-20220516a' of https://gitlab.com/dagrh/qemu:
multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
multifd: Send header packet without flags if zero-copy-send is enabled
multifd: multifd_send_sync_main now returns negative on error
migration: Add migrate_use_tls() helper
migration: Add zero-copy-send parameter for QMP/HMP for Linux
QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX
QIOChannel: Add flags on io_writev and introduce io_flush callback
meson.build: Fix docker-test-build@alpine when including linux/errqueue.h
tests: ensure migration status isn't reported as failed
tests: add multifd migration tests of TLS with x509 credentials
tests: add multifd migration tests of TLS with PSK credentials
tests: convert multifd migration tests to use common helper
tests: convert XBZRLE migration test to use common helper
tests: add migration tests of TLS with x509 credentials
tests: add migration tests of TLS with PSK credentials
tests: add more helper macros for creating TLS x509 certs
tests: fix encoding of IP addresses in x509 certs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add property that allows zero-copy migration of memory pages
on the sending side, and also includes a helper function
migrate_use_zero_copy_send() to check if it's enabled.
No code is introduced to actually do the migration, but it allow
future implementations to enable/disable this feature.
On non-Linux builds this parameter is compiled-out.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220513062836.965425-5-leobras@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Perfectly aligned things look pretty, but keeping them that
way as the schema evolves requires churn, and in some cases
newly-added lines are not aligned properly.
Overall, trying to align things is just not worth the trouble.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Message-Id: <20220503073737.84223-8-abologna@redhat.com>
Message-Id: <20220503073737.84223-9-abologna@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Two patches squashed together]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220503073737.84223-5-abologna@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This only affects readability. The generated documentation
doesn't change.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220503073737.84223-4-abologna@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The correct return type is ReplicationStatus, not
ReplicationResult.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Message-Id: <20220420153408.243584-3-abologna@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The example shows {"command": ...}, which is wrong. Fix it to
{"execute": ...}.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220401082028.3583296-1-armbru@redhat.com>
Reviewed-by: Victor Toso <victortoso@redhat.com>
The example output is missing the mandatory member @last-mode in the
return value. Fix it.
Signed-off-by: Victor Toso <victortoso@redhat.com>
Message-Id: <20220331190633.121077-7-victortoso@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Example output lacks mandatory member @timestamp. Provide it.
Example output is not properly formatted. Fixing it by:
- Adding '<-' to signalize it is receiving the data;
- Breaking lines similar to the other examples.
Signed-off-by: Victor Toso <victortoso@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-Id: <20220328140604.41484-8-victortoso@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The examples for the snapshot-* and calc-dirty-rate commands document
that arguments for the commands are passed in a 'data' field.
This is wrong, passing them in a "data" field results in
the error:
{"error": {"class": "GenericError", "desc": "QMP input member 'data'
is unexpected"}}
Arguments are expected to be passed in an field called "arguments".
Replace "data" with "arguments" in the snapshot-* and calc-dirty-rate
command examples.
Signed-off-by: Fabian Holler <fabian.holler@simplesurance.de>
Message-Id: <20220222170116.63105-1-fabian.holler@simplesurance.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Provide information on the number of bytes copied in the pre-copy,
downtime and post-copy phases of migration.
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>