qemu/migration
Avihai Horon 4e1871c450 migration: Don't serialize devices in qemu_savevm_state_iterate()
Commit 90697be889 ("live migration: Serialize vmstate saving in stage
2") introduced device serialization in qemu_savevm_state_iterate(). The
rationale behind it was to first complete migration of slower changing
block devices and only then migrate the RAM, to avoid sending fast
changing RAM pages over and over.

This commit was added a long time ago, and while it was useful back
then, it is not the case anymore:
1. Block migration is deprecated, see commit 66db46ca83 ("migration:
   Deprecate block migration").
2. Today there are other iterative devices besides RAM and block, such
   as VFIO, which are registered for migration after RAM. With current
   serialization behavior, a fast changing device can block other
   devices from sending their data, which may prevent migration from
   converging in some cases.

The issue described in item 2 was observed in several VFIO migration
scenarios with switchover-ack capability enabled, where some workload on
the VM prevented RAM from ever reaching a hard zero, thus blocking VFIO
initial pre-copy data from being sent. Hence, destination could not ack
switchover and migration could not converge.

Fix that by not serializing iterative devices in
qemu_savevm_state_iterate().

Note that this still doesn't fully prevent device starvation. As
correctly pointed out by Peter [1], a fast changing device might
constantly consume all allocated bandwidth and block the following
devices. However, this scenario is more likely to happen only if
max-bandwidth is low.

[1] https://lore.kernel.org/qemu-devel/Zd6iw9dBhW6wKNxx@x1n/

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240304105339.20713-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 14:41:40 -04:00
..
block-dirty-bitmap.c Replace "iothread lock" with "BQL" in comments 2024-01-08 10:45:43 -05:00
block.c Replace "iothread lock" with "BQL" in comments 2024-01-08 10:45:43 -05:00
block.h migration: disable auto-converge during bulk block migration 2017-09-27 11:27:14 +01:00
channel-block.c io: follow coroutine AioContext in qio_channel_yield() 2023-09-07 20:32:11 -05:00
channel-block.h migration: introduce a QIOChannel impl for BlockDriverState VMState 2022-06-22 19:33:43 +01:00
channel.c migration: Fix migration_channel_read_peek() error path 2024-01-04 09:52:42 +08:00
channel.h migration: check magic value for deciding the mapping of channels 2023-02-06 19:22:57 +01:00
colo-failover.c migration/colo: Improve an x-colo-lost-heartbeat error message 2023-02-23 14:10:17 +01:00
colo.c Replace "iothread lock" with "BQL" in comments 2024-01-08 10:45:43 -05:00
dirtyrate.c system/cpus: rename qemu_mutex_lock_iothread() to bql_lock() 2024-01-08 10:45:43 -05:00
dirtyrate.h migration/calc-dirty-rate: millisecond-granularity period 2023-10-10 08:03:50 +08:00
exec.c migration: simplify exec migration functions 2024-03-04 07:12:40 +01:00
exec.h migration: convert exec backend to accept MigrateAddress. 2023-11-02 11:35:04 +01:00
fd.c migration/multifd: Add mapped-ram support to fd: URI 2024-03-01 15:42:04 +08:00
fd.h migration/multifd: Add mapped-ram support to fd: URI 2024-03-01 15:42:04 +08:00
file.c migration/multifd: Add mapped-ram support to fd: URI 2024-03-01 15:42:04 +08:00
file.h migration/multifd: Support incoming mapped-ram stream format 2024-03-01 15:42:04 +08:00
global_state.c migration 1st pull for 9.0 2024-01-05 13:35:25 +00:00
meson.build migration: file URI 2023-10-04 13:16:58 +02:00
migration-hmp-cmds.c migration: Plug memory leak on HMP migrate error path 2024-01-29 11:02:12 +08:00
migration-stats.c migration: migration_rate_limit_reset() don't need the QEMUFile 2023-10-31 08:44:33 +01:00
migration-stats.h migration: Remove transferred atomic counter 2023-10-31 08:44:33 +01:00
migration.c migration/multifd: Add mapped-ram support to fd: URI 2024-03-01 15:42:04 +08:00
migration.h migration: stop vm for cpr 2024-02-28 11:31:28 +08:00
multifd-zlib.c migration/multifd: Decouple recv method from pages 2024-03-01 15:42:04 +08:00
multifd-zstd.c migration/multifd: Decouple recv method from pages 2024-03-01 15:42:04 +08:00
multifd.c migration/multifd: Document two places for mapped-ram 2024-03-04 08:31:11 +08:00
multifd.h migration/multifd: Support incoming mapped-ram stream format 2024-03-01 15:42:04 +08:00
options.c migration/multifd: Support outgoing mapped-ram stream format 2024-03-01 15:42:04 +08:00
options.h migration/ram: Introduce 'mapped-ram' migration capability 2024-03-01 15:42:04 +08:00
page_cache.c migration: Fix cache_init()'s "Failed to allocate" error messages 2021-02-08 11:19:51 +00:00
page_cache.h migration: Clean up signed vs. unsigned XBZRLE cache-size 2021-02-08 11:19:51 +00:00
postcopy-ram.c migration: remove error from notifier data 2024-02-28 11:31:28 +08:00
postcopy-ram.h migration: remove error from notifier data 2024-02-28 11:31:28 +08:00
qemu-file.c migration/qemu-file: add utility methods for working with seekable channels 2024-03-01 15:42:04 +08:00
qemu-file.h migration/qemu-file: add utility methods for working with seekable channels 2024-03-01 15:42:04 +08:00
ram-compress.c migration: Rename ram_compressed_pages() to compress_ram_pages() 2023-10-30 17:41:55 +01:00
ram-compress.h migration: Rename ram_compressed_pages() to compress_ram_pages() 2023-10-30 17:41:55 +01:00
ram.c Migartion pull request for 20240304 2024-03-05 11:19:58 +00:00
ram.h migration/multifd: Support outgoing mapped-ram stream format 2024-03-01 15:42:04 +08:00
rdma.c migration/rdma: define htonll/ntohll only if not predefined 2024-01-16 11:16:10 +08:00
rdma.h migration: convert rdma backend to accept MigrateAddress 2023-11-02 11:35:03 +01:00
savevm.c migration: Don't serialize devices in qemu_savevm_state_iterate() 2024-03-11 14:41:40 -04:00
savevm.h migration: Add .save_prepare() handler to struct SaveVMHandlers 2023-09-11 08:34:06 +02:00
socket.c migration/multifd: Drop unnecessary helper to destroy IOC 2024-02-28 11:31:28 +08:00
socket.h migration/multifd: Drop unnecessary helper to destroy IOC 2024-02-28 11:31:28 +08:00
target.c migration: Add migration prefix to functions in target.c 2023-09-11 08:34:06 +02:00
threadinfo.c migration/multifd: Protect accesses to migration_threads 2023-07-26 10:55:56 +02:00
threadinfo.h migration/multifd: Protect accesses to migration_threads 2023-07-26 10:55:56 +02:00
tls.c migration: Drop unused parameter for migration_tls_client_create() 2023-05-03 11:24:20 +02:00
tls.h migration: Drop unused parameter for migration_tls_client_create() 2023-05-03 11:24:20 +02:00
trace-events migration/multifd: Cleanup multifd_recv_sync_main 2024-03-01 15:42:04 +08:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vmstate-types.c Move CPU softfloat unions to cpu-float.h 2022-04-06 14:31:43 +02:00
vmstate.c migration: Make VMStateDescription.subsections const 2023-12-29 11:17:30 +11:00
xbzrle.c migration/xbzrle: Use i386 host/cpuinfo.h 2023-05-23 16:51:18 -07:00
xbzrle.h migration/xbzrle: Use i386 host/cpuinfo.h 2023-05-23 16:51:18 -07:00
yank_functions.c migration/yank: Use channel features 2024-01-29 11:02:12 +08:00
yank_functions.h migration: Move the yank unregister of channel_close out 2021-07-26 12:45:03 +01:00