Allow RAM MemoryRegion to be created from an offset in a file, instead
of allocating at offset of 0 by default. This is needed to synchronize
RAM between QEMU & remote process.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
v2
Dropped vmstate: Fix memory leak in vmstate_handle_alloc
Broke on Power
Added migration: only check page size match if RAM postcopy is enabled
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEERfXHG0oMt/uXep+pBRYzHrxb/ecFAmAhIE4ACgkQBRYzHrxb
/ecPuA/+Pgo++1ZSseJUgbLePwyTVc0jahdcvYEDmLUn8UM6ikBcBXBgUKHdkFW3
bjSSVgB/xxvXSiafBK4xFNrCqSgqMSr3DJcHmvWgv2wVARcYf6Z26Da53LZq1Qru
0tvRyb40Od1f9zb8Zj7e2Y3pjQ9ybLLbjfNhgnOBbQivqWkjZI31oV2KUCWY2+eV
T1BEwr6mgYepqhmeB6OvQZtaQVC5toirS6NajNF4nt0vZEIGIvK6/A9erCVU8Tze
5ch1J0MUqgc3q6ZSE/I9BHEy6MaL0X8G6H+ezjxdoRQtbt1iM/YqZJCSrXkAxiLC
ROohryb6qVk26+UYuana79faLwrw359WlkwNEE6SEIRSENu+6p7bgN3LZuCILCO7
xJEkeTgy6r40IGCkDC9aWa8pyLHpNX9gyLpGBHdIRD6zEOWaKNtzh7E2uo/T0ann
BpcfgQOsYN25hIHiiXnxozUREbx71VDfMq7GqGB6eC3u2+a3U6jpSJb1nNq5NB89
FJYLZy5Rbuy7OStMwfMsxRs7E63XvGgnwrN8FczU/pumCPX4lDYIpnocqinUmP8p
XubRQQVaVDSKIq1mvzw7iR/1NsP9vfYvnrAIv941f38NBmDKqdPuMOXR/qB/Kp2Y
jB7b1L5/JcXbWsQmK7fda9jmPzFwSO2cTeTiUonk9RfuuDEws0A=
=4tbe
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20210208a' into staging
Migration pull 2021-02-08
v2
Dropped vmstate: Fix memory leak in vmstate_handle_alloc
Broke on Power
Added migration: only check page size match if RAM postcopy is enabled
# gpg: Signature made Mon 08 Feb 2021 11:28:14 GMT
# gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20210208a: (27 commits)
migration: only check page size match if RAM postcopy is enabled
migration: introduce snapshot-{save, load, delete} QMP commands
iotests: fix loading of common.config from tests/ subdir
iotests: add support for capturing and matching QMP events
migration: introduce a delete_snapshot wrapper
migration: wire up support for snapshot device selection
migration: control whether snapshots are ovewritten
block: rename and alter bdrv_all_find_snapshot semantics
block: allow specifying name of block device for vmstate storage
block: add ability to specify list of blockdevs during snapshot
migration: stop returning errno from load_snapshot()
migration: Make save_snapshot() return bool, not 0/-1
block: push error reporting into bdrv_all_*_snapshot functions
migration: Display the migration blockers
migration: Add blocker information
migration: Fix a few absurdly defective error messages
migration: Fix cache_init()'s "Failed to allocate" error messages
migration: Clean up signed vs. unsigned XBZRLE cache-size
migration: Fix migrate-set-parameters argument validation
migration: introduce 'userfaultfd-wrlat.py' script
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During migrations, after each iteration, cpu_throttle_set() is called,
which irrespective of input, re-arms the timer according to value of
new_throttle_pct. This causes cpu_throttle_thread() to be delayed in
getting scheduled and consqeuntly lets guest run for more time than what
the throttle value should allow. This leads to spikes in guest throughput
at high cpu-throttle percentage whenever cpu_throttle_set() is called.
A solution would be not to modify the timer immediately in
cpu_throttle_set(), instead, only modify throttle_percentage so that the
throttle would automatically adjust to the required percentage when
cpu_throttle_timer_tick() is invoked.
Manually tested the patch using following configuration:
Guest:
Centos7 (3.10.0-123.el7.x86_64)
Total Memory - 64GB , CPUs - 16
Tool used - stress (1.0.4)
Workload - stress --vm 32 --vm-bytes 1G --vm-keep
Migration Parameters:
Network Bandwidth - 500MBPS
cpu-throttle-initial - 99
Results:
With timer_mod(): fails to converge, continues indefinitely
Without timer_mod(): converges in 249 sec
Signed-off-by: Utkarsh Tripathi <utkarsh.tripathi@nutanix.com>
Message-Id: <1609420384-119407-1-git-send-email-utkarsh.tripathi@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We passed an is_write flag to the fuzz_dma_read_cb function to
differentiate between the mapped DMA regions that need to be populated
with fuzzed data, and those that don't. We simply passed through the
address_space_map is_write parameter. The goal was to cut down on
unnecessarily populating mapped DMA regions, when they are not read
from.
Unfortunately, nothing precludes code from reading from regions mapped
with is_write=true. For example, see:
https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04729.html
This patch removes the is_write parameter to fuzz_dma_read_cb. As a
result, we will fill all mapped DMA regions with fuzzed data, ignoring
the specified transfer direction.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20210120060255.558535-1-alxndr@bu.edu>
Modify load_snapshot/save_snapshot to accept the device list and vmstate
node name parameters previously added to the block layer.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-9-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns a boolean value.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Return false/true instead of -1/0, document function]
Acked-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-4-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The platform specific details of mechanisms for implementing
confidential guest support may require setup at various points during
initialization. Thus, it's not really feasible to have a single cgs
initialization hook, but instead each mechanism needs its own
initialization calls in arch or machine specific code.
However, to make it harder to have a bug where a mechanism isn't
properly initialized under some circumstances, we want to have a
common place, late in boot, where we verify that cgs has been
initialized if it was requested.
This patch introduces a ready flag to the ConfidentialGuestSupport
base type to accomplish this, which we verify in
qemu_machine_creation_done().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Global properties have an @optional field, which allows to apply a given
property to a given type even if one of its subclasses doesn't support
it. This is especially used in the compat code when dealing with the
"disable-modern" and "disable-legacy" properties and the "virtio-pci"
type.
Allow object_register_sugar_prop() to set this field as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159738953558.377274.16617742952571083440.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This will allow us to centralize the registration of
the cpus.c module accelerator operations (in accel/accel-softmmu.c),
and trigger it automatically using object hierarchy lookup from the
new accel_init_interfaces() initialization step, depending just on
which accelerators are available in the code.
Rename all tcg-cpus.c, kvm-cpus.c, etc to tcg-accel-ops.c,
kvm-accel-ops.c, etc, matching the object type names.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-18-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
we cannot in principle make the TCG Operations field definitions
conditional on CONFIG_TCG in code that is included by both common_ss
and specific_ss modules.
Therefore, what we can do safely to restrict the TCG fields to TCG-only
builds, is to move all tcg cpu operations into a separate header file,
which is only included by TCG, target-specific code.
This leaves just a NULL pointer in the cpu.h for the non-TCG builds.
This also tidies up the code in all targets a bit, having all TCG cpu
operations neatly contained by a dedicated data struct.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210204163931.7358-16-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
commit 568496c0c0 ("cpu: Add callback to check architectural") and
commit 3826121d92 ("target-arm: Implement checking of fired")
introduced an ARM-specific hack for cpu_check_watchpoint.
Make debug_check_watchpoint optional, and move it to tcg_ops.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210204163931.7358-15-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
commit 4061200059 ("arm: Correctly handle watchpoints for BE32 CPUs")
introduced this ARM-specific, TCG-specific hack to adjust the address,
before checking it with cpu_check_watchpoint.
Make adjust_watchpoint_address optional and move it to tcg_ops.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210204163931.7358-14-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move blk_exp_close_all() from bdrv_close() to qemu_cleanup(), before
bdrv_drain_all_begin().
Export drivers may have coroutines yielding at some point in the block
layer, so we need to shut them down before draining the block layer,
as otherwise they may get stuck blk_wait_while_drained().
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900505
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <20210201125032.44713-3-slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when
creating a memory region from a file. This functionality is needed since
the underlying host file may not allow writing.
Add a bool readonly argument to memory_region_init_ram_from_file() and
the APIs it calls.
Extend memory_region_init_ram_from_file() rather than introducing a
memory_region_init_rom_from_file() API so that callers can easily make a
choice between read/write and read-only at runtime without calling
different APIs.
No new RAMBlock flag is introduced for read-only because it's unclear
whether RAMBlocks need to know that they are read-only. Pass a bool
readonly argument instead.
Both of these design decisions can be changed in the future. It just
seemed like the simplest approach to me.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210104171320.575838-2-stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The -msg timestamp=on|off option controls whether a timestamp is printed
with error_report() messages. The "-msg" name suggests that this option
has a wider effect than just error_report(). The next patch extends it
to the 'log' trace backend, so rename the variable from
error_with_timestamp to message_with_timestamp.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210125113507.224287-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These cases require a bit more thought to review; in each case, the
code was appending to a list, but not with a FOOList **tail variable.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210113221013.390592-6-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Flawed change to qmp_guest_network_get_interfaces() dropped]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Use qemu_opts_parse_noisily now that HMP does not call
vnc_parse anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210120144235.345983-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When building with GCC 10.2 configured with --extra-cflags=-Os, we get:
softmmu/physmem.c: In function 'address_space_translate_for_iotlb':
softmmu/physmem.c:643:26: error: 'notifier' may be used uninitialized in this function [-Werror=maybe-uninitialized]
643 | notifier->active = true;
| ^
softmmu/physmem.c:608:23: note: 'notifier' was declared here
608 | TCGIOMMUNotifier *notifier;
| ^~~~~~~~
Initialize 'notifier' to silence the warning.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210117170411.4106949-1-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The possible choices for panic, reset and watchdog actions are inconsistent.
"-action panic=poweroff" should be renamed to "-action panic=shutdown"
on the command line. This is because "-action panic=poweroff" and
"-action watchdog=poweroff" have slightly different semantics, the first
does an unorderly exit while the second goes through qemu_cleanup(). With
this change, -no-shutdown would not have to change "-action panic=pause"
"pause", just like it does not have to change the reset action.
"-action reboot=none" should be renamed to "-action reboot=reset".
This should be self explanatory, since for example "-action panic=none"
lets the guest proceed without taking any action.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Despite it's name it didn't actually clean-up so let us document
gdb_exit() better and use that.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210108224256.2321-9-alex.bennee@linaro.org>
We are shortly going to have a split rw/rx jit buffer. Depending
on the host, we need to flush the dcache at the rw data pointer and
flush the icache at the rx code pointer.
For now, the two passed pointers are identical, so there is no
effective change in behaviour.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It's common to want to print a human-readable indication of a clock's
frequency. Provide a utility function in the clock API to return a
string which is a displayable representation of the frequency,
and use it in qdev-monitor.c.
Before:
(qemu) info qtree
[...]
dev: xilinx,zynq_slcr, id ""
clock-in "ps_clk" freq_hz=3.333333e+07
mmio 00000000f8000000/0000000000001000
After:
dev: xilinx,zynq_slcr, id ""
clock-in "ps_clk" freq_hz=33.3 MHz
mmio 00000000f8000000/0000000000001000
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201215150929.30311-5-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This has been a tcg-specific function, but is also in use
by hardware accelerators via physmem.c. This can cause
link errors when tcg is disabled.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201214140314.18544-3-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Enable removing tcg/$tcg_arch from the include path when TCG is disabled.
Move translate-all.h to include/exec, since stubs exist for the functions
defined therein.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to the renumbering of text consoles when graphical consoles are
created, init_displaystate must be called after all QemuConsoles are
created, i.e. after devices are created.
vl.c calls it from qemu_init_displays, while qmp_x_exit_preconfig is
where devices are created. If qemu_init_displays is called before it,
the VGA graphical console does not come up.
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Silly patch extracted from the next one, which is already big enough.
Because there are already local variables named "accel", we will name
the global vl.c variable for "-M accel" accelerators instead. Rename
it already in configure_accelerators to be ready.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It has been marked as deprecated since QEMU v5.0, replaced by the
corresponding parameter of the -display option.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201210155808.233895-5-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It has been marked as deprecated since QEMU v4.2, replaced by
the -overcommit option. Time to remove it now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201210155808.233895-4-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The '-tb-size' option (replaced by '-accel tcg,tb-size') is
deprecated since 5.0 (commit fe17413247). Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201202112714.1223783-1-philmd@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201210155808.233895-2-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In using the address_space_translate_internal API, address_space_cache_init
forgot one piece of advice that can be found in the code for
address_space_translate_internal:
/* MMIO registers can be expected to perform full-width accesses based only
* on their address, without considering adjacent registers that could
* decode to completely different MemoryRegions. When such registers
* exist (e.g. I/O ports 0xcf8 and 0xcf9 on most PC chipsets), MMIO
* regions overlap wildly. For this reason we cannot clamp the accesses
* here.
*
* If the length is small (as is the case for address_space_ldl/stl),
* everything works fine. If the incoming length is large, however,
* the caller really has to do the clamping through memory_access_size.
*/
address_space_cache_init is exactly one such case where "the incoming length
is large", therefore we need to clamp the resulting length---not to
memory_access_size though, since we are not doing an access yet, but to
the size of the resulting section. This ensures that subsequent accesses
to the cached MemoryRegionSection will be in range.
With this patch, the enclosed testcase notices that the used ring does
not fit into the MSI-X table and prints a "qemu-system-x86_64: Cannot map used"
error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current default action of pausing a guest after a panic event
is received leaves the responsibility to resume guest execution to the
management layer. The reasons for this behavior are discussed here:
https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/
However, in instances like the case of older guests (Linux and
Windows) using a pvpanic device but missing support for the
PVPANIC_CRASHLOADED event, and Windows guests using the hv-crash
enlightenment, it is desirable to allow the guests to continue
running after sending a PVPANIC_PANICKED event. This allows such
guests to proceed to capture a crash dump and automatically reboot
without intervention of a management layer.
Add an option to avoid stopping a VM after a panic event is received,
by passing:
-action panic=none
in the command line arguments, or during runtime by using an upcoming
QMP command.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <1607705564-26264-3-git-send-email-alejandro.j.jimenez@oracle.com>
[Do not fix panic action in the variable, instead modify -no-shutdown. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several command line options currently in use are meant to modify
the behavior of QEMU in response to certain guest events like:
-no-reboot, -no-shutdown, -watchdog-action.
These can be grouped into a single option of the form:
-action event=action
Which can be used to specify the existing options above in the
following format:
-action reboot=none|shutdown
-action shutdown=poweroff|pause
-action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi
This is done in preparation for adding yet another option of this
type, which modifies the QEMU behavior when a guest panic occurs.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <1607705564-26264-2-git-send-email-alejandro.j.jimenez@oracle.com>
[Use QemuOpts help support, invoke QMP command. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a QMP command to allow for the behaviors specified by the
-no-reboot and -no-shutdown command line option to be set at runtime.
The new command is named set-action and takes optional arguments, named
after an event, that provide a corresponding action to take.
Example:
-> { "execute": "set-action",
"arguments": {
"reboot": "none",
"shutdown": "poweroff",
"watchdog": "debug" } }
<- { "return": {} }
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Message-Id: <1607705564-26264-4-git-send-email-alejandro.j.jimenez@oracle.com>
[Split the series differently, with -action based on the QMP command. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compute the DIRTY_MEMORY_CODE bit in memory_region_get_dirty_log_mask
instead of memory_region_init_*. This makes it possible to allocate
memory backend objects at any time.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_finish_machine_init currently can only exit QEMU if it fails.
Prepare for giving it proper error propagation, and possibly for
adding a plugin_add monitor command that calls an accelerator
method.
While at it, make all errors from plugin_load look the same.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Machine options can be retrieved as properties of the machine object.
Encourage that by removing the "easy" accessor to machine options.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Generalize the qdev_hotplug variable to the different phases of
machine initialization. We would like to allow different
monitor commands depending on the phase.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
machine_init_done is not the right flag to check when preconfig
is taken into account; for example "./qemu-system-x86_64 -serial
mon:stdio -preconfig" does not print the QEMU monitor header until after
exit_preconfig. Add back a custom bool for mux character devices. This
partially undoes commit c7278b4355 ("chardev: introduce chr_machine_done
hook", 2018-03-12), but it keeps the cleaner logic using a function
pointer in ChardevClass.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qdev_machine_creation_done is only setting a flag now. Extend it to
move more code out of vl.c. Leave only consistency checks and gdbserver
processing in qemu_machine_creation_done.
gdbserver_start can be moved after qdev_machine_creation_done because
it only does listen on the socket and creates some internal data
structures; it does not send any data (e.g. guest state) over the socket.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that there is no RUN_STATE_PRECONFIG anymore that can conflict with
RUN_STATE_INMIGRATE, we can allow -incoming defer with -preconfig.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move post-preconfig initialization to the x-exit-preconfig. If preconfig
is not requested, just exit preconfig mode immediately with the QMP
command.
As a result, the preconfig loop will run with accel_setup_post
and os_setup_post restrictions (xen_restrict, chroot, etc.)
already done.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The preconfig state is only used if -incoming is not specified, which
makes the RunState state machine more tricky than it need be. However
there is already an equivalent condition which works even with -incoming,
namely qdev_hotplug. Use it instead of a separate runstate.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move everything related to Property and PropertyInfo to
qdev-properties.[ch] to make it easier to refactor that code.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20201211220529.2290218-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
split up the CpusAccel tcg_cpus into three TCG variants:
tcg_cpus_rr (single threaded, round robin cpus)
tcg_cpus_icount (same as rr, but with instruction counting enabled)
tcg_cpus_mttcg (multi-threaded cpus)
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201015143217.29337-2-cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move more of them into MachineState, in preparation for moving initialization
of the machine out of vl.c.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
serial_hd(i) is NULL if and only if i >= serial_max_hds(). Test
serial_hd(i) instead of bounding the loop at serial_max_hds(),
thus removing one more function that vl.c is expected to export.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Complement the previous patch by starting the VM with a QMP command.
The plan is that the user will be able to do the same, invoking two
commands "finish-machine-init" and "cont" instead of
"x-exit-preconfig".
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make qemu_start_incoming_migration local to migration/migration.c.
By using the runstate instead of a separate flag, vl need not do
anything to setup deferred incoming migration.
qmp_migrate_incoming also does not need the deferred_incoming flag
anymore, because "-incoming PROTOCOL" will clear the "once" flag
before the main loop starts. Therefore, later invocations of
the migrate-incoming command will fail with the existing
"The incoming migration has already been started" error message.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The check has the same effect here, it only matters that it is performed
once all devices, both builtin and user-specified, have been created.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Displays should be available before the monitor starts, so that
it is possible to use the graphical console to interact with
the monitor itself.
This patch is quite ugly, but all this is temporary. The double
call to qemu_init_displays will go away as soon we can unify machine
initialization between the preconfig and "normal" flows, and so will
the preconfig_exit_requested variable (that is only preconfig_requested
remains).
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a bit nasty: the machine is storing a string and later
resolving it. We probably want to remove the memdev property
and instead make this a memory-set command. "-M memdev" can be
handled a legacy option that is special cased by
machine_set_property.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"Early" backends are created before the machine and can be used as
machine options.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move CHECKPOINT_INIT right before the machine initialization is
completed. Everything before is essentially an extension of
command line parsing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no need to load plugins in the middle of default device processing,
move -plugin handling just before preconfig is entered.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create it together with other default backends, even though the processing is
done later.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_opts_set is used to create default network backends and to
parse sugar options -kernel, -initrd, -append, -bios and -dtb.
These are very different uses:
I would *expect* a function named qemu_opts_set to set an option in a
merge-lists QemuOptsList, such as -kernel, and possibly to set an option
in a non-merge-lists QemuOptsList with non-NULL id, similar to -set.
However, it wouldn't *work* to use qemu_opts_set for the latter
because qemu_opts_set uses fail_if_exists==1. So, for non-merge-lists
QemuOptsList and non-NULL id, the semantics of qemu_opts_set (fail if the
(QemuOptsList, id) pair already exists) are debatable.
On the other hand, I would not expect qemu_opts_set to create a
non-merge-lists QemuOpts with a single option; which it does, though.
For this case of non-merge-lists QemuOptsList and NULL id, qemu_opts_set
hardly adds value over qemu_opts_parse. It does skip some parsing and
unescaping, but that's not needed when creating default network
backends.
So qemu_opts_set has warty behavior for non-merge-lists QemuOptsList
if id is non-NULL, and it's mostly pointless if id is NULL. My
solution to keeping the API as simple as possible is to limit
qemu_opts_set to merge-lists QemuOptsList. For them, it's useful (we
don't want comma-unescaping for -kernel) *and* has sane semantics.
Network backend creation is switched to qemu_opts_parse.
qemu_opts_set is now only used on merge-lists QemuOptsList... except
in the testcase, which is changed to use a merge-list QemuOptsList.
With this change we can also remove the id parameter. With the
parameter always NULL, we know that qemu_opts_create cannot fail
and can pass &error_abort to it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep the machine initialization sequence free of miscellaneous command
line parsing actions.
The only difference is that preallocation will always be done with one
thread if -smp is not provided; previously it was using mc->default_cpus,
which is almost always 1 anyway.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just like -incoming. Later we will add support for "-incoming defer
-preconfig", because there are cases (Xen, block layer) that want
to look at RUNSTATE_INMIGRATE. -loadvm will remain mutually exclusive
with preconfig; the plan is to just do loadvm in the monitor, since
the user is already going to interact with it for preconfiguration.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The final part of qemu_init, starting with the completion of
board init, is already relatively clean. Split it out of
qemu_init so that qemu_init keeps only the messy parts.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Group a bunch of subsystem initializations that can be done right
after command line parsing. Remove initializations that can be done
simply as global variable initializers.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some very simple initialization routines can be nested in existing
subsystem-level functions, do that to simplify qemu_init.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Various options affect the global state of QEMU including the rest of
qemu_init, and they need to be called very early. Group them together
in a function that is called at the beginning.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no reason to prevent -preconfig -daemonize. Of course if
no monitor is defined there will be no way to start the VM,
but that is a user error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Once smp_parse is done, the validation operates on the MachineState.
There is no reason for that code to be in vl.c.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bios_name was a legacy variable used by machine code, but it is
no more.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20201026143028.3034018-16-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_write() returns a MemTxResult type.
Do not discard it, return it to the caller.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201023151923.3243652-5-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds support the kernel-irqchip option for
WHPX with on or off value. 'split' value is not supported
for the option. The option only works for the latest version
of Windows (ones that are coming out on Insiders). The
change maintains backward compatibility on older version of
Windows where this option is not supported.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <SN4PR2101MB0880B13258DA9251F8459F4DC0170@SN4PR2101MB0880.namprd21.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change to "expects a THING" where that's an obvious improvement
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201113082626.2725812-11-armbru@redhat.com>
We check that it exist at device creation time, so we don't have to
check anywhere else.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20201118083748.1328-22-quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't need to walk the opts by hand. qmp_opt_get() already does
that. And then we can remove the functions that did that walk.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20201118083748.1328-21-quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Just put allthe logic inside the same if.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20201118083748.1328-20-quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Once there, remove not needed cast.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20201118083748.1328-3-quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
the memory region or even [0, ~0ULL] for all the space. The assertion
could be hit by a guest, and rhel7 guest effectively hit it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20201116165506.31315-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This way we can tell between regular IOMMUTLBEntry (entry of IOMMU
hardware) and notifications.
In the notifications, we set explicitly if it is a MAPs or an UNMAP,
instead of trusting in entry permissions to differentiate them.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20201116165506.31315-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Previous name didn't reflect the iommu operation.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20201116165506.31315-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It makes no sense to track dirty pages for those un-migratable memory
regions (e.g., Memory BAR region of the VFIO PCI device) and doing so
will potentially lead to some unpleasant issues during migration [1].
Skip dirty tracking for those regions by evaluating if the region is
migratable before setting dirty_log_mask (DIRTY_MEMORY_MIGRATION).
[1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg03757.html
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-Id: <20201116132210.1730-1-yuzenghui@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
This patch contains all the files, whose maintainer I could not get
from ‘get_maintainer.pl’ script.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023124424.20177-1-chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Adapted exec.c and qdev-monitor.c to new location]
Signed-off-by: Thomas Huth <thuth@redhat.com>