The shpc component is optional while ACPI hotplug is used
for hot-plugging PCI devices into a PCI-PCI bridge.
Disabling the shpc by default will make slot 0 usable at boot time
and not only for hot-plug, without loosing any functionality.
Older machines will have shpc enabled for compatibility reasons.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
msix_init() reports errors with error_report(), which is wrong when
it's used in realize(). The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.
For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.
Bonus: add comment for msix_init.
CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The Generic Root Port behaves almost the same as the
Intel's IOH device with id 3420, without having
Intel specific attributes.
The device has two purposes:
(1) Can be used on both X86 and ARM machines.
(2) It will allow us to tweak the behaviour
(e.g add vendor-specific PCI capabilities)
- something that obviously cannot be done
on a known device.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Andrea Bolognani <abologna@redhat.com>
The 'base' PCI Express Root Port includes
the common code to be re-used for all
Root Ports implementations. Most of the code
was taken from the current implementation
of Intel's IOH 3420 Root Port.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It's a familiar pattern: some code uses ARRAY_SIZE, then refactoring
changes the argument from an array to a pointer to a dynamically
allocated buffer. Code keeps compiling but any ARRAY_SIZE calls now
return the size of the pointer divided by element size.
Let's add build time checks to ARRAY_SIZE before we allow more
of these in the code-base.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QEMU_BUILD_BUG_ON uses a typedef in order to be safe
to use outside functions, but sometimes it's useful
to have a version that can be used within an expression.
Following what Linux does, introduce QEMU_BUILD_BUG_ON_ZERO
that return zero after checking condition at build time.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
There are theoretical concerns that some compilers might not trigger
build failures on attempts to define an array of size (x ? -1 : 1) where
x is a variable and make it a variable sized array instead. Let rewrite
using a struct with a negative bit field size instead as there are no
dynamic bit field sizes. This is similar to what Linux does.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Some headers use QEMU_BUILD_BUG_ON. This causes a problem
if the C file including that header happens to have
QEMU_BUILD_BUG_ON at the same line number.
Fix using a widely available extension: __COUNTER__.
If unavailable, provide a stub.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Those could probably be squashed with earlier patches, however I
couldn't easily identify them, test them or check if there are still
necessary on various platforms.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This define is used by several character devices, place it in char
common header.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A mechanical move, except that qemu_chr_write_all() needs to be declared
in char.h header to be used from chardev unit files.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Introduce rules in the top level Makefile that are able to generate
trace.[ch] files in every subdirectory which has a trace-events file.
The top level directory is handled specially, so instead of creating
trace.h, it creates trace-root.h. This allows sub-directories to
include the top level trace-root.h file, without ambiguity wrt to
the trace.g file in the current sub-dir.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170125161417.31949-7-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
All users include the trailing ; anyway, let's require that -
it seems cleaner.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
The class kind is necessary to lookup the chardev name in
qmp_chardev_add() after calling qemu_chr_new_from_opts() and to set
the appropriate ChardevBackend (mainly to free the right
fields).
qemu_chr_new_from_opts() can be changed to use a non-qmp function
using the chardev class typename. Introduce qemu_chardev_add() to be
called from qemu_chr_new_from_opts() and remove the class chardev kind
field. Set the backend->type in the parse callback (when non-common
fields are added).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CharDriver no longer exists, it has been replaced with Chardev.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu_chr_new_from_opts() is modified to not need CharDriver backend[]
array, but uses instead objectified qmp_query_chardev_backends() and
char_get_class(). The alias field is moved outside in a ChardevAlias[],
similar to QDevAlias for devices.
"kind" and "parse" are moved to ChardevClass ("kind" is to be removed
next)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Now it uses Object instance_finalize instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This enables the ps2 controller to process mouse events for buttons 4 and 5.
Additionally, distinct definitions for the ps2 mouse button state are
introduced. The legacy definitions from console.h are not used anymore.
Signed-off-by: Fabian Lesniak <fabian@lesniak-it.de>
Message-id: 20161206190007.7539-3-fabian@lesniak-it.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Implements 128-bit left shift and right shift as well as their
testcases. By design, shift silently mods by 128, so the caller is
responsible to assert the shift range if necessary.
Left shift sets the overflow flag if any non-zero digit is shifted out.
Examples:
ulshift(&low, &high, 250, &overflow);
equivalent: n << 122
urshift(&low, &high, -2);
equivalent: n << 126
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[dwg: Added test-shift128 to .gitignore]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xscvdphp: VSX Scalar round & Convert Double-Precision format to
Half-Precision format
xscvhpdp: VSX Scalar Convert Half-Precision format to
Double-Precision format
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When passing through an USB storage device to a pseries guest, it
is currently not possible to automatically boot from the device
if the "bootindex" property has been specified, too (e.g. when using
"-device nec-usb-xhci -device usb-host,hostbus=1,hostaddr=2,bootindex=0"
at the command line). The problem is that QEMU builds a device tree path
like "/pci@800000020000000/usb@0/usb-host@1" and passes it to SLOF
in the /chosen/qemu,boot-list property. SLOF, however, probes the
USB device, recognizes that it is a storage device and thus changes
its name to "storage", and additionally adds a child node for the
SCSI LUN, so the correct boot path in SLOF is something like
"/pci@800000020000000/usb@0/storage@1/disk@101000000000000" instead.
So when we detect an USB mass storage device with SCSI interface,
we've got to adjust the firmware boot-device path properly that
SLOF can automatically boot from the device.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1354177
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The H_SIGNAL_SYS_RESET hcall allows a guest CPU to raise a system reset
exception on CPUs within the same guest -- all CPUs, all-but-self, or a
specific CPU (including self).
This has not made its way to a PAPR release yet, but we have an hcall
number assigned.
H_SIGNAL_SYS_RESET = 0x380
Syntax:
hcall(uint64 H_SIGNAL_SYS_RESET, int64 target);
Generate a system reset NMI on the threads indicated by target.
Values for target:
-1 = target all online threads including the caller
-2 = target all online threads except for the caller
All other negative values: reserved
Positive values: The thread to be targeted, obtained from the value
of the "ibm,ppc-interrupt-server#s" property of the CPU in the OF
device tree.
Semantics:
- Invalid target: return H_Parameter.
- Otherwise: Generate a system reset NMI on target thread(s),
return H_Success.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_h_cas_compose_response() includes a cpu_update parameter which
controls whether it includes updated information on the CPUs in the device
tree fragment returned from the ibm,client-architecture-support (CAS) call.
Providing the updated information is essential when CAS has negotiated
compatibility options which require different cpu information to be
presented to the guest. However, it should be safe to provide in other
cases (it will just override the existing data in the device tree with
identical data). This simplifies the code by removing the parameter and
always providing the cpu update information.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Currently the pseries machine has two paths for constructing CPUs. On
newer machine type versions, which support cpu hotplug, it constructs
cpu core objects, which in turn construct CPU threads. For older machine
versions it individually constructs the CPU threads.
This division is going to make some future changes to the cpu construction
harder, so this patch unifies them. Now cpu core objects are always
created. This requires some updates to allow core objects to be created
without a full complement of threads (since older versions allowed a
number of cpus not a multiple of the threads-per-core). Likewise it needs
some changes to the cpu core hot/cold plug path so as not to choke on the
old machine types without hotplug support.
For good measure, we move the cpu construction to its own subfunction,
spapr_init_cpus().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
The Xen HVM unplug protocol [1] specifies a mechanism to allow guests to
request unplug of 'aux' disks (which is stated to mean all IDE disks,
except the primary master). This patch adds support for that unplug request.
NOTE: The semantics of what happens if unplug of all disks and 'aux' disks
is simultaneously requests is not clear. The patch makes that
assumption that an 'all' request overrides an 'aux' request.
[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/hvm-emulated-unplug.markdown
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
----
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Turn Chardev into Object.
qemu_chr_alloc() is replaced by the qemu_chardev_new() constructor. It
will call qemu_char_open() to open/intialize the chardev with the
ChardevCommon *backend settings.
The CharDriver::create() callback is turned into a ChardevClass::open()
which is called from the newly introduced qemu_chardev_open().
"chardev-gdb" and "chardev-hci" are internal chardev and aren't
creatable directly with -chardev. Use a new internal flag to disable
them. We may want to use TYPE_USER_CREATABLE interface instead, or
perhaps allow -chardev usage.
Although in general we keep typename and macros private, unless the type
is being used by some other file, in this patch, all types and common
helper macros for qemu-char.c are in char.h. This is to help transition
now (some types must be declared early, while some aren't shared) and
when splitting in several units. This is to be improved later.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of registering a vc handler to allocate the Gtk VC Chardev,
overwrite the console.c char driver.
A later patch, when switching to QOM, will register a default console vc
QOM class if none has been registered before.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pick a uniform chardev type name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a single allocation for CharDriverState, this avoids extra
allocations & pointers, and is a step towards more object-oriented
CharDriver.
Gtk console is a bit peculiar, gd_vc_chr_set_echo() used to have a
temporary VirtualConsole to save the echo bit. Instead now, we consider
whether vcd->console is set or not, and restore the echo bit saved in
VCDriverState when calling gd_vc_vte_init().
The casts added are temporary, they are replaced with QOM type-safe
macros in a later patch in this series.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a feature flag rather than a structure field for "replay".
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows to remove the "is_mux" field from CharDriverState.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes the code more declarative, and avoids duplicating the
information on all instances.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No need to allocate & copy fields, let's use static const struct instead.
Add an alias field to the CharDriver structure to cover the cases where
we previously registered a driver twice under two names.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1484921496-11257-4-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 6f6071745b ("raw-posix: Fetch max sectors for host block device")
introduced a routine to call the kernel BLKSECTGET ioctl, which stores the
result back to user space. However, the size of the data returned depends
on the routine handling the ioctl. The (compat_)blkdev_ioctl returns a
short, while sg_ioctl returns an int. Thus, on big-endian systems, we can
find ourselves accidentally shifting the result to a much larger value.
(On s390x, a short is 16 bits while an int is 32 bits.)
Also, the two ioctl handlers return values in different scales (block
returns sectors, while sg returns bytes), so some tweaking of the outputs
is required such that hdev_get_max_transfer_length returns a value in a
consistent set of units.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170120162527.66075-3-farman@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that CPUs show up in the help text of "-device ?",
we should group them into an appropriate category.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1484917276-7107-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The generic edk2 SMM infrastructure prefers
EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If
Trigger() only brings the current processor into SMM, then edk2 handles it
in the following ways:
(1) If Trigger() is executed by the BSP (which is guaranteed before
ExitBootServices(), but is not necessarily true at runtime), then:
(a) If edk2 has been configured for "traditional" SMM synchronization,
then the BSP sends directed SMIs to the APs with APIC delivery,
bringing them into SMM individually. Then the BSP runs the SMI
handler / dispatcher.
(b) If edk2 has been configured for "relaxed" SMM synchronization,
then the APs that are not already in SMM are not brought in, and
the BSP runs the SMI handler / dispatcher.
(2) If Trigger() is executed by an AP (which is possible after
ExitBootServices(), and can be forced e.g. by "taskset -c 1
efibootmgr"), then the AP in question brings in the BSP with a
directed SMI, and the BSP runs the SMI handler / dispatcher.
The smaller problem with (1a) and (2) is that the BSP and AP
synchronization is slow. For example, the "taskset -c 1 efibootmgr"
command from (2) can take more than 3 seconds to complete, because
efibootmgr accesses non-volatile UEFI variables intensively.
The larger problem is that QEMU's current behavior diverges from the
behavior usually seen on physical hardware, and that keeps exposing
obscure corner cases, race conditions and other instabilities in edk2,
which generally expects / prefers a software SMI to affect all CPUs at
once.
Therefore introduce the "broadcast SMI" feature that causes QEMU to inject
the SMI on all VCPUs.
While the original posting of this patch
<http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html>
only intended to speed up (2), based on our recent "stress testing" of SMM
this patch actually provides functional improvements.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20170126014416.11211-3-lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce the following fw_cfg files:
- "etc/smi/supported-features": a little endian uint64_t feature bitmap,
presenting the features known by the host to the guest. Read-only for
the guest.
The content of this file will be determined via bit-granularity ICH9-LPC
device properties, to be introduced later. For now, the bitmask is left
zeroed. The bits will be set from machine type compat properties and on
the QEMU command line, hence this file is not migrated.
- "etc/smi/requested-features": a little endian uint64_t feature bitmap,
representing the features the guest would like to request. Read-write
for the guest.
The guest can freely (re)write this file, it has no direct consequence.
Initial value is zero. A nonzero value causes the SMI-related fw_cfg
files and fields that are under guest influence to be migrated.
- "etc/smi/features-ok": contains a uint8_t value, and it is read-only for
the guest. When the guest selects the associated fw_cfg key, the guest
features are validated against the host features. In case of error, the
negotiation doesn't proceed, and the "features-ok" file remains zero. In
case of success, the "features-ok" file becomes (uint8_t)1, and the
negotiated features are locked down internally (to which no further
changes are possible until reset).
The initial value is zero. A nonzero value causes the SMI-related
fw_cfg files and fields that are under guest influence to be migrated.
The C-language fields backing the "supported-features" and
"requested-features" files are uint8_t arrays. This is because they carry
guest-side representation (our choice is little endian), while
VMSTATE_UINT64() assumes / implies host-side endianness for any uint64_t
fields. If we migrate a guest between hosts with different endiannesses
(which is possible with TCG), then the host-side value is preserved, and
the host-side representation is translated. This would be visible to the
guest through fw_cfg, unless we used plain byte arrays. So we do.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20170126014416.11211-2-lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adding one more option "-f" for "info mtree" to dump the flat views of
all the address spaces.
This will be useful to debug the memory rendering logic, also it'll be
much easier with it to know what memory region is handling what address
range.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1484556005-29701-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch implements saving/restoring of static apic_delivered variable.
v8: saving static variable only for one of the APICs
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170126123429.5412.94368.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch implements initial vmstate creation or loading at the start
of record/replay. It is needed for rewinding the execution in the replay mode.
v4 changes:
- snapshots are not created by default anymore
v3 changes:
- added rrsnapshot option
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170124071746.4572.61449.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch introduces save_vmstate function to allow saving and loading
vmstates from the replay module.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20170124071741.4572.13714.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For configurations of the pflash_cfi01 device which set it up with a
device-width not equal to the width (ie where we are emulating
multiple narrow flash devices wired up in parallel), we were giving
incorrect values in the CFI data table:
(1) the sector length entry should specify the sector length for a
single device, not the length for the overall collection of
devices
(2) the number of blocks per device must not be divided by the
number of devices because the resulting device size would not
match the overall size
(3) this then means that the overall write block size must be
modified depending on the number of devices because the entry is
per device and when the guest writes into the flash it
calculates the write size by using the CFI entry (write size
per device) multiplied by the number of chips.
(It would alternatively be possible to modify the write
block size in the CFI table (currently hardcoded at 2048) and
leave the overall write block size alone.)
This commit corrects these bugs, and adds a hw-compat property
to retain the old behaviour on 2.8 and earlier versions. (The
only board we have which uses this sort of flash config and
has machine versioning is the "virt" board -- the PC uses a
single flash device and so behaviour is unaffected whether
using old-multiple-chip-handling or not.)
Here is a configuration example from the vexpress board:
VEXPRESS_FLASH_SIZE = 64M
VEXPRESS_FLASH_SECT_SIZE 256K
num-blocks = VEXPRESS_FLASH_SIZE / VEXPRESS_FLASH_SECT_SIZE = 256
sector-length = 256K
width = 4
device-width = 2
The code will fill the CFI entry with the following entries:
num-blocks = 256
sector-length = 128K
writeblock_size = 2048
This results in two chips, each with 256 * 128K = 32M device size and
a write block size of 2048.
A sector erase will be sent to both chips, thus 256K must be erased.
When the guest sends a block write command, it will write 4096 bytes
data at once (2048 per device).
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: cleaned up and expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
hw/register.h provides macros like FIELD which make it easy to define
shift, mask and length constants for the fields within a register.
Unfortunately register.h also includes a lot of other things, some
of which will only compile in the softmmu build.
Pull the FIELD macro and friends out into a separate header file,
so they can be used in places like target/arm files which also
get built in the user-only configs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1484937883-1068-5-git-send-email-peter.maydell@linaro.org
Bitmaps with a granularity of 58 or above can be neither serialized nor
deserialized (see the comment in the function added in this series for
an explanation). This patch adds a function so that we can check whether
a bitmap actually can be (de-)serialized at all, thus avoiding failing
the necessary assertion in hbitmap_serialization_granularity().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20161115225746.3590-2-mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Add remaining bits of the Altera NiosII R1 support into qemu, which
is documentation, MAINTAINERS file entry, configure bits, arch_init
and configuration files for both linux-user (userland binaries) and
softmmu (hardware emulation).
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Chris Wulff <crwulff@gmail.com>
Cc: Jeff Da Silva <jdasilva@altera.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Sandra Loosemore <sandra@codesourcery.com>
Cc: Yves Vandervennet <yvanderv@altera.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-Id: <20170118220146.489-8-marex@denx.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Add missing bits for qemu-user required for emulating Altera Nios2
userspace binaries.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Chris Wulff <crwulff@gmail.com>
Cc: Jeff Da Silva <jdasilva@altera.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Sandra Loosemore <sandra@codesourcery.com>
Cc: Yves Vandervennet <yvanderv@altera.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-Id: <20170118220146.489-4-marex@denx.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Add nios2 disassembler support. This patch is composed from binutils files
from commit "Opcodes and assembler support for Nios II R2". The files from
binutils used in this patch are:
include/opcode/nios2.h
include/opcode/nios2r1.h
include/opcode/nios2r2.h
opcodes/nios2-opc.c
opcodes/nios2-dis.c
Checkpatch says total: 114 errors, 0 warnings, 3609 lines checked , which
is caused by a different coding style in those files. These warnings and
errors are not addressed To let these files be easily synchronized between
binutils and qemu.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Chris Wulff <crwulff@gmail.com>
Cc: Jeff Da Silva <jdasilva@altera.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Sandra Loosemore <sandra@codesourcery.com>
Cc: Yves Vandervennet <yvanderv@altera.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-Id: <20170118220146.489-2-marex@denx.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
A fix has been committed in upstream glib commit
210a9796f78eb90f76f1bd6a304e9fea05e97617.
(See also related bug https://bugzilla.gnome.org/show_bug.cgi?id=764415)
It is desirable to use the glib version instead of qemu copy, since it
provides more debugging facilities (G_MAIN_POLL_DEBUG etc), and
hopefully has a better maintainance. Hopefully, we can drop the qemu
copy in a few years.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is no need to have those functions as public API.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add also a missing parenthesis in a comment.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The vmstate_pci_device and vmstate_pcie_devices differ
just in the size of one buffer; combine the two using a _TEST
macro.
I think this is safe as long as everywhere which currently
uses either of these two uses the right type.
One thing that concerns me is that some places use pci_device_load/save
which does some irq mangling, but others just use the VMSTATE_PCI_DEVICE
macro - how are they getting the same irq mangling?
This passes a smoke test migrate of:
./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -m 1024
./littlefed20.img -device e1000e -device virtio-net -device
e1000 -device virtio-rng -device megasas -device megasas-gen2 -device
ioh3420 -device nec-usb-xhci
to an unmodified qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20161214195829.18241-1-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
commit fe904ea824 fixed a case
which migration aborted QEMU because it didn't regain the control
of images while some errors happened.
Actually, there are another two cases can trigger the same error reports:
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",
Case 1, codes path:
migration_thread()
migration_completion()
bdrv_inactivate_all() ----------------> inactivate images
qemu_savevm_state_complete_precopy()
socket_writev_buffer() --------> error because destination fails
qemu_fflush() ----------------> set error on migration stream
-> qmp_migrate_cancel() ----------------> user cancelled migration concurrently
-> migrate_set_state() ------------------> set migrate CANCELLIN
migration_completion() -----------------> go on to fail_invalidate
if (s->state == MIGRATION_STATUS_ACTIVE) -> Jump this branch
Case 2, codes path:
migration_thread()
migration_completion()
bdrv_inactivate_all() ----------------> inactivate images
migreation_completion() finished
-> qmp_migrate_cancel() ---------------> user cancelled migration concurrently
qemu_mutex_lock_iothread();
qemu_bh_schedule (s->cleanup_bh);
As we can see from above, qmp_migrate_cancel can slip in whenever
migration_thread does not hold the global lock. If this happens after
bdrv_inactive_all() been called, the above error reports will appear.
To prevent this, we can call bdrv_invalidate_cache_all() in qmp_migrate_cancel()
directly if we find images become inactive.
Besides, bdrv_invalidate_cache_all() in migration_completion() doesn't have the
protection of big lock, fix it by add the missing qemu_mutex_lock_iothread();
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1485244792-11248-1-git-send-email-zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
migrate_add_blocker should rightly fail if the '--only-migratable'
option was specified and the device in use should not be able to
perform the action which results in an unmigratable VM.
Make migrate_add_blocker return -EACCES in this case.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-6-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.
Add an errp parameter and a retcode return value to migrate_add_blocker.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-5-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Merged with recent 'Allow invtsc migration' change
Add a new option "--only-migratable" in qemu which will allow to add
only those devices which will not fail qemu after migration. Devices
set with the flag 'unmigratable' cannot be added when this option will
be used.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-3-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.
When a QTAILQ is migrated from source to target, it is appended to the
corresponding QTAILQ structure, which is assumed to have been properly
initialized.
This approach will be used to transfer pending_events and ccs_list in spapr
state.
We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-3-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-2-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
make sure that external callers won't try to modify
possible_cpus and owner of possible_cpus can access
it directly when it modifies it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1484759609-264075-5-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The existing default_config_files table in arch_init.c has a
single entry, making it completely unnecessary. The whole code
can be replaced by a single qemu_read_config_file() call in vl.c.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170117180051.11958-1-ehabkost@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently DNS resolution is done automatically as part
of the creation of a QIOChannelSocket object instance.
This works ok for network clients where you just end
up a single network socket, but for servers, the results
of DNS resolution may require creation of multiple
sockets.
Introducing a DNS resolver API allows DNS resolution
to be separated from the socket object creation. This
will make it practical to create multiple QIOChannelSocket
instances for servers.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that task objects have a directly associated error,
there's no need for an an Error **errp parameter to
the QIOTask thread worker function. It already has a
QIOTask object, so can directly set the error on it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QIOTaskFunc signature takes an Object * for
the source, and an Error * for any error. We also need to
be able to provide a result pointer. Rather than continue
to add parameters to QIOTaskFunc, remove the existing
ones and simply pass the QIOTask object instead. This
has methods to access all the other data items required
in the callback impl.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently when a task fails, the error is never explicitly
associated with the task object, it is just passed along
through the completion callback. This adds the ability to
explicitly associate an error with the task.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently there is no data associated with a successful
task completion. This adds an opaque pointer to the task
to store an arbitrary result.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The GDestroyNotify parameter is already a pointer, so does
not need a '*' suffix on the type.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Incrementing the reference in qio_task_get_source is
not necessary, since we're not running concurrently
with any other code touching the QIOTask. This
minimizes chances of further memory leaks.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
- rework of the zpci code, giving us proper multibus support
- introduction of the 2.9 machine
- fixes and improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYgdReAAoJEN7Pa5PG8C+vDggP/i3eviyb2mFlnIiwazlAfBuw
Uc6vBFDh/WWMthpzHl4PF+yujM3XbuvUN3VejdnqWLQ1PYq2p3n7rHNlR2XlBovu
f8l2LpPZGsj1VtAr1QGBj5ipOmRs3qydXY7EDCKORbKuPeor1VW7TbeaKbfpvpZM
rZHWMlV1UGA6kxM/B+zd9+kxBM3IYnHy3o+Gaq+cfuKyc0VRWRJmalqonjkR7EZj
InaIyOtGonpPTlMD1GTbM71Wx/NnCugYUEX1Eq4yHX4DV15rM3B83LgTJu72txzr
ObJmzT3XU2DKwtzo87Y6cWJ3GoxQQbwgiU6VL+l8JVtrzGfllpUdcdInQjSqxXp2
OW8NuV6Ie02YOrczBXbBAv46PKmoLTf63hvsC4f6nNLa2O6FqxAXzYGKtOpvgOq5
j1Q6VyzAb/vbyyW2lyMice4XJXGMxitaMGxvJG0lq/iscRpNdpz6E+dgkzO7lieF
+ETpDsGd5miMdsAUqmIREjBCCjOzOGpC4WX0mg8Te8LmR3Rt8WYIgWuowMvbq2iG
/qmv9a8ea2XqB+/g2ta+YqS9cPChsPJSN03Q0bo1244DMwBKuVwyXNsC9lRIkiHJ
4b1Msoseohv9D4ghU8q6gSOU+T5nxLRT1TWBByqhkONU1C4UyKHEblop/c1oHE5k
UZtiaQvyWFhVU4QtXeE8
=fzmu
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170120-v2' into staging
First set of s390x patches for 2.9:
- rework of the zpci code, giving us proper multibus support
- introduction of the 2.9 machine
- fixes and improvements
# gpg: Signature made Fri 20 Jan 2017 09:11:58 GMT
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170120-v2:
virtio-ccw: fix ring sizing
s390x/pci: merge msix init functions
s390x/pci: handle PCIBridge bus number
s390x/pci: use hashtable to look up zpci via fh
s390x/pci: PCI multibus bridge handling
s390x/pci: optimize calling s390_get_phb()
s390x/pci: change the device array to a list
s390x/pci: dynamically allocate iommu
s390x/pci: make S390PCIIOMMU inherit Object
s390x/kvm: use kvm_gsi_routing_enabled in flic
s390x: add compat machine for 2.9
s390x: remove double compat statement
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v1
iQEcBAABAgAGBQJYgXzxAAoJEO8Ells5jWIRgtAIAKuFrOBE/xJnjd/45sVKcx2j
fsohKHF8T/eLmt5sw+MhGtnM/oRJRUX8kGpA9AU8m6TCSaTYh2tOKX5lwrykuAzk
feqz2pqZFwiLWs5Ro7qEQIhMkqtFetODvKd05qnKnAldj8SC45czKxdghmSP/B+w
4nnDEdqVqUuUseDCa1mW1b4f6g1N93LbgChK7lK9Xqg+OqeEbQ7nLgVvcWvN7+Ea
DfDKWP8tjQ5QhjzFWc4wa9/Tx+0HI7Dn57fv98XdJMvm1kt/MdnO7QKAXWmHH5s/
6DX+NHgN0ZAn85gv/ufq1F9C4TstbAoZA9EOGhoBJ5ww8mueARB3L2iCj+OcS9A=
=gkbh
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 20 Jan 2017 02:58:57 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
tap: fix memory leak on failure in net_init_tap()
hw/pci: use-after-free in pci_nic_init_nofail when nic device fails to initialize
hw/net/dp8393x: Avoid unintentional sign extensions on addresses
m68k: QOMify the MCF Fast Ethernet Controller device
net: optimize checksum computation
docs: Fix description of the sentence
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
writeable fw cfg blobs which will be used for guest to host
communication
fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYgSq0AAoJECgfDbjSjVRpHtwH/j/viN38ginAvuRiPssEiitb
VC3oO09siMx+rO97H7ur5cVcwiyMFxG90Dtmsptf3r46hzgUcv4meC4zzNG3Xds6
Iwsqy1m3nQDEL1dbU7XbhfbrWAGCiY1I+O2JRSvHQ8+HsmP6vOLxPPEQTlFRQIrk
k9HHlMHo2tYU0hhSOOoDDG/mBG8QcYgIaGleCMrVBlV/Q6w7lnD8XVgPWjEF5RsG
2SkbY+JQJlmt6qZpkbdQKox4cHFxlA8f6P9ne1o++gjVENhbe6KrDFhROE560Lbn
dtypZV6Y0Pt6SMrk+lR2Gd2DHI/10LhNVi/mz6o1HrCzmISJlIxIvXD6XmhqdPk=
=7hNY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, vhost, pc: fixes, features
writeable fw cfg blobs which will be used for guest to host
communication
fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 19 Jan 2017 21:08:04 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio: force VIRTIO_F_IOMMU_PLATFORM
virtio: fix up max size checks
vhost: drop VHOST_F_DEVICE_IOTLB
update-linux-headers.sh: support __bitwise
virtio_crypto: header update
pci_regs: update to latest linux
virtio-mmio: switch to linux headers
virtio_mmio: add standard header file
virtio: drop an obsolete comment
fw-cfg: bump "x-file-slots" to 0x20 for 2.9+ machine types
pc: Add 2.9 machine-types
fw-cfg: turn FW_CFG_FILE_SLOTS into a device property
fw-cfg: support writeable blobs
vhost_net: device IOTLB support
virtio: disable notifications again after poll succeeded
Revert "virtio: turn vq->notification into a nested counter"
virtio-net: enable ioeventfd even if vhost=off
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As noticed by David Gilbert, commit 6053a86 'kvmclock: reduce kvmclock
differences on migration' added 'x-mach-use-reliable-get-clock' and a
compatibility entry that turns it off; however it got merged after 2.8.0
was released but the entry has gone into PC_COMPAT_2_7 where it should
have gone into PC_COMPAT_2_8.
Fix it by moving the entry to PC_COMPAT_2_8.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170118175343.GA26873@amt.cnet>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a board level property to the virt board which will
enable EL2 on the CPU if the user asks for it. The
default is not to provide EL2. If EL2 is enabled then
we will use SMC as our PSCI conduit, and report the
virtualization support in the GICv3 device tree node
and the ACPI tables.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1483977924-14522-19-git-send-email-peter.maydell@linaro.org
If we are giving the guest a CPU with EL2, it is likely to
want to use the HVC instruction itself, for instance for
providing PSCI to inner guest VMs. This makes using HVC
as the PSCI conduit for the outer QEMU a bad idea. We will
want to use SMC instead is this case: this makes sense
because QEMU's PSCI implementation is effectively an
emulation of functionality provided by EL3 firmware.
Add code to support selecting the PSCI conduit to use,
rather than hardcoding use of HVC.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1483977924-14522-15-git-send-email-peter.maydell@linaro.org
Implement the function which signals virtual interrupts to the
CPU as appropriate following CPU interface state changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1483977924-14522-13-git-send-email-peter.maydell@linaro.org
As the first step in adding support for the virtualization
extensions to the GICv3 emulation:
* add the necessary data fields to the state structures
* add the fields to the migration state, as a subsection
which is only present if virtualization is enabled
The use of a subsection means we retain migration
compatibility as EL2 is not enabled on any CPUs currently.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1483977924-14522-8-git-send-email-peter.maydell@linaro.org
Wire the new VIRQ, VFIQ and maintenance interrupt lines from the
GIC to each CPU.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1483977924-14522-5-git-send-email-peter.maydell@linaro.org
Augment the GIC's QOM device interface by adding two
new sets of sysbus IRQ lines, to signal VIRQ and VFIQ to
each CPU.
We never use these, but it's helpful to keep the v2-and-earlier
GIC's external interface in line with that of the GICv3 to
avoid board code having to add extra code conditional on which
version of the GIC is in use.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1483977924-14522-3-git-send-email-peter.maydell@linaro.org
Augment the GICv3's QOM device interface by adding two
new sets of sysbus IRQ lines, to signal VIRQ and VFIQ to
each CPU.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1483977924-14522-2-git-send-email-peter.maydell@linaro.org
The Aspeed SMC controllers have a mode (Command mode) in which
accesses to the flash content are no different than doing MMIOs. The
controller generates all the necessary commands to load (or store)
data in memory.
However, accesses are restricted to the segment window assigned the
the flash module by the controller. This window is defined by the
Segment Address Register.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1483979087-32663-8-git-send-email-clg@kaod.org
[PMM: Deleted now-unused aspeed_smc_is_usermode() function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SPI controller of the AST2400 SoC has less registers. So we can
adjust the size of the memory region holding the registers depending
on the controller type. We can also remove the guest_error logging
which is useless as the range of the region is strict enough.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 1483979087-32663-7-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is getting difficult to read. Also add a 'has_dma' field for each
controller type.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1483979087-32663-6-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Current code seems to assume ring size is
always decreased but this is not required by spec:
what spec says is just that size can not exceed
the maximum. Fix it up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <1484256243-1982-1-git-send-email-mst@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
When running qemu-system-m68k with the "-net" parameter (for example
simply "-net nic -net user"), there is currently a confusing warning
message saying:
Warning: requested NIC (anonymous, model mcf_fec) was not created
(not supported by this machine?)
This seems to happen because the MCF NIC has never been adapted to
the currently expected QEMU device behavior. Thus let's QOMify the
NIC now to get rid of the warning message.
Signed-off-by: Thomas Huth <huth@tuxfamily.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Use the Intel HAX is kernel-based hardware acceleration module for
Windows (similar to KVM on Linux).
Based on the "target/i386: Add Intel HAX to android emulator" patch
from David Chou <david.j.chou@intel.com>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <7b9cae28a0c379ab459c7a8545c9a39762bd394f.1484045952.git.vpalatin@chromium.org>
[Drop hax_populate_ram stub. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
That's a forward port of the core HAX interface code from the
emu-2.2-release branch in the external/qemu-android repository as used by
the Android emulator.
The original commit was "target/i386: Add Intel HAX to android emulator"
saying:
"""
Backport of 2b3098ff27bab079caab9b46b58546b5036f5c0c
from studio-1.4-dev into emu-master-dev
Intel HAX (harware acceleration) will enhance android emulator performance
in Windows and Mac OS X in the systems powered by Intel processors with
"Intel Hardware Accelerated Execution Manager" package installed when
user runs android emulator with Intel target.
Signed-off-by: David Chou <david.j.chou@intel.com>
"""
It has been modified to build and run along with the current code base.
The formatting has been fixed to go through scripts/checkpatch.pl,
and the DPRINTF macros have been updated to get the instanciations checked by
the compiler.
The FPU registers saving/restoring has been updated to match the current
QEMU registers layout.
The implementation has been simplified by doing the following modifications:
- removing the code for supporting the hardware without Unrestricted Guest (UG)
mode (including all the code to fallback on TCG emulation).
- not including the Darwin support (which is not yet debugged/tested).
- simplifying the initialization by removing the leftovers from the Android
specific code, then trimming down the remaining logic.
- removing the unused MemoryListener callbacks.
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <e1023837f8d0e4c470f6c4a3bf643971b2bca5be.1484045952.git.vpalatin@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the generic cpu_synchronize_ functions to the common hw_accel.h header,
in order to prepare for the addition of a second hardware accelerator.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <f5c3cffe8d520011df1c2e5437bb814989b48332.1484045952.git.vpalatin@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
C11 allows errno to be clobbered by pretty much any library function
call, so in general callers need to take care to save errno before
calling other functions.
However, for error reporting functions this is rather awkward and can
make the code on the caller side more complicated than
necessary. error_setg_errno() already takes care of preserving errno
and some functions rely on that, so just promise that we continue to
do so in the future.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1469611466-31574-1-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Update header from latest linux driver. Session creation structs gain
padding to make them same size. Formatting cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
More precisely, the "x-file-slots" count is bumped for all machine types
that:
(a) use fw_cfg, and
(b) are not versioned (hence migration is not expected to work for them
across QEMU releases anyway), or have version 2.9.
This affects machine types implemented in the following source files:
- "hw/arm/virt.c". The "virt-*" machine type is versioned, and the <= 2.8
versions already depend on HW_COMPAT_2_8 (see commit e353aac51b).
Therefore adding the "x-file-slots" compat values to HW_COMPAT_2_8
suffices.
- "hw/i386/pc.c". The "pc-i440fx-*" (including "pc-*") and "pc-q35-*"
machine types are versioned. Modifying HW_COMPAT_2_8 is sufficient here
too (see commit "pc: Add 2.9 machine-types"). The "isapc" machtype is
not versioned. The "xenfv" machine type, which uses fw_cfg for direct
kernel booting, is also not versioned.
- "hw/ppc/mac_newworld.c". The "mac99" machine type is not versioned.
- "hw/ppc/mac_oldworld.c". The "g3beige" machine type is not versioned.
- "hw/sparc/sun4m.c". None of the 9 machine types defined in this file
appear versioned.
- "hw/sparc64/sun4u.c". None of the 3 machine types defined in this file
appear versioned.
Cc: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Artyom Tarasenko <atar4qemu@gmail.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
We'd like to raise the value of FW_CFG_FILE_SLOTS. Doing it naively could
lead to problems with backward migration: a more recent QEMU (running an
older machine type) would allow the guest, in fw_cfg_select(), to select a
high key value that is unavailable in the same machine type implemented by
the older (target) QEMU. On the target host, fw_cfg_data_read() for
example could dereference nonexistent entries.
As first step, size the FWCfgState.entries[*] and FWCfgState.entry_order
arrays dynamically. All three array sizes will be influenced by the new
field FWCfgState.file_slots (and matching device property).
Make the following changes:
- Replace the FW_CFG_FILE_SLOTS macro with FW_CFG_FILE_SLOTS_MIN (minimum
count of fw_cfg file slots) in the header file. The value remains 0x10.
- Replace all uses of FW_CFG_FILE_SLOTS with a helper function called
fw_cfg_file_slots(), returning the new property.
- Eliminate the macro FW_CFG_MAX_ENTRY, and replace all its uses with a
helper function called fw_cfg_max_entry().
- In the MMIO- and IO-mapped realize functions both, allocate all three
arrays dynamically, based on the new property.
- The new property defaults to FW_CFG_FILE_SLOTS_MIN. This is going to be
customized in the following patches.
Cc: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Useful to send guest data back to QEMU.
Changes from Laszlo Ersek <lersek@redhat.com>:
- rebase the patch from Michael Tsirkin's original postings at [1] and [2]
to the following patches:
- loader: Allow a custom AddressSpace when loading ROMs
- loader: Add AddressSpace loading support to uImages
- loader: fix handling of custom address spaces when adding ROM blobs
- reject such writes immediately that would exceed the end of the array,
rather than performing a partial write before setting the error bit: see
the (len != dma.length) condition
- document the write interface
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg04968.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg02735.html
Cc: "Gabriel L. Somlo" <somlo@cmu.edu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: qemu-arm@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Gabriel Somlo <somlo@cmu.edu>
Tested-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
This patches implements Device IOTLB support for vhost kernel. This is
done through:
1) switch to use dma helpers when map/unmap vrings from vhost codes
2) introduce a set of VhostOps to:
- setting up device IOTLB request callback
- processing device IOTLB request
- processing device IOTLB invalidation
2) kernel support for Device IOTLB API:
- allow vhost-net to query the IOMMU IOTLB entry through eventfd
- enable the ability for qemu to update a specified mapping of vhost
- through ioctl.
- enable the ability to invalidate a specified range of iova for the
device IOTLB of vhost through ioctl. In x86/intel_iommu case this is
triggered through iommu memory region notifier from device IOTLB
invalidation descriptor processing routine.
With all the above, kernel vhost_net can co-operate with userspace
IOMMU. For vhost-user, the support could be easily done on top by
implementing the VhostOps.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJYeOOmAAoJEPvQ2wlanipE3ZUH/Rsfpl23kXCMmqoXEIhWXy+h
yf8ARWCmpU6UKfwb+sH4vLegBfU56f62vVkGQ6oaaAbuyQ4SxCUlZGMO/rqY8/TE
m57aM+VfEE+bIdinAtLjFM24EVp/exMfkeutK7ItzLv7GwlrBos0J5veyCuyJ15q
pccV24jrpbJGilEeJ2GblKp3r2I3dInQGauOQhtoP3MNjHmYNSQD7noSbdN/JiTR
9H2eV700pg3ZPaSfO+CTVQN+cHjK1FC6qLi6916YZY9llnSOnDAegBYgbwE1RIBw
AULpWrezYveKy71eFhHVtGxnPeCJ8J4GVECMK0P0cdxzprIXFh1kZezyM4bxAGk=
=sboI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-common-tlb-reset-20170113-r1' into staging
This is the same as the v3 posted except a re-base and a few extra signoffs
# gpg: Signature made Fri 13 Jan 2017 14:26:46 GMT
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-tcg-common-tlb-reset-20170113-r1:
cputlb: drop flush_global flag from tlb_flush
cpu_common_reset: wrap TCG specific code in tcg_enabled()
qom/cpu: move tlb_flush to cpu_common_reset
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Copy the mechanism of hw/smbios/smbios-stub.c to implement an ACPI-stub
instead, so that -acpitable can be later extended to ARM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove the useless is_external argument. Since the iohandler
AioContext is never used for block devices, aio_disable_external
is never called on it. This lets us remove stubs/iohandler.c.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
They are small, it is not worth stubbing them. Just include them
in user-mode emulators and unit tests as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-smbios command line options were accepted but silently ignored on
TARGET_ARM, due to a test for TARGET_I386 in arch_init.c.
Copy the mechanism of hw/pci/pci-stub.c to implement an smbios-stub
instead, enabled for all targets without CONFIG_SMBIOS.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Message-Id: <20161222151828.28292-1-leif.lindholm@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is complex, but I think it is reasonably documented in the source.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will make it possible to walk the list of bottom halves without
holding the AioContext lock---and in turn to call bottom half
handlers without holding the lock.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170112180800.21085-4-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex. It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.
This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be used for AioHandlers too. There is going to be little
or no contention, so it is better to reuse the same lock.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170112180800.21085-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_io_plug and bdrv_io_unplug are only called (via their
BlockBackend equivalents) after starting asynchronous I/O.
bdrv_drain is not going to be called while they are running,
because---even if a coroutine runs for some reason---it will
only drain in the next iteration of the event loop through
bdrv_co_yield_to_drain.
So this mechanism is unnecessary, get rid of it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129113334.605-1-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We have never has the concept of global TLB entries which would avoid
the flush so we never actually use this flag. Drop it and make clear
that tlb_flush is the sledge-hammer it has always been.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
[DG: ppc portions]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
so it won't impose an additional limits on max_cpus limits
supported by different targets.
It removes global MAX_CPUMASK_BITS constant and need to
bump it up whenever max_cpus is being increased for
a target above MAX_CPUMASK_BITS value.
Use runtime max_cpus value instead to allocate sufficiently
sized node_cpu bitmasks in numa parser.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479466974-249781-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Added asserts to ensure cpu_index < max_cpus]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Considering 'id' is mandatory for user_creatable objects/backends
and user_creatable_add_type() always has it as an argument
regardless of where from it is called CLI/monitor or QMP,
Fix issue by adding 'id' property to hostmem backends and
set it in user_creatable_add_type() for every object that
implements 'id' property. Then later at query-memdev time
get 'id' from object directly.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1484052795-158195-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Simplify code by dropping ~57LOC by merging user_creatable_add()
into user_creatable_add_opts() and using the later from monitor.
Along with it allocate opts_visitor_new() once in user_creatable_add_opts().
As result we have one less API func and a more readable/simple
user_creatable_add_opts() vs user_creatable_add().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1484052795-158195-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
For builds with Mingw-w64 as it is included in Cygwin, there are two
header files which define KEY_EVENT with different values.
This results in lots of compiler warnings like this one:
CC vl.o
In file included from /qemu/include/ui/console.h:340:0,
from /qemu/vl.c:76:
/usr/i686-w64-mingw32/sys-root/mingw/include/curses.h:1522:0: warning: "KEY_EVENT" redefined
#define KEY_EVENT 0633 /* We were interrupted by an event */
In file included from /usr/share/mingw-w64/include/windows.h:74:0,
from /usr/share/mingw-w64/include/winsock2.h:23,
from /qemu/include/sysemu/os-win32.h:29,
from /qemu/include/qemu/osdep.h:100,
from /qemu/vl.c:24:
/usr/share/mingw-w64/include/wincon.h:101:0: note: this is the location of the previous definition
#define KEY_EVENT 0x1
QEMU only uses the KEY_EVENT macro from wincon.h.
Therefore we can undefine the macro coming from curses.h.
The explicit include statement for curses.h in ui/curses.c is not needed
and was removed.
Those two modifications fix the redefinition warnings.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20161119185318.10564-1-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This adds two console functions, qemu_console_set_window_id and
qemu_graphic_console_get_window_id, to let graphical backend record the
window id in the QemuConsole structure, and let the baum driver read it.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20161221003806.22412-2-samuel.thibault@ens-lyon.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Wayland always uses evdev as its input source, so QEMU
can use the existing evdev keymap data
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20161201094117.16407-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
That reduces DSDT by 910 bytes when memory hotplug
isn't enabled.
While doing so drop intermediate variables/arguments
passing around ACPI_MEMORY_HOTPLUG_IO_LEN and making
it local to memory_hotplug.c, hardcoding it there as
it can't change.
Also don't pass around ACPI_MEMORY_HOTPLUG_BASE through
intermediate variables/arguments where it's not needed.
Instead initialize in module static variable when MMIO
region is mapped and use that within memory_hotplug.c
whenever it's required.
That way MMIO base specified only at one place and AML
with MMIO would always use the same value.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Move defines used locally only by memory_hotplug.c into it
from header files.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
>From this patch all the memory hotplug related AML
bits are consolidated in one place within DSTD.
Follow up patches will utilize that to simplify
memory hotplug related C/AML code.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
It consolidates memory hotplug AML in one place within DSDT
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
since static and dynamic parts of memory MHPD device are now
in the same table (DSDT), there is no point keeping
them scattered across the table, so consolidate it
in one place.
There aren't any functional change, only AML text movement
from externally refferenced MHPD scope directly into
MHPD device declaration.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
This patch allows advising guest with host MTU's by setting
host_mtu parameter.
If VIRTIO_NET_F_MTU has been successfully negotiated, MTU
value is passed to the backend.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Aaron Conole <aconole@redhat.com
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch provides a way for virtio-net to notify the
backend about the host MTU set by the user.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements VHOST_USER_PROTOCOL_F_NET_MTU
protocol feature and VHOST_USER_NET_SET_MTU request so
that the backend gets notified of the user defined host
MTU.
If backend supports VHOST_USER_PROTOCOL_F_REPLY_ACK,
QEMU assumes MTU is valid if success is returned.
Vhost-net driver sends this request through a new
vhost_net_set_mtu vhost_ops entry.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add procedure for fast drop of queued packets, acting like
pop and push without mapping the buffers into memory.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bring virtio queue to correct internal state for host-to-guest
operations when vhost is temporary stopped.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now, AER capa version is fixed to v2, if assigned device isn't v2,
then this value will be inconsistent between guest and host
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When user specify invalid value for property aer_log_max, device should
fail to create, and report appropriate message.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The ready flag should be set by the children of
cryptodev backend interface. Warp the setter/getter
functions for it.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This property is used to Tag the cryptodev backend
is used by virtio-crypto or not. Making cryptodev
can't be hot unplugged when it's in use. Cleanup
resources when cryptodev is finalized.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This patch provides ATSR which was a requirement for software that
wants to enable ATS on endpoint devices behind a Root Port. This is
done simply by setting ALL_PORTS which indicates all PCI-Express Root
Ports support ATS transactions.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patches enable the Address Translation Service support for virtio
pci devices. This is needed for a guest visible Device IOTLB
implementation and will be required by vhost device IOTLB API
implementation for intel IOMMU.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch enables device IOTLB support for intel iommu. The major
work is to implement QI device IOTLB descriptor processing and notify
the device through iommu notifier.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, all virtio devices bypass IOMMU completely. This is because
address_space_memory is assumed and used during DMA emulation. This
patch converts the virtio core API to use DMA API. This idea is
- introducing a new transport specific helper to query the dma address
space. (only pci version is implemented).
- query and use this address space during virtio device guest memory
accessing when iommu platform (VIRTIO_F_IOMMU_PLATFORM) was enabled
for this device.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IOMMU needs to be migrated before all the PCI devices (in case there are
devices that will request for address translation). So marking it with a
priority higher than the default (which PCI devices and other belong).
Migration framework handled the rest.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
During migration, save state entries are saved/loaded without a specific
order - we just traverse the savevm_state.handlers list and do it one by
one. This might not be enough.
There are requirements that we need to load specific device's vmstate
first before others. For example, VT-d IOMMU contains DMA address
remapping information, which is required by all the PCI devices to do
address translations. We need to make sure IOMMU's device state is
loaded before the rest of the PCI devices, so that DMA address
translation can work properly.
This patch provide a VMStateDescription.priority value to allow specify
the priority of the saved states. The loadvm operation will be done with
those devices with higher vmsd priority.
Before this patch, we are possibly achieving the ordering requirement by
an assumption that the ordering will be the same with the ordering that
objects are created. A better way is to mark it out explicitly in the
VMStateDescription table, like what this patch does.
Current ordering logic is still naive and slow, but after all that's not
a critical path so IMO it's a workable solution for now.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
These files deal with the file protocol, not the raw format (the
file protocol is often used with other formats, and the raw
format is not forced to use the file protocol). Rename things
to make it a bit easier to follow.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the context of asynchronous work, if we have a worker coroutine that
didn't yield, the parent coroutine cannot be reentered because it hasn't
yielded yet. In this case we don't even have to reenter the parent
because it will see that the work is already done and won't even yield.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
This is the ACPI equivalent to "hw/arm/virt: Don't incorrectly claim
architectural timer to be edge-triggered" which fixes the DT for
machine types 2.9 and later.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-15-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
by moving VirtGuestInfo.fw_cfg to VirtMachineState. This is the
mach-virt equivalent of "pc: Move PcGuestInfo.fw_cfg to
PCMachineState" and "pc: Eliminate PcGuestInfo struct" combined.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-14-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that we pass VirtMachineState, and guest-info is just part of
that state, we can remove all the redundant members and access
the VirtMachineState directly.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-12-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Only two functions take VirtGuestInfo parameters. Now that guest-info
is part of VirtMachineState, and VirtMachineState is defined in the
virt header, pass that instead.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-11-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation to share more Virt machine state than just guest-info
with other mach-virt source files, move the State and Class structures
to virt.h
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-10-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
include/hw/arm/virt-acpi-build.h is only used for VirtGuestInfo,
which doesn't even necessarily have to be ACPI specific. Move
VirtGuestInfo to include/hw/arm/virt.h, allowing us to remove
include/hw/arm/virt-acpi-build.h, and to prepare for even more
code motion.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-9-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of allocating a new struct just for VirtGuestInfo and the
machine_done Notifier, place them inside VirtMachineState. This
is the mach-virt equivalent of "pc: Eliminate struct
PcGuestInfoState"
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20170102200153.28864-8-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 20170102200153.28864-5-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Also remove all unused flags.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 20170102200153.28864-4-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Also move the enabled flag definition from mach-virt code to
acpi common.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 20170102200153.28864-3-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a return value to the event handler. Some I2C devices will
NAK if they have no data, so allow them to do this. This required
the following changes:
Go through all the event handlers and change them to return int
and return 0.
Modify i2c_start_transfer to terminate the transaction on a NAK.
Modify smbus handing to not assert if a NAK occurs on a second
operation, and terminate the transaction and return -1 instead.
Add some information on semantics to I2CSlaveClass.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds support of recording and replaying network packets in
irount rr mode.
Record and replay for network interactions is performed with the network filter.
Each backend must have its own instance of the replay filter as follows:
-netdev user,id=net1 -device rtl8139,netdev=net1
-object filter-replay,id=replay,netdev=net1
Replay network filter is used to record and replay network packets. While
recording the virtual machine this filter puts all packets coming from
the outer world into the log. In replay mode packets from the log are
injected into the network device. All interactions with network backend
in replay mode are disabled.
v5 changes:
- using iov_to_buf function instead of loop
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
These parameters control the poll time self-tuning algorithm. They are
optional and will default to sane values if omitted.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-14-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch is based on the algorithm for the kvm.ko halt_poll_ns
parameter in Linux. The initial polling time is zero.
If the event loop is woken up within the maximum polling time it means
polling could be effective, so grow polling time.
If the event loop is woken up beyond the maximum polling time it means
polling is not effective, so shrink polling time.
If the event loop makes progress within the current polling time then
the sweet spot has been reached.
This algorithm adjusts the polling time so it can adapt to variations in
workloads. The goal is to reach the sweet spot while also recognizing
when polling would hurt more than help.
Two new trace events, poll_grow and poll_shrink, are added for observing
polling time adjustment.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-13-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The begin and end callbacks can be used to prepare for the polling loop
and clean up when polling stops. Note that they may only be called once
for multiple aio_poll() calls if polling continues to succeed. Once
polling fails the end callback is invoked before aio_poll() resumes file
descriptor monitoring.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-11-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Poll mode can be configured with -object iothread,poll-max-ns=NUM.
Polling is disabled with a value of 0 nanoseconds.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The AioContext event loop uses ppoll(2) or epoll_wait(2) to monitor file
descriptors or until a timer expires. In cases like virtqueues, Linux
AIO, and ThreadPool it is technically possible to wait for events via
polling (i.e. continuously checking for events without blocking).
Polling can be faster than blocking syscalls because file descriptors,
the process scheduler, and system calls are bypassed.
The main disadvantage to polling is that it increases CPU utilization.
In classic polling configuration a full host CPU thread might run at
100% to respond to events as quickly as possible. This patch implements
a timeout so we fall back to blocking syscalls if polling detects no
activity. After the timeout no CPU cycles are wasted on polling until
the next event loop iteration.
The run_poll_handlers_begin() and run_poll_handlers_end() trace events
are added to aid performance analysis and troubleshooting. If you need
to know whether polling mode is being used, trace these events to find
out.
Note that the AioContext is now re-acquired before disabling notify_me
in the non-polling case. This makes the code cleaner since notify_me
was enabled outside the non-polling AioContext release region. This
change is correct since it's safe to keep notify_me enabled longer
(disabling is an optimization) but potentially causes unnecessary
event_notifer_set() calls. I think the chance of performance regression
is small here.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The new AioPollFn io_poll() argument to aio_set_fd_handler() and
aio_set_event_handler() is used in the next patch.
Keep this code change separate due to the number of files it touches.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Polling mode will not call ppoll(2)/epoll_wait(2). Therefore we know
there are no fds ready and should avoid looping over fd handlers in
aio_dispatch().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There is not much differences with the A0 revision apart from the DDR
calibration.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1480434248-27138-10-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The size of the SRAM depends on the SoC model, so use a per-soc
definition when creating the region.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1480434248-27138-9-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1480434248-27138-4-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.
For new machine types, use this value rather than reading
from guest memory.
This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20161121105052.598267440@redhat.com>
[Add comment explaining what is going on. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 87f68d3182 (block: drop aio
functions that operate on the main AioContext) drops qemu_aio_wait
function references mostly while leaves these behind, clean up them.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-3-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 49cf57281b (vl: delay thread initialization after daemonization)
makes the global mutex is taken after daemonization instead before
daemonization by qemu_init_main_loop().
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-2-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's timer to expire, not clock.
Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-1-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap. This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We've currently got 18 architectures in QEMU, and thus 18 target-xxx
folders in the root folder of the QEMU source tree. More architectures
(e.g. RISC-V, AVR) are likely to be included soon, too, so the main
folder of the QEMU sources slowly gets quite overcrowded with the
target-xxx folders.
To disburden the main folder a little bit, let's move the target-xxx
folders into a dedicated target/ folder, so that target-xxx/ simply
becomes target/xxx/ instead.
Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [crisµblaze part]
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch makes virtio-gpu track host memory allocations for ressources
and applies a limit (configurable 256M by default). When exceeding the
limit virtio-gpu throws VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY errors (like
it already does today when pixman image allocations fail).
This patch covers 2d mode only. For 3d mode we have to figure how we
are going to handle this best. qemu doesn't track resources in case
virglrenderer is used, so I guess we should extend virglrenderer to
allow setting a limit, then let qemu set the limit and catch
virgl_renderer_resource_create failures.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1480423356-22255-1-git-send-email-kraxel@redhat.com
This patch fixes a cross-version migration regression introduced
by commit d1b4259f ("virtio-bus: Plug devices after features are
negotiated").
The problem is encountered when host's vhost backend does not support
VIRTIO_F_VERSION_1, and migration is initiated from a v2.7 or prior
machine with virtio-pci modern capabilities enabled to a v2.8 machine.
In this case, modern capabilities get exposed to the guest by the source,
whereas the target will detect version 1 is not supported so will only
expose legacy capabilities.
The problem is fixed by introducing a new "x-ignore-backend-features"
property, which is set in v2.7 and prior compatibility modes. Doing this,
v2.7 machine keeps its broken behaviour (enabling modern while version
is not supported), and newer machines will behave correctly.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-id: 20161214163035.3297-1-maxime.coquelin@redhat.com
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Commit 3e76099aac ("loader: Allow a custom AddressSpace when loading
ROMs") introduced the "Rom.as" field:
(1) It modified the utility callers of rom_insert() to take "as" as a
new parameter from *their* callers, and set "rom->as" from that
parameter. The functions covered were rom_add_file() and
rom_add_elf_program().
(2) It also modified rom_insert() itself, to auto-assign
"&address_space_memory", in case the external caller passed -- and
the utility caller forwarded -- as=NULL.
Except, commit 3e76099aac forgot to update the third utility caller of
rom_insert(), under point (1), namely rom_add_blob().
* Later, commit 5e774eb3bd ("loader: Add AddressSpace loading support
to uImages") added the load_uimage_as() function, and the
rom_add_blob_fixed_as() function-like macro, with the necessary changes
elsewhere to propagate the new "as" parameter to rom_add_blob():
load_uimage_as()
load_uboot_image()
rom_add_blob_fixed_as()
rom_add_blob()
At this point, the signature (and workings) of rom_add_blob() had been
broken already, and the rom_add_blob_fixed_as() macro passed its "_as"
parameter to rom_add_blob() as "callback_opaque". Given that the
"fw_callback" parameter itself was set to NULL (correctly), this did no
additional damage (the opaque arg would never be used), but ultimately
it broke the new functionality of load_uimage_as().
* The load_uimage_as() function would be put to use in one of the later
patches, commit e481a1f63c ("generic-loader: Add a generic loader").
* We can fix this only in a unified patch now. Append "AddressSpace *as"
to the signature of rom_add_blob(), and handle the new parameter. Pass
NULL from all current callers, except from rom_add_blob_fixed_as(),
where "_as" has to be bumped to the proper position.
* Note that rom_add_file() rejects the case when both "mr" and "as" are
passed in as non-NULL. The action that this is apparently supposed to
prevent is the
rom->mr = mr;
assignment (that's the only place where the "mr" parameter is used in
rom_add_file()). In rom_add_blob() though, we have no "mr" parameter,
and the actions done on the fw_cfg branch:
if (fw_file_name && fw_cfg) {
if (mc->rom_file_has_mr) {
data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
mr = rom->mr;
} else {
data = rom->data;
}
reflect those that are performed by rom_add_file() too (with mr==NULL):
if (rom->fw_file && fw_cfg) {
if ((!option_rom || mc->option_rom_has_mr) &&
mc->rom_file_has_mr) {
data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
} else {
data = rom->data;
}
Hence we need no additional restrictions in rom_add_blob().
* Stable is not affected as both problematic commits appeared first in
v2.8.0-rc0.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: qemu-arm@nongnu.org
Cc: qemu-devel@nongnu.org
Fixes: 3e76099aac
Fixes: 5e774eb3bd
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQExBAABCAAbBQJYPFEQFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
yF4H/3oBEgzDF9HbnSklknGhkPnOvYnNVKtJbHgk4SnZ1FlPSJLohuz15mXxbr+R
0MzWyQliHiBsAX8sMdvVVHm6YVy9JSABnsefhPUgM++1gT3+EhFsToZ9cWsAYOp7
Q4/hMc66ne0N5SWKjTlCzHfBxw3sPDvOoNYSVYjJYeASTSDQuyyVxRRWMYBFSUnD
p4m7dJCz+my8YXz6diTY8csxFRGmt49EtxtQBU1wBrFc+m8qn4UKaTXoqfcDEBe6
RceS9OAWrddv1Ds4OM/ZgD0BikYehYYnq9THvjWuqhTjHdKKYNeZAodqFJicEZmF
aAIZmhTASQo4fHuImtUja5ggYtU=
=Ht4d
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'bonzini/tags/for-upstream' into staging
Small fixes for rc2.
# gpg: Signature made Mon 28 Nov 2016 03:45:20 PM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* bonzini/tags/for-upstream:
rules.mak: Use -r instead of -Wl, -r to fix building when PIE is default
migration/pcspk: Turn migration of pcspk off for 2.7 and older
migration/pcspk: Add a property to state if pcspk is migrated
pci-assign: sync MSI/MSI-X cap and table with PCIDevice
megasas: clean up and fix request completion/cancellation
megasas: do not call pci_dma_unmap after having freed the frame once
Message-id: 1480372837-109736-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
To keep backwards migration compatibility allow us to turn pcspk
migration off.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20161128133201.16104-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Here's the first set of 2.8 hard freeze bugfixes for ppc.
The biggest thing here is a batch of fixes for migration breakages in
both 2.7 and current 2.8. Alas, there is at least one more migration
problem, which prevents memory unplug after a migration. I hoped to
include a fix for that here, but it turned out to have some problems
bigger than those it was solving. So, I expect at least one more hard
freeze pull request.
There are also a few other assorted bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYNP4mAAoJEGw4ysog2bOSouIQALsw0PNpduvEsUzgEZ6GOgFw
77jEawt4me+eCgB0oipj0Bz9ho2DIGeheiFrvU7vTsD/q00CDc5kZ6GNPlY43sGM
OzT65EyycQ7MDZFDfVgpmaHjXqIGVf5zZbyz8ZD5wU3w10DdRtrDogYcjb+ZQzCG
0vRnAkV/tuVkn9Z5ogWrdvhQa0/ER3Yk/BpTXoe4JFoLgViwydkI6yCSw5dwatEU
djprDinCsBziKDT03Z9wmiTGTvZk6iGHMJWPOLJOSTBd5v9pzdpxtuNrZrF1oOQd
pBE1qlNkCpnd+LLKyW+nsTdo1FyxUg0pg7kWqnSPwqm+KM09Phpp00FN69Hmz/DR
P+aMX9qKaTJoNPHklY15pmF/olIkcxVlidNKaqgKAbZZR5BuHF3YBVILWL8ZfaeE
n6Gw0GqJeTSW5mO81uikKTZt5kqOVChHbxXcxfVl/4vzk8TTS3fy5AW0IERbfgHN
NbBesSZejqL++xzVrfoVyfJV8nkF1M+08FITQdyXpkdYVB565e9YmlIaLpZ5a7It
gLVBqbAEOaC+5swlEyAp70h+nhjVN631+b8gs+bi9trrBL9IL8q3g7U0l7XKM0Zs
MU6nxV2zogbdVraiPv9KrwtOeUKXAPUJfe3fXRr4rBYTL7HK9CBQWjaGNVtFJPKk
vWybUBSmwF402OmslZKp
=cds8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'dgibson/tags/ppc-for-2.8-20161123' into staging
ppc patch queue 2016-11-23
Here's the first set of 2.8 hard freeze bugfixes for ppc.
The biggest thing here is a batch of fixes for migration breakages in
both 2.7 and current 2.8. Alas, there is at least one more migration
problem, which prevents memory unplug after a migration. I hoped to
include a fix for that here, but it turned out to have some problems
bigger than those it was solving. So, I expect at least one more hard
freeze pull request.
There are also a few other assorted bug fixes.
# gpg: Signature made Wed 23 Nov 2016 02:25:42 AM GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* dgibson/tags/ppc-for-2.8-20161123:
spapr: Fix 2.7<->2.8 migration of PCI host bridge
Revert "spapr: Fix migration of PCI host bridges from qemu-2.7"
target-ppc: Allow eventual removal of old migration mistakes
migration: Add VMSTATE_UINTTL_TEST()
target-ppc: Fix CPU migration from qemu-2.6 <-> later versions
ppc: Make uninorth interrupt swizzling identical to Grackle
target-ppc: fix index array of national digits
hw/char/spapr_vty: Return amount of free buffer entries in vty_can_receive()
ppc: BOOK3E: nothing should be done when MSR:PR is set
spapr: migration support for CAS-negotiated option vectors
tests/postcopy: Use KVM on ppc64 only if it is KVM-HV
Message-id: 1479869383-16162-1-git-send-email-david@gibson.dropbear.id.au
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
from qemu-2.7 to the current version. It split the device's MMIO
window into two pieces for 32-bit and 64-bit MMIO.
The patch included backwards compatibility code to convert the old
property into the new format. However, the property value was also
transferred in the migration stream and compared with a (probably
unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to
the new style converted value from (pre-)2.8 giving a mismatch and
migration failure.
Along with the actual field that caused the breakage, there are
several other ill-advised VMSTATE_EQUAL()s. To fix forwards
migration, we read the values in the stream into scratch variables and
ignore them, instead of comparing for equality. To fix backwards
migration, we populate those scratch variables in pre_save() with
adjusted values to match the old behaviour.
To permit the eventual possibility of removing this cruft from the
stream, we only include these compatibility fields if a new
'pre-2.8-migration' property is set. We clear it on the pseries-2.8
machine type, which obviously can't be migrated backwards, but set it
on earlier machine type versions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
include/migration/cpu.h defines VMSTATE_UINTTL() and several variants
for migrating target_ulong fields. It's defined in terms of
VMSTATE_UINT32() or VMSTATE_UINT64() as appropriate.
It doesn't, however, include a VMSTATE_UINTTL_TEST() variant, which
I'm going to need shortly. So, add it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
With the additional of the OV5_HP_EVT option vector, we now have
certain functionality (namely, memory unplug) that checks at run-time
for whether or not the guest negotiated the option via CAS. Because
we don't currently migrate these negotiated values, we are unable
to unplug memory from a guest after it's been migrated until after
the guest is rebooted and CAS-negotiation is repeated.
This patch fixes this by adding CAS-negotiated options to the
migration stream. We do this using a subsection, since the
negotiated value of OV5_HP_EVT is the only option currently needed
to maintain proper functionality for a running guest.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In the user emulation code path, tlb_vaddr_to_host erronesously passed
vaddr as the guest address to be translated, instead of addr, the parameter
which actually contained the guest address.
This resulted in incorrect addresses being used when emulating block copy
(mvc/mvpg) and block clear (xc) instructions for the s390x target.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Message-Id: <20161113050523.23909-1-koorogi@koorogi.info>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create a qdev plugged to the xen-sysbus for each new backend device.
This device can be used as a parent for all needed devices of that
backend. The id of the new device will be "xen-<type>-<dev>" with
<type> being the xen backend type (e.g. "qdisk") and <dev> the xen
backend number of the type under which it is to be found in xenstore.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
In order to have an easy way to add a new qdev with a specific id
carve out the needed functionality from qdev_device_add() into a new
function qdev_set_id().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Add a bus for Xen backend devices in order to be able to establish a
dedicated device path for pluggable devices.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Most notably this fixes a regression with vhost introduced by the pull before
last.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYLyObAAoJECgfDbjSjVRptPoIAK/4SdEAqS9pnXPekPZpIddV
KHCFjj4Q68s22i0jpA1hxSXk1yQZIl56dnynU0DIAbCD1NYQIEmWx7uOJjppre9O
L64V2s2ItEagFBGFwQDoJnUDIyEhth8KRqsa36V2YWJXYOaH1Rx1QNb9tX9R0aeb
2lVwYE+yig1Gc/2PAYJrcKWwM3iwWrYW6ssycP2LEOGOhBCIrGZwDJkqv7ayDVL9
j4tH2eBRrOAzm8c3fybC3OZkeLqcQJnbVONmD8kV0Q0IphcFvloJQCvcefb/3Ox1
HAz57JxZfpxMZPVtvgU8Q+xzElz8noCXg+6lF/dx71CKicwXxg4lsMF1LyKHUoU=
=cuAn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_upstream' into staging
virtio, vhost, pc: fixes
Most notably this fixes a regression with vhost introduced by the pull before
last.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 18 Nov 2016 03:51:55 PM GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* mst/tags/for_upstream:
acpi: Use apic_id_limit when calculating legacy ACPI table size
ipmi: fix qemu crash while migrating with ipmi
ivshmem: Fix 64 bit memory bar configuration
virtio: set ISR on dataplane notifications
virtio: access ISR atomically
virtio: introduce grab/release_ioeventfd to fix vhost
virtio-crypto: fix virtio_queue_set_notification() race
Message-id: 1479484366-7977-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised. This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.
Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation. If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster. Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.
The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM. The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.
The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR. The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Following the recent refactoring of virtio notifiers [1], more specifically
the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to
start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2]
by default, core virtio code requires 'ioeventfd_started' to be set
to true/false when the host notifiers are configured.
When vhost is stopped and started, however, there is a stop followed by
another start. Since ioeventfd_started was never set to true, the 'stop'
operation triggered by virtio_bus_set_host_notifier() will not result
in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves
the memory regions with stale notifiers and results on the next start
triggering the following assertion:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
Aborted
This patch reintroduces (hopefully in a cleaner way) the concept
that was present with ioeventfd_disabled before the refactoring.
When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd
should be enabled or not, but ioeventfd is actually not started at
all until vhost releases the host notifiers.
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html
Reported-by: Felipe Franciosi <felipe@nutanix.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd")
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 080ac219cc.
Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus'
fw_cfg file since it does the same and there is no point
to maintaing duplicate guest ABI, if it can be helped.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-2-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Lots of fixes all over the place.
Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYKyfxAAoJECgfDbjSjVRpI4oH/2ZBpUxT/neq4ezX0bou5+1R
lQ1m0VI1JDbay5uYw0Z0rUY7aLP0kX2XLu2jNBZg7fGz3+BPhqAoEjkGdlUran79
fEnuYCvMMokQNzvMaHv+lFXO/MuEJthdDeUJyi4B0NU0sseutiz8opXuSWIC8Ncx
pyqRb8AfgKnsUSRizEVfiOWI1fk+zsTFbSyUwajwKi5E7RNEuHwLiqk5VFhzrrTX
nLwnUvlH7NrcDfliam9ziadhguo5cwCE4jBlMpeHnW5tGalNRulvF5EgwXybIdrU
JaR6lzQabOcoaAuJJ/dgo336B1Ef3JA/hogqfTW4unJGL5VVkWT0HLZ9OV42NEg=
=ibZy
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, vhost, pc, pci: documentation, fixes and cleanups
Lots of fixes all over the place.
Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* remotes/mst/tags/for_upstream: (28 commits)
docs: add PCIe devices placement guidelines
virtio: drop virtio_queue_get_ring_{size,addr}()
vhost: drop legacy vring layout bits
vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE
nvdimm acpi: use aml_name_decl to define named object
nvdimm acpi: rename nvdimm_dsm_reserved_root
nvdimm acpi: fix two comments
nvdimm acpi: define DSM return codes
nvdimm acpi: rename nvdimm_acpi_hotplug
nvdimm acpi: cleanup nvdimm_build_fit
nvdimm acpi: rename nvdimm_plugged_device_list
docs: improve the doc of Read FIT method
nvdimm acpi: clean up nvdimm_build_acpi
pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug
pc: memhp: move nvdimm hotplug out of memory hotplug
nvdimm acpi: drop the lock of fit buffer
qdev: hotplug: drop HotplugHandler.post_plug callback
vhost: migration blocker only if shared log is used
virtio-net: mark VIRTIO_NET_F_GSO as legacy
...
Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These are not used anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.
This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:
hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
virtio_queue_get_used_size(vdev, n);
}
and the mapping fails.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.
In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.
This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.
This works for legacy mappings as well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename it to nvdimm_plug()
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as they use completely different way to handle hotplug event
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as there is a global lock to protect vm-exit handlers and
QMP/monitor, this lock can be dropped
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
as nvdimm acpi is okay to build fit when the nvdimm device
has not been 'realized'
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.
Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We should not use cpu_to_le16() here, instead each of device/function
value is stored in a 8 byte field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function does not fully initialize the returned VirtQueueElement and should
be used only internally from the virtio module.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function undoes the effect of virtqueue_pop and doesn't do anything
destructive or irreversible so virtqueue_unpop is a more fitting name.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
iQIcBAABAgAGBQJYKoq1AAoJEL2+eyfA3jBXomcP/2TCNW44gcUoHBovV71q2O0T
IFx5/3M5hKrbRkTtXjl0jKaNwLIIwATHHUqpJ5hPXPFpD7TWht9hgP7K8kGGzjRa
fWBVYtbOOzrHs5DcaKMzPzQKkptJBY4gdEBFUaLMetwjzKOu/6psEq+FnG34CTwO
+PY9AirWtOP6q4Xd0FDbQmVNmMZv0AI+/b2UMuRhsmG0vfyqdurD/yG5dXqKY9UQ
O7eWSZyXolUJIbYI9rDteKJLoSDHfwy/+kDX4Hm79g9pRpaWsi484QYj/6y9orwP
wEJhDGLPEpKCSdGGtX2qq1teqmEvVm8j+sDUaiQcRMQm+z0EE8XuBV0e9SaCDgOo
gTRs1j6kkV65fPjyfdwbsVHl0q3D3KaLJiUdIbIuXqHEVU8Z2e4GWCUL0mEghSpo
tWCLWEM1m1YXpPSRS0pvHq6Qy1nGX6T5UxK8M1jvhYW/fNPft83TF274+rl5wZlq
3bu6EJMR+s/TR0k6RCFBWZfZ9IxrVW4zXZUcHs0iuaHNeXYwSx+fwRt+2dN2hWw+
Err4rCpHBu7zdsHSm4muZzvKp/iIpDQJiyT9r6FxOwfEH/dE+YtAnW9GxKNuXDl0
8QzVLtKaJOASil4wlpEQ+uRhZvYIesoHLXEpNbSvpLjn73CQ4p0Zs9vkbkG0SqhE
disn0383BEyJXCfd7iAv
=sHj+
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'jtc/tags/block-pull-request' into staging
# gpg: Signature made Tue 15 Nov 2016 04:10:29 AM GMT
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* jtc/tags/block-pull-request:
mirror: do not flush every time the disks are synced
block/curl: Do not wait for data beyond EOF
block/curl: Remember all sockets
block/curl: Fix return value from curl_read_cb
block/curl: Use BDRV_SECTOR_SIZE
block/curl: Drop TFTP "support"
qemu-iotests: avoid spurious failure on test 109
iotests: add transactional failure race test
blockjob: refactor backup_start as backup_job_create
blockjob: add block_job_start
blockjob: add .start field
blockjob: add .clean property
blockjob: fix dead pointer in txn list
Message-id: 1479183291-14086-1-git-send-email-jcody@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Refactor backup_start as backup_job_create, which only creates the job,
but does not automatically start it. The old interface, 'backup_start',
is not kept in favor of limiting the number of nearly-identical interfaces
that would have to be edited to keep up with QAPI changes in the future.
Callers that wish to synchronously start the backup_block_job can
instead just call block_job_start immediately after calling
backup_job_create.
Transactions are updated to use the new interface, calling block_job_start
only during the .commit phase, which helps prevent race conditions where
jobs may finish before we even finish building the transaction. This may
happen, for instance, during empty block backup jobs.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1478587839-9834-6-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Instead of automatically starting jobs at creation time via backup_start
et al, we'd like to return a job object pointer that can be started
manually at later point in time.
For now, add the block_job_start mechanism and start the jobs
automatically as we have been doing, with conversions job-by-job coming
in later patches.
Of note: cancellation of unstarted jobs will perform all the normal
cleanup as if the job had started, particularly abort and clean. The
only difference is that we will not emit any events, because the job
never actually started.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1478587839-9834-5-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Add an explicit start field to specify the entrypoint. We already have
ownership of the coroutine itself AND managing the lifetime of the
coroutine, let's take control of creation of the coroutine, too.
This will allow us to delay creation of the actual coroutine until we
know we'll actually start a BlockJob in block_job_start. This avoids
the sticky question of how to "un-create" a Coroutine that hasn't been
started yet.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478587839-9834-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Cleaning up after we have deferred to the main thread but before the
transaction has converged can be dangerous and result in deadlocks
if the job cleanup invokes any BH polling loops.
A job may attempt to begin cleaning up, but may induce another job to
enter its cleanup routine. The second job, part of our same transaction,
will block waiting for the first job to finish, so neither job may now
make progress.
To rectify this, allow jobs to register a cleanup operation that will
always run regardless of if the job was in a transaction or not, and
if the transaction job group completed successfully or not.
Move sensitive cleanup to this callback instead which is guaranteed to
be run only after the transaction has converged, which removes sensitive
timing constraints from said cleanup.
Furthermore, in future patches these cleanup operations will be performed
regardless of whether or not we actually started the job. Therefore,
cleanup callbacks should essentially confine themselves to undoing create
operations, e.g. setup actions taken in what is now backup_start.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478587839-9834-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The XSCOM addresses for the core registers are encoded in a slightly
different way on POWER8 and POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PnvChip is defined twice and this can confuse old compilers :
CC ppc64-softmmu/hw/ppc/pnv_xscom.o
In file included from qemu.git/hw/ppc/pnv.c:29:
qemu.git/include/hw/ppc/pnv.h:60: error: redefinition of typedef ‘PnvChip’
qemu.git/include/hw/ppc/pnv_xscom.h:24: note: previous declaration of ‘PnvChip’ was here
make[1]: *** [hw/ppc/pnv.o] Error 1
make[1]: *** Waiting for unfinished jobs....
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All the variants for rol/ror have a bug in case where the shift == 0.
For example rol32, would generate:
return (word << 0) | (word >> 32);
Which though works, would be flagged as a runtime error on clang's
sanitizer.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQExBAABCAAbBQJYGaRPFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
XKgH/RgNtosBTqJsmphkS7wACFAFOf7Uq46ajoKfB66Pt1J/++pFQg4TApPYkb7j
KlKeKmXa7hb6+Jg8325H4zGkGno4kn2dE+OnznaB1xPKwiZVAMQVzQsagsEVqpno
k/5PBVRptIiuHQKyU29Go0CxbWJBTH0O14S7rDK4YDF0YMnuT280HQOI3jdu1igV
G/Q+CMgfk+yXf6GWHE8Z9sNq7n0ha8qgruA/X3NC7+pAvEsUcAP065zwLp9weYuK
W1MU68L7Ub4tRo0SVf1HFkDUNdMv4T4hg+wpGe1GwthJWexHu9x0YAQBy60ykJb6
NtHwjLwCUWtm7AiZD/btsOJPmjk=
=+Dt/
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)
# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (30 commits)
main-loop: Suppress I/O thread warning under qtest
docs/rcu.txt: Fix minor typo
vl: exit qemu on guest panic if -no-shutdown is not set
checkpatch: allow spaces before parenthesis for 'coroutine_fn'
x86: add AVX512_4VNNIW and AVX512_4FMAPS features
slirp: fix CharDriver breakage
qemu-char: do not forward events through the mux until QEMU has started
nbd: Implement NBD_CMD_WRITE_ZEROES on client
nbd: Implement NBD_CMD_WRITE_ZEROES on server
nbd: Improve server handling of shutdown requests
nbd: Refactor conversion to errno to silence checkpatch
nbd: Support shorter handshake
nbd: Less allocation during NBD_OPT_LIST
nbd: Let client skip portions of server reply
nbd: Let server know when client gives up negotiation
nbd: Share common option-sending code in client
nbd: Send message along with server NBD_REP_ERR errors
nbd: Share common reply-sending code in server
nbd: Rename struct nbd_request and nbd_reply
nbd: Rename NbdClientSession to NBDClientSession
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.
It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants to allow
a hole.
Note that when it comes to requiring full allocation, vs.
permitting optimizations, the NBD spec intentionally picked a
different sense for the flag; the rules in qemu are:
MAY_UNMAP == 0: must write zeroes
MAY_UNMAP == 1: may use holes if reads will see zeroes
while in NBD, the rules are:
FLAG_NO_HOLE == 1: must write zeroes
FLAG_NO_HOLE == 0: may use holes if reads will see zeroes
In all cases, the 'may use holes' scenario is optional (the
server need not use a hole, and must not use a hole if
subsequent reads would not see zeroes).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-16-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
NBD commit 6d34500b clarified how clients and servers are supposed
to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN
(for the server to announce it is about to go away during option
haggling, so the client should quit sending NBD_OPT_* other than
NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about
to go away during transmission, so the client should quit sending
NBD_CMD_* other than NBD_CMD_DISC). It also clarified that
NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not.
This patch merely adds the missing reply to NBD_OPT_ABORT and teaches
the client to recognize server errors. Actually teaching the server
to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that
the server has been requested to shut down soon (maybe we could do
that by installing a SIGINT handler in qemu-nbd, which transitions
from RUNNING to a new state that waits for the client to react,
rather than just out-right quitting - but that's a bigger task for
another day).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-15-git-send-email-eblake@redhat.com>
[Move dummy ESHUTDOWN to include/qemu/osdep.h. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD Protocol allows the server and client to mutually agree
on a shorter handshake (omit the 124 bytes of reserved 0), via
the server advertising NBD_FLAG_NO_ZEROES and the client
acknowledging with NBD_FLAG_C_NO_ZEROES (only possible in
newstyle, whether or not it is fixed newstyle). It doesn't
shave much off the wire, but we might as well implement it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Message-Id: <1476469998-28592-13-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than open-coding each option request, it's easier to
have common helper functions do the work. That in turn requires
having convenient packed types for handling option requests
and replies.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-9-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Our coding convention prefers CamelCase names, and we already
have other existing structs with NBDFoo naming. Let's be
consistent, before later patches add even more structs.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-6-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current upstream NBD documents that requests have a 16-bit flags,
followed by a 16-bit type integer; although older versions mentioned
only a 32-bit field with masking to find flags. Since the protocol
is in network order (big-endian over the wire), the ABI is unchanged;
but dealing with the flags as a separate field rather than masking
will make it easier to add support for upcoming NBD extensions that
increase the number of both flags and commands.
Improve some comments in nbd.h based on the current upstream
NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md),
and touch some nearby code to keep checkpatch.pl happy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-3-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD protocol allows servers to advertise a human-readable
description alongside an export name during NBD_OPT_LIST. Add
an option to pass through the user's string to the NBD client.
Doing this also makes it easier to test commit 200650d4, which
is the client counterpart of receiving the description.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-2-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
_GPE.E04 is dedicated for nvdimm device hotplug
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug
As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.
@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access
As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For each NVDIMM present or intended to be supported by platform,
platform firmware also exposes an ACPI Namespace Device under
the root device
So it builds nvdimm devices for all slots to support vNVDIMM hotplug
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make crypto operations are executed asynchronously,
so that other QEMU threads and monitor couldn't
be blocked at the virtqueue handling context.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We use an opaque point to the VirtIOCryptoReq which
can support different packets based on different
algorithms.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduces VirtIOCryptoReq structure to store
crypto request so that we can easily support
asynchronous crypto operation in the future.
At present, we only support cipher and algorithm
chaining.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose the capacity of algorithms supported by
virtio crypto device to the frontend driver using
pci configuration space.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the virtio crypto realization, I'll
finish the core code in the following patches. The
thoughts came from virtio net realization.
For more information see:
http://qemu-project.org/Features/VirtioCrypto
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.
While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.
For mingw32, we appear to have no viable solution for this. The locking
functions are not properly exported from the system runtime library.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Implement error_vprintf to send the output of error_report to
the test log. This silences test-vmstate.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-3-git-send-email-pbonzini@redhat.com>
Leave the implementation of error_vprintf and error_vprintf_unless_qmp
(the latter now trivially wrapped by error_printf_unless_qmp) to
libqemustub.a and monitor.c. This has two advantages: it lets us
remove the monitor_printf and monitor_vprintf stubs, and it lets
tests provide a different implementation of the functions that uses
g_test_message.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-2-git-send-email-pbonzini@redhat.com>
(Trivial)
Fix wrong function names in documentation.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-8-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
To make it a little more obvious which functions are intended to be
public interface and which are intended to be for use only by jobs
themselves, split the interface into "public" and "private" files.
Convert blockjobs (e.g. block/backup) to using the private interface.
Leave blockdev and others on the public interface.
There are remaining uses of private state by qemu-img, and several
cases in blockdev.c and block/io.c where we grab job->blk for the
purposes of acquiring an AIOContext.
These will be corrected in future patches.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-7-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
BlockJobs will begin hiding their state in preparation for some
refactorings anyway, so let's internalize the user_pause mechanism
instead of leaving it to callers to correctly manage.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-6-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
There's no reason to leave this to blockdev; we can do it in blockjobs
directly and get rid of an extra callback for most users.
All non-internal events, even those created outside of QMP, will
consistently emit events.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-5-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Bubble up the internal interface to commit and backup jobs, then switch
replication tasks over to using this methodology.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Add the ability to create jobs without an ID.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
If jobs are not created directly by the user, do not allow them to be
seen by the user/management utility. At the moment, 'internal' jobs are
those that do not have an ID. As of this patch it is impossible to
create such jobs.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1477584421-1399-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJYF2zfAAoJEH8JsnLIjy/WOsAP/1TLKU8ZZ/TLpONfUD6SYcdt
rPxhuxlxRl7k1xP/Y4oYa+Nl6Q9emgIcTA3V3hqKNEmaqxQWEM56Q+l2cFhaqqVm
CO/bNJ0vGDogAG0ahgPCs9XKx7IPByb/iQiE+FNL68ZPZAWIaEHYgoiCjukr6e8B
Q/OXN+WQLM4SIxnfLCYKSCycdGsaSTSx1T/8IjQb0DbpMFtDHOPkXcf/7AsiNJdP
wdcx0uDkFk3GK2hN4A/ODFQQpEoi6ehm/RaaiDqGEX93IWnFlXRpq4h+dih79jaI
Xuf9vt+uBKqTW5EiwY/lUmalf+zxj6wHNkvNze9paiacAvuT4N4Vq4+niDvRAbeI
gkeC2GNRdO0iD+iBMHKQQ0JelBn0y/B+txzurcbZd71d0GF3r6jC17BEwnM39veR
iIARK/dZEvygYRQknTG/EkbYHBdpWNmjKeVBr/08L7v7r9pN15huNx6Cmr+klz08
ibstYxyCdmhvAf+UFELDZh3MF+65dpv8+Sa2OYS1SeE7jO9IhkF1SSyaGtSglvtO
AN9xrJnaOBlLdVLycLpuH+1bcpXIYwMaoHeVsDmeBI5N9QPYlD73Y0ns7vENLR59
2kUtyhOORdaE+y69hjuRmNYIN8svszzDk6or95aGZB3r4yv7uu2w+sGyy2dR+pdZ
m+80c3xshudX9rUyHlJb
=Hglo
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Mon 31 Oct 2016 16:10:07 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (29 commits)
qapi: allow blockdev-add for NFS
block/nfs: Introduce runtime_opts in NFS
block: Mention replication in BlockdevDriver enum docs
qemu-iotests: test 'offset' and 'size' options in raw driver
raw_bsd: add offset and size options
qemu-iotests: Test the 'base-node' parameter of 'block-stream'
block: Add 'base-node' parameter to the 'block-stream' command
qemu-iotests: Test streaming to a Quorum child
qemu-iotests: Add iotests.supports_quorum()
qemu-iotests: Test block-stream and block-commit in parallel
qemu-iotests: Test overlapping stream and commit operations
qemu-iotests: Test block-stream operations in parallel
qemu-iotests: Test streaming to an intermediate layer
docs: Document how to stream to an intermediate layer
block: Add QMP support for streaming to an intermediate layer
block: Support streaming to an intermediate layer
block: Block all intermediate nodes in commit_active_start()
block: Block all nodes involved in the block-commit operation
block: Check blockers in all nodes involved in a block-commit job
block: Use block_job_add_bdrv() in backup_start()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.
This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.
To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.
With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that. If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it. Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
When a block job is created on a certain BlockDriverState, operations
are blocked there while the job exists. However, some block jobs may
involve additional BDSs, which must be blocked separately when the job
is created and unblocked manually afterwards.
This patch adds block_job_add_bdrv(), that simplifies this process by
keeping a list of BDSs that are involved in the specified block job.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_drain_all() doesn't allow the caller to do anything after all
pending requests have been completed but before block jobs are
resumed.
This patch splits bdrv_drain_all() into _begin() and _end() for that
purpose. It also adds aio_{disable,enable}_external() calls to disable
external clients in the meantime.
An important restriction of this split is that no new block jobs or
BlockDriverStates can be created between the bdrv_drain_all_begin()
and bdrv_drain_all_end() calls. This is not a concern now because
we'll only be using this in bdrv_reopen_multiple(), but it must be
dealt with if we ever have other uses cases in the future.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
able to use it with InetSocketAddress sockets outside of
util/qemu-sockets.c independently.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQItBAABCAAXBQJYFc37EBxhbWl0QGtlcm5lbC5vcmcACgkQ6wtN/GV+9nDdqQ/9
H+di+q/zF6aOEuXRCN0ud9R81fZ7nLve3e9EbKCNZXnxN3M3+zQj0At+e/SBEoTc
rRwqJmNlRX3TWViWsAYmddgtopZ9R9DWwW/VsKjS2Ng230YShrA/o20hu2dJkwl3
CN4vAObzc/gxM59NWUlMnTXOG+Z9fI1NlEf0vZ2484a59KPVIE7W1zZccT1F8MNq
sfa3RlOGBchNO2Rfzrr6cFGGH7UTfWiftPs7EDOfN/YBcpld6V4DfPWdPP8r2DSX
gG7D3DJuuqPxZCjl0Nm1OjaIunYfVrRpMPMeNyo1+kTVbvvhDPFjah/MSWu8XJ2c
N5lSqikIAWfMbQVW9gDpQ0495eRQTWA7VIlCbwN6mqNxyBvQMA+licFq6UFFrCMp
quC3gO+daz3fKvUhi23TpebbqKLHA5OZA5ZvkjGDgkKPMCJyQRBZHLb81t5xsulZ
cXuzAOeRcXK9aEJvcrDgWwxzi3PN8zg74RF1ZV8gxM4DkHKohlsnbgshWqGkFh3M
+S5tEPqVlOlDO9juf6rlwNnVbWhFDFEGMKjI9XMTWwVWTREbqyP86fP66h2C4qc6
34yAHi5G2i43dxzHgpHC0MpU0XenO0EYdu+8Tcx35LSBSfkOeD7pU1DeZQgbI51m
ZQSnLDJqv2HVdoZT7vaIjwuhtEl84xqelFdVg+cY7Gw=
=sur7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.8' into staging
Migration bits from the COLO project
# gpg: Signature made Sun 30 Oct 2016 10:39:55 GMT
# gpg: using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337 2735 1E9A 3B5F 8540 83B6
# Subkey fingerprint: CC63 D332 AB8F 4617 4529 6534 EB0B 4DFC 657E F670
* remotes/amit-migration/tags/migration-for-2.8:
MAINTAINERS: Add maintainer for COLO framework related files
configure: Support enable/disable COLO feature
docs: Add documentation for COLO feature
COLO: Implement failover work for secondary VM
COLO: Implement the process of failover for primary VM
COLO: Introduce state to record failover process
COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
COLO: Synchronize PVM's state to SVM periodically
COLO: Add checkpoint-delay parameter for migrate-set-parameters
COLO: Load VMState into QIOChannelBuffer before restore it
COLO: Send PVM state to secondary side when do checkpoint
COLO: Add a new RunState RUN_STATE_COLO
COLO: Introduce checkpointing protocol
COLO: Establish a new communicating path for COLO
migration: Switch to COLO process after finishing loadvm
migration: Enter into COLO mode after migration if COLO is enabled
COLO: migrate COLO related info to secondary node
migration: Introduce capability 'x-colo' to migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version: GnuPG v2
iQEcBAABCAAGBQJYE2ULAAoJEMo1YkxqkXHGvvAH/iDPIAiwBXbndL3KhQTneSHn
ctd4I3VK1/VVTIBRJIetqETiWiAm/WoRhI9kBc/NrQxBFx3ko+fpSYFS2t6lJYnV
EX0vjTKjFhr05tOTQDH/SQtHdU5x/x2M8SsxqrCcTyLm5VDfdPeBlMBfSNMj/L2K
bwinANVEwr6LOM0h8weQ0SvOCa5MLII2p5ufGwKQmhUY5tgZvFlyPa+quDVisKoE
7CpLwWHmUQSNxUXSaru90osUJyk90wCcYxPpJN3YO1MHvpH4kG8DpZ8bnFqLAoNw
zkRdqIrlfntD+mKDqRU1y0GXxu9I4VK1UDcQyRFoSdMi2oHR+L018sQEjCYTAXo=
=n+CF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/famz/tags/for-upstream' into staging
# gpg: Signature made Fri 28 Oct 2016 15:47:39 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/for-upstream:
aio: convert from RFifoLock to QemuRecMutex
qemu-thread: introduce QemuRecMutex
iothread: release AioContext around aio_poll
block: only call aio_poll on the current thread's AioContext
qemu-img: call aio_context_acquire/release around block job
qemu-io: acquire AioContext
block: prepare bdrv_reopen_multiple to release AioContext
replication: pass BlockDriverState to reopen_backing_file
iothread: detach all block devices before stopping them
aio: introduce qemu_get_current_aio_context
sheepdog: use BDRV_POLL_WHILE
nfs: use BDRV_POLL_WHILE
nfs: move nfs_set_events out of the while loops
block: introduce BDRV_POLL_WHILE
qed: Implement .bdrv_drain
block: change drain to look only at one child at a time
block: add BDS field to count in-flight requests
mirror: use bdrv_drained_begin/bdrv_drained_end
blockjob: introduce .drain callback for jobs
replication: interrupt failover if the main device is closed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks. Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.
This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.
Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock. The next one will rectify this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds asserts to check the locking on the various translation
engines structures. There are two sets of structures that are protected
by locks.
The first the l1map and PageDesc structures used to track which
translation blocks are associated with which physical addresses. In
user-mode this is covered by the mmap_lock.
The second case are TB context related structures which are protected by
tb_lock which is also user-mode only.
Currently the asserts do nothing in SoftMMU mode but this will change
for MTTCG.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce the virtio_crypto.h which follows
virtio-crypto specification.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds session operation and crypto operation
stuff in the cryptodev backend, including function
pointers and corresponding structures.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
cryptodev backend interface is used to realize the active work for
virtual crypto device.
This patch only add the framework, doesn't include specific operations.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Of the three possible parameter combinations for
virtio_queue_set_host_notifier_fd_handler:
- assign=true/set_handler=true is only called from
virtio_device_start_ioeventfd
- assign=false/set_handler=false is called from
set_host_notifier_internal but it only does something when
reached from virtio_device_stop_ioeventfd_impl; otherwise
there is no EventNotifier set on qemu_get_aio_context().
- assign=true/set_handler=false is called from
set_host_notifier_internal, but it is not doing anything:
with the new start_ioeventfd and stop_ioeventfd methods,
there is never an EventNotifier set on qemu_get_aio_context()
at this point. This is enforced by the assertion in
virtio_bus_set_host_notifier.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ioeventfd_disabled was the only reason for the default
implementation of virtio_device_start_ioeventfd not to use
virtio_bus_set_host_notifier. This is now fixed, and the sole entry
point to set up ioeventfd can be virtio_bus_set_host_notifier.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that there is not anymore a switch from the generic ioeventfd handler
to the dataplane handler, virtio_bus_set_host_notifier(assign=true) is
always called with !bus->ioeventfd_started, hence virtio_bus_stop_ioeventfd
does nothing in this case. Move the invocation to vhost.c, which is the
only place that needs it.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make virtio_device_start_ioeventfd_impl use the same logic as
dataplane to set up the host notifier. This removes the need
for the set_handler argument in set_host_notifier_internal.
This is a first step towards using virtio_bus_set_host_notifier
as the sole entry point to set up ioeventfds. At least now
the functions have the same interface, but they still differ
in that virtio_bus_set_host_notifier sets ioeventfd_disabled.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 872dd82c83.
virtio_add_queue_aio is unused.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Override start_ioeventfd and stop_ioeventfd to start/stop the
whole dataplane logic. This has some positive side effects:
- no need anymore for virtio_add_queue_aio (i.e. a revert of
commit 1c627137c1)
- no need anymore to switch from generic ioeventfd handlers to
dataplane
It detects some errors better:
$ qemu-system-x86_64 -object iothread,id=io \
-device virtio-scsi-pci,ioeventfd=off,iothread=io
qemu-system-x86_64: -device virtio-scsi-pci,ioeventfd=off,iothread=io:
ioeventfd is required for iothread
while previously it would have started just fine.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will be used to forbid iothread configuration when the
proxy does not allow using ioeventfd. To simplify the implementation,
change the direction of the ioeventfd_disabled callback too.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allow customization of the start and stop of ioeventfd. This will
allow direct start of dataplane without passing through the default
ioeventfd handlers, which in turn allows using the dataplane logic
instead of virtio_add_queue_aio. It will also enable some code
simplification, because the sole entry point to ioeventfd setup
will be virtio_bus_set_host_notifier.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback. The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore. It was the only backend to do
so, and if desired this behavior should be implemented in
virtio-bus.c.
Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_set_disabled
callback.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provide a vmsd pointer for VirtIO devices to use instead of the
load/save methods.
We'll eventually kill off the load/save methods.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.
We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.
For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.
We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.
We reuse migration thread, so the process of checkpointing will be handled
in migration thread.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.
We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.
The default value for COLO (COarse-Grain LOck Stepping) is disabled.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Prepare xen_be_del_xendev to be shared with frontends:
* xen_be_del_xendev -> xen_pv_del_xendev
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Prepare xen_be_find_xendev to be shared with frontends:
* xen_be_find_xendev -> xen_pv_find_xendev
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>