PCMachineState.node_cpu was used for mapping APIC ID
to numa node id as CPU entries in SRAT used to be
built on sparse APIC ID bitmap (up to apic_id_limit).
However since commit
5803fce pc: acpi: SRAT: create only valid processor lapic entries
CPU entries in SRAT aren't build using apic bitmap
but using 0..maxcpus index instead which is also used
for creating numa_info[x].node_cpu map.
So instead of doing useless intermediate conversion from
1. node by cpu index -> node by apic id
i.e. numa_info[x].node_cpu -> PCMachineState.node_cpu
2. apic id -> srat entry PMX
PCMachineState.node_cpu[apic id] -> PMX value
use numa_info[x].node_cpu map directly like ARM does and do
1. numa_info[x].node_cpu -> PMX value using index
in range 0..maxcpus
and drop not necessary PCMachineState.node_cpu and related
code.
That also removes the last (not counting legacy hotplug)
dependency of ACPI code on apic_id_limit and need to allocate
huge sparse PCMachineState.node_cpu array in case of 32-bit
APIC IDs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For compatibility reasons PC/Q35 will start with legacy
CPU hotplug interface by default but with new CPU hotplug
AML code since 2.7 machine type. That way legacy firmware
that doesn't use QEMU generated ACPI tables will be
able to continue using legacy CPU hotplug interface.
While new machine type, with firmware supporting QEMU
provided ACPI tables, will generate new CPU hotplug AML,
which will switch to new CPU hotplug interface when
guest OS executes its _INI method on ACPI tables
loading.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it adds HW and AML parts for CPU_Device._OST method
handling to allow OSPM reports status of hot-(un)plug
operation.
And extends QMP command query-acpi-ospm-status to report
CPU's OST info along with already reported PC-DIMM devices.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it adds hw registers needed for handling CPU hot-remove and
corresponding AML methods to request and eject a CPU with
necessary hotplug callbacks in pc,piix4,ich9 code.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it adds hw registers needed for handling CPU hot-add and
corresponding AML methods to handle hot-add events on
guest side.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add madt_cpu callback to AcpiDeviceIfClass and use
it for generating LAPIC MADT entries for CPUs.
Later it will be used for generating x2APIC
entries in case of more than 255 CPUs and also
would be reused by ARM target when ACPI CPU hotplug
is introduced there.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
it adds CPU objects to DSDT with _STA method
and QEMU side of CPU hotplug interface initialization
with registers sufficient to handle _STA requests,
including necessary hotplug callbacks in piix4,ich9 code.
Hot-(un)plug hw/acpi parts will be added by
corresponding follow up patches.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will be used to select which hotplug call-back is called
and for switching from legacy mode into new one.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will be used by NVDIMM ACPI
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Implement ObjectType which is used by NVDIMM _DSM method in
later patch
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a parameter, 'label-size', which is the size of nvdimm label
data area which is reserved at the end of backend memory. It is required
at least 128k
Two callbacks, read_label_data() and write_label_data(), are used to
operate the label area
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This callback returns the MemoryRegion that is the memory of dimm should
be kept during live migration
nvdimm device is different with pc-dimm as its memory includes not only
the MemoryRegion directly mapping to guest's address space but also the
memory used as label data
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the ACPI table construction tools to create an ACPI entry
for IPMI. This adds a function called build_acpi_ipmi_devices
to add an DSDT entry for IPMI if IPMI is compiled in and an
IPMI device exists. It also adds a dummy function if IPMI
is not compiled in.
This conforms to section "C3-2 Locating IPMI System Interfaces in
ACPI Name Space" in the IPMI 2.0 specification.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an IPMI table entry to the SMBIOS.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently outstanding patches for spapr, target-ppc and related
devices. This batch has:
* Significant new progress towards full support for hypervisor
mode
* Assorted bugfixes
* Some preliminary patches towards dynamic DMA window support
The last involves a change to memory.c, which Paolo has said I can
take through this tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXa3gJAAoJEGw4ysog2bOSzRQQAI0K+F2CoKuiGJ3csr40edlA
rDvHJd0a3DvF5ez8YBY9iiG2nXaphz7LrsfkFUQ/UmqlIpVLNdJaJ8l/RnjwrVnP
coJ9sFkzzARQF39BybGYPOVMalgi/MhnN/+zm87JaSD+C7gihG+ghYR4sgLHWalb
xWDFLnZVg3uTC1IQzS14NMjqswQN1O0d9wyfYjBogAAEvP4daGKVTQCgpoVwGaXw
JNE4Sm1vVIqJtiYr48V4cNVQWodpTMoHdVPJ6LPXWBlJMd3K5BEy3QsFc+jra+kG
Uj8dayNJwu4CjIwDWz4kpL/H0XtLNxfC2/6n1khc53OEnG5SzrgN80yJBvAKzU5S
KDQemwyOd3R2YTItkHR++QhLqHVx39Hr3BsFGXGyugAaczD7v4NLx87xmMEQOyQB
ai+B4EpQqS/lEOCxyf7nRzgRxA5uB2My9L31peWt862G/LH+UixMWn7CwKW4aKul
CuV4kqVdVQHrWe966HNYnDwfGpmaEXF9W5uLu7GenhFtW6R6IRhT1oORiyqHRfEi
9lTVx/vvHK2U3Ie3RCu3ZYuSbwDhP482J8ysfSo1iCkbSXHwNNACkLQmh0s/WmOs
X2erIOUAja/Ku2DyZ6bXngXN/W/JR6Hw1IpEMeo6LxeK8VIm7F8PkCgrJCg88EEp
ZFxGGYkc3HqksgJy1tuV
=Da1O
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' into staging
ppc patch queue for 2016-06-23
Currently outstanding patches for spapr, target-ppc and related
devices. This batch has:
* Significant new progress towards full support for hypervisor
mode
* Assorted bugfixes
* Some preliminary patches towards dynamic DMA window support
The last involves a change to memory.c, which Paolo has said I can
take through this tree.
# gpg: Signature made Thu 23 Jun 2016 06:47:53 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160623:
ppc: Disable huge page support if it is not available for main RAM
ppc: Add P7/P8 Power Management instructions
ppc: Move exception generation code out of line
ppc: Turn a bunch of booleans from int to bool
ppc: Add real mode CI load/store instructions for P7 and P8
ppc: Rework generation of priv and inval interrupts
ppc: Fix generation if ISI/DSI vs. HV mode
ppc: Fix POWER7 and POWER8 exception definitions
ppc: fix exception model for HV mode
ppc: define a default LPCR value
ppc: Fix rfi/rfid/hrfi/... emulation
memory: Add reporting of supported page sizes
ppc: Improve emulation of THRM registers
target-ppc: Fix rlwimi, rlwinm, rlwnm again
ppc64: disable gen_pause() for linux-user mode
tests: Use '+=' to add additional tests, not '='
powerpc/mm: Update the WIMG check during H_ENTER
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
USB devices in attached state are visible to the guest. This patch adds
a QOM property for this. Write access is opt-in per device. Some
devices manage attached state automatically (usb-host, usb-serial,
usb-redir), so we can't enable write access universally but have to do
it on a case by case base. So far, no device opts in.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1465984019-28963-4-git-send-email-kraxel@redhat.com
[ minor codestyle fix ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.
This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.
As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.
This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: Removed an unnecessary calculation]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Version: GnuPG v1
iQEcBAABAgAGBQJXaFInAAoJEJykq7OBq3PI6VsH/0Sfgbdo1RksYuQwb/y92sCW
EN+lxUZ+OLfgrc8PYgNZwfSM3rsfYhznL0MAXOeEe7Ahabi07w7DhGR8WvwfAOlI
G96FRuvrIPfv5u6U6fwS4CvG3TIHVLxfHKCsTpPUmH8U5CNx/x/tpjNiWN1dj6t+
sXybSjYHfZfiZy2tI9MFIFWCdxnF/pl0QAPhbRqc8Y/RQTDrPKRjLpz+nitN/u96
5TS7KlELyQuP91YMmLceYSmIkHbxW703h+iE2n4hov0uZCP8Jil+2Jsd3ziQSRlL
j6LqexQ2ViBGdDSfiZGYES2VPlsHOCwb4G+IgWBStfZg1ppaXENvcDzPrgrB+L4=
=eUnF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 20 Jun 2016 21:29:27 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/tracing-pull-request: (42 commits)
trace: split out trace events for linux-user/ directory
trace: split out trace events for qom/ directory
trace: split out trace events for target-ppc/ directory
trace: split out trace events for target-s390x/ directory
trace: split out trace events for target-sparc/ directory
trace: split out trace events for net/ directory
trace: split out trace events for audio/ directory
trace: split out trace events for ui/ directory
trace: split out trace events for hw/alpha/ directory
trace: split out trace events for hw/arm/ directory
trace: split out trace events for hw/acpi/ directory
trace: split out trace events for hw/vfio/ directory
trace: split out trace events for hw/s390x/ directory
trace: split out trace events for hw/pci/ directory
trace: split out trace events for hw/ppc/ directory
trace: split out trace events for hw/9pfs/ directory
trace: split out trace events for hw/i386/ directory
trace: split out trace events for hw/isa/ directory
trace: split out trace events for hw/sd/ directory
trace: split out trace events for hw/sparc/ directory
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The event is described in "trace-events". Note that the "MO_AMASK" flag
is not traced, since it does not seem to affect the visible semantics of
instructions.
[s/inline inline/inline/ to fix clang build.
--Stefan]
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 146549350711.18437.726780393247474362.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When qemu_set_log_filename() detects an invalid file name, it reports
an error, closes the log file (if any), and starts logging to stderr
(unless daemonized or nothing is being logged).
This is wrong. Asking for an invalid log file on the command line
should be fatal. Asking for one in the monitor should fail without
messing up an existing logfile.
Fix by converting qemu_set_log_filename() to Error. Pass it
&error_fatal, except for hmp_logfile report errors.
This also permits testing without a subprocess, so do that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
g_error() is not an acceptable way to report errors to the user:
$ qemu-system-x86_64 -dfilter 1000+0
** (process:17187): ERROR **: Failed to parse range in: 1000+0
Trace/breakpoint trap (core dumped)
g_assert() isn't, either:
$ qemu-system-x86_64 -dfilter 1000x+64
**
ERROR:/work/armbru/qemu/util/log.c:180:qemu_set_dfilter_ranges: assertion failed: (e == range_op)
Aborted (core dumped)
Convert qemu_set_dfilter_ranges() to Error. Rework its deeply nested
control flow. Touch up the error messages. Call it with
&error_fatal.
This also permits testing without a subprocess, so do that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Block jobs that use additional BDSes or event loop resources need a
callback to get their affairs in order when the AioContext is switched.
Simple block jobs don't need an attach callback, they automatically work
thanks to the generic attach/detach notifiers that this patch adds.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-7-git-send-email-stefanha@redhat.com
It's possible that an AioContext notifier user was close to finishing
when .detach_aio_context() or .attached_aio_context() is called. In
that case they may call bdrv_remove_aio_context_notifier() during the
callback.
Use safe iteration to avoid crashing when the notifier list is modified
during iteration. We must not only handle the case where the current
aio notifier is removed during a callback but also the one where any
other aio notifier is removed.
The next patch adds an AioContext notifier for block jobs and they
really could be terminating just as .detach_aio_context() is invoked.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-6-git-send-email-stefanha@redhat.com
Block jobs are coroutines that usually perform I/O but sometimes also
sleep or yield. Currently only sleeping or yielded block jobs can be
paused. This means jobs that do not sleep or yield (using
block_job_yield()) are unaffected by block_job_pause().
Add block_job_pause_point() so that block jobs can mark quiescent points
that are suitable for pausing. This solves the problem that it can take
a block job a long time to pause if it is performing a long series of
I/O operations.
Transitioning to paused state involves a .pause()/.resume() callback.
These callbacks are used to ensure that I/O and event loop activity has
ceased while the job is at a pause point.
Note that this patch introduces a stricter pause state than previously.
The job->busy flag was incorrectly documented as a quiescent state
without I/O pending. This is violated by any job that has I/O pending
across sleep or block_job_yield(), like the mirror block job.
[Add missing block_job_should_pause() check to avoid deadlock after
job->driver->pause() in block_job_pause_point().
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-4-git-send-email-stefanha@redhat.com
The block_job_is_paused() function name is not great because callers
only use it to determine whether pausing has been requested. Rename it
to highlight those semantics and remove it from the public header file
as there are no external callers.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-3-git-send-email-stefanha@redhat.com
In ACPI 5.1 Errata, it adds GIC version in GIC distributor structure.
This is useful for guest kernel to identify which version GIC hardware
is. Update GIC distributor structure and present GIC version in MADT
table.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465960955-17388-1-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Value matching allows Linux to boot with CONFIG_NO_HZ_IDLE=y on the
palmetto-bmc machine. Two match registers are provided for each timer.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1465974248-20434-1-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement the GICv3 logic to recalculate the highest priority pending
interrupt for each CPU after some part of the GIC state has changed.
We avoid unnecessary full recalculation where possible.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-11-git-send-email-peter.maydell@linaro.org
This patch includes the device class itself, some ID register
value functions which will be needed by both distributor
and redistributor, and some skeleton functions for handling
interrupts coming in and going out, which will be filled in
in a subsequent patch.
Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-10-git-send-email-peter.maydell@linaro.org
[PMM: pulled this patch earlier in the sequence, and left
some code out of it for a later patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Move the GICv3 parent_irq and parent_fiq pointers into the
GICv3CPUState structure rather than giving them their own array.
This will make it easy to assert the IRQ and FIQ lines for a
particular CPU interface without having to know or calculate
the CPU index for the GICv3CPUState we are working on.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-8-git-send-email-peter.maydell@linaro.org
Add state information to GICv3 object structure and implement
arm_gicv3_common_reset().
This commit includes accessor functions for the fields which are
stored as bitmaps in uint32_t arrays.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-7-git-send-email-peter.maydell@linaro.org
[PMM: significantly overhauled:
* Add missing qom/cpu.h include
* Remove legacy-only state fields (we can add them later if/when we add
legacy emulation)
* Use arrays of uint32_t to store the various distributor bitmaps,
and provide accessor functions for the various set/test/etc operations
* Add various missing register offset #defines
* Accessor macros which combine distributor and redistributor behaviour
removed
* Fields in state structures renamed to match architectural register names
* Corrected the reset value for GICR_IENABLER0 since we don't support
legacy mode
* Added ARM_LINUX_BOOT_IF interface for "we are directly booting a kernel in
non-secure" so that we can fake up the firmware-mandated reconfiguration
only when we need it
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
A half-shuffle operation takes a word with zeros in the high half:
0000 0000 0000 0000 ABCD EFGH IJKL MNOP
and spreads the bits out so they are in every other bit of the word:
0A0B 0C0D 0E0F 0G0H 0I0J 0K0L 0M0N 0O0P
A half-unshuffle performs the reverse operation.
Provide functions in bitops.h which implement these operations
for 32-bit and 64-bit inputs, and add tests for them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-3-git-send-email-peter.maydell@linaro.org
Define a VMSTATE_UINT64_2DARRAY macro, to go with the ones we
already have for other type sizes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-2-git-send-email-peter.maydell@linaro.org
If the same GlobalProperty struct is registered twice, the list
entry gets corrupted, making tqe_next points to itself, and
qdev_prop_set_globals() gets stuck in a loop. The bug can be
easily reproduced by running:
$ qemu-system-x86_64 -rtc-td-hack -rtc-td-hack
Change global_props to use GList instead of queue.h, making the
code simpler and able to deal with properties being registered
twice.
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Here's the current accumulated set of spapr, ppc and related patches.
* The big thing in here is CPU hotplug for spapr
- This includes a number of acked generic changes adding new
infrastructure for hotplugging cpu cores
* A number of TCG bug fixes are also included
* This adds a new testcase to make it harder to accidentally break
Macintosh (and other openbios) platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXY5oxAAoJEGw4ysog2bOSMu0QAN5e3H3qt2or5dnxS6/bC7QZ
9UrDfCFiJ1HppWfesj0lSAQvHlkEdkml8O7fxeG7bzHQDdaIwy4+q2RIvDiAgnOW
u1he5yhaN6PDLzr9zowtfUySWz6bixMyrBeY2I6/hOKgPKVeifryudmlRNJ5zc1R
edwiSI4saNBYT2+KlaaQW26g765Xk0CG0FjxAyTTPH62WsUvxB1haSKHd7QTn57c
ovdrCTHzfhXaQ0WmbDvIz2v0kY6lO71Re9DRFQmlZktVtGr6E07afr+2VzBTuYLb
r6iHyi/Ed7x+kncH/5F52K7HIddNEp0gbZJgWnfM0gUSIAnya65FP/Qgkc4RdGFd
yeR6V99s/i5dU5gxoFK0MKNn61/QLLUZJiS9l6XNs92mDZn5C03C/edINOv8tbRe
TG+/TghwoFfojodi3cEM+TrDZv7E73RgWjoLY/Eq29KOAuOQaWV0ctbPgnKOZApq
u2LwhPjzU8ae4LCA1c0sMZnMLnskplnNWhmz9sLyvsfy12Lau86/PnT43/rGZ2uu
lW1SEgrk1zpYkOdbyAB01ZOPw0bQVL8uHQpm7df4/RtiyJuWCG+nPBXJoWXq3hN2
VO5PEd33Ec8Cdln9modOwHOIkIgH5zMsG8yLcvHoAeDWccPUkr9Z1mv7vZ4ENQFO
JaOyIOC8JKJJgA/2DQA+
=AGZ4
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160617' into staging
ppc patch queue for 2016-06-17
Here's the current accumulated set of spapr, ppc and related patches.
* The big thing in here is CPU hotplug for spapr
- This includes a number of acked generic changes adding new
infrastructure for hotplugging cpu cores
* A number of TCG bug fixes are also included
* This adds a new testcase to make it harder to accidentally break
Macintosh (and other openbios) platforms
# gpg: Signature made Fri 17 Jun 2016 07:35:29 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160617:
spapr: implement query-hotpluggable-cpus callback
hmp: Add 'info hotpluggable-cpus' HMP command
QMP: Add query-hotpluggable-cpus
spapr: CPU hot unplug support
spapr: CPU hotplug support
spapr: convert boot CPUs into CPU core devices
spapr: Move spapr_cpu_init() to spapr_cpu_core.c
spapr: Abstract CPU core device and type specific core devices
qom: API to get instance_size of a type
spapr_drc: Prevent detach racing against attach for CPU DR
xics,xics_kvm: Handle CPU unplug correctly
cpu: Abstract CPU core type
qdev: hotplug: Introduce HotplugHandler.pre_plug() callback
target-ppc: Fix rlwimi, rlwinm, rlwnm
vfio: Fix broken EEH
target-ppc: Bug in BookE wait instruction
ppc / sparc: Add a tester for checking whether OpenBIOS runs successfully
hw/ppc/spapr: Silence deprecation message in qtest mode
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Beginning of reconnect support for vhost-user.
Misc cleanups and fixes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXY0Q3AAoJECgfDbjSjVRpkVcH/2gTHRE9yUoWe6ROvPV67BKx
8Iy9GzJ3BMO3RolVZEA5KXIevn5TG+pV274BZEuXMD3AL/molv279p0o/gvBYoqq
V0jNH2MO+MV6D9OzhUXcgWSejvybF5W07ojPDU/hlgtFXPZFbJDyt95MWaLiilOg
cCtTuRqgrrRaypcnnk/CIDbC+Ek2kAYdgQHQbfj9ihle3TWO8R0bSXnFqSaqCIkM
4slMlv8y82fODeiO83nkpfAP1NCnfnRC8r8Gv7hbEUTlZQntavx5DuYdiIx6nsJE
W0g+Gpe1o0+jRuMnucGIUZvqzZ0e/I0wZuV16Nsfx+Rbd5+4CzTxZda5Qb05v7I=
=BHbJ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes
Beginning of reconnect support for vhost-user.
Misc cleanups and fixes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 17 Jun 2016 01:28:39 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
MAINTAINERS: add Marcel to PCI
msi_init: change return value to 0 on success
fix some coding style problems
pci core: assert ENOSPC when add capability
test: start vhost-user reconnect test
tests: append i386 tests
vhost-net: save & restore vring enable state
vhost-net: save & restore vhost-user acked features
vhost-net: do not crash if backend is not present
vhost-user: disconnect on start failure
qemu-char: add qemu_chr_disconnect to close a fd accepted by listen fd
tests/vhost-user-bridge: workaround stale vring base
tests/vhost-user-bridge: add client mode
vhost-user: add ability to know vhost-user backend disconnection
pci: fix pci_requester_id()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conflicts:
tests/Makefile.include
It will allow mgmt to query present and hotpluggable CPU objects,
it is required from a target platform that wishes to support command
to implement and set MachineClass.query_hotpluggable_cpus callback,
which will return a list of possible CPU objects with options that
would be needed for hotplugging possible CPU objects.
There are:
'type': 'str' - QOM CPU object type for usage with device_add
'vcpus-count': 'int' - number of logical VCPU threads per
CPU object (mgmt needs to know)
and a set of optional fields that are to used for hotplugging a CPU
objects and would allows mgmt tools to know what/where it could be
hotplugged;
[node],[socket],[core],[thread]
For present CPUs there is a 'qom-path' field which would allow mgmt to
inspect whatever object/abstraction the target platform considers
as CPU object.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Remove the CPU core device by removing the underlying CPU thread devices.
Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug
notification to the guest. Release the vCPU object after CPU hot unplug so
that vCPU fd can be parked and reused.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Set up device tree entries for the hotplugged CPU core and use the
exising RTAS event logging infrastructure to send CPU hotplug notification
to the guest.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for
CPU core hotplug. Initialize boot time CPUs as core deivces and prevent
topologies that result in partially filled cores. Both of these are done
only if CPU core hotplug is supported.
Note: An unrelated change in the call to xics_system_init() is done
in this patch as it makes sense to use the local variable smt introduced
in this patch instead of kvmppc_smt_threads() call here.
TODO: We derive sPAPR core type by looking at -cpu <model>. However
we don't take care of "compat=" feature yet for boot time as well
as hotplug CPUs.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Start consolidating CPU init related routines in spapr_cpu_core.c. As
part of this, move spapr_cpu_init() and its dependencies from spapr.c
to spapr_cpu_core.c
No functionality change in this patch.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a
public(ish) header]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add sPAPR specific abastract CPU core device that is based on generic
CPU core device. Use this as base type to create sPAPR CPU specific core
devices.
TODO:
- Add core types for other remaining CPU types
- Handle CPU model alias correctly
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add an API object_type_get_size(const char *typename) that returns the
instance_size of the give typename.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If a CPU is hot removed while hotplug of the same is still in progress,
the guest crashes. Prevent this by ensuring that detach is done only
after attach has completed.
The existing code already prevents such race for PCI hotplug. However
given that CPU is a logical DR unlike PCI and starts with ISOLATED
state, we need a logic that works for CPU too.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[Don't set awaiting_attach for PCI devices]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
XICS is setup for each CPU during initialization. Provide a routine
to undo the same when CPU is unplugged. While here, move ss->cs management
into xics from xics_kvm since there is nothing KVM specific in it.
Also ensure xics reset doesn't set irq for CPUs that are already unplugged.
This allows reboot of a VM that has undergone CPU hotplug and unplug
to work correctly.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add an abstract CPU core type that could be used by machines that want
to define and hotplug CPUs in core granularity.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[Integer core property]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[dwg: changed property names to 'core-id' and 'nr-threads']
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
pre_plug callback is to be called before device.realize() is executed.
This would allow to check/set device's properties from HotplugHandler.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A driver may change the vring enable state at run time but vhost-user
backend may not be present (a contrived example is when the backend is
disconnected and the device is reconfigured after driver rebinding)
Restore the vring state when the vhost-user backend is started, so it
can process the ring.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The initial vhost-user connection sets the features to be negotiated
with the driver. Renegotiation isn't possible without device reset.
To handle reconnection of vhost-user backend, ensure the same set of
features are provided, and reuse already acked features.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The patch introduces qemu_chr_disconnect(). The function is used for
closing a fd accepted by listen fd. Though we already have qemu_chr_delete(),
but it closes not only accepted fd but also listen fd. This new function
is used when we still want to keep listen fd.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This fix SID verification failure when IOMMU IR is enabled with PCI
bridges. Existing pci_requester_id() is more like getting BDF info
only. Renaming it to pci_get_bdf(). Meanwhile, we provide the correct
implementation to get requester ID. VT-d spec 5.1.1 is a good reference
to go, though it talks only about interrupt delivery, the rule works
exactly the same for non-interrupt cases.
Currently, there are three use cases for pci_requester_id():
- PCIX status bits: here we need BDF only, not requester ID. Replacing
with pci_get_bdf().
- PCIe Error injection and MSI delivery: for both these cases, we are
looking for requester IDs. Here we should use the new impl.
To avoid a PCI walk every time we send MSI message, one requester_id
cache field is added to PCIDevice to cache the result when initialize
PCI device.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While doing DMA read into ESP command buffer 's->cmdbuf', it could
write past the 's->cmdbuf' area, if it was transferring more than 16
bytes. Increase the command buffer size to 32, which is maximum when
's->do_cmd' is set, and add a check on 'len' to avoid OOB access.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Declare a constant and use that when determining if an export
name fits within the constraints we are willing to support.
Note that upstream NBD recently documented that clients MUST
support export names of 256 bytes (not including trailing NUL),
and SHOULD support names up to 4096 bytes. 4096 is a bit big
(we would lose benefits of stack-allocation of a name array),
and we already have other limits in place (for example, qcow2
snapshot names are clamped around 1024). So for now, just
stick to the required minimum, as that's easier to audit than
a full-scale support for larger names.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-12-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These structs are never used to represent the bytes that go over the
network. The big-endian network data is built into a uint8_t array
in nbd_{receive,send}_{request,reply}. Remove the unused magic field,
reorder the struct to avoid holes, and remove the packed attribute.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h. Include it in
sysemu/os-posix.h and remove it from everywhere else.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, we are trying to move the backing BDS from the source to the
target in bdrv_replace_in_backing_chain() which is called from
mirror_exit(). However, mirror_complete() already tries to open the
target's backing chain with a call to bdrv_open_backing_file().
First, we should only set the target's backing BDS once. Second, the
mirroring block job has a better idea of what to set it to than the
generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
conditions on when to move the backing BDS from source to target are not
really correct).
Therefore, remove that code from bdrv_replace_in_backing_chain() and
leave it to mirror_complete().
Depending on what kind of mirroring is performed, we furthermore want to
use different strategies to open the target's backing chain:
- If blockdev-mirror is used, we can assume the user made sure that the
target already has the correct backing chain. In particular, we should
not try to open a backing file if the target does not have any yet.
- If drive-mirror with mode=absolute-paths is used, we can and should
reuse the already existing chain of nodes that the source BDS is in.
In case of sync=full, no backing BDS is required; with sync=top, we
just link the source's backing BDS to the target, and with sync=none,
we use the source BDS as the target's backing BDS.
We should not try to open these backing files anew because this would
lead to two BDSs existing per physical file in the backing chain, and
we would like to avoid such concurrent access.
- If drive-mirror with mode=existing is used, we have to use the
information provided in the physical image file which means opening
the target's backing chain completely anew, just as it has been done
already.
If the target's backing chain shares images with the source, this may
lead to multiple BDSs per physical image file. But since we cannot
reliably ascertain this case, there is nothing we can do about it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-3-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
It is always true for open images now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This allows drivers to share code between normal I/O and vmstate
accesses.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This brings it in line with .bdrv_save_vmstate().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
We already have a byte-based bdrv_pwritev(), but the read counterpart
was still missing. This commit adds it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
In a first step to convert the common I/O path to work on bytes rather
than sectors, this converts the copy-on-read logic that is used by
bdrv_aligned_preadv().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a new BDRV_REQ_MASK constant, and use it to make sure that
caller flags are always valid.
Tested with 'make check' and with qemu-iotests on both '-raw'
and '-qcow2'; the only failure turned up was fixed in the
previous commit.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Apply the following renames for starting incoming migration:
process_incoming_migration -> migration_fd_process_incoming
migration_set_incoming_channel -> migration_channel_process_incoming
migration_tls_set_incoming_channel -> migration_tls_channel_process_incoming
and for starting outgoing migration:
migration_set_outgoing_channel -> migration_channel_connect
migration_tls_set_outgoing_channel -> migration_tls_channel_connect
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1464776234-9910-3-git-send-email-berrange@redhat.com
Message-Id: <1464776234-9910-3-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
On the source, add a count of page requests received from the
destination.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-id: 1465816605-29488-4-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-4-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
I looked at a dozen Intel CPU that have this CPUID and all of them
always had Core offset as 1 (a wasted bit when hyperthreading is
disabled) and Package offset at least 4 (wasted bits at <= 4 cores).
QEMU uses more compact IDs and it doesn't make much sense to change it
now. I keep the SMT and Core sub-leaves even if there is just one
thread/core; it makes the code simpler and there should be no harm.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This adds the DP and the DPDMA to the Zynq MP platform.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-10-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the implementation of the DisplayPort.
It has an aux-bus to access dpcd and edid.
Graphic plane is connected to the channel 3.
Video plane is connected to the channel 0.
Audio stream are connected to the channels 4 and 5.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-9-git-send-email-fred.konrad@greensocs.com
[PMM: fixed format strings]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the implementation of the DPDMA.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-8-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement an I2C slave which implements DDC and returns the
EDID data for an attached monitor.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-7-git-send-email-fred.konrad@greensocs.com
- Rebased on the current master.
- Modified for QOM.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
[PMM: actually wire up the vmstate to dc->vmsd]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This introduces dpcd module.
It wires on a aux-bus and can be accessed by the driver to get lane-speed, etc.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This introduces a new bus: aux-bus.
It contains an address space for aux slaves devices and a bridge to an I2C bus
for I2C through AUX transactions.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Most of the control flow logic between send and recv (error checking
etc) is the same. Factor this out into a common send_recv() API.
This is then usable by clients, where the control logic for send
and receive differs only by a boolean. E.g.
if (send)
i2c_send(...):
else
i2c_recv(...);
becomes:
i2c_send_recv(... , send);
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-4-git-send-email-fred.konrad@greensocs.com
Changes from FK:
* Rebased on master.
* Rebased on my i2c broadcast patch.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a virtual PMU device for virt machine while use PPI 7 for PMU
overflow interrupt number.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465267577-1808-3-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Let's introduce a CssDevId to handle device ids of the xx.x.xxxx
type used for channel devices. This has some benefits:
- We can use them in virtio-ccw and split the validity checks for
a channel device id in general from the constraint checking
within the virtio-ccw scope.
- We can reuse the device id type for future non-virtio channel
devices.
While we're at it, improve the validity checks and disallow e.g.
trailing characters.
Suggested-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
According to the platform specification, under certain conditions,
pending IO interruptions have to be cleared. Let's add an interface
for that.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Memory hotplug can fail for some combinations of RAM and maxmem when
DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
on maximum addressable memory returned by guest and this value is currently
being calculated wrongly by the guest kernel routine memory_hotplug_max().
While there is an attempt to fix the guest kernel, this patch works
around the problem within QEMU itself.
memory_hotplug_max() routine in the guest kernel arrives at max
addressable memory by multiplying lmb-size with the lmb-count obtained
from ibm,dynamic-memory property. There are two assumptions here:
- All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
where only hot-pluggable LMBs are present in this property.
- The memory area comprising of RAM and hotplug region is contiguous: This
needn't be true always for PowerKVM as there can be gap between
boot time RAM and hotplug region.
To work around this guest kernel bug, ensure that ibm,dynamic-memory
has information about all the LMBs (RMA, boot-time LMBs, future
hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
hotpluggable region).
RMA is represented separately by memory@0 node. Hence mark RMA LMBs
and also the LMBs for the gap b/n RAM and hotpluggable region as
reserved and as having no valid DRC so that these LMBs are not considered
by the guest.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This ensures that the underlying memory is marked dirty once the transfer
is complete and resolves cache coherency problems under MacOS 9.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
add the PowerPC AT_HWCAP2 definitions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
OpenSSL's libcrypto always defines AES symbols with the same names as
qemu's local aes code. This is problematic when enabling at least curl
as that frequently also uses libcrypto. It might not be noticed when
running, but if you try to statically link, everything falls down.
An example snippet:
LINK qemu-nbd
.../libcrypto.a(aes-x86_64.o): In function 'AES_encrypt':
(.text+0x460): multiple definition of 'AES_encrypt'
crypto/aes.o:aes.c:(.text+0x670): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_decrypt':
(.text+0x9f0): multiple definition of 'AES_decrypt'
crypto/aes.o:aes.c:(.text+0xb30): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_cbc_encrypt':
(.text+0xf90): multiple definition of 'AES_cbc_encrypt'
crypto/aes.o:aes.c:(.text+0xff0): first defined here
collect2: error: ld returned 1 exit status
.../qemu-2.6.0/rules.mak:105: recipe for target 'qemu-nbd' failed
make: *** [qemu-nbd] Error 1
The aes.h header has redefines already for FreeBSD, but go ahead and
enable that for everyone since there's no real good reason to not use
a namespace all the time.
Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This wrapper for machine_usb(current_machine) is not necessary,
replace all usages of usb_enabled() with machine_usb().
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1465419025-21519-3-git-send-email-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.
Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.
The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:
- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
MRU is implemented here by adding an intermediate struct
that contains the u32 hash and a pointer to the TB; this
allows us, on an MRU promotion, to copy said struct (that is not
at the head), and put this new copy at the head. After a grace
period, the original non-head struct can be eliminated, and
after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
no MRU for lookups; MRU for inserts.
The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
http://imgur.com/a/Awgnq
Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.
Host: Intel Xeon E5-2690
28 ++------------+-------------+-------------+-------------+------------++
A***** + + + master **A*** +
27 ++ * xxhash ##B###++
| A******A****** xxhash-rcu $$C$$$ |
26 C$$ A******A****** qht-fixed-nomru*%%D%%%++
D%%$$ A******A******A*qht-dyn-mru A*E****A
25 ++ %%$$ qht-dyn-nomru &&F&&&++
B#####% |
24 ++ #C$$$$$ ++
| B### $ |
| ## C$$$$$$ |
23 ++ # C$$$$$$ ++
| B###### C$$$$$$ %%%D
22 ++ %B###### C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
| D%%%%%%B###### @E@@@@@@ %%%D%%%@@@E@@@@@@E
21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
+ E@@@ F&&& + E@ + F&&& + +
20 ++------------+-------------+-------------+-------------+------------++
14 16 18 20 22 24
log2 number of buckets
Host: Intel i7-4790K
14.5 ++------------+------------+-------------+------------+------------++
A** + + + master **A*** +
14 ++ ** xxhash ##B###++
13.5 ++ ** xxhash-rcu $$C$$$++
| qht-fixed-nomru %%D%%% |
13 ++ A****** qht-dyn-mru @@E@@@++
| A*****A******A****** qht-dyn-nomru &&F&&& |
12.5 C$$ A******A******A*****A****** ***A
12 ++ $$ A*** ++
D%%% $$ |
11.5 ++ %% ++
B### %C$$$$$$ |
11 ++ ## D%%%%% C$$$$$ ++
| # % C$$$$$$ |
10.5 F&&&&&&B######D%%%%% C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$ $$$C
10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
+ F&& D%%%%%%B######B######B#####B###@@@D%%% +
9.5 ++------------+------------+-------------+------------+------------++
14 16 18 20 22 24
log2 number of buckets
Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.
xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.
qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.
However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.
Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.
The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:
- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time
We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.
A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: Sergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.
A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Sometimes it is useful to have a quick histogram to represent a certain
distribution -- for example, when investigating a performance regression
in a hash table due to inadequate hashing.
The appended allows us to easily represent a distribution using Unicode
characters. Further, the data structure keeping track of the distribution
is so simple that obtaining its values for off-line processing is trivial.
Example, taking the last 10 commits to QEMU:
Characters in commit title Count
-----------------------------------
39 1
48 1
53 1
54 2
57 1
61 1
67 1
78 1
80 1
qdist_init(&dist);
qdist_inc(&dist, 39);
[...]
qdist_inc(&dist, 80);
char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);
char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-9-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
For some workloads such as arm bootup, tb_phys_hash is performance-critical.
The is due to the high frequency of accesses to the hash table, originated
by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's.
More info:
https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html
To dig further into this I modified an arm image booting debian jessie to
immediately shut down after boot. Analysis revealed that quite a bit of time
is unnecessarily spent in tb_phys_hash: the cause is poor hashing that
results in very uneven loading of chains in the hash table's buckets;
the longest observed chain had ~550 elements.
The appended addresses this with two changes:
1) Use xxhash as the hash table's hash function. xxhash is a fast,
high-quality hashing function.
2) Feed the hashing function with not just tb_phys, but also pc and flags.
This improves performance over using just tb_phys for hashing, since that
resulted in some hash buckets having many TB's, while others getting very few;
with these changes, the longest observed chain on a single hash bucket is
brought down from ~550 to ~40.
Tests show that the other element checked for in tb_find_physical,
cs_base, is always a match when tb_phys+pc+flags are a match,
so hashing cs_base is wasteful. It could be that this is an ARM-only
thing, though. UPDATE:
On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote:
> The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB
> consisting of only a delay slot).
> It may well still turn out to be reasonable to ignore cs_base for hashing.
BTW, after this change the hash table should not be called "tb_hash_phys"
anymore; this is addressed later in this series.
This change gives consistent bootup time improvements. I tested two
host machines:
- Intel Xeon E5-2690: 11.6% less time
- Intel i7-4790K: 19.2% less time
Increasing the number of hash buckets yields further improvements. However,
using a larger, fixed number of buckets can degrade performance for other
workloads that do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-8-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will be used by upcoming changes for hashing the tb hash.
Add this into a separate file to include the copyright notice from
xxhash.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-7-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: use TAS instead of atomic_xchg; emit acquire/release
barriers; return bool from trylock; call cpu_relax() while spinning;
optimize for uncontended locks by acquiring the lock with TAS instead
of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Taken from the linux kernel.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-5-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>