The current vhost_user_set_mem_table_postcopy() implementation
populates each region of the VHOST_USER_SET_MEM_TABLE message without
first checking if there are more than VHOST_MEMORY_MAX_NREGIONS already
populated. This can cause memory corruption if too many regions are
added to the message during the postcopy step.
This change moves an existing assert up such that attempting to
construct a VHOST_USER_SET_MEM_TABLE message with too many memory
regions will gracefully bring down qemu instead of corrupting memory.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Message-Id: <1579143426-18305-2-git-send-email-raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When multiqueue is enabled, a vhost_dev is created for each queue
pair. However, only one slave channel is needed.
Fixes: 4bbeeba023 (vhost-user: add slave-req-fd support)
Cc: marcandre.lureau@redhat.com
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Message-Id: <20200121214553.28459-1-amorenoz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adds the "virtio,pci-iommu" node in the host bridge node and
the RID mapping, excluding the IOMMU RID.
This is done in the virtio-iommu-pci hotplug handler which
gets called only if no firmware is loaded or if -no-acpi is
passed on the command line. As non DT integration is
not yet supported by the kernel we must make sure we
are in DT mode. This limitation will be removed as soon
as the topology description feature gets supported.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20200214132745.23392-10-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds virtio-iommu-pci, which is the pci proxy for
the virtio-iommu device.
Currently non DT integration is not yet supported by the kernel.
So the machine must implement a hotplug handler for the
virtio-iommu-pci device that creates the device tree iommu-map
bindings as documented in kernel documentation:
Documentation/devicetree/bindings/virtio/iommu.txt
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-9-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add Migration support. We rely on recently added gtree and qlist
migration. We only migrate the domain gtree. The endpoint gtree
is re-constructed in a post-load operation.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-8-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The event queue allows to report asynchronous errors.
The translate function now injects faults when relevant.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-7-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements the translate callback
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-6-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements virtio_iommu_map/unmap.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-5-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch implements the endpoint attach/detach to/from
a domain.
Domain and endpoint internal datatypes are introduced.
Both are stored in RB trees. The domain owns a list of
endpoints attached to it. Also helpers to get/put
end points and domains are introduced.
As for the IOMMU memory regions, a callback is called on
PCI bus enumeration that initializes for a given device
on the bus hierarchy an IOMMU memory region. The PCI bus
hierarchy is stored locally in IOMMUPciBus and IOMMUDevice
objects.
At the time of the enumeration, the bus number may not be
computed yet.
So operations that will need to retrieve the IOMMUdevice
and its IOMMU memory region from the bus number and devfn,
once the bus number is garanteed to be frozen, use an array
of IOMMUPciBus, lazily populated.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-4-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds the command payload decoding and
introduces the functions that will do the actual
command handling. Those functions are not yet implemented.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patchs adds the skeleton for the virtio-iommu device.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200214132745.23392-2-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtqueue code sets up MemoryRegionCaches to access the virtqueue
guest RAM data structures. The code currently assumes that
VRingMemoryRegionCaches is initialized before device emulation code
accesses the virtqueue. An assertion will fail in
vring_get_region_caches() when this is not true. Device fuzzing found a
case where this assumption is false (see below).
Virtqueue guest RAM addresses can also be changed from a vCPU thread
while an IOThread is accessing the virtqueue. This breaks the same
assumption but this time the caches could become invalid partway through
the virtqueue code. The code fetches the caches RCU pointer multiple
times so we will need to validate the pointer every time it is fetched.
Add checks each time we call vring_get_region_caches() and treat invalid
caches as a nop: memory stores are ignored and memory reads return 0.
The fuzz test failure is as follows:
$ qemu -M pc -device virtio-blk-pci,id=drv0,drive=drive0,addr=4.0 \
-drive if=none,id=drive0,file=null-co://,format=raw,auto-read-only=off \
-drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw \
-display none \
-qtest stdio
endianness
outl 0xcf8 0x80002020
outl 0xcfc 0xe0000000
outl 0xcf8 0x80002004
outw 0xcfc 0x7
write 0xe0000000 0x24 0x00ffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab5cffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab0000000001
inb 0x4
writew 0xe000001c 0x1
write 0xe0000014 0x1 0x0d
The following error message is produced:
qemu-system-x86_64: /home/stefanha/qemu/hw/virtio/virtio.c:286: vring_get_region_caches: Assertion `caches != NULL' failed.
The backtrace looks like this:
#0 0x00007ffff5520625 in raise () at /lib64/libc.so.6
#1 0x00007ffff55098d9 in abort () at /lib64/libc.so.6
#2 0x00007ffff55097a9 in _nl_load_domain.cold () at /lib64/libc.so.6
#3 0x00007ffff5518a66 in annobin_assert.c_end () at /lib64/libc.so.6
#4 0x00005555559073da in vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:286
#5 vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:283
#6 0x000055555590818d in vring_used_flags_set_bit (mask=1, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
#7 virtio_queue_split_set_notification (enable=0, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
#8 virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:451
#9 0x0000555555908512 in virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:444
#10 0x00005555558c697a in virtio_blk_handle_vq (s=0x5555575c57e0, vq=0x5555575ceea0) at qemu/hw/block/virtio-blk.c:775
#11 0x0000555555907836 in virtio_queue_notify_aio_vq (vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:2244
#12 0x0000555555cb5dd7 in aio_dispatch_handlers (ctx=ctx@entry=0x55555671a420) at util/aio-posix.c:429
#13 0x0000555555cb67a8 in aio_dispatch (ctx=0x55555671a420) at util/aio-posix.c:460
#14 0x0000555555cb307e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#15 0x00007ffff7bbc510 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#16 0x0000555555cb5848 in glib_pollfds_poll () at util/main-loop.c:219
#17 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#18 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
#19 0x00005555559b20c9 in main_loop () at vl.c:1683
#20 0x0000555555838115 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4441
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Cc: Michael Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200207104619.164892-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
use the new virtio_delete_queue function to cleanup.
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200224041336.30790-3-pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio queues forgot to delete in unrealize, and aslo error path in
realize, this patch fix these memleaks, the leak stack is as follow:
Direct leak of 114688 byte(s) in 16 object(s) allocated from:
#0 0x7f24024fdbf0 in calloc (/lib64/libasan.so.3+0xcabf0)
#1 0x7f2401642015 in g_malloc0 (/lib64/libglib-2.0.so.0+0x50015)
#2 0x55ad175a6447 in virtio_add_queue /mnt/sdb/qemu/hw/virtio/virtio.c:2327
#3 0x55ad17570cf9 in vhost_user_blk_device_realize /mnt/sdb/qemu/hw/block/vhost-user-blk.c:419
#4 0x55ad175a3707 in virtio_device_realize /mnt/sdb/qemu/hw/virtio/virtio.c:3509
#5 0x55ad176ad0d1 in device_set_realized /mnt/sdb/qemu/hw/core/qdev.c:876
#6 0x55ad1781ff9d in property_set_bool /mnt/sdb/qemu/qom/object.c:2080
#7 0x55ad178245ae in object_property_set_qobject /mnt/sdb/qemu/qom/qom-qobject.c:26
#8 0x55ad17821eb4 in object_property_set_bool /mnt/sdb/qemu/qom/object.c:1338
#9 0x55ad177aeed7 in virtio_pci_realize /mnt/sdb/qemu/hw/virtio/virtio-pci.c:1801
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200224041336.30790-2-pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar to other virtio-deivces, ctrl_vq forgot to delete in virtio_crypto_device_unrealize, this patch fix it.
This device has aleardy maintained vq pointers. Thus, we use the new virtio_delete_queue function directly to do the cleanup.
The leak stack:
Direct leak of 10752 byte(s) in 3 object(s) allocated from:
#0 0x7f4c024b1970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f4c018be49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55a2f8017279 in virtio_add_queue /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:2333
#3 0x55a2f8057035 in virtio_crypto_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio-crypto.c:814
#4 0x55a2f8005d80 in virtio_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:3531
#5 0x55a2f8497d1b in device_set_realized /mnt/sdb/qemu-new/qemu_test/qemu/hw/core/qdev.c:891
#6 0x55a2f8b48595 in property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:2238
#7 0x55a2f8b54fad in object_property_set_qobject /mnt/sdb/qemu-new/qemu_test/qemu/qom/qom-qobject.c:26
#8 0x55a2f8b4de2c in object_property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:1390
#9 0x55a2f80609c9 in virtio_crypto_pci_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio-crypto-pci.c:58
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Cc: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Message-Id: <20200225075554.10835-5-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar to other virtio-devices, rq_vq forgot to delete in
virtio_pmem_unrealize, this patch fix it. This device has already
maintained a vq pointer, thus we use the new virtio_delete_queue
function directly to do the cleanup.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200225075554.10835-4-pannengyuan@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
use the new virtio_delete_queue function to cleanup.
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200225075554.10835-3-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Similar to other virtio device(https://patchwork.kernel.org/patch/11399237/), virtio queues forgot to delete in unrealize, and aslo error path in realize, this patch fix these memleaks, the leak stack is as follow:
Direct leak of 57344 byte(s) in 1 object(s) allocated from:
#0 0x7f15784fb970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f157790849d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55587a1bf859 in virtio_add_queue /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:2333
#3 0x55587a2071d5 in vuf_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/vhost-user-fs.c:212
#4 0x55587a1ae360 in virtio_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:3531
#5 0x55587a63fb7b in device_set_realized /mnt/sdb/qemu-new/qemu_test/qemu/hw/core/qdev.c:891
#6 0x55587acf03f5 in property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:2238
#7 0x55587acfce0d in object_property_set_qobject /mnt/sdb/qemu-new/qemu_test/qemu/qom/qom-qobject.c:26
#8 0x55587acf5c8c in object_property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:1390
#9 0x55587a8e22a2 in pci_qdev_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/pci/pci.c:2095
#10 0x55587a63fb7b in device_set_realized /mnt/sdb/qemu-new/qemu_test/qemu/hw/core/qdev.c:891
#11 0x55587acf03f5 in property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:2238
#12 0x55587acfce0d in object_property_set_qobject /mnt/sdb/qemu-new/qemu_test/qemu/qom/qom-qobject.c:26
#13 0x55587acf5c8c in object_property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:1390
#14 0x55587a496d65 in qdev_device_add /mnt/sdb/qemu-new/qemu_test/qemu/qdev-monitor.c:679
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200225075554.10835-2-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This series removes ad hoc RAM allocation API (memory_region_allocate_system_memory)
and consolidates it around hostmem backend. It allows to
* resolve conflicts between global -mem-prealloc and hostmem's "policy" option,
fixing premature allocation before binding policy is applied
* simplify complicated memory allocation routines which had to deal with 2 ways
to allocate RAM.
* reuse hostmem backends of a choice for main RAM without adding extra CLI
options to duplicate hostmem features. A recent case was -mem-shared, to
enable vhost-user on targets that don't support hostmem backends [1] (ex: s390)
* move RAM allocation from individual boards into generic machine code and
provide them with prepared MemoryRegion.
* clean up deprecated NUMA features which were tied to the old API (see patches)
- "numa: remove deprecated -mem-path fallback to anonymous RAM"
- (POSTPONED, waiting on libvirt side) "forbid '-numa node,mem' for 5.0 and newer machine types"
- (POSTPONED) "numa: remove deprecated implicit RAM distribution between nodes"
Introduce a new machine.memory-backend property and wrapper code that aliases
global -mem-path and -mem-alloc into automatically created hostmem backend
properties (provided memory-backend was not set explicitly given by user).
A bulk of trivial patches then follow to incrementally convert individual
boards to using machine.memory-backend provided MemoryRegion.
Board conversion typically involves:
* providing MachineClass::default_ram_size and MachineClass::default_ram_id
so generic code could create default backend if user didn't explicitly provide
memory-backend or -m options
* dropping memory_region_allocate_system_memory() call
* using convenience MachineState::ram MemoryRegion, which points to MemoryRegion
allocated by ram-memdev
On top of that for some boards:
* missing ram_size checks are added (typically it were boards with fixed ram size)
* ram_size fixups are replaced by checks and hard errors, forcing user to
provide correct "-m" values instead of ignoring it and continuing running.
After all boards are converted, the old API is removed and memory allocation
routines are cleaned up.
The goal is to reduce the amount of requests issued by a guest on
1M reads/writes. This rises the performance up to 4% on that kind of
disk access pattern.
The maximum chunk size to be used for the guest disk accessing is
limited with seg_max parameter, which represents the max amount of
pices in the scatter-geather list in one guest disk request.
Since seg_max is virqueue_size dependent, increasing the virtqueue
size increases seg_max, which, in turn, increases the maximum size
of data to be read/write from a guest disk.
More details in the original problem statment:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03721.html
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-id: 20200214074648.958-1-dplotnikov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Booting the r2d machine from flash fails because flash is not discovered.
Looking at the flattened memory tree, we see the following.
FlatView #1
AS "memory", root: system
AS "cpu-memory-0", root: system
AS "sh_pci_host", root: bus master container
Root memory region: system
0000000000000000-000000000000ffff (prio 0, i/o): io
0000000000010000-0000000000ffffff (prio 0, i/o): r2d.flash @0000000000010000
The overlapping memory region is sh_pci.isa, ie the ISA I/O region bridge.
This region is initially assigned to address 0xfe240000, but overwritten
with a write into the PCIIOBR register. This write is expected to adjust
the PCI memory window, but not to change the region's base adddress.
Peter Maydell provided the following detailed explanation.
"Section 22.3.7 and in particular figure 22.3 (of "SSH7751R user's manual:
hardware") are clear about how this is supposed to work: there is a window
at 0xfe240000 in the system register space for PCI I/O space. When the CPU
makes an access into that area, the PCI controller calculates the PCI
address to use by combining bits 0..17 of the system address with the
bits 31..18 value that the guest has put into the PCIIOBR. That is, writing
to the PCIIOBR changes which section of the IO address space is visible in
the 0xfe240000 window. Instead what QEMU's implementation does is move the
window to whatever value the guest writes to the PCIIOBR register -- so if
the guest writes 0 we put the window at 0 in system address space."
Fix the problem by calling memory_region_set_alias_offset() instead of
removing and re-adding the PCI ISA subregion on writes into PCIIOBR.
At the same time, in sh_pci_device_realize(), don't set iobr since
it is overwritten later anyway. Instead, pass the base address to
memory_region_add_subregion() directly.
Many thanks to Peter Maydell for the detailed problem analysis, and for
providing suggestions on how to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200218201050.15273-1-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Correct the number of dummy cycles required by the FAST_READ_4 command (to
be eight, one dummy byte).
Fixes: ef06ca3946 ("xilinx_spips: Add support for RX discard and RX drain")
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20200218113350.6090-1-frasse.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instantiate EHCI and OHCI controllers on Allwinner A10. OHCI ports are
modeled as companions of the respective EHCI ports.
With this patch applied, USB controllers are discovered and instantiated
when booting the cubieboard machine with a recent Linux kernel.
ehci-platform 1c14000.usb: EHCI Host Controller
ehci-platform 1c14000.usb: new USB bus registered, assigned bus number 1
ehci-platform 1c14000.usb: irq 26, io mem 0x01c14000
ehci-platform 1c14000.usb: USB 2.0 started, EHCI 1.00
ehci-platform 1c1c000.usb: EHCI Host Controller
ehci-platform 1c1c000.usb: new USB bus registered, assigned bus number 2
ehci-platform 1c1c000.usb: irq 31, io mem 0x01c1c000
ehci-platform 1c1c000.usb: USB 2.0 started, EHCI 1.00
ohci-platform 1c14400.usb: Generic Platform OHCI controller
ohci-platform 1c14400.usb: new USB bus registered, assigned bus number 3
ohci-platform 1c14400.usb: irq 27, io mem 0x01c14400
ohci-platform 1c1c400.usb: Generic Platform OHCI controller
ohci-platform 1c1c400.usb: new USB bus registered, assigned bus number 4
ohci-platform 1c1c400.usb: irq 32, io mem 0x01c1c400
usb 2-1: new high-speed USB device number 2 using ehci-platform
usb-storage 2-1:1.0: USB Mass Storage device detected
scsi host1: usb-storage 2-1:1.0
usb 3-1: new full-speed USB device number 2 using ohci-platform
input: QEMU QEMU USB Mouse as /devices/platform/soc/1c14400.usb/usb3/3-1/3-1:1.0/0003:0627:0001.0001/input/input0
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200217204812.9857-4-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We'll use this property in a follow-up patch to insantiate an EHCI
bus with companion support.
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200217204812.9857-3-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We need to be able to use OHCISysBusState outside hcd-ohci.c, so move it
to its include file.
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20200217204812.9857-2-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The isar_feature_aa32_pan and isar_feature_aa32_ats1e1 functions
are supposed to be testing fields in ID_MMFR3; but a cut-and-paste
error meant we were looking at MVFR0 instead.
Fix the functions to look at the right register; this requires
us to move at least id_mmfr3 to the ARMISARegisters struct; we
choose to move all the ID_MMFRn registers for consistency.
Fixes: 3d6ad6bb46
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200214175116.9164-19-peter.maydell@linaro.org
Instead of open-coding a check on the ID_DFR0 PerfMon ID register
field, create a standardly-named isar_feature for "does AArch32 have
a v8.1 PMUv3" and use it.
This entails moving the id_dfr0 field into the ARMISARegisters struct.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200214175116.9164-9-peter.maydell@linaro.org
Up to now, the z2 machine only boots if a flash image is provided.
This is not really necessary; the machine can boot from initrd or from
SD without it. At the same time, having to provide dummy flash images
is a nuisance and does not add any real value. Make it optional.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200217210903.18602-1-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Up to now, the mainstone machine only boots if two flash images are
provided. This is not really necessary; the machine can boot from initrd
or from SD without it. At the same time, having to provide dummy flash
images is a nuisance and does not add any real value. Make it optional.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200217210824.18513-1-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix warning reported by Clang static code analyzer:
CC hw/misc/iotkit-secctl.o
hw/misc/iotkit-secctl.c:343:9: warning: Value stored to 'value' is never read
value &= 0x00f000f3;
^ ~~~~~~~~~~
Fixes: b3717c23e1
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200217132922.24607-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This returns a fixed but non-zero value for the chip id.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200121013302.43839-3-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This splits the common write callback into separate ast2400 and ast2500
implementations. This makes it clearer when implementing differing
behaviour.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200121013302.43839-2-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The device tree blob returned by load_device_tree is malloced.
We should free it after cpu_physical_memory_write().
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Message-Id: <20200218091154.21696-3-kuhn.chenqun@huawei.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We already detect if a device is being hot plugged before CAS to trigger
a CAS reboot and during migration to migrate the state of the associated
DRC. But hot unplugging a device is also an asynchronous operation that
requires the guest to take action. This means that if the guest is migrated
after the hot unplug event was sent but before it could release the device
with RTAS, the destination QEMU doesn't know about the pending unplug
operation and doesn't actually remove the device when the guest finally
releases it.
Similarly, if the unplug request is fired before CAS, the guest isn't
notified of the change, just like with hotplug. It ends up booting with
the device still present in the DT and configures it, just like it was
never removed. Even weirder, since the event is still queued, it will
be eventually processed when some other unrelated event is posted to
the guest.
Enhance spapr_drc_transient() to also return true if an unplug request is
pending. This fixes the issue at CAS with a CAS reboot request and
causes the DRC state to be migrated. Some extra care is still needed to
inform the destination that an unplug request is pending : migrate the
unplug_requested field of the DRC in an optional subsection. This might
break backwards migration, but this is still better than ending with
an inconsistent guest.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158169248798.3465937.1108351365840514270.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We currently don't support hotplug of devices between boot and CAS. If
this happens a CAS reboot is triggered. We detect this during CAS using
the spapr_drc_needed() function which is essentially a VMStateDescription
.needed callback. Even if the condition for CAS reboot happens to be the
same as for DRC migration, it looks wrong to piggyback a migration helper
for this.
Introduce a helper with slightly more explicit name and use it in both CAS
and DRC migration code. Since a subsequent patch will enhance this helper
to cover the case of hot unplug, let's go for spapr_drc_transient(). While
here convert spapr_hotplugged_dev_before_cas() to the "transient" wording as
well.
This doesn't change any behaviour.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158169248180.3465937.9531405453362718771.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
'fdt' forgot to clean both e500 and pnv when we call 'system_reset' on ppc,
this patch fix it. The leak stacks are as follow:
Direct leak of 4194304 byte(s) in 4 object(s) allocated from:
#0 0x7fafe37dd970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7fafe2e3149d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x561876f7f80d in create_device_tree /mnt/sdb/qemu-new/qemu/device_tree.c:40
#3 0x561876b7ac29 in ppce500_load_device_tree /mnt/sdb/qemu-new/qemu/hw/ppc/e500.c:364
#4 0x561876b7f437 in ppce500_reset_device_tree /mnt/sdb/qemu-new/qemu/hw/ppc/e500.c:617
#5 0x56187718b1ae in qemu_devices_reset /mnt/sdb/qemu-new/qemu/hw/core/reset.c:69
#6 0x561876f6938d in qemu_system_reset /mnt/sdb/qemu-new/qemu/vl.c:1412
#7 0x561876f6a25b in main_loop_should_exit /mnt/sdb/qemu-new/qemu/vl.c:1645
#8 0x561876f6a398 in main_loop /mnt/sdb/qemu-new/qemu/vl.c:1679
#9 0x561876f7da8e in main /mnt/sdb/qemu-new/qemu/vl.c:4438
#10 0x7fafde16b812 in __libc_start_main ../csu/libc-start.c:308
#11 0x5618765c055d in _start (/mnt/sdb/qemu-new/qemu/build/ppc64-softmmu/qemu-system-ppc64+0x2b1555d)
Direct leak of 1048576 byte(s) in 1 object(s) allocated from:
#0 0x7fc0a6f1b970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7fc0a656f49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55eb05acd2ca in pnv_dt_create /mnt/sdb/qemu-new/qemu/hw/ppc/pnv.c:507
#3 0x55eb05ace5bf in pnv_reset /mnt/sdb/qemu-new/qemu/hw/ppc/pnv.c:578
#4 0x55eb05f2f395 in qemu_system_reset /mnt/sdb/qemu-new/qemu/vl.c:1410
#5 0x55eb05f43850 in main /mnt/sdb/qemu-new/qemu/vl.c:4403
#6 0x7fc0a18a9812 in __libc_start_main ../csu/libc-start.c:308
#7 0x55eb0558655d in _start (/mnt/sdb/qemu-new/qemu/build/ppc64-softmmu/qemu-system-ppc64+0x2b1555d)
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200214033206.4395-1-pannengyuan@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows moving the kernel in the guest memory. The option is useful
for step debugging (as Linux is linked at 0x0); it also allows loading
grub which is normally linked to run at 0x20000.
This uses the existing kernel address by default.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20200203032943.121178-6-aik@ozlabs.ru>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We obviously don't want to print out an error message if addr points to
a valid register.
Reported-by: Coverity CID 1419391 Missing break in switch
Fixes: 9ae1329ee2 "ppc/pnv: Add models for POWER8 PHB3 PCIe Host bridge"
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158153365202.3229002.11521084761048102466.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Obviously, we want to pass &local_err so that we can check it then
line below, not errp.
Reported-by: Coverity CID 1419395 'Constant' variable guards dead code
Fixes: 4f9924c4d4 "ppc/pnv: Add models for POWER9 PHB4 PCIe Host bridge"
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158153364605.3229002.2796177658957390343.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As reported by Coverity defect CID 1419397, the 'j' variable goes up to
63 and shouldn't be used to left shift a 32-bit integer.
The result of the operation goes to a 64-bit integer : use a 64-bit
constant.
Reported-by: Coverity CID 1419397 Bad bit shift operation
Fixes: 9ae1329ee2 "ppc/pnv: Add models for POWER8 PHB3 PCIe Host bridge"
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158153364010.3229002.8004283672455615950.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch implements few of the necessary hcalls for the nvdimm support.
PAPR semantics is such that each NVDIMM device is comprising of multiple
SCM(Storage Class Memory) blocks. The guest requests the hypervisor to
bind each of the SCM blocks of the NVDIMM device using hcalls. There can
be SCM block unbind requests in case of driver errors or unplug(not
supported now) use cases. The NVDIMM label read/writes are done through
hcalls.
Since each virtual NVDIMM device is divided into multiple SCM blocks,
the bind, unbind, and queries using hcalls on those blocks can come
independently. This doesn't fit well into the qemu device semantics,
where the map/unmap are done at the (whole)device/object level granularity.
The patch doesnt actually bind/unbind on hcalls but let it happen at the
device_add/del phase itself instead.
The guest kernel makes bind/unbind requests for the virtual NVDIMM device
at the region level granularity. Without interleaving, each virtual NVDIMM
device is presented as a separate guest physical address range. So, there
is no way a partial bind/unbind request can come for the vNVDIMM in a
hcall for a subset of SCM blocks of a virtual NVDIMM. Hence it is safe to
do bind/unbind everything during the device_add/del.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-Id: <158131059899.2897.11515211602702956854.stgit@lep8c.aus.stglabs.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add support for NVDIMM devices for sPAPR. Piggyback on existing nvdimm
device interface in QEMU to support virtual NVDIMM devices for Power.
Create the required DT entries for the device (some entries have
dummy values right now).
The patch creates the required DT node and sends a hotplug
interrupt to the guest. Guest is expected to undertake the normal
DR resource add path in response and start issuing PAPR SCM hcalls.
The device support is verified based on the machine version unlike x86.
This is how it can be used ..
Ex :
For coldplug, the device to be added in qemu command line as shown below
-object memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896
-device nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0
For hotplug, the device to be added from monitor as below
object_add memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896
device_add nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
[Early implementation]
Message-Id: <158131058078.2897.12767731856697459923.stgit@lep8c.aus.stglabs.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For ppc64, PAPR requires the nvdimm device to have UUID property
set in the device tree. Add an option to get it from the user.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <158131056931.2897.14057087440721445976.stgit@lep8c.aus.stglabs.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
nvdimm_device_list is required for parsing the list for devices
in subsequent patches. Move it to common utility area.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <158131055857.2897.15658377276504711773.stgit@lep8c.aus.stglabs.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We are going to add more init for the latest machine, so move the setup
to a function so we don't have to change the DEFINE_SPAPR_MACHINE macro
each time.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200207064628.1196095-1-mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When PHB4 bridge has been added, the dependencies to PCIE_PORT has been
added to XIVE_SPAPR and indirectly to PSERIES.
The build of the PowerNV machine is fine while we also build the PSERIES
machine.
If we disable the PSERIES machine, the PowerNV build fails because the
PCI Express files are not built:
/usr/bin/ld: hw/ppc/pnv.o: in function `pnv_chip_power8_pic_print_info':
.../hw/ppc/pnv.c:623: undefined reference to `pnv_phb3_msi_pic_print_info'
/usr/bin/ld: hw/ppc/pnv.o: in function `pnv_chip_power9_pic_print_info':
.../hw/ppc/pnv.c:639: undefined reference to `pnv_phb4_pic_print_info'
/usr/bin/ld: ../hw/usb/hcd-ehci-pci.o: in function `usb_ehci_pci_write_config':
.../hw/usb/hcd-ehci-pci.c:129: undefined reference to `pci_default_write_config'
/usr/bin/ld: ../hw/usb/hcd-ehci-pci.o: in function `usb_ehci_pci_realize':
.../hw/usb/hcd-ehci-pci.c:68: undefined reference to `pci_allocate_irq'
/usr/bin/ld: .../hw/usb/hcd-ehci-pci.c:72: undefined reference to `pci_register_bar'
/usr/bin/ld: ../hw/usb/hcd-ehci-pci.o:(.data.rel+0x50): undefined reference to `vmstate_pci_device'
This patch fixes the problem by adding needed dependencies to POWERNV.
Fixes: 4f9924c4d4 ("ppc/pnv: Add models for POWER9 PHB4 PCIe Host bridge")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200205232016.588202-3-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "ibm,os-term" RTAS call has a single parameter which is a pointer to
a message from the guest kernel about the termination cause; this prints
it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20200203032044.118585-1-aik@ozlabs.ru>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>