mirrors/qemu

mirror of https://github.com/qemu/qemu.git synced 2024-12-17 01:03:51 +08:00

Author	SHA1	Message	Date
Hanna Czenczek	cda83adc62	vhost-user: Interface for migration state transfer Add the interface for transferring the back-end's state during migration as defined previously in vhost-user.rst. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20231016134243.68248-6-hreitz@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-11-07 03:39:10 -05:00
Steve Sistare	89415796f6	cpr: relax vhost migration blockers vhost blocks migration if logging is not supported to track dirty memory, and vhost-user blocks it if the log cannot be saved to a shm fd. vhost-vdpa blocks migration if both hosts do not support all the device's features using a shadow VQ, for tracking requests and dirty memory. vhost-scsi blocks migration if storage cannot be shared across hosts, or if state cannot be migrated. None of these conditions apply if the old and new qemu processes do not run concurrently, and if new qemu starts on the same host as old, which is the case for cpr. Narrow the scope of these blockers so they only apply to normal mode. They will not block cpr modes when they are added in subsequent patches. No functional change until a new mode is added. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1698263069-406971-5-git-send-email-steven.sistare@oracle.com>	2023-11-01 16:13:59 +01:00
Stefan Hajnoczi	1b4a5a20da	virtio,pc,pci: features, cleanups infrastructure for vhost-vdpa shadow work piix south bridge rework reconnect for vhost-user-scsi dummy ACPI QTG DSM for cxl tests, cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg= =ILek -----END PGP SIGNATURE----- Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging virtio,pc,pci: features, cleanups infrastructure for vhost-vdpa shadow work piix south bridge rework reconnect for vhost-user-scsi dummy ACPI QTG DSM for cxl tests, cleanups, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo # YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU # pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd # 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT # ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV # 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa # zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg= # =ILek # -----END PGP SIGNATURE----- # gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT # gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469 # gpg: issuer "mst@redhat.com" # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full] # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full] # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits) intel-iommu: Report interrupt remapping faults, fix return value MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section vhost-user: Fix protocol feature bit conflict tests/acpi: Update DSDT.cxl with QTG DSM hw/cxl: Add QTG _DSM support for ACPI0017 device tests/acpi: Allow update of DSDT.cxl hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range vhost-user: fix lost reconnect vhost-user-scsi: start vhost when guest kicks vhost-user-scsi: support reconnect to backend vhost: move and rename the conn retry times vhost-user-common: send get_inflight_fd once hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine hw/isa/piix: Implement multi-process QEMU support also for PIIX4 hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4 hw/isa/piix: Rename functions to be shared for PCI interrupt triggering hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4 hw/isa/piix: Share PIIX3's base class with PIIX4 hw/isa/piix: Harmonize names of reset control memory regions ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-10-23 14:45:29 -07:00
Stefan Hajnoczi	c0c4f14729	virtio: call ->vhost_reset_device() during reset vhost-user-scsi has a VirtioDeviceClass->reset() function that calls ->vhost_reset_device(). The other vhost devices don't notify the vhost device upon reset. Stateful vhost devices may need to handle device reset in order to free resources or prevent stale device state from interfering after reset. Call ->vhost_device_reset() from virtio_reset() so that that vhost devices are notified of device reset. This patch affects behavior as follows: - vhost-kernel: No change in behavior since ->vhost_reset_device() is not implemented. - vhost-user: back-ends that negotiate VHOST_USER_PROTOCOL_F_RESET_DEVICE now receive a VHOST_USER_DEVICE_RESET message upon device reset. Otherwise there is no change in behavior. DPDK, SPDK, libvhost-user, and the vhost-user-backend crate do not negotiate VHOST_USER_PROTOCOL_F_RESET_DEVICE automatically. - vhost-vdpa: an extra SET_STATUS 0 call is made during device reset. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20231004014532.1228637-4-stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com>	2023-10-22 05:18:16 -04:00
Steve Sistare	c8a7fc5179	migration: simplify blockers Modify migrate_add_blocker and migrate_del_blocker to take an Error reason. This allows migration to own the Error object, so that if an error occurs in migrate_add_blocker, migration code can free the Error and clear the client handle, simplifying client code. It also simplifies the migrate_del_blocker call site. In addition, this is a pre-requisite for a proposed future patch that would add a mode argument to migration requests to support live update, and maintain a list of blockers for each mode. A blocker may apply to a single mode or to multiple modes, and passing Error will allow one Error object to be registered for multiple modes. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Tested-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Michael Galaxy <mgalaxy@akamai.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>	2023-10-20 08:51:41 +02:00
David Hildenbrand	533f5d6679	memory,vhost: Allow for marking memory device memory regions unmergeable Let's allow for marking memory regions unmergeable, to teach flatview code and vhost to not merge adjacent aliases to the same memory region into a larger memory section; instead, we want separate aliases to stay separate such that we can atomically map/unmap aliases without affecting other aliases. This is desired for virtio-mem mapping device memory located on a RAM memory region via multiple aliases into a memory region container, resulting in separate memslots that can get (un)mapped atomically. As an example with virtio-mem, the layout would look something like this: [...] 0000000240000000-00000020bfffffff (prio 0, i/o): device-memory 0000000240000000-000000043fffffff (prio 0, i/o): virtio-mem 0000000240000000-000000027fffffff (prio 0, ram): alias memslot-0 @mem2 0000000000000000-000000003fffffff 0000000280000000-00000002bfffffff (prio 0, ram): alias memslot-1 @mem2 0000000040000000-000000007fffffff 00000002c0000000-00000002ffffffff (prio 0, ram): alias memslot-2 @mem2 0000000080000000-00000000bfffffff [...] Without unmergable memory regions, all three memslots would get merged into a single memory section. For example, when mapping another alias (e.g., virtio-mem-memslot-3) or when unmapping any of the mapped aliases, memory listeners will first get notified about the removal of the big memory section to then get notified about re-adding of the new (differently merged) memory section(s). In an ideal world, memory listeners would be able to deal with that atomically, like KVM nowadays does. However, (a) supporting this for other memory listeners (vhost-user, vfio) is fairly hard: temporary removal can result in all kinds of issues on concurrent access to guest memory; and (b) this handling is undesired, because temporarily removing+readding can consume quite some time on bigger memslots and is not efficient (e.g., vfio unpinning and repinning pages ...). Let's allow for marking a memory region unmergeable, such that we can atomically (un)map aliases to the same memory region, similar to (un)mapping individual DIMMs. Similarly, teach vhost code to not redo what flatview core stopped doing: don't merge such sections. Merging in vhost code is really only relevant for handling random holes in boot memory where; without this merging, the vhost-user backend wouldn't be able to mmap() some boot memory backed on hugetlb. We'll use this for virtio-mem next. Message-ID: <20230926185738.277351-18-david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:22 +02:00
David Hildenbrand	a2335113ae	memory-device,vhost: Support automatic decision on the number of memslots We want to support memory devices that can automatically decide how many memslots they will use. In the worst case, they have to use a single memslot. The target use cases are virtio-mem and the hyper-v balloon. Let's calculate a reasonable limit such a memory device may use, and instruct the device to make a decision based on that limit. Use a simple heuristic that considers: * A memslot soft-limit for all memory devices of 256; also, to not consume too many memslots -- which could harm performance. * Actually still free and unreserved memslots * The percentage of the remaining device memory region that memory device will occupy. Further, while we properly check before plugging a memory device whether there still is are free memslots, we have other memslot consumers (such as boot memory, PCI BARs) that don't perform any checks and might dynamically consume memslots without any prior reservation. So we might succeed in plugging a memory device, but once we dynamically map a PCI BAR we would be in trouble. Doing accounting / reservation / checks for all such users is problematic (e.g., sometimes we might temporarily split boot memory into two memslots, triggered by the BIOS). We use the historic magic memslot number of 509 as orientation to when supporting 256 memory devices -> memslots (leaving 253 for boot memory and other devices) has been proven to work reliable. We'll fallback to suggesting a single memslot if we don't have at least 509 total memslots. Plugging vhost devices with less than 509 memslots available while we have memory devices plugged that consume multiple memslots due to automatic decisions can be problematic. Most configurations might just fail due to "limit < used + reserved", however, it can also happen that these memory devices would suddenly consume memslots that would actually be required by other memslot consumers (boot, PCI BARs) later. Note that this has always been sketchy with vhost devices that support only a small number of memslots; but we don't want to make it any worse.So let's keep it simple and simply reject plugging such vhost devices in such a configuration. Eventually, all vhost devices that want to be fully compatible with such memory devices should support a decent number of memslots (>= 509). Message-ID: <20230926185738.277351-13-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:22 +02:00
David Hildenbrand	cd89c065b0	vhost: Add vhost_get_max_memslots() Let's add vhost_get_max_memslots(). Message-ID: <20230926185738.277351-12-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:22 +02:00
David Hildenbrand	766aa0a654	memory-device,vhost: Support memory devices that dynamically consume memslots We want to support memory devices that have a dynamically managed memory region container as device memory region. This device memory region maps multiple RAM memory subregions (e.g., aliases to the same RAM memory region), whereby these subregions can be (un)mapped on demand. Each RAM subregion will consume a memslot in KVM and vhost, resulting in such a new device consuming memslots dynamically, and initially usually 0. We already track the number of used vs. required memslots for all memslots. From that, we can derive the number of reserved memslots that must not be used otherwise. The target use case is virtio-mem and the hyper-v balloon, which will dynamically map aliases to RAM memory region into their device memory region container. Properly document what's supported and what's not and extend the vhost memslot check accordingly. Message-ID: <20230926185738.277351-10-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:22 +02:00
David Hildenbrand	8c49951c4a	vhost: Return number of free memslots Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. This is a preparation for memory devices that consume multiple memslots. Message-ID: <20230926185738.277351-6-david@redhat.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:22 +02:00
David Hildenbrand	309ebfa691	vhost: Remove vhost_backend_can_merge() callback Checking whether the memory regions are equal is sufficient: if they are equal, then most certainly the contained fd is equal. The whole vhost-user memslot handling is suboptimal and overly complicated. We shouldn't have to lookup a RAM memory regions we got notified about in vhost_user_get_mr_data() using a host pointer. But that requires a bigger rework -- especially an alternative vhost_set_mem_table() backend call that simply consumes MemoryRegionSections. For now, let's just drop vhost_backend_can_merge(). Message-ID: <20230926185738.277351-3-david@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:21 +02:00
David Hildenbrand	552b25229c	vhost: Rework memslot filtering and fix "used_memslot" tracking Having multiple vhost devices, some filtering out fd-less memslots and some not, can mess up the "used_memslot" accounting. Consequently our "free memslot" checks become unreliable and we might run out of free memslots at runtime later. An example sequence which can trigger a potential issue that involves different vhost backends (vhost-kernel and vhost-user) and hotplugged memory devices can be found at [1]. Let's make the filtering mechanism less generic and distinguish between backends that support private memslots (without a fd) and ones that only support shared memslots (with a fd). Track the used_memslots for both cases separately and use the corresponding value when required. Note: Most probably we should filter out MAP_PRIVATE fd-based RAM regions (for example, via memory-backend-memfd,...,shared=off or as default with memory-backend-file) as well. When not using MAP_SHARED, it might not work as expected. Add a TODO for now. [1] https://lkml.kernel.org/r/fad9136f-08d3-3fd9-71a1-502069c000cf@redhat.com Message-ID: <20230926185738.277351-2-david@redhat.com> Fixes: `988a27754b` ("vhost: allow backends to filter memory sections") Cc: Tiwei Bie <tiwei.bie@intel.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-10-12 14:15:21 +02:00
Thomas Huth	da3182887c	hw/virtio/vhost: Silence compiler warnings in vhost code when using -Wshadow Rename a variable in vhost_dev_sync_region() and remove a superfluous declaration in vhost_commit() to make this code compilable with "-Wshadow". Signed-off-by: Thomas Huth <thuth@redhat.com> Message-ID: <20231004114809.105672-1-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-By: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2023-10-06 12:47:57 +02:00
Li Feng	18f2971ce4	vhost: fix the fd leak When the vhost-user reconnect to the backend, the notifer should be cleanup. Otherwise, the fd resource will be exhausted. Fixes: `f9a09ca3ea` ("vhost: add support for configure interrupt") Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20230731121018.2856310-2-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com>	2023-08-03 16:06:49 -04:00
Viktor Prutyanov	ee071f67f7	vhost: register and change IOMMU flag depending on Device-TLB state The guest can disable or never enable Device-TLB. In these cases, it can't be used even if enabled in QEMU. So, check Device-TLB state before registering IOMMU notifier and select unmap flag depending on that. Also, implement a way to change IOMMU notifier flag if Device-TLB state is changed. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312 Signed-off-by: Viktor Prutyanov <viktor@daynix.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230626091258.24453-2-viktor@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-07-10 15:07:50 -04:00
Isaku Yamahata	8be0461d37	exec/memory: Add symbol for memory listener priority for device backend Add MEMORY_LISTENER_PRIORITY_DEV_BACKEND for the symbolic value for memory listener to replace the hard-coded value 10 for the device backend. No functional change intended. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <8314d91688030d7004e96958f12e2c83fb889245.1687279702.git.isaku.yamahata@intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-28 14:27:59 +02:00
Laurent Vivier	92099aa4e9	vhost: fix vhost_dev_enable_notifiers() error case in vhost_dev_enable_notifiers(), if virtio_bus_set_host_notifier(true) fails, we call vhost_dev_disable_notifiers() that executes virtio_bus_set_host_notifier(false) on all queues, even on queues that have failed to be initialized. This triggers a core dump in memory_region_del_eventfd(): virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24) vhost VQ 1 notifier binding failed: 24 .../softmmu/memory.c:2611: memory_region_del_eventfd: Assertion `i != mr->ioeventfd_nb' failed. Fix the problem by providing to vhost_dev_disable_notifiers() the number of queues to disable. Fixes: `8771589b6f` ("vhost: simplify vhost_dev_enable_notifiers") Cc: longpeng2@huawei.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20230602162735.3670785-1-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-06-26 09:50:00 -04:00
Prasad Pandit	77ece20ba0	vhost: release virtqueue objects in error path vhost_dev_start function does not release virtqueue objects when event_notifier_init() function fails. Release virtqueue objects and log a message about function failure. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Message-Id: <20230529114333.31686-3-ppandit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: `f9a09ca3ea` ("vhost: add support for configure interrupt") Reviewed-by: Peter Xu <peterx@redhat.com> Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com>	2023-06-23 02:54:44 -04:00
Prasad Pandit	1e3ffb34f7	vhost: release memory_listener object in error path vhost_dev_start function does not release memory_listener object in case of an error. This may crash the guest when vhost is unable to set memory table: stack trace of thread 125653: Program terminated with signal SIGSEGV, Segmentation fault #0 memory_listener_register (qemu-kvm + 0x6cda0f) #1 vhost_dev_start (qemu-kvm + 0x699301) #2 vhost_net_start (qemu-kvm + 0x45b03f) #3 virtio_net_set_status (qemu-kvm + 0x665672) #4 qmp_set_link (qemu-kvm + 0x548fd5) #5 net_vhost_user_event (qemu-kvm + 0x552c45) #6 tcp_chr_connect (qemu-kvm + 0x88d473) #7 tcp_chr_new_client (qemu-kvm + 0x88cf83) #8 tcp_chr_accept (qemu-kvm + 0x88b429) #9 qio_net_listener_channel_func (qemu-kvm + 0x7ac07c) #10 g_main_context_dispatch (libglib-2.0.so.0 + 0x54e2f) Release memory_listener objects in the error path. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> Message-Id: <20230529114333.31686-2-ppandit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Fixes: `c471ad0e9b` ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com>	2023-06-23 02:54:44 -04:00
Philippe Mathieu-Daudé	4ee4667ded	hw/virtio: Remove unnecessary 'virtio-access.h' header None of these files use the VirtIO Load/Store API declared by "hw/virtio/virtio-access.h". This header probably crept in via copy/pasting, remove it. Note, "virtio-access.h" is target-specific, so any file including it also become tainted as target-specific. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230524093744.88442-10-philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2023-06-23 02:54:44 -04:00
Cindy Lu	74b5d2b56c	vhost: expose function vhost_dev_has_iommu() To support vIOMMU in vdpa, need to exposed the function vhost_dev_has_iommu, vdpa will use this function to check if vIOMMU enable. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20230510054631.2951812-2-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-05-19 10:30:46 -04:00
Peter Xu	560a997535	vhost: Drop unused eventfd_add\|del hooks These hooks were introduced in: `80a1ea3748` ("memory: move ioeventfd ops to MemoryListener", 2012-02-29) But they seem to be never used. Drop them. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20230306193209.516011-1-peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-21 04:25:52 -04:00
Eugenio Pérez	c3716f260b	vdpa: move vhost reset after get vring base The function vhost.c:vhost_dev_stop calls vhost operation vhost_dev_start(false). In the case of vdpa it totally reset and wipes the device, making the fetching of the vring base (virtqueue state) totally useless. The kernel backend does not use vhost_dev_start vhost op callback, but vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa is desirable, but it can be added on top. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230303172445.1089785-8-eperezma@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-03-07 12:38:59 -05:00
Longpeng	0fdc6b8509	vhost: configure all host notifiers in a single MR transaction This allows the vhost device to batch the setup of all its host notifiers. This significantly reduces the device starting time, e.g. the time spend on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device) Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20221227072015.3134-3-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-01-08 01:54:23 -05:00
Longpeng	8771589b6f	vhost: simplify vhost_dev_enable_notifiers Simplify the error path in vhost_dev_enable_notifiers by using vhost_dev_disable_notifiers directly. Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20221227072015.3134-2-longpeng2@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Cindy Lu	f9a09ca3ea	vhost: add support for configure interrupt Add functions to support configure interrupt. The configure interrupt process will start in vhost_dev_start and stop in vhost_dev_stop. Also add the functions to support vhost_config_pending and vhost_config_mask. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20221222070451.936503-8-lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Jason Wang	345cc1cbcb	vhost: fix vq dirty bitmap syncing when vIOMMU is enabled When vIOMMU is enabled, the vq->used_phys is actually the IOVA not GPA. So we need to translate it to GPA before the syncing otherwise we may hit the following crash since IOVA could be out of the scope of the GPA log size. This could be noted when using virtio-IOMMU with vhost using 1G memory. Fixes: `c471ad0e9b` ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Tested-by: Lei Yang <leiyang@redhat.com> Reported-by: Yalan Zhang <yalzhang@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221216033552.77087-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-21 07:32:24 -05:00
Stefano Garzarella	4daa5054c5	vhost: enable vrings in vhost_dev_start() for vhost-user devices Commit `02b61f38d3` ("hw/virtio: incorporate backend features in features") properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user backend, but we forgot to enable vrings as specified in docs/interop/vhost-user.rst: If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the ring starts directly in the enabled state. If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is initialized in a disabled state and is enabled by ``VHOST_USER_SET_VRING_ENABLE`` with parameter 1. Some vhost-user front-ends already did this by calling vhost_ops.vhost_set_vring_enable() directly: - backends/cryptodev-vhost.c - hw/net/virtio-net.c - hw/virtio/vhost-user-gpio.c But most didn't do that, so we would leave the vrings disabled and some backends would not work. We observed this issue with the rust version of virtiofsd [1], which uses the event loop [2] provided by the vhost-user-backend crate where requests are not processed if vring is not enabled. Let's fix this issue by enabling the vrings in vhost_dev_start() for vhost-user front-ends that don't already do this directly. Same thing also in vhost_dev_stop() where we disable vrings. [1] https://gitlab.com/virtio-fs/virtiofsd [2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217 Fixes: `02b61f38d3` ("hw/virtio: incorporate backend features in features") Reported-by: German Maglione <gmaglione@redhat.com> Tested-by: German Maglione <gmaglione@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Message-Id: <20221123131630.52020-1-sgarzare@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20221130112439.2527228-3-alex.bennee@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-12-01 02:30:04 -05:00
Kangjie Xu	e1f101d9f6	vhost: expose vhost_virtqueue_stop() Expose vhost_virtqueue_stop(), we need to use it when resetting a virtqueue. Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221017092558.111082-9-xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 13:12:20 -05:00
Kangjie Xu	ff48b62809	vhost: expose vhost_virtqueue_start() Expose vhost_virtqueue_start(), we need to use it when restarting a virtqueue. Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20221017092558.111082-8-xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-11-07 13:12:20 -05:00
Alex Bennée	a276123119	hw/virtio: add some vhost-user trace events These are useful for tracing the lifetime of vhost-user connections. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220802095010.3330793-10-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-10-07 09:41:51 -04:00
Alex Bennée	f20400ed0d	hw/virtio: gracefully handle unset vhost_dev vdev I've noticed asserts firing because we query the status of vdev after a vhost connection is closed down. Rather than faulting on the NULL indirect just quietly reply false. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220728135503.1060062-3-alex.bennee@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-17 07:07:37 -04:00
Konstantin Khlebnikov	ae50ae0b91	vhost: setup error eventfd and dump errors Vhost has error notifications, let's log them like other errors. For each virt-queue setup eventfd for vring error notifications. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> [vsementsov: rename patch, change commit message and dump error like other errors in the file] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220623161325.18813-3-vsementsov@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>	2022-06-27 18:53:18 -04:00
Ni Xun	9ce305c8be	vhost: also check queue state in the vhost_dev_set_log error routine When check queue state in the vhost_dev_set_log routine, it miss the error routine check, this patch also check queue state in error case. Fixes: `1e5a050f57` ("check queue state in the vhost_dev_set_log routine") Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Zhigang Lu <tonnylu@tencent.com> Message-Id: <OS0PR01MB57139163F3F3955960675B52EAA79@OS0PR01MB5713.jpnprd01.prod.outlook.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-06-16 12:54:58 -04:00
Jonah Palmer	c255488d67	virtio: add vhost support for virtio devices This patch adds a get_vhost() callback function for VirtIODevices that returns the device's corresponding vhost_dev structure, if the vhost device is running. This patch also adds a vhost_started flag for VirtIODevices. Previously, a VirtIODevice wouldn't be able to tell if its corresponding vhost device was active or not. Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <1648819405-25696-3-git-send-email-jonah.palmer@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-16 04:38:40 -04:00
Thomas Huth	55d71e0b78	Don't include sysemu/tcg.h if it is not necessary This header only defines the tcg_allowed variable and the tcg_enabled() function - which are not required in many files that include this header. Drop the #include statement there. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220315144107.1012530-1-thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-04-20 12:12:47 -07:00
Marc-André Lureau	e03b56863d	Replace config-time define HOST_WORDS_BIGENDIAN Replace a config-time define with a compile time condition define (compatible with clang and gcc) that must be declared prior to its usage. This avoids having a global configure time define, but also prevents from bad usage, if the config header wasn't included before. This can help to make some code independent from qemu too. gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> [ For the s390x parts I'm involved in ] Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:37 +02:00
Sergio Lopez	ff5eb77b8a	vhost: use wfd on functions setting vring call fd When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd can be used for writing. Use the recently introduced event_notifier_get_wfd() function to obtain the fd that our peer must use to signal the vring. Signed-off-by: Sergio Lopez <slp@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220304100854.14829-3-slp@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 06:19:47 -05:00
Michael S. Tsirkin	a86d1a0a93	Revert "vhost: add support for configure interrupt" This reverts commit `f7220a7ce2`. Fixes: `f7220a7ce2` ("vhost: add support for configure interrupt") Cc: "Cindy Lu" <lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-10 16:01:11 -05:00
Roman Kagan	5d33ae4b7a	vhost: stick to -errno error return convention The generic vhost code expects that many of the VhostOps methods in the respective backends set errno on errors. However, none of the existing backends actually bothers to do so. In a number of those methods errno from the failed call is clobbered by successful later calls to some library functions; on a few code paths the generic vhost code then negates and returns that errno, thus making failures look as successes to the caller. As a result, in certain scenarios (e.g. live migration) the device doesn't notice the first failure and goes on through its state transitions as if everything is ok, instead of taking recovery actions (break and reestablish the vhost-user connection, cancel migration, etc) before it's too late. To fix this, consolidate on the convention to return negated errno on failures throughout generic vhost, and use it for error propagation. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-07 05:19:55 -05:00
Cindy Lu	f7220a7ce2	vhost: add support for configure interrupt Add functions to support configure interrupt. The configure interrupt process will start in vhost_dev_start and stop in vhost_dev_stop. Also add the functions to support vhost_config_pending and vhost_config_mask, for masked_config_notifier, we only use the notifier saved in vq 0. Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20211104164827.21911-8-lulu@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-06 06:11:39 -05:00
Leonardo Garcia	f71d31fa81	hw/virtio/vhost: Fix typo in comment. Signed-off-by: Leonardo Garcia <lagarcia@br.ibm.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <a10a0ddab65b474ebea1e1141abe0f4aa463909b.1637668012.git.lagarcia@br.ibm.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-12-17 10:46:08 +01:00
Peter Xu	142518bda5	memory: Name all the memory listeners Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:24 +02:00
Jason Wang	ae4003738f	vhost: correctly detect the enabling IOMMU Vhost used to compare the dma_as against the address_space_memory to detect whether the IOMMU is enabled or not. This might not work well since the virito-bus may call get_dma_as if VIRTIO_F_IOMMU_PLATFORM is set without an actual IOMMU enabled when device is plugged. In the case of PCI where pci_get_address_space() is used, the bus master as is returned. So vhost actually tries to enable device IOTLB even if the IOMMU is not enabled. This will lead a lots of unnecessary transactions between vhost and Qemu and will introduce a huge drop of the performance. For PCI, an ideal approach is to use pci_device_iommu_address_space() just for get_dma_as. But Qemu may choose to initialize the IOMMU after the virtio-pci which lead a wrong address space is returned during device plugged. So this patch switch to use transport specific way via iommu_enabled() to detect the IOMMU during vhost start. In this case, we are fine since we know the IOMMU is initialized correctly. Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20210804034803.1644-4-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-04 16:35:17 -04:00
Tiberiu Georgescu	9b1d929adb	hw/virtio: move vhost_set_backend_type() to vhost.c Just a small refactor patch. vhost_set_backend_type() gets called only in vhost.c, so we can move the function there and make it static. We can then extern the visibility of kernel_ops, to match the other VhostOps in vhost-backend.h. The VhostOps constants now make more sense in vhost.h Suggested-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Signed-off-by: Tiberiu Georgescu <tiberiu.georgescu@nutanix.com> Message-Id: <20210809134015.67941-1-tiberiu.georgescu@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-09-04 09:07:46 -04:00
Markus Armbruster	998647dc8f	vhost: Clean up how VhostOpts method vhost_backend_init() fails vhost_user_backend_init() can fail without setting an error. Unclean. Its caller vhost_dev_init() compensates by substituting a generic error then. Goes back to commit `28770ff935` "vhost: Distinguish errors in vhost_backend_init()". Clean up by moving the generic error from vhost_dev_init() to all the failure paths that neglect to set an error. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-14-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2021-08-26 17:15:28 +02:00
Markus Armbruster	66647ed459	vhost: Clean up how VhostOpts method vhost_get_config() fails vhost_user_get_config() can fail without setting an error. Unclean. Its caller vhost_dev_get_config() compensates by substituting a generic error then. Goes back to commit `50de51387f` "vhost: Distinguish errors in vhost_dev_get_config()". Clean up by moving the generic error from vhost_dev_get_config() to all the failure paths that neglect to set an error. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-13-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Sign of error_setg_errno()'s second argument fixed in both calls]	2021-08-26 17:15:28 +02:00
Markus Armbruster	436c831a28	migration: Unify failure check for migrate_add_blocker() Most callers check the return value. Some check whether it set an error. Functionally equivalent, but the former tends to be easier on the eyes, so do that everywhere. Prior art: commit `c6ecec43b2` "qemu-option: Check return value instead of @err where convenient". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Kevin Wolf	50de51387f	vhost: Distinguish errors in vhost_dev_get_config() Instead of just returning 0/-1 and letting the caller make up a meaningless error message, add an Error parameter to allow reporting the real error and switch to 0/-errno so that different kind of errors can be distinguished in the caller. config_len in vhost_user_get_config() is defined by the device, so if it's larger than VHOST_USER_MAX_CONFIG_SIZE, this is a programming error. Turn the corresponding check into an assertion. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210609154658.350308-6-kwolf@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-30 13:18:42 +02:00
Kevin Wolf	f2a6e6c4fa	vhost: Return 0/-errno in vhost_dev_init() Instead of just returning 0/-1 and letting the caller make up a meaningless error message, switch to 0/-errno so that different kinds of errors can be distinguished in the caller. This involves changing a few more callbacks in VhostOps to return 0/-errno: .vhost_set_owner(), .vhost_get_features() and .vhost_virtqueue_set_busyloop_timeout(). The implementations of these functions are trivial as they generally just send a message to the backend. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210609154658.350308-4-kwolf@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-30 13:16:05 +02:00

1 2 3 4 5

205 Commits