This partially reverts commit bbd2d5a812.
This commit was misguided and broke using --disable-pie on any distro
that enables PIE by default in their compiler driver, including Debian
and its derivatives. Whilst -no-pie is not a linker flag, it is a
compiler driver flag that ensures -pie is not automatically passed by it
to the linker. Without it, all compile_prog checks will fail as any code
built with the explicit -fno-pie will fail to link with the implicit
default -pie due to trying to use position-dependent relocations. The
only bug that needed fixing was LDFLAGS_NOPIE being used as a flag for
the linker itself in pc-bios/optionrom/Makefile.
Note this does not reinstate exporting LDFLAGS_NOPIE, as it is unused,
since the only previous use was the one that should not have existed. I
have also updated the comment for the -fno-pie and -no-pie checks to
reflect what they're actually needed for.
Fixes: bbd2d5a812
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Roth <michael.roth@amd.com>
The CDE desktop on HP-UX 10 shows wrongly rendered pixels when the local screen
menu is closed. This bug was introduced by commit c7050f3f16
("hw/display/artist: Refactor x/y coordination extraction") which converted the
coordinate extraction in artist_vram_read() and artist_vram_write() to use the
ADDR_TO_X and ADDR_TO_Y macros, but forgot to right-shift the address by 2 as
it was done before.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: c7050f3f16 ("hw/display/artist: Refactor x/y coordination extraction")
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <YK1aPb8keur9W7h2@ls3530>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 01f750f5fe)
Signed-off-by: Michael Roth <michael.roth@amd.com>
We end up not copying the mmap_addr of all existing regions, resulting
in a SEGFAULT once we actually try to map/access anything within our
memory regions.
Fixes: 875b9fd97b ("Support individual region unmap in libvhost-user")
Cc: qemu-stable@nongnu.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Coiby Xu <coiby.xu@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211011201047.62587-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6889eb2d43)
Signed-off-by: Michael Roth <michael.roth@amd.com>
In case of device resume after suspend, VQ notifier MR still valid.
Duplicated registrations explode memory block list and slow down device
resume.
Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: tiwei.bie@intel.com
Cc: qemu-stable@nongnu.org
Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20211008080215.590292-1-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a1ed9ef1de)
Signed-off-by: Michael Roth <michael.roth@amd.com>
If 'virgl_cmd_get_capset' set 'max_size' to 0,
the 'virgl_renderer_fill_caps' will write the data after the 'resp'.
This patch avoid this by checking the returned 'max_size'.
virtio-gpu fix: abd7f08b23 ("display: virtio-gpu-3d: check
virgl capabilities max_size")
Fixes: CVE-2021-3546
Reported-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-8-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 9f22893adc)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The 'res->iov' will be leaked if the guest trigger following sequences:
virgl_cmd_create_resource_2d
virgl_resource_attach_backing
virgl_cmd_resource_unref
This patch fixes this.
Fixes: CVE-2021-3544
Reported-by: Li Qiang <liq3ea@163.com>
virtio-gpu fix: 5e8e3c4c75 ("virtio-gpu: fix resource leak
in virgl_cmd_resource_unref"
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-6-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f6091d86ba)
Signed-off-by: Michael Roth <michael.roth@amd.com>
If the guest trigger following sequences, the attach_backing will be leaked:
vg_resource_create_2d
vg_resource_attach_backing
vg_resource_unref
This patch fix this by freeing 'res->iov' in vg_resource_destroy.
Fixes: CVE-2021-3544
Reported-by: Li Qiang <liq3ea@163.com>
virtio-gpu fix: 5e8e3c4c75 ("virtio-gpu: fix resource leak
in virgl_cmd_resource_unref")
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-5-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b7afebcf9e)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Otherwise some of the 'resp' will be leaked to guest.
Fixes: CVE-2021-3545
Reported-by: Li Qiang <liq3ea@163.com>
virtio-gpu fix: 42a8dadc74 ("virtio-gpu: fix information leak
in getting capset info dispatch")
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-2-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 121841b25d)
Signed-off-by: Michael Roth <michael.roth@amd.com>
usb-host and usb-redirect try to batch bulk transfers by combining many
small usb packets into a single, large transfer request, to reduce the
overhead and improve performance.
This patch adds a size limit of 1 MiB for those combined packets to
restrict the host resources the guest can bind that way.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210503132915.2335822-6-kraxel@redhat.com>
(cherry picked from commit 05a40b172e)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Use autofree heap allocation instead.
Fixes: 4f4321c11f ("usb: use iovecs in USBPacket")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-3-kraxel@redhat.com>
(cherry picked from commit 7ec54f9eb6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The device uses the guest-supplied stream number unchecked, which can
lead to guest-triggered out-of-band access to the UASDevice->data3 and
UASDevice->status3 fields. Add the missing checks.
Fixes: CVE-2021-3713
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reported-by: Chen Zhe <chenzhe@huawei.com>
Reported-by: Tan Jingguo <tanjingguo@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210818120505.1258262-2-kraxel@redhat.com>
(cherry picked from commit 13b250b12a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Apparently, we don't have to duplicate the string.
Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210929162445.64060-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 75b98cb9f6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
HMP command "change vnc" can take the password as argument, or prompt
for it:
(qemu) change vnc password 123
(qemu) change vnc password
Password: ***
(qemu)
This regressed in commit cfb5387a1d "hmp: remove "change vnc TARGET"
command", v6.0.0.
(qemu) change vnc passwd 123
Password: ***
(qemu) change vnc passwd
(qemu)
The latter passes NULL to qmp_change_vnc_password(), which is a no-no.
Looks like it puts the display into "password required, but none set"
state.
The logic error is easy to miss in review, but testing should've
caught it.
Fix the obvious way.
Fixes: cfb5387a1d
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210909081219.308065-2-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 6193344f93)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Both qemu and qemu-img use writeback cache mode by default, which is
already documented in qemu(1). qemu-nbd uses writethrough cache mode by
default, and the default cache mode is not documented.
According to the qemu-nbd(8):
--cache=CACHE
The cache mode to be used with the file. See the
documentation of the emulator's -drive cache=... option for
allowed values.
qemu(1) says:
The default mode is cache=writeback.
So users have no reason to assume that qemu-nbd is using writethough
cache mode. The only hint is the painfully slow writing when using the
defaults.
Looking in git history, it seems that qemu used writethrough in the past
to support broken guests that did not flush data properly, or could not
flush due to limitations in qemu. But qemu-nbd clients can use
NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone.
Change the default cache mode to writback, and document the default and
available values properly in the online help and manual.
With this change converting image via qemu-nbd is 3.5 times faster.
$ qemu-img create dst.img 50g
$ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img
Before this change:
$ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
Time (mean ± σ): 83.639 s ± 5.970 s [User: 2.733 s, System: 6.112 s]
Range (min … max): 76.749 s … 87.245 s 3 runs
After this change:
$ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
Time (mean ± σ): 23.522 s ± 0.433 s [User: 2.083 s, System: 5.475 s]
Range (min … max): 23.234 s … 24.019 s 3 runs
Users can avoid the issue by using --cache=writeback[1] but the defaults
should give good performance for the common use case.
[1] https://bugzilla.redhat.com/1990656
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210813205519.50518-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 0961525705)
Signed-off-by: Michael Roth <michael.roth@amd.com>
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().
Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.
This addresses CVE-2021-3748.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit bedd7e93d0)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Currently all of the M-profile specific code in arm_cpu_reset() is
inside a !defined(CONFIG_USER_ONLY) ifdef block. This is
unintentional: it happened because originally the only
M-profile-specific handling was the setup of the initial SP and PC
from the vector table, which is system-emulation only. But then we
added a lot of other M-profile setup to the same "if (ARM_FEATURE_M)"
code block without noticing that it was all inside a not-user-mode
ifdef. This has generally been harmless, but with the addition of
v8.1M low-overhead-loop support we ran into a problem: the reset of
FPSCR.LTPSIZE to 4 was only being done for system emulation mode, so
if a user-mode guest tried to execute the LE instruction it would
incorrectly take a UsageFault.
Adjust the ifdefs so only the really system-emulation specific parts
are covered. Because this means we now run some reset code that sets
up initial values in the FPCCR and similar FPU related registers,
explicitly set up the registers controlling FPU context handling in
user-emulation mode so that the FPU works by design and not by
chance.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/613
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210914120725.24992-2-peter.maydell@linaro.org
(cherry picked from commit b62ceeaf80)
Signed-off-by: Michael Roth <michael.roth@amd.com>
The audio migration vmstate is empty, and always has been; we can't
just remove it though because an old qemu might send it us.
Changes with -audiodev now mean it's sometimes created when it didn't
used to be, and can confuse migration to old qemu.
Change it so that vmstate_audio is never sent; if it's received it
should still be accepted, and old qemu's shouldn't be too upset if it's
missing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210809170956.78536-1-dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit da77adbaf6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
OSS-Fuzz found sending illegal addresses when querying the write
protection bits triggers the assertion added in commit 84816fb63e
("hw/sd/sdcard: Assert if accessing an illegal group"):
qemu-fuzz-i386-target-generic-fuzz-sdhci-v3: ../hw/sd/sd.c:824: uint32_t sd_wpbits(SDState *, uint64_t):
Assertion `wpnum < sd->wpgrps_size' failed.
#3 0x7f62a8b22c91 in __assert_fail
#4 0x5569adcec405 in sd_wpbits hw/sd/sd.c:824:9
#5 0x5569adce5f6d in sd_normal_command hw/sd/sd.c:1389:38
#6 0x5569adce3870 in sd_do_command hw/sd/sd.c:1737:17
#7 0x5569adcf1566 in sdbus_do_command hw/sd/core.c💯16
#8 0x5569adcfc192 in sdhci_send_command hw/sd/sdhci.c:337:12
#9 0x5569adcfa3a3 in sdhci_write hw/sd/sdhci.c:1186:9
#10 0x5569adfb3447 in memory_region_write_accessor softmmu/memory.c:492:5
It is legal for the CMD30 to query for out-of-range addresses.
Such invalid addresses are simply ignored in the response (write
protection bits set to 0).
In commit 84816fb63e ("hw/sd/sdcard: Assert if accessing an illegal
group") we misplaced the assertion *before* we test the address is
in range. Move it *after*.
Include the qtest reproducer provided by Alexander Bulekov:
$ make check-qtest-i386
...
Running test qtest-i386/fuzz-sdcard-test
qemu-system-i386: ../hw/sd/sd.c:824: sd_wpbits: Assertion `wpnum < sd->wpgrps_size' failed.
Cc: qemu-stable@nongnu.org
Reported-by: OSS-Fuzz (Issue 29225)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 84816fb63e ("hw/sd/sdcard: Assert if accessing an illegal group")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/495
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210802235524.3417739-3-f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
(cherry picked from commit 4ac0b72bae)
*drop fuzz test additions, since sdcard fuzz test has functional
dependency on guest-visible change not flagged for stable:
59b63d78 ("hw/sd/sdcard: Check for valid address range in SEND_WRITE_PROT (CMD30)")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Per the 'Physical Layer Simplified Specification Version 3.01',
Table 4-22: 'Block Oriented Write Protection Commands'
SEND_WRITE_PROT (CMD30)
If the card provides write protection features, this command asks
the card to send the status of the write protection bits [1].
[1] 32 write protection bits (representing 32 write protect groups
starting at the specified address) [...]
The last (least significant) bit of the protection bits corresponds
to the first addressed group. If the addresses of the last groups
are outside the valid range, then the corresponding write protection
bits shall be set to 0.
Split the if() statement (without changing the behaviour of the code)
to better position the description comment.
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210802235524.3417739-2-f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
(cherry picked from commit 2a0396285d)
*context dependency for 4ac0b72bae ("hw/sd/sdcard: Fix assertion accessing out-of-range addresses with CMD30")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Problem reported by openEuler fuzz-sig group.
The buff2frame_bas function (hw\net\can\can_sja1000.c)
infoleak(qemu5.x~qemu6.x) or stack-overflow(qemu 4.x).
Reported-by: Qiang Ning <ningqiang1@huawei.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 11744862f2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:
1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
and consequently won't release free pages back to the OS once
migration finishes.
The issue is that for postcopy, we won't do a final bitmap sync while
the guest is stopped on the source and
virtio_balloon_free_page_hint_notify() will only call
virtio_balloon_free_page_done() on the source during
PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
the destination.
2) Once the VM touches a page on the destination that has been excluded
from migration on the source via qemu_guest_free_page_hint() while
postcopy is active, that thread will stall until postcopy finishes
and all threads are woken up. (with older Linux kernels that won't
retry faults when woken up via userfaultfd, we might actually get a
SEGFAULT)
The issue is that the source will refuse to migrate any pages that
are not marked as dirty in the dirty bmap -- for example, because the
page might just have been sent. Consequently, the faulting thread will
stall, waiting for the page to be migrated -- which could take quite
a while and result in guest OS issues.
While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].
As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.
Note 1: in the future we might either adjust migration code on the source
to track pages that have actually been sent or adjust
migration code on source and destination to eventually send
pages multiple times from the source and and deal with pages
that are sent multiple times on the destination.
Note 2: virtio-mem has similar issues, however, access to "unplugged"
memory by the guest is very rare and we would have to be very
lucky for it to happen during migration. The spec states
"The driver SHOULD NOT read from unplugged memory blocks ..."
and "The driver MUST NOT write to unplugged memory blocks".
virtio-mem will move away from virtio_balloon_free_page_done()
soon and handle this case explicitly on the destination.
[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com
Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fd51e54fa1)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Jakub noticed[1] that, when using pin-based interrupts, the device will
unconditionally deasssert when any CQEs are acknowledged. However, the
pin should not be deasserted if other completion queues still holds
unacknowledged CQEs.
The bug is an artifact of commit ca247d3509 ("hw/block/nvme: fix
pin-based interrupt behavior") which fixed one bug but introduced
another. This is the third time someone tries to fix pin-based
interrupts (see commit 5e9aa92eb1 ("hw/block: Fix pin-based interrupt
behaviour of NVMe"))...
Third time's the charm, so fix it, again, by keeping track of how many
CQs have unacknowledged CQEs and only deassert when all are cleared.
[1]: <20210610114624.304681-1-jakub.jermar@kernkonzept.com>
Cc: qemu-stable@nongnu.org
Fixes: ca247d3509 ("hw/block/nvme: fix pin-based interrupt behavior")
Reported-by: Jakub Jermář <jakub.jermar@kernkonzept.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
(cherry picked from commit 83d7ed5c57)
*avoid dependency on 88eea45c ("hw/nvme: move nvme emulation out of hw/block")
Signed-off-by: Michael Roth <michael.roth@amd.com>
Qiang Liu reported that an access on an unknown address is triggered in
memory_region_set_enabled because a check on CAP.PMRS is missing for the
PMRCTL register write when no PMR is configured.
Cc: qemu-stable@nongnu.org
Fixes: 75c3c9de96 ("hw/block/nvme: disable PMR at boot up")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/362
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
(cherry picked from commit 2b02aabc9d)
Signed-off-by: Michael Roth <michael.roth@amd.com>
While QEMU coding style prefers lowercase hexadecimals in constants, the
NVMe subsystem uses the format from the NVMe specifications in comments,
i.e. 'h' suffix instead of '0x' prefix.
Fix this up across the code base.
Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com>
[k.jensen: updated message; added conversion in a couple of missing comments]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 312c3531bb)
*context dependency for 2b02aabc9d
Signed-off-by: Michael Roth <michael.roth@amd.com>
Commit [1] moved _SUN variable from only hot-pluggable to
all devices. This made linux kernel enumerate extra slots
that weren't present before. If extra slot happens to be
be enumerated first and there is a device in th same slot
but on other bridge, linux kernel will add -N suffix to
slot name of the later, thus changing NIC name compared to
QEMU 5.2. This in some case confuses systemd, if it is
using SLOT NIC naming scheme and interface name becomes
not the same as it was under QEMU-5.2.
Reproducer QEMU CLI:
-M pc-i440fx-5.2 -nodefaults \
-device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x3 \
-device virtio-net-pci,id=nic1,bus=pci.1,addr=0x1 \
-device virtio-net-pci,id=nic2,bus=pci.1,addr=0x2 \
-device virtio-net-pci,id=nic3,bus=pci.1,addr=0x3
with RHEL8 guest produces following results:
v5.2:
kernel: virtio_net virtio0 ens1: renamed from eth0
kernel: virtio_net virtio2 ens3: renamed from eth2
kernel: virtio_net virtio1 enp1s2: renamed from eth1
(slot 2 is assigned to empty bus 0 slot and virtio1
is assigned to 2-2 slot, and renaming falls back,
for some reason, to path based naming scheme)
v6.0:
kernel: virtio_net virtio0 ens1: renamed from eth0
kernel: virtio_net virtio2 ens3: renamed from eth2
systemd-udevd[299]: Error changing net interface name 'eth1' to 'ens3': File exists
systemd-udevd[299]: could not rename interface '3' from 'eth1' to 'ens3': File exists
(with commit [1] kernel assigns virtio2 to 3-2 slot
since bridge advertises _SUN=0x3 and kernel assigns
slot 3 to bridge. Still it manages to rename virtio2
correctly to ens3, however systemd gets confused with virtio1
where slot allocation exactly the same (2-2) as in 5.2 case
and tries to rename it to ens3 which is rightfully taken by
virtio2)
I'm not sure what breaks in systemd interface renaming (it probably
should be investigated), but on QEMU side we can safely revert
_SUN to 5.2 behavior (i.e. avoid cold-plugged bridges and non
hot-pluggable device classes), without breaking acpi-index, which uses
slot numbers but it doesn't have to use _SUN, it could use an arbitrary
variable name that has the same slot value).
It will help existing VMs to keep networking with non trivial
configs in working order since systemd will do its interface
renaming magic as it used to do.
1)
Fixes: b7f23f62e4 (pci: acpi: add _DSM method to PCI devices)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-3-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7193d7cdd9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-2-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a4344574fd)
Signed-off-by: Michael Roth <michael.roth@amd.com>
After yank feature was introduced in migration, whenever migration
is started using TLS, the following error happens in both source and
destination hosts:
(qemu) qemu-kvm: ../util/yank.c:107: yank_unregister_instance:
Assertion `QLIST_EMPTY(&entry->yankfns)' failed.
This happens because of a missing yank_unregister_function() when using
qio-channel-tls.
Fix this by also allowing TYPE_QIO_CHANNEL_TLS object type to perform
yank_unregister_function() in channel_close() and multifd_load_cleanup().
Also, inside migration_channel_connect() and
migration_channel_process_incoming() move yank_register_function() so
it only runs once on a TLS migration.
Fixes: b5eea99ec2 ("migration: Add yank feature", 2021-01-13)
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1964326
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Changes since v2:
- Dropped all references to ioc->master
- yank_register_function() and yank_unregister_function() now only run
once in a TLS migration.
Changes since v1:
- Cast p->c to QIOChannelTLS into multifd_load_cleanup()
Message-Id: <20210601054030.1153249-1-leobras.c@gmail.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit 7de2e85653)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Code consuming the "crypto/tlscreds*.h" APIs doesn't need
to access its internals. Move the structure definitions to
the "tlscredspriv.h" private header (only accessible by
implementations). The public headers (in include/) still
forward-declare the structures typedef.
Note, tlscreds.c and 3 of the 5 modified source files already
include "tlscredspriv.h", so only add it to tls-cipher-suites.c
and tlssession.c.
Removing the internals from the public header solves a bug
introduced by commit 7de2e85653 ("yank: Unregister function
when using TLS migration") which made migration/qemu-file-channel.c
include "io/channel-tls.h", itself sometime depends on GNUTLS,
leading to a build failure on OSX:
[2/35] Compiling C object libmigration.fa.p/migration_qemu-file-channel.c.o
FAILED: libmigration.fa.p/migration_qemu-file-channel.c.o
cc -Ilibmigration.fa.p -I. -I.. -Iqapi [ ... ] -o libmigration.fa.p/migration_qemu-file-channel.c.o -c ../migration/qemu-file-channel.c
In file included from ../migration/qemu-file-channel.c:29:
In file included from include/io/channel-tls.h:26:
In file included from include/crypto/tlssession.h:24:
include/crypto/tlscreds.h:28:10: fatal error: 'gnutls/gnutls.h' file not found
#include <gnutls/gnutls.h>
^~~~~~~~~~~~~~~~~
1 error generated.
Reported-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/407
Fixes: 7de2e85653 ("yank: Unregister function when using TLS migration")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 678bcc3c2c)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 3c52bf0c60)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 5590f65fac)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 8612df2ebe)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 0279cd9535)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Avoid accessing QCryptoTLSCreds internals by using
the qcrypto_tls_creds_check_endpoint() helper.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 7b3b616838)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Introduce the qcrypto_tls_creds_check_endpoint() helper
to access QCryptoTLSCreds internal 'endpoint' field.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit e9ac68083f)
Signed-off-by: Michael Roth <michael.roth@amd.com>
When the NVMe block driver was introduced (see commit bdd6a90a9e,
January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning
-ENOMEM in case of error. The driver was correctly handling the
error path to recycle its volatile IOVA mappings.
To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit
DMA mappings per container", April 2019) added the -ENOSPC error to
signal the user exhausted the DMA mappings available for a container.
The block driver started to mis-behave:
qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device
(qemu)
(qemu) info status
VM status: paused (io-error)
(qemu) c
VFIO_MAP_DMA failed: No space left on device
(qemu) c
VFIO_MAP_DMA failed: No space left on device
(The VM is not resumable from here, hence stuck.)
Fix by handling the new -ENOSPC error (when DMA mappings are
exhausted) without any distinction to the current -ENOMEM error,
so we don't change the behavior on old kernels where the CVE-2019-3882
fix is not present.
An easy way to reproduce this bug is to restrict the DMA mapping
limit (65535 by default) when loading the VFIO IOMMU module:
# modprobe vfio_iommu_type1 dma_entry_limit=666
Cc: qemu-stable@nongnu.org
Cc: Fam Zheng <fam@euphon.net>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Michal Prívozník <mprivozn@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210723195843.1032825-1-philmd@redhat.com
Fixes: bdd6a90a9e ("block: Add VFIO based NVMe driver")
Buglink: https://bugs.launchpad.net/qemu/+bug/1863333
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/65
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 15a730e7a3)
Signed-off-by: Michael Roth <michael.roth@amd.com>
libFuzzer triggered the following assertion:
cat << EOF | qemu-system-i386 -M pc-q35-5.0 \
-nographic -monitor none -serial none \
-qtest stdio -d guest_errors -trace pci\*
outl 0xcf8 0xf2000060
outl 0xcfc 0x8400056e
EOF
pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
Aborted (core dumped)
This is because guest wrote MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD
(reserved value) to the PCIE XBAR register.
There is no indication on the datasheet about what occurs when
this value is written. Simply ignore it on QEMU (and report an
guest error):
pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
Q35: Reserved PCIEXBAR LENGTH
pci_cfg_read mch 00:0 @0x0 -> 0x8086
pci_cfg_read mch 00:0 @0x0 -> 0x29c08086
...
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
BugLink: https://bugs.launchpad.net/qemu/+bug/1878641
Fixes: df2d8b3ed4 ("q35: Introduce q35 pc based chipset emulator")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210526142438.281477-1-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9b0ca75e01)
Signed-off-by: Michael Roth <michael.roth@amd.com>
This function should have been updated for vector types
when they were introduced.
Fixes: d2fd745fe8
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/367
Cc: qemu-stable@nongnu.org
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit c1c091948a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
We should not be aligning the offset in temp_allocate_frame,
because the odd offset produces an aligned address in the end.
Instead, pass the logical offset into tcg_set_frame and add
the stack bias last.
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 9defd1bdfb)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Based on the description of error_setg(), the local variable err in
qemu_maybe_daemonize() should be initialized to NULL.
Without fix, the uninitialized *errp triggers assert failure which
doesn't show much valuable information.
Before the fix:
qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion `*errp == NULL' failed.
After fix:
qemu-system-x86_64: cannot create PID file: Cannot open pid file: Permission denied
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20210610084741.456260-1-zhenzhong.duan@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: 0546c0609c ("vl: split various early command line options to a separate function", 2020-12-10)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 38f71349c7)
Signed-off-by: Michael Roth <michael.roth@amd.com>
In the vfio_migration_init(), the SaveVMHandler is registered for
VFIO device. But it lacks the operation of 'unregister'. It will
lead to 'Segmentation fault (core dumped)' in
qemu_savevm_state_setup(), if performing live migration after a
VFIO device is hot deleted.
Fixes: 7c2f5f75f9 (vfio: Register SaveVMHandlers for VFIO device)
Reported-by: Qixin Gan <ganqixin@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
Message-Id: <20210527123101.289-1-jiangkunkun@huawei.com>
Reviewed by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 22fca190e2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
Based on the description of error_setg(), the local variable err in
qemu_init_subsystems() should be initialized to NULL.
Fixes: efd7ab22fb ("vl: extract qemu_init_subsystems")
Cc: qemu-stable@nongnu.org
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Message-Id: <20210610131729.3906565-1-liangpeng10@huawei.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6e1da3d305)
Signed-off-by: Michael Roth <michael.roth@amd.com>
When processing a command to select a target and send a CDB, the ESP device
maintains a sequence step register so that if an error occurs the host can
determine which part of the selection/CDB submission sequence failed.
The old Linux 2.6 driver is really pedantic here: it checks the sequence step
register even if a command succeeds and complains loudly on the console if the
sequence step register doesn't match the expected bus phase and interrupt flags.
This reason this mismatch occurs is because the ESP emulation currently doesn't
update the bus phase until the next TI (Transfer Information) command and so the
cleared sequence step register is considered invalid for the stale bus phase.
Normally this isn't an issue as the host only checks the sequence step register
if an error occurs but the old Linux 2.6 driver does this in several places
causing a large stream of "esp0: STEP_ASEL for tgt 0" messages to appear on the
console during the boot process.
Fix this by not clearing the sequence step register when reading the interrupt
register and clearing the DMA status, so the guest sees a valid sequence step
and bus phase combination at the end of the command phase. No other change is
required since the sequence step register is correctly updated throughout the
selection/CDB submission sequence once one of the select commands is issued.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 1b9e48a5bd ("esp: implement non-DMA transfers in PDMA mode")
Message-Id: <20210518212511.21688-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit af947a3d85)
Signed-off-by: Michael Roth <michael.roth@amd.com>