Don't assume having an in-kernel irqchip means that GSI
routing is enabled.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Decouple another x86-specific assumption about what irqchips imply.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of assuming that we can use irqfds if and only if
kvm_irqchip_in_kernel(), add a bool to the KVMState which
indicates this, and is set only on x86 and only if the
irqchip is in the kernel.
The kernel documentation implies that the only thing
you need to use KVM_IRQFD is that KVM_CAP_IRQFD is
advertised, but this seems to be untrue. In particular
the kernel does not (alas) return a sensible error if you
try to set up an irqfd when you haven't created an irqchip.
If it did we could remove all this nonsense and let the
kernel return the error code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvm_allows_irq0_override() is a totally x86 specific concept:
move it to the target-specific source file where it belongs.
This means we need a new header file for the prototype:
kvm_i386.h, in line with the existing kvm_ppc.h.
While we are moving it, fix the return type to be 'bool' rather
than 'int'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Rename the function kvm_irqchip_set_irq() to kvm_set_irq(),
since it can be used for sending (asynchronous) interrupts whether
there is a full irqchip model in the kernel or not. (We don't
include 'async' in the function name since asynchronous is the
normal case.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
On x86 userspace delivers interrupts to the kernel asynchronously
(and therefore VCPU idle management is done in the kernel) if and
only if there is an in-kernel irqchip. On other architectures this
isn't necessarily true (they may always send interrupts
asynchronously), so define a new kvm_async_interrupts_enabled()
function instead of misusing kvm_irqchip_in_kernel().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
The code creating the symlink from linux-headers/asm to the
architecture specific linux-headers/asm-$arch directory was
implicitly hardcoding a list of KVM supporting architectures.
Add a default case for the common "Linux architecture name and
QEMU CPU name match" case, so future architectures will only
need to add code if they've managed to get mismatched names.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Add a helper function for fetching max cpus supported by kvm.
Make QEMU exit with an error message if smp_cpus exceeds limit
of VCPU count retrieved by invoking this helper function.
Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
* kwolf/for-anthony:
qemu-img: use QemuOpts instead of QEMUOptionParameter in resize function
qemu-iotests: Be more flexible with image creation options
qemu-iotests: add 039 qcow2 lazy refcounts test
qemu-io: add "abort" command to simulate program crash
qcow2: implement lazy refcounts
qemu-iotests: ignore qemu-img create lazy_refcounts output
docs: add lazy refcounts bit to qcow2 specification
qcow2: introduce dirty bit
docs: add dirty bit to qcow2 specification
qemu-iotests: add qed.py image manipulation utility
qapi: generalize documentation of streaming commands
ide scsi: Mess with geometry only for hard disk devices
Commit 5931065907 is incomplete,
we'll arrive in the scsi command complete callback in CSW state
and must handle that case correctly.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
in_addr_t isn't available on mingw32. Just use an unsigned long instead. I
considered typedef'ing in_addr_t on mingw32 but this would potentially be
brittle if mingw32 did introduce the type.
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu-iotests already filters out image creation options that may be
present or not in order to get the same output in both cases. However,
often it only considers the default value of the option. Cover all valid
values instead so that ./check -o name=value can be used successfull for
all of them.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This tests establishes the basic post-conditions of the qcow2 lazy
refcounts features:
1. If the image was closed normally, it is marked clean.
2. If an allocating write was performed and the image was not closed
normally, then it is marked dirty.
a. Written data can be read back successfully.
b. The image file can be repaired and will be marked clean again.
c. The image file is automatically repaired when opened read/write.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Avoiding data loss and corruption is the top requirement for image file
formats. The qemu-io "abort" command makes it possible to simulate
program crashes and does not give the image format a chance to cleanly
shut down. This command is useful for data integrity test cases.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty. In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.
Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM). Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.
The lazy refcount optimization must be enabled at image creation time:
qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough
Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hide the default lazy_refcounts=off output from qemu-img like we do with
other image creation options. This ensures that existing golden outputs
continue to pass despite the new option that has been added.
Note that this patch applies before the one that actually introduces the
lazy_refcounts=on|off option. This ensures git-bisect(1) continues to
work.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The lazy refcounts bit indicates that this image can take advantage of
the dirty bit and that refcount updates can be postponed.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds an incompatible feature bit to mark images that have not
been closed cleanly. When a dirty image file is opened a consistency
check and repair is performed.
Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The dirty bit will make it possible to perform lazy refcount updates,
where the image file is not kept consistent all the time. Upon opening
a dirty image file, it is necessary to perform a consistency check and
repair any incorrect refcounts.
Therefore the dirty bit must be an incompatible feature bit. We don't
want old programs accessing a file with stale refcounts.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The qed.py utility can inspect and manipulate QED image files. It can
be used for testing to see the state of image metadata and also to
inject corruptions into the image file. It also has a scrubbing feature
to copy just the metadata out of an image file, allowing users to share
broken image files without revealing data in bug reports.
This has lived in my local repo for a long time but could be useful
to others.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Talk about background operations in general, rather than specifically
about streaming.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Legacy -drive cyls=... are now ignored completely when the drive
doesn't back a hard disk device. Before, they were first checked
against a hard disk's limits, then ignored.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit b1f416aa8d breaks vhost_net
because it always registers the virtio_pci_host_notifier_read() handler
function on the ioeventfd, even when vhost_net.ko is using the ioeventfd.
The result is both QEMU and vhost_net.ko polling on the same eventfd
and the virtio_net.ko guest driver seeing inconsistent results:
# ifconfig eth0 192.168.0.1 netmask 255.255.255.0
virtio_net virtio0: output:id 0 is not a head!
To fix this, proceed the same as we do for irqfd: add a parameter to
virtio_queue_set_host_notifier_fd_handler and in that case only set
the notifier, not the handler.
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Tested-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* 'axp-next' of git://repo.or.cz/qemu/rth:
alpha-linux-user: Fix the getpriority syscall
alpha-linux-user: Properly handle the non-rt sigprocmask syscall.
alpha-linux-user: Fix a3 error return with v0 error bypass.
linux-user: Translate pipe2 flags; add to strace
linux-user: Allocate the right amount of space for non-fixed file maps
linux-user: Handle O_SYNC, O_NOATIME, O_CLOEXEC, O_PATH
linux-user: Sync fcntl.h bits with the kernel
alpha-linux-user: Handle TARGET_SSI_IEEE_RAISE_EXCEPTION properly
alpha-linux-user: Work around hosted mmap allocation problems
alpha-linux-user: Fix signal handling
Alpha uses unbiased priority values in the syscall, with the a3
return value signaling error conditions. Therefore, properly
interpret the libc getpriority as needed for the guest rather
than passing the host value through unchanged.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Name the syscall properly for QEMU, kernel source notwithstanding.
Fix syntax errors in the code thus enabled within do_syscall.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
If we let the kernel handle the implementation of mmap_find_vma,
via an anon mmap, we must use the size as indicated by the user
and not the size truncated to the filesize.
This happens often in ld.so, where we initially mmap the file to
the size of the text+data+bss to reserve an area, then mmap+fixed
over the top to properly handle data and bss.
Signed-off-by: Richard Henderson <rth@twiddle.net>
For each target, only define the bits that appear in
arch/target/include/asm/fcntl.h. Mirror the kernel's
asm-generic layout by handling anything undefined afterward.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We weren't aggregating the exceptions, nor raising signals properly.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Proper signal numbers were not defined, and EXCP_INTERRUPT
was unhandled, leading to all sorts of subtle confusion.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Qualifier 'volatile' is not useful for applications, it's too strict
for single threaded code but does not give the real atomicity guarantees
needed for multithreaded code.
Drop them and now useless casts.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The Xen 4.1 probe never uses the return value from xc_interface_open(),
so was provoking a compiler warning on newer gcc. Fix by not bothering
to put the return value anywhere.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The Xen compile checks are currently run inside subshells. This
is unnecessary and has the effect that if do_cc() exits with
an error message then this only causes the subshell to exit,
not the whole of configure, which is confusing. Remove the
subshells, changing:
if ( cat ; compile_prog ) ; then ...
to
if cat && compile_prog ; then ...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit 0f66998 makes -enable-fips conditional on Linux hosts but then uses it
unconditionally in vl.c.
Fix this by moving the fips handling to os-posix.c and adding a condition.
Cc: Paul Moore <pmoore@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Changes so translation of remote address to the host's ip address in
the virtual network happens for all addresses in the 127.0.0.0/8
network, not just 127.0.0.1.
This fixes so that hostfwd bound to addresses such as 127.0.0.2 works.
Signed-off-by: Anders Waldenborg <anders@0x63.nu>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
* bonzini/scsi-next:
scsi: add support for ATA_PASSTHROUGH_xx scsi command
esp: add missing const on TypeInfo structures
esp: enable for all PCI machines
Revert "megasas: disable due to build breakage"
megasas: static SAS addresses
scsi-disk: fix compilation with DEBUG_SCSI
megasas: Update function megasys_scsi_uninit
SCSI: STARTSTOPUNIT only eject/load media if powercondition is 0
SCSI: Update the sense code for PREVENT REMOVAL errors
Correct the command names of opcode 0x85 and 0xa1, and calculate
their xfer size from CDB.
Signed-off-by: Cong Meng <mc@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
FIPS 140-2 requires disabling certain ciphers, including DES, which is used
by VNC to obscure passwords when they are sent over the network. The
solution for FIPS users is to disable the use of VNC password auth when the
host system is operating in FIPS compliance mode and the user has specified
'-enable-fips' on the QEMU command line.
This patch causes QEMU to emit a message to stderr when the host system is
running in FIPS mode and a VNC password was specified on the commend line.
If the system is not running in FIPS mode, or is running in FIPS mode but
VNC password authentication was not requested, QEMU operates normally.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qmp/queue/qmp:
hmp: show the backing file depth
block: Use bdrv_get_backing_file_depth()
block: create bdrv_get_backing_file_depth()
qapi: qapi.py: allow the "'" character to be escaped
* afaerber-or/qom-cpu-4:
cpu: Move thread_kicked to CPUState
cpu: Move thread field into CPUState
cpu: Move CPU_COMMON_THREAD into CPUState
qemu-thread: Let qemu_thread_is_self() return bool