Commit Graph

17937 Commits

Author SHA1 Message Date
Avi Kivity
26a83ad0e7 memory: remove MemoryRegion::backend_registered
backend_registered was used to lazify the process of registering an
mmio region, since the it is different for the I/O address space and
the memory address space.  However, it also makes registration dependent
on the region being visible in the address space.  This is not the case
for "fake" regions, like watchpoints or IO_MEM_UNASSIGNED.

Remove backend_registered and always initialize the region.  If it turns
out to be part of the I/O address space, we've wasted an I/O slot, but
that's not too bad.  In any case this will be optimized later on.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
acbbec5d43 memory: move mmio access to functions
Currently mmio access goes directly to the io_mem_{read,write} arrays.
In preparation for eliminating them, add indirection via a function.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
f1f6e3b86e exec: make phys_page_find() return a temporary
Instead of returning a PhysPageDesc pointer, return a temporary.
This lets us move away from actually storing PhysPageDesc's, and
instead sythesising them when needed.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
be675c9720 memory: move endianness compensation to memory core
Instead of doing device endianness compensation in cpu_register_io_memory(),
do it in the memory core.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
7638e0d220 memory: obsolete more dirty memory related functions
No longer used outside memory.c and exec.c.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:49 +02:00
Avi Kivity
5a97065b01 xen: convert framebuffer dirty tracking to memory API
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:49 +02:00
Avi Kivity
8f77558f22 memory: obsolete cpu_physical_memory_[gs]et_dirty_tracking()
The getter is no longer used, so it is completely removed.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:49 +02:00
Avi Kivity
dc94a7ed61 Convert ram_load() to the memory API
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
f09f2189d5 Remove support for version 3 ram_load
Version 3 ram_load depends on ram_addrs, which are not stable.  Version 4
was introduced in 0.13 (and RHEL 6), so this means live migration from 0.12
and earlier to 1.1 or later will not work.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
8fec98b41b Sort RAMBlocks by ID for migration, not by ram_addr
ram_addr is (a) unstable (b) going away.  Sort by idstr instead.

Commit b2e0a138e initially introduced the sorting for the purpose
of improving debuggability.  After this patch, the order is still
stable, but perhaps less usable by a human.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
71c510e26e Switch ram_save to the memory API
Avoid using ram_addr_t, instead use (MemoryRegion *, offset) pairs.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
7c63736603 Store MemoryRegion in RAMBlock
As a step in moving live migration from RAMBlocks to MemoryRegions,
store the MemoryRegion in a RAMBlock.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
c5705a7728 vmstate, memory: decouple vmstate from memory API
Currently creating a memory region automatically registers it for
live migration.  This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.

Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration.  Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
8991c79b57 memory: introduce memory_region_name()
Trivial accessor for the name attribute.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:47 +02:00
Anthony Liguori
f3c6a169a3 Merge remote-tracking branch 'qemu-kvm/memory/page_desc' into staging
* qemu-kvm/memory/page_desc: (22 commits)
  Remove cpu_get_physical_page_desc()
  sparc: avoid cpu_get_physical_page_desc()
  virtio-balloon: avoid cpu_get_physical_page_desc()
  vhost: avoid cpu_get_physical_page_desc()
  kvm: avoid cpu_get_physical_page_desc()
  memory: remove CPUPhysMemoryClient
  xen: convert to MemoryListener API
  memory: temporarily add memory_region_get_ram_addr()
  xen, vga: add API for registering the framebuffer
  vhost: convert to MemoryListener API
  kvm: convert to MemoryListener API
  kvm: switch kvm slots to use host virtual address instead of ram_addr_t
  memory: add API for observing updates to the physical memory map
  memory: replace cpu_physical_sync_dirty_bitmap() with a memory API
  framebuffer: drop use of cpu_physical_sync_dirty_bitmap()
  loader: remove calls to cpu_get_physical_page_desc()
  framebuffer: drop use of cpu_get_physical_page_desc()
  memory: introduce memory_region_find()
  memory: add memory_region_is_logging()
  memory: add memory_region_is_rom()
  ...
2012-01-03 14:39:05 -06:00
Avi Kivity
586c6230c0 Remove cpu_get_physical_page_desc()
No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
cc4aa8307c sparc: avoid cpu_get_physical_page_desc()
This reaches into the innards of the memory core, which are being
changed.  Switch to a memory API version.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
b7c28c74af virtio-balloon: avoid cpu_get_physical_page_desc()
This reaches into the innards of the memory core, which are being
changed.  Switch to a memory API version.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
2817b260e3 vhost: avoid cpu_get_physical_page_desc()
This reaches into the innards of the memory core, which are being
changed.  Switch to a memory API version.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
ffcde12f6c kvm: avoid cpu_get_physical_page_desc()
This reaches into the innards of the memory core, which are being
changed.  Switch to a memory API version.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
dcd97e33af memory: remove CPUPhysMemoryClient
No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:27 +02:00
Avi Kivity
20581d2078 xen: convert to MemoryListener API
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:22 +02:00
Avi Kivity
8d3bc5178f Fix qapi code generation wrt parallel build
Make's multiple output syntax

  x.c x.h: x.template
       gen < x.template

actually invokes the command once for x.c and once for x.h (with differing $@
in each invocation).  During a parallel build, the two commands may be invoked
in parallel; this opens up a race, where the second invocation trashes a file
supposedly produced during the first, and now in use by a dependent command.

The various qapi code generators are susceptible to this; fix by making them
generate just one file per invocation.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-27 09:28:58 -06:00
Anthony Liguori
4e1ea514f9 Merge remote-tracking branch 'aneesh/for-upstream' into staging
* aneesh/for-upstream:
  scripts/analyse-9p-simpletrace.py:	Add symbolic names for 9p operations.
  hw/9pfs: iattr_valid flags are kernel internal flags map them to 9p values.
  hw/9pfs: Use the correct signed type for different variables
  hw/9pfs: replace iovec manipulation with QEMUIOVector
2011-12-27 08:53:35 -06:00
Anthony Liguori
ebdfc3c83c Merge remote-tracking branch 'bonzini/nbd-for-anthony' into staging
* bonzini/nbd-for-anthony: (26 commits)
  nbd: add myself as maintainer
  qemu-nbd: throttle requests
  qemu-nbd: asynchronous operation
  qemu-nbd: add client pointer to NBDRequest
  qemu-nbd: move client handling to nbd.c
  qemu-nbd: use common main loop
  link the main loop and its dependencies into the tools
  qemu-nbd: introduce NBDRequest
  qemu-nbd: introduce NBDExport
  qemu-nbd: introduce nbd_do_receive_request
  qemu-nbd: more robust handling of invalid requests
  qemu-nbd: introduce nbd_do_send_reply
  qemu-nbd: simplify nbd_trip
  move corking functions to osdep.c
  qemu-nbd: remove data_size argument to nbd_trip
  qemu-nbd: remove offset argument to nbd_trip
  Update ioctl order in nbd_init() to detect EBUSY
  nbd: add support for NBD_CMD_TRIM
  nbd: add support for NBD_CMD_FLUSH
  nbd: add support for NBD_CMD_FLAG_FUA
  ...
2011-12-27 08:52:42 -06:00
Gleb Natapov
a0fa82085e enable architectural PMU cpuid leaf for kvm
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Vasilis Liaskovitis
991dfefdee Set numa topology for max_cpus
qemu-kvm passes numa/SRAT topology information for smp_cpus to SeaBIOS. However
SeaBIOS always expects to setup max_cpus number of SRAT cpu entries
(MaxCountCPUs variable in build_srat function of Seabios). When qemu-kvm runs
with smp_cpus != max_cpus (e.g. -smp 2,maxcpus=4), Seabios will mistakenly use
memory SRAT info for setting up CPU SRAT entries for the offline CPUs. Wrong
SRAT memory entries are also created. This breaks NUMA in a guest.
Fix by setting up SRAT info for max_cpus in qemu-kvm.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
cce47516cd kvm: x86: Drop redundant apic base and tpr update from kvm_get_sregs
The latter was already commented out, the former is redundant as well.
We always get the latest changes after return from the guest via
kvm_arch_post_run.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
fabacc0f79 kvm: x86: Avoid runtime allocation of xsave buffer
Keep a per-VCPU xsave buffer for kvm_put/get_xsave instead of
continuously allocating and freeing it on state sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
6b42494b21 kvm: x86: Use symbols for all xsave field
Field 0 (FCW+FSW) and 1 (FTW+FOP) were hard-coded so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:00 -02:00
Paolo Bonzini
44f76b289a nbd: add myself as maintainer
Not planning to do much else, hence listing it as "Odd Fixes".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
41996e3803 qemu-nbd: throttle requests
Limiting the number of in-flight requests is implemented very simply
with a can_read callback.  It does not require a semaphore, unlike the
client side in block/nbd.c, because we can throttle directly the creation
of coroutines.  The client side can have a coroutine created at any time
when an I/O request is made.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
262db38871 qemu-nbd: asynchronous operation
Using coroutines enable asynchronous operation on both the network and
the block side.  Network can be owned by two coroutines at the same time,
one writing and one reading.  On the send side, mutual exclusion is
guaranteed by a CoMutex.  On the receive side, mutual exclusion is
guaranteed because new coroutines immediately start receiving data,
and no new coroutines are created as long as the previous one is receiving.

Between receive and send, qemu-nbd can have an arbitrary number of
in-flight block transfers.  Throttling is implemented by the next
patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
72deddc5e6 qemu-nbd: add client pointer to NBDRequest
By attaching a client to an NBDRequest, we can avoid passing around the
socket descriptor and data buffer.

Also, we can now manage the reference count for the client in
nbd_request_get/put request instead of having to do it ourselved in
nbd_read.  This simplifies things when coroutines are used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
1743b51586 qemu-nbd: move client handling to nbd.c
This patch sets up the fd handler in nbd.c instead of qemu-nbd.c.  It
introduces NBDClient, which wraps the arguments to nbd_trip in a single
structure, so that we can add a notifier to it.  This way, qemu-nbd can
know about disconnections.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
a61c67828d qemu-nbd: use common main loop
Using a single main loop for sockets will help yielding from the socket
coroutine back to the main loop, and later reentering it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
cbcfa0418f link the main loop and its dependencies into the tools
Using the main loop code from QEMU enables tools to operate fully
asynchronously.  Advantages include better Windows portability (for some
definition of portability) over glib's.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
d9a7380658 qemu-nbd: introduce NBDRequest
Move the buffer from NBDExport to a new structure, so that it will be
possible to have multiple in-flight requests for the same export
(and for the same client too---we get that for free).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
af49bbbe78 qemu-nbd: introduce NBDExport
Wrap the common parameters of nbd_trip and nbd_negotiate in a
single opaque struct.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
a030b347aa qemu-nbd: introduce nbd_do_receive_request
Group the receiving of a response and the associated data into a new function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
fae6941629 qemu-nbd: more robust handling of invalid requests
Fail invalid requests with EINVAL instead of dropping them into
the void.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
2204559203 qemu-nbd: introduce nbd_do_send_reply
Group the sending of a reply and the associated data into a new function.
Without corking, the caller would be forced to leave 12 free bytes at the
beginning of the data pointer.  Not too ugly, but still ugly. :)

Using nbd_do_send_reply everywhere will help when the routine will set up
the write handler that re-enters the send coroutine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
a478f6e595 qemu-nbd: simplify nbd_trip
Use TCP_CORK to remove a violation of encapsulation, that would later
require nbd_trip to know too much about an NBD reply.

We could also switch to sendmsg (qemu_co_sendv) later, it is even
easier once coroutines are in.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
128aa58947 move corking functions to osdep.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
3777b09fd7 qemu-nbd: remove data_size argument to nbd_trip
The size of the buffer is in practice part of the protocol.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
94607e7a77 qemu-nbd: remove offset argument to nbd_trip
The argument is write-only.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Chunyan Liu
3e05c78551 Update ioctl order in nbd_init() to detect EBUSY
Update ioctl(s) in nbd_init() to detect device busy early.

Current nbd_init() issues NBD_CLEAR_SOCKET before NBD_SET_SOCKET, if issuing
"qemu-nbd -c /dev/nbd0 disk.img" twice, the second time won't detect EBUSY in
nbd_init(), but in nbd_client will report EBUSY and do clear socket (the 1st
time command will be affacted too because of no socket any more.)

No change to previous version.

Signed-off-by: Chunyan Liu <cyliu@suse.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
7a706633e9 nbd: add support for NBD_CMD_TRIM
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
1486d04a1b nbd: add support for NBD_CMD_FLUSH
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
2c7989a9b1 nbd: add support for NBD_CMD_FLAG_FUA
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00