Update (thanks to Edgar, Thiemo, malc, Paul, Laurent and Andrzej)

git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5453 c046a42c-6fe2-441c-8c8c-71466251a162
This commit is contained in:
blueswir1 2008-10-09 18:52:04 +00:00
parent 33256a25b3
commit 998a050186

View File

@ -33,11 +33,12 @@
@menu
* intro_features:: Features
* intro_x86_emulation:: x86 emulation
* intro_x86_emulation:: x86 and x86-64 emulation
* intro_arm_emulation:: ARM emulation
* intro_mips_emulation:: MIPS emulation
* intro_ppc_emulation:: PowerPC emulation
* intro_sparc_emulation:: SPARC emulation
* intro_sparc_emulation:: Sparc32 and Sparc64 emulation
* intro_other_emulation:: Other CPU emulation
@end menu
@node intro_features
@ -51,17 +52,17 @@ QEMU has two operating modes:
@itemize @minus
@item
Full system emulation. In this mode, QEMU emulates a full system
(usually a PC), including a processor and various peripherals. It can
be used to launch an different Operating System without rebooting the
PC or to debug system code.
Full system emulation. In this mode (full platform virtualization),
QEMU emulates a full system (usually a PC), including a processor and
various peripherals. It can be used to launch several different
Operating Systems at once without rebooting the host machine or to
debug system code.
@item
User mode emulation (Linux host only). In this mode, QEMU can launch
Linux processes compiled for one CPU on another CPU. It can be used to
launch the Wine Windows API emulator (@url{http://www.winehq.org}) or
to ease cross-compilation and cross-debugging.
User mode emulation. In this mode (application level virtualization),
QEMU can launch processes compiled for one CPU on another CPU, however
the Operating Systems must match. This can be used for example to ease
cross-compilation and cross-debugging.
@end itemize
As QEMU requires no host kernel driver to run, it is very safe and
@ -75,7 +76,10 @@ QEMU generic features:
@item Using dynamic translation to native code for reasonable speed.
@item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390.
@item
Working on x86, x86_64 and PowerPC32/64 hosts. Being tested on ARM,
HPPA, Sparc32 and Sparc64. Previous versions had some support for
Alpha and S390 hosts, but TCG (see below) doesn't support those yet.
@item Self-modifying code support.
@ -85,6 +89,10 @@ QEMU generic features:
in other projects (look at @file{qemu/tests/qruncom.c} to have an
example of user mode @code{libqemu} usage).
@item
Floating point library supporting both full software emulation and
native host FPU instructions.
@end itemize
QEMU user mode emulation features:
@ -96,20 +104,47 @@ QEMU user mode emulation features:
@item Accurate signal handling by remapping host signals to target signals.
@end itemize
Linux user emulator (Linux host only) can be used to launch the Wine
Windows API emulator (@url{http://www.winehq.org}). A Darwin user
emulator (Darwin hosts only) exists and a BSD user emulator for BSD
hosts is under development. It would also be possible to develop a
similar user emulator for Solaris.
QEMU full system emulation features:
@itemize
@item QEMU can either use a full software MMU for maximum portability or use the host system call mmap() to simulate the target MMU.
@item
QEMU uses a full software MMU for maximum portability.
@item
QEMU can optionally use an in-kernel accelerator, like kqemu and
kvm. The accelerators execute some of the guest code natively, while
continuing to emulate the rest of the machine.
@item
Various hardware devices can be emulated and in some cases, host
devices (e.g. serial and parallel ports, USB, drives) can be used
transparently by the guest Operating System. Host device passthrough
can be used for talking to external physical peripherals (e.g. a
webcam, modem or tape drive).
@item
Symmetric multiprocessing (SMP) even on a host with a single CPU. On a
SMP host system, QEMU can use only one CPU fully due to difficulty in
implementing atomic memory accesses efficiently.
@end itemize
@node intro_x86_emulation
@section x86 emulation
@section x86 and x86-64 emulation
QEMU x86 target features:
@itemize
@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU.
LDT/GDT and IDT are emulated. VM86 mode is also supported to run
DOSEMU. There is some support for MMX/3DNow!, SSE, SSE2, SSE3, SSSE3,
and SSE4 as well as x86-64 SVM.
@item Support of host page sizes bigger than 4KB in user mode emulation.
@ -124,9 +159,7 @@ Current QEMU limitations:
@itemize
@item No SSE/MMX support (yet).
@item No x86-64 support.
@item Limited x86-64 support.
@item IPC syscalls are missing.
@ -134,10 +167,6 @@ Current QEMU limitations:
memory access (yet). Hopefully, very few OSes seem to rely on that for
normal use.
@item On non x86 host CPUs, @code{double}s are used instead of the non standard
10 byte @code{long double}s of x86 for floating point emulation to get
maximum performances.
@end itemize
@node intro_arm_emulation
@ -193,7 +222,7 @@ FPU and MMU.
@end itemize
@node intro_sparc_emulation
@section SPARC emulation
@section Sparc32 and Sparc64 emulation
@itemize
@ -216,17 +245,34 @@ Current QEMU limitations:
@item Atomic instructions are not correctly implemented.
@item Sparc64 emulators are not usable for anything yet.
@item There are still some problems with Sparc64 emulators.
@end itemize
@node intro_other_emulation
@section Other CPU emulation
In addition to the above, QEMU supports emulation of other CPUs with
varying levels of success. These are:
@itemize
@item
Alpha
@item
CRIS
@item
M68k
@item
SH4
@end itemize
@node QEMU Internals
@chapter QEMU Internals
@menu
* QEMU compared to other emulators::
* Portable dynamic translation::
* Register allocation::
* Condition code optimisations::
* CPU state optimisations::
* Translation cache::
@ -234,6 +280,7 @@ Current QEMU limitations:
* Self-modifying code and translated code invalidation::
* Exception support::
* MMU emulation::
* Device emulation::
* Hardware interrupts::
* User emulation specific details::
* Bibliography::
@ -273,19 +320,23 @@ patches. However, user mode Linux requires heavy kernel patches while
QEMU accepts unpatched Linux kernels. The price to pay is that QEMU is
slower.
The new Plex86 [8] PC virtualizer is done in the same spirit as the
qemu-fast system emulator. It requires a patched Linux kernel to work
(you cannot launch the same kernel on your PC), but the patches are
really small. As it is a PC virtualizer (no emulation is done except
for some privileged instructions), it has the potential of being
faster than QEMU. The downside is that a complicated (and potentially
unsafe) host kernel patch is needed.
The Plex86 [8] PC virtualizer is done in the same spirit as the now
obsolete qemu-fast system emulator. It requires a patched Linux kernel
to work (you cannot launch the same kernel on your PC), but the
patches are really small. As it is a PC virtualizer (no emulation is
done except for some privileged instructions), it has the potential of
being faster than QEMU. The downside is that a complicated (and
potentially unsafe) host kernel patch is needed.
The commercial PC Virtualizers (VMWare [9], VirtualPC [10], TwoOStwo
[11]) are faster than QEMU, but they all need specific, proprietary
and potentially unsafe host drivers. Moreover, they are unable to
provide cycle exact simulation as an emulator can.
VirtualBox [12], Xen [13] and KVM [14] are based on QEMU. QEMU-SystemC
[15] uses QEMU to simulate a system where some hardware devices are
developed in SystemC.
@node Portable dynamic translation
@section Portable dynamic translation
@ -295,63 +346,51 @@ are very complicated and highly CPU dependent. QEMU uses some tricks
which make it relatively easily portable and simple while achieving good
performances.
The basic idea is to split every x86 instruction into fewer simpler
instructions. Each simple instruction is implemented by a piece of C
code (see @file{target-i386/op.c}). Then a compile time tool
(@file{dyngen}) takes the corresponding object file (@file{op.o})
to generate a dynamic code generator which concatenates the simple
instructions to build a function (see @file{op.h:dyngen_code()}).
In essence, the process is similar to [1], but more work is done at
compile time.
A key idea to get optimal performances is that constant parameters can
be passed to the simple operations. For that purpose, dummy ELF
relocations are generated with gcc for each constant parameter. Then,
the tool (@file{dyngen}) can locate the relocations and generate the
appriopriate C code to resolve them when building the dynamic code.
That way, QEMU is no more difficult to port than a dynamic linker.
To go even faster, GCC static register variables are used to keep the
state of the virtual CPU.
@node Register allocation
@section Register allocation
Since QEMU uses fixed simple instructions, no efficient register
allocation can be done. However, because RISC CPUs have a lot of
register, most of the virtual CPU state can be put in registers without
doing complicated register allocation.
After the release of version 0.9.1, QEMU switched to a new method of
generating code, Tiny Code Generator or TCG. TCG relaxes the
dependency on the exact version of the compiler used. The basic idea
is to split every target instruction into a couple of RISC-like TCG
ops (see @code{target-i386/translate.c}). Some optimizations can be
performed at this stage, including liveness analysis and trivial
constant expression evaluation. TCG ops are then implemented in the
host CPU back end, also known as TCG target (see
@code{tcg/i386/tcg-target.c}). For more information, please take a
look at @code{tcg/README}.
@node Condition code optimisations
@section Condition code optimisations
Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a
critical point to get good performances. QEMU uses lazy condition code
evaluation: instead of computing the condition codes after each x86
instruction, it just stores one operand (called @code{CC_SRC}), the
result (called @code{CC_DST}) and the type of operation (called
@code{CC_OP}).
Lazy evaluation of CPU condition codes (@code{EFLAGS} register on x86)
is important for CPUs where every instruction sets the condition
codes. It tends to be less important on conventional RISC systems
where condition codes are only updated when explicitly requested.
Instead of computing the condition codes after each x86 instruction,
QEMU just stores one operand (called @code{CC_SRC}), the result
(called @code{CC_DST}) and the type of operation (called
@code{CC_OP}). When the condition codes are needed, the condition
codes can be calculated using this information. In addition, an
optimized calculation can be performed for some instruction types like
conditional branches.
@code{CC_OP} is almost never explicitly set in the generated code
because it is known at translation time.
In order to increase performances, a backward pass is performed on the
generated simple instructions (see
@code{target-i386/translate.c:optimize_flags()}). When it can be proved that
the condition codes are not needed by the next instructions, no
condition codes are computed at all.
The lazy condition code evaluation is used on x86, m68k and cris. ARM
uses a simplified variant for the N and Z flags.
@node CPU state optimisations
@section CPU state optimisations
The x86 CPU has many internal states which change the way it evaluates
instructions. In order to achieve a good speed, the translation phase
considers that some state information of the virtual x86 CPU cannot
change in it. For example, if the SS, DS and ES segments have a zero
base, then the translator does not even generate an addition for the
segment base.
The target CPUs have many internal states which change the way it
evaluates instructions. In order to achieve a good speed, the
translation phase considers that some state information of the virtual
CPU cannot change in it. The state is recorded in the Translation
Block (TB). If the state changes (e.g. privilege level), a new TB will
be generated and the previous TB won't be used anymore until the state
matches the state recorded in the previous TB. For example, if the SS,
DS and ES segments have a zero base, then the translator does not even
generate an addition for the segment base.
[The FPU stack pointer register is not handled that way yet].
@ -388,28 +427,20 @@ instruction cache invalidation is signaled by the application when code
is modified.
When translated code is generated for a basic block, the corresponding
host page is write protected if it is not already read-only (with the
system call @code{mprotect()}). Then, if a write access is done to the
page, Linux raises a SEGV signal. QEMU then invalidates all the
translated code in the page and enables write accesses to the page.
host page is write protected if it is not already read-only. Then, if
a write access is done to the page, Linux raises a SEGV signal. QEMU
then invalidates all the translated code in the page and enables write
accesses to the page.
Correct translated code invalidation is done efficiently by maintaining
a linked list of every translated block contained in a given page. Other
linked lists are also maintained to undo direct block chaining.
Although the overhead of doing @code{mprotect()} calls is important,
most MSDOS programs can be emulated at reasonnable speed with QEMU and
DOSEMU.
Note that QEMU also invalidates pages of translated code when it detects
that memory mappings are modified with @code{mmap()} or @code{munmap()}.
When using a software MMU, the code invalidation is more efficient: if
a given code page is invalidated too often because of write accesses,
then a bitmap representing all the code inside the page is
built. Every store into that page checks the bitmap to see if the code
really needs to be invalidated. It avoids invalidating the code when
only data is modified in the page.
On RISC targets, correctly written software uses memory barriers and
cache flushes, so some of the protection above would not be
necessary. However, QEMU still requires that the generated code always
matches the target instructions in memory in order to handle
exceptions correctly.
@node Exception support
@section Exception support
@ -418,10 +449,9 @@ longjmp() is used when an exception such as division by zero is
encountered.
The host SIGSEGV and SIGBUS signal handlers are used to get invalid
memory accesses. The exact CPU state can be retrieved because all the
x86 registers are stored in fixed host registers. The simulated program
counter is found by retranslating the corresponding basic block and by
looking where the host program counter was at the exception point.
memory accesses. The simulated program counter is found by
retranslating the corresponding basic block and by looking where the
host program counter was at the exception point.
The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
in some cases it is not computed because of condition code
@ -431,15 +461,10 @@ still be restarted in any cases.
@node MMU emulation
@section MMU emulation
For system emulation, QEMU uses the mmap() system call to emulate the
target CPU MMU. It works as long the emulated OS does not use an area
reserved by the host OS (such as the area above 0xc0000000 on x86
Linux).
In order to be able to launch any OS, QEMU also supports a soft
MMU. In that mode, the MMU virtual to physical address translation is
done at every memory access. QEMU uses an address translation cache to
speed up the translation.
For system emulation QEMU supports a soft MMU. In that mode, the MMU
virtual to physical address translation is done at every memory
access. QEMU uses an address translation cache to speed up the
translation.
In order to avoid flushing the translated code each time the MMU
mappings change, QEMU uses a physically indexed translation cache. It
@ -448,6 +473,33 @@ means that each basic block is indexed with its physical address.
When MMU mappings change, only the chaining of the basic blocks is
reset (i.e. a basic block can no longer jump directly to another one).
@node Device emulation
@section Device emulation
Systems emulated by QEMU are organized by boards. At initialization
phase, each board instantiates a number of CPUs, devices, RAM and
ROM. Each device in turn can assign I/O ports or memory areas (for
MMIO) to its handlers. When the emulation starts, an access to the
ports or MMIO memory areas assigned to the device causes the
corresponding handler to be called.
RAM and ROM are handled more optimally, only the offset to the host
memory needs to be added to the guest address.
The video RAM of VGA and other display cards is special: it can be
read or written directly like RAM, but write accesses cause the memory
to be marked with VGA_DIRTY flag as well.
QEMU supports some device classes like serial and parallel ports, USB,
drives and network devices, by providing APIs for easier connection to
the generic, higher level implementations. The API hides the
implementation details from the devices, like native device use or
advanced block device formats like QCOW.
Usually the devices implement a reset method and register support for
saving and loading of the device state. The devices can also use
timers, especially together with the use of bottom halves (BHs).
@node Hardware interrupts
@section Hardware interrupts
@ -513,9 +565,9 @@ it is not very useful, it is an important test to show the power of the
emulator.
Achieving self-virtualization is not easy because there may be address
space conflicts. QEMU solves this problem by being an executable ELF
shared object as the ld-linux.so ELF interpreter. That way, it can be
relocated at load time.
space conflicts. QEMU user emulators solve this problem by being an
executable ELF shared object as the ld-linux.so ELF interpreter. That
way, it can be relocated at load time.
@node Bibliography
@section Bibliography
@ -568,6 +620,22 @@ The VirtualPC PC virtualizer.
@url{http://www.twoostwo.org/},
The TwoOStwo PC virtualizer.
@item [12]
@url{http://virtualbox.org/},
The VirtualBox PC virtualizer.
@item [13]
@url{http://www.xen.org/},
The Xen hypervisor.
@item [14]
@url{http://kvm.qumranet.com/kvmwiki/Front_Page},
Kernel Based Virtual Machine (KVM).
@item [15]
@url{http://www.greensocs.com/projects/QEMUSystemC},
QEMU-SystemC, a hardware co-simulator.
@end table
@node Regression Tests