This version requires that host and guest have the same PAE status.
NX cap is not offered to the guest, yet.
Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name
(That's really what it is, and the confusion gets worse with PAE support)
Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
lguest never checked for pending interrupts when enabling interrupts, and
things still worked. However, it makes a significant difference to TCP
performance, so it's time we fixed it by introducing a pending_irq flag
and checking it on irq_restore and irq_enable.
These two routines are now too big to patch into the 8/10 bytes
patch space, so we drop that code.
Note: The high latency on interrupt delivery had a very curious
effect: once everything else was optimized, networking without GSO was
faster than networking with GSO, since more interrupts were sent and
hence a greater chance of one getting through to the Guest!
Note2: (Almost) Closing the same loophole for iret doesn't have any
measurable effect, so I'm leaving that patch for the moment.
Before:
1GB tcpblast Guest->Host: 30.7 seconds
1GB tcpblast Guest->Host (no GSO): 76.0 seconds
After:
1GB tcpblast Guest->Host: 6.8 seconds
1GB tcpblast Guest->Host (no GSO): 27.8 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Took some cycles to re-read the Lguest Journey end-to-end, fix some
rot and tighten some phrases.
Only comments change. No new jokes, but a couple of recycled old jokes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
explicitly use ktime.h include
explicitly use hrtimer.h include
explicitly use sched.h include
This patch adds headers explicitly to lguest sources file,
to avoid depending on them being included somewhere else.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can save some lines of code by getting rid of
*lg = cpu... lines of code spread everywhere by now.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
this patch makes the pgdir management per-vcpu. The pgdirs pool
is still guest-wide (although it'll probably need to grow when we
are really executing more vcpus), but the pgdidx index is gone,
since it makes no sense anymore. Instead, we use a per-vcpu
index.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
this patch makes the pending_notify field, used to control
pending notifications, per-vcpu, instead of per-guest
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lguest struct have room for some fields, namely, cr2, ts, esp1
and ss1, that are not really guest-wide, but rather, vcpu-wide.
This patch puts it in the vcpu struct
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lguest uses tasks to control its running behaviour (like sending
breaks, controlling halted state, etc). In a per-vcpu environment,
each vcpu will have its own underlying task. So this patch
makes the infrastructure for that possible
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Here, I introduce per-vcpu timers. With this, we can have
local expiries, needed for accounting time in smp guests
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
this patch changes do_hcall() and do_async_hcall() interfaces (and obviously their
callers) to get a vcpu struct. Again, a vcpu services the hypercall, not the whole
guest
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Went through the documentation doing typo and content fixes. This
patch contains only comment and whitespace changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Jes complains that page table code still uses lgread_u32 even though
it now uses general kernel pte types. The best thing to do is to
generalize lgread_u32 and lgwrite_u32.
This means we lose the efficiency of getuser(). We could potentially
regain it if we used __copy_from_user instead of copy_from_user, but
I'm not certain that our range check is equivalent to access_ok() on
all platforms.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jes Sorensen <jes@sgi.com>
This patch gets rid of the old lguest host I/O infrastructure and
replaces it with a single hypercall "LHCALL_NOTIFY" which takes an
address.
The main change is the removal of io.c: that mainly did inter-guest
I/O, which virtio doesn't yet support.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1) This allows us to get alot closer to booting bzImages.
2) It means we don't have to know page_offset.
3) The Guest needs to modify the boot pagetables to create the
PAGE_OFFSET mapping before jumping to C code.
4) guest_pa() walks the page tables rather than using page_offset.
5) We don't use page_offset to figure out whether to emulate: it was
always kinda quesationable, and won't work for instructions done
before remapping (bzImage unpacking in particular).
6) We still want the kernel address for tlb flushing: have the initial
hypercall give us that, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is my first step in the migration of page_tables.c to the kernel
types and functions/macros (2.6.23-rc3). Seems to be working OK.
Signed-off-by: Matias Zabaljauregui <matias.zabaljauregui@cern.ch>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Clean up the hypercall code to make the code in hypercalls.c
architecture independent. First process the common hypercalls and
then call lguest_arch_do_hcall() if the call hasn't been handled.
Rename struct hcall_ring to hcall_args.
This patch requires the previous patch which reorganize the layout of
struct lguest_regs on i386 so they match the layout of struct
hcall_args.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently we look at the "trapnum" to see if the Guest wants a
hypercall. But once the hypercall is done we have to reset trapnum to
a bogus value, otherwise if we exit to userspace and return, we'd run
the same hypercall twice (that was a nasty bug to find!).
This has two main effects:
1) When Jes's patch changes the hypercall args to be a generic "struct
hcall_args" we simply change the type of "lg->hcall". It's set by
arch code, so if it has to copy args or something it can do so, and
point "hcall" into lg->arch somewhere.
2) Async hypercalls only get run when an actual hypercall is pending.
This simplfies the code a little and is a more logical semantic.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Back when we had all the Guest state in the switcher, we had a fixed
array of them. This is no longer necessary.
If we switch the network code to using random_ether_addr (46 bits is
enough to avoid clashes), we can get rid of the concept of "guest id"
altogether.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In order to avoid problematic special linking of the Launcher, we give
the Host an offset: this means we can use any memory region in the
Launcher as Guest memory rather than insisting on mmap() at 0.
The result is quite pleasing: a number of casts are replaced with
simple additions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Use copy_to_user() when copying a struct timespec to the guest -
put_user() cannot handle two long's in one go on a 64bit arch.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
A non-periodic clock_event_device and the "jiffies" clock don't mix well:
tick_handle_periodic() can go into an infinite loop.
Currently lguest guests use the jiffies clock when the TSC is
unusable. Instead, make the Host write the current time into the lguest
page on every interrupt. This doesn't cost much but is more precise
and at least as accurate as the jiffies clock. It also gets rid of
the GET_WALLCLOCK hypercall.
Also, delay setting sched_clock until our clock is set up, otherwise
the early printk timestamps can go backwards (not harmful, just ugly).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The Host
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The netfilter code had very good documentation: the Netfilter Hacking HOWTO.
Noone ever read it.
So this time I'm trying something different, using a bit of Knuthiness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the code for the "lg.ko" module, which allows lguest guests to
be launched.
[akpm@linux-foundation.org: update for futex-new-private-futexes]
[akpm@linux-foundation.org: build fix]
[jmorris@namei.org: lguest: use hrtimers]
[akpm@linux-foundation.org: x86_64 build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>