linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-26 06:04:14 +08:00

Author	SHA1	Message	Date
Jeremy Fitzhardinge	d6182fbf04	xen64: allocate and manage user pagetables Because the x86_64 architecture does not enforce segment limits, Xen cannot protect itself with them as it does in 32-bit mode. Therefore, to protect itself, it runs the guest kernel in ring 3. Since it also runs the guest userspace in ring3, the guest kernel must maintain a second pagetable for its userspace, which does not map kernel space. Naturally, the guest kernel pagetables map both kernel and userspace. The userspace pagetable is attached to the corresponding kernel pagetable via the pgd's page->private field. It is allocated and freed at the same time as the kernel pgd via the paravirt_pgd_alloc/free hooks. Fortunately, the user pagetable is almost entirely shared with the kernel pagetable; the only difference is the pgd page itself. set_pgd will populate all entries in the kernel pagetable, and also set the corresponding user pgd entry if the address is less than STACK_TOP_MAX. The user pagetable must be pinned and unpinned with the kernel one, but because the pagetables are aliased, pgd_walk() only needs to be called on the kernel pagetable. The user pgd page is then pinned/unpinned along with the kernel pgd page. xen_write_cr3 must write both the kernel and user cr3s. The init_mm.pgd pagetable never has a user pagetable allocated for it, because it can never be used while running usermode. One awkward area is that early in boot the page structures are not available. No user pagetable can exist at that point, but it complicates the logic to avoid looking at the page structure. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:05:38 +02:00
Jeremy Fitzhardinge	5deb30d194	xen: rework pgd_walk to deal with 32/64 bit Rewrite pgd_walk to deal with 64-bit address spaces. There are two notible features of 64-bit workspaces: 1. The physical address is only 48 bits wide, with the upper 16 bits being sign extension; kernel addresses are negative, and userspace is positive. 2. The Xen hypervisor mapping is at the negative-most address, just above the sign-extension hole. 1. means that we can't easily use addresses when traversing the space, since we must deal with sign extension. This rewrite expresses everything in terms of pgd/pud/pmd indices, which means we don't need to worry about the exact configuration of the virtual memory space. This approach works equally well in 32-bit. To deal with 2, assume the hole is between the uppermost userspace address and PAGE_OFFSET. For 64-bit this skips the Xen mapping hole. For 32-bit, the hole is zero-sized. In all cases, the uppermost kernel address is FIXADDR_TOP. A side-effect of this patch is that the upper boundary is actually handled properly, exposing a long-standing bug in 32-bit, which failed to pin kernel pmd page. The kernel pmd is not shared, and so must be explicitly pinned, even though the kernel ptes are shared and don't need pinning. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:03:59 +02:00
Jeremy Fitzhardinge	836fe2f291	xen: use set_pte_vaddr Make Xen's set_pte_mfn() use set_pte_vaddr rather than copying it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:02:01 +02:00
Jeremy Fitzhardinge	ce803e705f	xen64: use arbitrary_virt_to_machine for xen_set_pmd When building initial pagetables in 64-bit kernel the pud/pmd pointer may be in ioremap/fixmap space, so we need to walk the pagetable to look up the physical address. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:01:17 +02:00
Jeremy Fitzhardinge	ebd879e397	xen: fix truncation of machine address arbitrary_virt_to_machine can truncate a machine address if its above 4G. Cast the problem away. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:01:03 +02:00
Jeremy Fitzhardinge	ce87b3d326	xen64: get active_mm from the pda x86_64 stores the active_mm in the pda, so fetch it from there. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:57:45 +02:00
Jeremy Fitzhardinge	f6e587325b	xen64: add extra pv_mmu_ops We need extra pv_mmu_ops for 64-bit, to deal with the extra level of pagetable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:57:16 +02:00
Jeremy Fitzhardinge	cbcd79c2e5	x86: use __page_aligned_data/bss Update arch/x86's use of page-aligned variables. The change to arch/x86/xen/mmu.c fixes an actual bug, but the rest are cleanups and to set a precedent. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:54:39 +02:00
Eduardo Habkost	c1f2f09ef6	pvops-64: call paravirt_post_allocator_init() on setup_arch() Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:53:57 +02:00
Ingo Molnar	1a781a777b	Merge branch 'generic-ipi' into generic-ipi-for-linus Conflicts: arch/powerpc/Kconfig arch/s390/kernel/time.c arch/x86/kernel/apic_32.c arch/x86/kernel/cpu/perfctr-watchdog.c arch/x86/kernel/i8259_64.c arch/x86/kernel/ldt.c arch/x86/kernel/nmi_64.c arch/x86/kernel/smpboot.c arch/x86/xen/smp.c include/asm-x86/hw_irq_32.h include/asm-x86/hw_irq_64.h include/asm-x86/mach-default/irq_vectors.h include/asm-x86/mach-voyager/irq_vectors.h include/asm-x86/smp.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-15 21:55:59 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Jeremy Fitzhardinge	d8355aca23	xen: fix address truncation in pte mfn<->pfn conversion When converting the page number in a pte/pmd/pud/pgd between machine and pseudo-physical addresses, the converted result was being truncated at 32-bits. This caused failures on machines with more than 4G of physical memory. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: "Christopher S. Aker" <caker@theshore.net> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-04 11:31:20 +02:00
Jens Axboe	3b16cf8748	x86: convert to generic helpers for IPI function calls This converts x86, x86-64, and xen to use the new helpers for smp_call_function() and friends, and adds support for smp_call_function_single(). Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-06-26 11:21:54 +02:00
Jeremy Fitzhardinge	400d34944c	xen: add mechanism to extend existing multicalls Some Xen hypercalls accept an array of operations to work on. In general this is because its more efficient for the hypercall to the work all at once rather than as separate hypercalls (even batched as a multicall). This patch adds a mechanism (xen_mc_extend_args()) to allocate more argument space to the last-issued multicall, in order to extend its argument list. The user of this mechanism is xen/mmu.c, which uses it to extend the args array of mmu_update. This is particularly valuable when doing the update for a large mprotect, which goes via ptep_modify_prot_commit(), but it also manages to batch updates to pgd/pmds as well. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 15:17:34 +02:00
Jeremy Fitzhardinge	e57778a1e3	xen: implement ptep_modify_prot_start/commit Xen has a pte update function which will update a pte while preserving its accessed and dirty bits. This means that ptep_modify_prot_start() can be implemented as a simple read of the pte value. The hardware may update the pte in the meantime, but ptep_modify_prot_commit() updates it while preserving any changes that may have happened in the meantime. The updates in ptep_modify_prot_commit() are batched if we're currently in lazy mmu mode. The mmu_update hypercall can take a batch of updates to perform, but this code doesn't make particular use of that feature, in favour of using generic multicall batching to get them all into the hypervisor. The net effect of this is that each mprotect pte update turns from two expensive trap-and-emulate faults into they hypervisor into a single hypercall whose cost is amortized in a batched multicall. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 15:17:23 +02:00
Jeremy Fitzhardinge	2849914393	xen: remove support for non-PAE 32-bit Non-PAE operation has been deprecated in Xen for a while, and is rarely tested or used. xen-unstable has now officially dropped non-PAE support. Since Xen/pvops' non-PAE support has also been broken for a while, we may as well completely drop it altogether. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-24 17:00:55 +02:00
Jeremy Fitzhardinge	ebb9cfe20f	xen: don't drop NX bit Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:56:41 +02:00
Jeremy Fitzhardinge	05345b0f00	xen: mask unwanted pte bits in __supported_pte_mask [ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:56:36 +02:00
Jeremy Fitzhardinge	a987b16cc6	xen: don't drop NX bit Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:55:13 +02:00
Jeremy Fitzhardinge	eb179e443d	xen: mask unwanted pte bits in __supported_pte_mask [ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 14:55:11 +02:00
Ingo Molnar	688d22e23a	Merge branch 'linus' into x86/xen	2008-06-16 11:21:27 +02:00
Jeremy Fitzhardinge	e2426cf85f	xen: avoid hypercalls when updating unpinned pud/pmd When operating on an unpinned pagetable (ie, one under construction or destruction), it isn't necessary to use a hypercall to update a pud/pmd entry. Jan Beulich observed that a similar optimisation avoided many thousands of hypercalls while doing a kernel build. One tricky part is that early in the kernel boot there's no page structure, so we can't check to see if the page is pinned. In that case, we just always use the hypercall. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-02 13:24:40 +02:00
Ingo Molnar	15ce60056b	xen: export get_phys_to_machine -tip testing found the following xen-console symbols trouble: ERROR: "get_phys_to_machine" [drivers/video/xen-fbfront.ko] undefined! ERROR: "get_phys_to_machine" [drivers/net/xen-netfront.ko] undefined! ERROR: "get_phys_to_machine" [drivers/input/xen-kbdfront.ko] undefined! with: http://redhat.com/~mingo/misc/config-Mon_Jun__2_12_25_13_CEST_2008.bad	2008-06-02 13:20:11 +02:00
Ingo Molnar	b20aeccd6a	xen: fix early bootup crash on native hardware -tip tree auto-testing found the following early bootup hang: --------------> get_memcfg_from_srat: assigning address to rsdp RSD PTR v0 [Nvidia] BUG: Int 14: CR2 ffd00040 EDI 8092fbfe ESI ffd00040 EBP 80b0aee8 ESP 80b0aed0 EBX 000f76f0 EDX 0000000e ECX 00000003 EAX ffd00040 err 00000000 EIP 802c055a CS 00000060 flg 00010006 Stack: ffd00040 80bc78d0 80b0af6c 80b1dbfe 8093d8ba 00000008 80b42810 80b4ddb4 80b42842 00000000 80b0af1c 801079c8 808e724e 00000000 80b42871 802c0531 00000100 00000000 0003fff0 80b0af40 80129999 00040100 00040100 00000000 Pid: 0, comm: swapper Not tainted 2.6.26-rc4-sched-devel.git #570 [<802c055a>] ? strncmp+0x11/0x25 [<80b1dbfe>] ? get_memcfg_from_srat+0xb4/0x568 [<801079c8>] ? mcount_call+0x5/0x9 [<802c0531>] ? strcmp+0xa/0x22 [<80129999>] ? printk+0x38/0x3a [<80129999>] ? printk+0x38/0x3a [<8011b122>] ? memory_present+0x66/0x6f [<80b216b4>] ? setup_memory+0x13/0x40c [<80b16b47>] ? propagate_e820_map+0x80/0x97 [<80b1622a>] ? setup_arch+0x248/0x477 [<80129999>] ? printk+0x38/0x3a [<80b11759>] ? start_kernel+0x6e/0x2eb [<80b110fc>] ? i386_start_kernel+0xeb/0xf2 ======================= <------ with this config: http://redhat.com/~mingo/misc/config-Wed_May_28_01_33_33_CEST_2008.bad The thing is, the crash makes little sense at first sight. We crash on a benign-looking printk. The code around it got changed in -tip but checking those topic branches individually did not reproduce the bug. Bisection led to this commit: \| `d5edbc1f75` is first bad commit \| commit `d5edbc1f75` \| Author: Jeremy Fitzhardinge <jeremy@goop.org> \| Date: Mon May 26 23:31:22 2008 +0100 \| \| xen: add p2m mfn_list_list Which is somewhat surprising, as on native hardware Xen client side should have little to no side-effects. After some head scratching, it turns out the following happened: randconfig enabled the following Xen options: CONFIG_XEN=y CONFIG_XEN_MAX_DOMAIN_MEMORY=8 # CONFIG_XEN_BLKDEV_FRONTEND is not set # CONFIG_XEN_NETDEV_FRONTEND is not set CONFIG_HVC_XEN=y # CONFIG_XEN_BALLOON is not set which activated this piece of code in arch/x86/xen/mmu.c: > @@ -69,6 +69,13 @@ > __attribute__((section(".data.page_aligned"))) = > { [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] }; > > +/* Arrays of p2m arrays expressed in mfns used for save/restore */ > +static unsigned long p2m_top_mfn[TOP_ENTRIES] > + __attribute__((section(".bss.page_aligned"))); > + > +static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE] > + __attribute__((section(".bss.page_aligned"))); The problem is, you must only put variables into .bss.page_aligned that have a _size_ that is _exactly_ page aligned. In this case the size of p2m_top_mfn_list is not page aligned: 80b8d000 b p2m_top_mfn 80b8f000 b p2m_top_mfn_list 80b8f008 b softirq_stack 80b97008 b hardirq_stack 80b9f008 b bm_pte So all subsequent variables get unaligned which, depending on luck, breaks the kernel in various funny ways. In this case what killed the kernel first was the misaligned bootmap pte page, resulting in that creative crash above. Anyway, this was a fun bug to track down :-) I think the moral is that .bss.page_aligned is a dangerous construct in its current form, and the symptoms of breakage are very non-trivial, so i think we need build-time checks to make sure all symbols in .bss.page_aligned are truly page aligned. The Xen fix below gets the kernel booting again. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-28 14:32:06 +02:00
Jeremy Fitzhardinge	0e91398f2a	xen: implement save/restore This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge	d5edbc1f75	xen: add p2m mfn_list_list When saving a domain, the Xen tools need to remap all our mfns to portable pfns. In order to remap our p2m table, it needs to know where all its pages are, so maintain the references to the p2m table for it to use. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge	cf0923ea29	xen: efficiently support a holey p2m table When using sparsemem and memory hotplug, the kernel's pseudo-physical address space can be discontigious. Previously this was dealt with by having the upper parts of the radix tree stubbed off. Unfortunately, this is incompatible with save/restore, which requires a complete p2m table. The solution is to have a special distinguished all-invalid p2m leaf page, which we can point all the hole areas at. This allows the tools to see a complete p2m table, but it only costs a page for all memory holes. It also simplifies the code since it removes a few special cases. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge	8006ec3e91	xen: add configurable max domain size Add a config option to set the max size of a Xen domain. This is used to scale the size of the physical-to-machine array; it ends up using around 1 page/GByte, so there's no reason to be very restrictive. For a 32-bit guest, the default value of 8GB is probably sufficient; there's not much point in giving a 32-bit machine much more memory than that. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge	d451bb7aa8	xen: make phys_to_machine structure dynamic We now support the use of memory hotplug, so the physical to machine page mapping structure must be dynamic. This is implemented as a two-level radix tree structure, which allows us to efficiently incrementally allocate memory for the p2m table as new pages are added. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Jan Beulich	de067814d6	x86/xen: fix arbitrary_virt_to_machine() While I realize that the function isn't currently being used, I still think an obvious mistake like this should be corrected. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 14:08:06 +02:00
Jeremy Fitzhardinge	3843fc2575	xen: remove support for non-PAE 32-bit Non-PAE operation has been deprecated in Xen for a while, and is rarely tested or used. xen-unstable has now officially dropped non-PAE support. Since Xen/pvops' non-PAE support has also been broken for a while, we may as well completely drop it altogether. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-22 18:42:49 +02:00
Christoph Lameter	d60cd46bbd	pageflags: use proper page flag functions in Xen Xen uses bitops to manipulate page flags. Make it use proper page flag functions. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:22 -07:00
Jeremy Fitzhardinge	2bd50036b5	xen: allow set_pte_at on init_mm to be lockless The usual pagetable locking protocol doesn't seem to apply to updates to init_mm, so don't rely on preemption being disabled in xen_set_pte_at on init_mm. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Jeremy Fitzhardinge	947a69c90c	xen: unify pte operations We can fold the essentially common pte functions together now. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	430442e38e	xen: make use of pte_t union pte_t always contains a "pte" field for the whole pte value, so make use of it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	abf33038ff	xen: use appropriate pte types Convert Xen pagetable handling to use appropriate *val_t types. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Mark McLoughlin	f64337062c	xen: refactor xen_{alloc,release}_{pt,pd}() Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-04 18:36:48 +02:00
Harvey Harrison	da7bfc50f5	x86: sparse warnings in pageattr.c Adjust the definition of lookup_address to take an unsigned long level argument. Adjust callers in xen/mmu.c that pass in a dummy variable. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:08 +01:00
Ingo Molnar	f0646e43ac	x86: return the page table level in lookup_address() based on this patch from Andi Kleen: \| Subject: CPA: Return the page table level in lookup_address() \| From: Andi Kleen <ak@suse.de> \| \| Needed for the next change. \| \| And change all the callers. and ported it to x86.git. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Jeremy Fitzhardinge	a89780f3b8	xen: fix mismerge in masking pte flags Looks like a mismerge/misapply dropped one of the cases of pte flag masking for Xen. Also, only mask the flags for present ptes. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:39 +01:00
Jeremy Fitzhardinge	015c8dd0cb	xen: mask out PWT too The hypervisor doesn't allow PCD or PWT to be set on guest ptes, so make sure they're masked out. Also, fix up some previous mispatching. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:58 +01:00
Jeremy Fitzhardinge	c8e5393ab3	x86: page.h: make pte_t a union to always include Make sure pte_t, whatever its definition, has a pte element with type pteval_t. This allows common code to access it without needing to be specifically parameterised on what pagetable mode we're compiling for. For 32-bit, this means that pte_t becomes a union with "pte" and "{ pte_low, pte_high }" (PAE) or just "pte_low" (non-PAE). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:57 +01:00
Jeremy Fitzhardinge	2c80b01bea	xen: mask _PAGE_PCD from ptes _PAGE_PCD maps a page with caching disabled, which is typically used for mapping harware registers. Xen never allows it to be set on a mapping, and unprivileged guests never need it since they can't see the real underlying hardware. However, some uncached mappings are made early when probing the (non-existent) APIC, and its OK to mask off the PCD flag in these cases. This became necessary because Xen started checking for this bit, rather than silently masking it off. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Jeremy Fitzhardinge	74260714c5	xen: lock pte pages while pinning/unpinning When a pagetable is created, it is made globally visible in the rmap prio tree before it is pinned via arch_dup_mmap(), and remains in the rmap tree while it is unpinned with arch_exit_mmap(). This means that other CPUs may race with the pinning/unpinning process, and see a pte between when it gets marked RO and actually pinned, causing any pte updates to fail with write-protect faults. As a result, all pte pages must be properly locked, and only unlocked once the pinning/unpinning process has finished. In order to avoid taking spinlocks for the whole pagetable - which may overflow the PREEMPT_BITS portion of preempt counter - it locks and pins each pte page individually, and then finally pins the whole pagetable. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickens <hugh@veritas.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Keir Fraser <keir@xensource.com> Cc: Jan Beulich <jbeulich@novell.com>	2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge	9f79991d41	xen: deal with stale cr3 values when unpinning pagetables When a pagetable is no longer in use, it must be unpinned so that its pages can be freed. However, this is only possible if there are no stray uses of the pagetable. The code currently deals with all the usual cases, but there's a rare case where a vcpu is changing cr3, but is doing so lazily, and the change hasn't actually happened by the time the pagetable is unpinned, even though it appears to have been completed. This change adds a second per-cpu cr3 variable - xen_current_cr3 - which tracks the actual state of the vcpu cr3. It is only updated once the actual hypercall to set cr3 has been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. [ Stable folks: 2.6.23 bugfix ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Stable Kernel <stable@kernel.org>	2007-10-16 11:51:30 -07:00
Jesper Juhl	d626a1f1cb	Clean up duplicate includes in arch/i386/xen/ This patch cleans up duplicate includes in arch/i386/xen/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>	2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge	8965c1c095	paravirt: clean up lazy mode handling Currently, the set_lazy_mode pv_op is overloaded with 5 functions: 1. enter lazy cpu mode 2. leave lazy cpu mode 3. enter lazy mmu mode 4. leave lazy mmu mode 5. flush pending batched operations This complicates each paravirt backend, since it needs to deal with all the possible state transitions, handling flushing, etc. In particular, flushing is quite distinct from the other 4 functions, and seems to just cause complication. This patch removes the set_lazy_mode operation, and adds "enter" and "leave" lazy mode operations on mmu_ops and cpu_ops. All the logic associated with enter and leaving lazy states is now in common code (basically BUG_ONs to make sure that no mode is current when entering a lazy mode, and make sure that the mode is current when leaving). Also, flush is handled in a common way, by simply leaving and re-entering the lazy mode. The result is that the Xen, lguest and VMI lazy mode implementations are much simpler. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Zach Amsden <zach@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Avi Kivity <avi@qumranet.com> Cc: Anthony Liguory <aliguori@us.ibm.com> Cc: "Glauber de Oliveira Costa" <glommer@gmail.com> Cc: Jun Nakajima <jun.nakajima@intel.com>	2007-10-16 11:51:29 -07:00
Thomas Gleixner	9702785a74	i386: move xen Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 11:16:51 +02:00

48 Commits