linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-20 03:04:01 +08:00

Author	SHA1	Message	Date
Arjan van de Ven	651dab4264	Merge commit 'linus/master' into merge-linus Conflicts: arch/x86/kvm/i8254.c	2008-10-17 09:20:26 -07:00
Linus Torvalds	08d19f51f0	Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits) KVM: ia64: Add intel iommu support for guests. KVM: ia64: add directed mmio range support for kvm guests KVM: ia64: Make pmt table be able to hold physical mmio entries. KVM: Move irqchip_in_kernel() from ioapic.h to irq.h KVM: Separate irq ack notification out of arch/x86/kvm/irq.c KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs KVM: Move device assignment logic to common code KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/ KVM: VMX: enable invlpg exiting if EPT is disabled KVM: x86: Silence various LAPIC-related host kernel messages KVM: Device Assignment: Map mmio pages into VT-d page table KVM: PIC: enhance IPI avoidance KVM: MMU: add "oos_shadow" parameter to disable oos KVM: MMU: speed up mmu_unsync_walk KVM: MMU: out of sync shadow core KVM: MMU: mmu_convert_notrap helper KVM: MMU: awareness of new kvm_mmu_zap_page behaviour KVM: MMU: mmu_parent_walk KVM: x86: trap invlpg KVM: MMU: sync roots on mmu reload ...	2008-10-16 15:36:00 -07:00
Harvey Harrison	80a914dc05	misc: replace __FUNCTION__ with __func__ __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:30 -07:00
Xiantao Zhang	3de42dc094	KVM: Separate irq ack notification out of arch/x86/kvm/irq.c Moving irq ack notification logic as common, and make it shared with ia64 side. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:25:35 +02:00
Xiantao Zhang	8a98f6648a	KVM: Move device assignment logic to common code To share with other archs, this patch moves device assignment logic to common parts. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:25:33 +02:00
Zhang xiantao	371c01b28e	KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/ Preparation for kvm/ia64 VT-d support. Signed-off-by: Zhang xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:25:32 +02:00
Marcelo Tosatti	83dbc83a0d	KVM: VMX: enable invlpg exiting if EPT is disabled Manually disabling EPT via module option fails to re-enable INVLPG exiting. Reported-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:25:31 +02:00
Jan Kiszka	1b10bf31a5	KVM: x86: Silence various LAPIC-related host kernel messages KVM-x86 dumps a lot of debug messages that have no meaning for normal operation: - INIT de-assertion is ignored - SIPIs are sent and received - APIC writes are unaligned or < 4 byte long (Windows Server 2003 triggers this on SMP) Degrade them to true debug messages, keeping the host kernel log clean for real problems. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:30 +02:00
Weidong Han	e5fcfc821a	KVM: Device Assignment: Map mmio pages into VT-d page table Assigned device could DMA to mmio pages, so also need to map mmio pages into VT-d page table. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:29 +02:00
Marcelo Tosatti	e48258009d	KVM: PIC: enhance IPI avoidance The PIC code makes little effort to avoid kvm_vcpu_kick(), resulting in unnecessary guest exits in some conditions. For example, if the timer interrupt is routed through the IOAPIC, IRR for IRQ 0 will get set but not cleared, since the APIC is handling the acks. This means that everytime an interrupt < 16 is triggered, the priority logic will find IRQ0 pending and send an IPI to vcpu0 (in case IRQ0 is not masked, which is Linux's case). Introduce a new variable isr_ack to represent the IRQ's for which the guest has been signalled / cleared the ISR. Use it to avoid more than one IPI per trigger-ack cycle, in addition to the avoidance when ISR is set in get_priority(). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:28 +02:00
Marcelo Tosatti	582801a95d	KVM: MMU: add "oos_shadow" parameter to disable oos Subject says it all. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:27 +02:00
Marcelo Tosatti	0074ff63eb	KVM: MMU: speed up mmu_unsync_walk Cache the unsynced children information in a per-page bitmap. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:26 +02:00
Marcelo Tosatti	4731d4c7a0	KVM: MMU: out of sync shadow core Allow guest pagetables to go out of sync. Instead of emulating write accesses to guest pagetables, or unshadowing them, we un-write-protect the page table and allow the guest to modify it at will. We rely on invlpg executions to synchronize individual ptes, and will synchronize the entire pagetable on tlb flushes. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:25 +02:00
Marcelo Tosatti	6844dec694	KVM: MMU: mmu_convert_notrap helper Need to convert shadow_notrap_nonpresent -> shadow_trap_nonpresent when unsyncing pages. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:24 +02:00
Marcelo Tosatti	0738541396	KVM: MMU: awareness of new kvm_mmu_zap_page behaviour kvm_mmu_zap_page will soon zap the unsynced children of a page. Restart list walk in such case. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:23 +02:00
Marcelo Tosatti	ad8cfbe3ff	KVM: MMU: mmu_parent_walk Introduce a function to walk all parents of a given page, invoking a handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:22 +02:00
Marcelo Tosatti	a7052897b3	KVM: x86: trap invlpg With pages out of sync invlpg needs to be trapped. For now simply nuke the entry. Untested on AMD. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:21 +02:00
Marcelo Tosatti	0ba73cdadb	KVM: MMU: sync roots on mmu reload Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:20 +02:00
Marcelo Tosatti	e8bc217aef	KVM: MMU: mode specific sync_page Examine guest pagetable and bring the shadow back in sync. Caller is responsible for local TLB flush before re-entering guest mode. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:19 +02:00
Marcelo Tosatti	38187c830c	KVM: MMU: do not write-protect large mappings There is not much point in write protecting large mappings. This can only happen when a page is shadowed during the window between is_largepage_backed and mmu_lock acquision. Zap the entry instead, so the next pagefault will find a shadowed page via is_largepage_backed and fallback to 4k translations. Simplifies out of sync shadow. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:18 +02:00
Marcelo Tosatti	a378b4e64c	KVM: MMU: move local TLB flush to mmu_set_spte Since the sync page path can collapse flushes. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:17 +02:00
Marcelo Tosatti	1e73f9dd88	KVM: MMU: split mmu_set_spte Split the spte entry creation code into a new set_spte function. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:16 +02:00
Marcelo Tosatti	93a423e704	KVM: MMU: flush remote TLBs on large->normal entry overwrite It is necessary to flush all TLB's when a large spte entry is overwritten with a normal page directory pointer. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:15 +02:00
Gleb Natapov	af2152f545	KVM: don't enter guest after SIPI was received by a CPU The vcpu should process pending SIPI message before entering guest mode again. kvm_arch_vcpu_runnable() returns true if the vcpu is in SIPI state, so we can't call it here. Signed-off-by: Gleb Natapov <gleb@qumranet.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:09 +02:00
Harvey Harrison	2259e3a7a6	KVM: x86.c make kvm_load_realmode_segment static Noticed by sparse: arch/x86/kvm/x86.c:3591:5: warning: symbol 'kvm_load_realmode_segment' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:07 +02:00
Marcelo Tosatti	4c2155ce81	KVM: switch to get_user_pages_fast Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7% faster on VMX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:06 +02:00
Amit Shah	bfadaded0d	KVM: Device Assignment: Free device structures if IRQ allocation fails When an IRQ allocation fails, we free up the device structures and disable the device so that we can unregister the device in the userspace and not expose it to the guest at all. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:04 +02:00
Ben-Ami Yassour	62c476c7c7	KVM: Device Assignment with VT-d Based on a patch by: Kay, Allen M <allen.m.kay@intel.com> This patch enables PCI device assignment based on VT-d support. When a device is assigned to the guest, the guest memory is pinned and the mapping is updated in the VT-d IOMMU. [Amit: Expose KVM_CAP_IOMMU so we can check if an IOMMU is present and also control enable/disable from userspace] Signed-off-by: Kay, Allen M <allen.m.kay@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Amit Shah <amit.shah@qumranet.com> Acked-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:25:04 +02:00
Guillaume Thouvenin	aa3a816b6d	KVM: x86 emulator: Use DstAcc for 'and' For instruction 'and al,imm' we use DstAcc instead of doing the emulation directly into the instruction's opcode. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:16:14 +02:00
Guillaume Thouvenin	8a9fee67fb	KVM: x86 emulator: Add cmp al, imm and cmp ax, imm instructions (ocodes 3c, 3d) Add decode entries for these opcodes; execution is already implemented. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:16:14 +02:00
Guillaume Thouvenin	9c9fddd0e7	KVM: x86 emulator: Add DstAcc operand type Add DstAcc operand type. That means that there are 4 bits now for DstMask. "In the good old days cpus would have only one register that was able to fully participate in arithmetic operations, typically called A for Accumulator. The x86 retains this tradition by having special, shorter encodings for the A register (like the cmp opcode), and even some instructions that only operate on A (like mul). SrcAcc and DstAcc would accommodate these instructions by decoding A into the corresponding 'struct operand'." -- Avi Kivity Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:16:14 +02:00
Sheng Yang	defed7ed92	x86: Move FEATURE_CONTROL bits to msr-index.h For MSR_IA32_FEATURE_CONTROL is already there. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:16:14 +02:00
Sheng Yang	9ea542facb	KVM: VMX: Rename IA32_FEATURE_CONTROL bits Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:16:14 +02:00
Avi Kivity	ef46f18ea0	KVM: x86 emulator: fix jmp r/m64 instruction jmp r/m64 doesn't require the rex.w prefix to indicate the operand size is 64 bits. Set the Stack attribute (even though it doesn't involve the stack, really) to indicate this. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:27 +02:00
Jan Kiszka	4b92fe0c9d	KVM: VMX: Cleanup stalled INTR_INFO read Commit 1c0f4f5011829dac96347b5f84ba37c2252e1e08 left a useless access of VM_ENTRY_INTR_INFO_FIELD in vmx_intr_assist behind. Clean this up. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:26 +02:00
Marcelo Tosatti	9c3e4aab5a	KVM: x86: unhalt vcpu0 on reset Since "KVM: x86: do not execute halted vcpus", HLT by vcpu0 before system reset by the IO thread will hang the guest. Mark vcpu as runnable in such case. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:26 +02:00
Mohammed Gamal	d19292e457	KVM: x86 emulator: Add call near absolute instruction (opcode 0xff/2) Add call near absolute instruction. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:26 +02:00
Marcelo Tosatti	d76901750a	KVM: x86: do not execute halted vcpus Offline or uninitialized vcpu's can be executed if requested to perform userspace work. Follow Avi's suggestion to handle halted vcpu's in the main loop, simplifying kvm_emulate_halt(). Introduce a new vcpu->requests bit to indicate events that promote state from halted to running. Also standardize vcpu wake sites. Signed-off-by: Marcelo Tosatti <mtosatti <at> redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:26 +02:00
Mohammed Gamal	a6a3034cb9	KVM: x86 emulator: Add in/out instructions (opcodes 0xe4-0xe7, 0xec-0xef) The patch adds in/out instructions to the x86 emulator. The instruction was encountered while running the BIOS while using the invalid guest state emulation patch. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:25 +02:00
Avi Kivity	fa89a81766	KVM: Add statistics for guest irq injections These can help show whether a guest is making progress or not. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:25 +02:00
Sheng Yang	d40a1ee485	KVM: MMU: Modify kvm_shadow_walk.entry to accept u64 addr EPT is 4 level by default in 32pae(48 bits), but the addr parameter of kvm_shadow_walk->entry() only accept unsigned long as virtual address, which is 32bit in 32pae. This result in SHADOW_PT_INDEX() overflow when try to fetch level 4 index. Fix it by extend kvm_shadow_walk->entry() to accept 64bit addr in parameter. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:25 +02:00
Mohammed Gamal	fb4616f431	KVM: x86 emulator: Add std and cld instructions (opcodes 0xfc-0xfd) This adds the std and cld instructions to the emulator. Encountered while running the BIOS with invalid guest state emulation enabled. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:25 +02:00
Joerg Roedel	a89c1ad270	KVM: add MC5_MISC msr read support Currently KVM implements MC0-MC4_MISC read support. When booting Linux this results in KVM warnings in the kernel log when the guest tries to read MC5_MISC. Fix this warnings with this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:24 +02:00
Avi Kivity	48d1503949	KVM: SVM: No need to unprotect memory during event injection when using npt No memory is protected anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:24 +02:00
Avi Kivity	3201b5d9f0	KVM: MMU: Fix setting the accessed bit on non-speculative sptes The accessed bit was accidentally turned on in a random flag word, rather than, the spte itself, which was lucky, since it used the non-EPT compatible PT_ACCESSED_MASK. Fix by turning the bit on in the spte and changing it to use the portable accessed mask. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:24 +02:00
Avi Kivity	171d595d3b	KVM: MMU: Flush tlbs after clearing write permission when accessing dirty log Otherwise, the cpu may allow writes to the tracked pages, and we lose some display bits or fail to migrate correctly. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:24 +02:00
Avi Kivity	2245a28fe2	KVM: MMU: Add locking around kvm_mmu_slot_remove_write_access() It was generally safe due to slots_lock being held for write, but it wasn't very nice. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:24 +02:00
Avi Kivity	bc2d429979	KVM: MMU: Account for npt/ept/realmode page faults Now that two-dimensional paging is becoming common, account for tdp page faults. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Mohammed Gamal	a5e2e82b8b	KVM: x86 emulator: Add mov r, imm instructions (opcodes 0xb0-0xbf) The emulator only supported one instance of mov r, imm instruction (opcode 0xb8), this adds the rest of these instructions. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Avi Kivity	acee3c04e8	KVM: Allocate guest memory as MAP_PRIVATE, not MAP_SHARED There is no reason to share internal memory slots with fork()ed instances. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Avi Kivity	abb9e0b8e3	KVM: MMU: Convert the paging mode shadow walk to use the generic walker Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Avi Kivity	140754bc80	KVM: MMU: Convert direct maps to use the generic shadow walker Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Avi Kivity	3d000db568	KVM: MMU: Add generic shadow walker We currently walk the shadow page tables in two places: direct map (for real mode and two dimensional paging) and paging mode shadow. Since we anticipate requiring a third walk (for invlpg), it makes sense to have a generic facility for shadow walk. This patch adds such a shadow walker, walks the page tables and calls a method for every spte encountered. The method can examine the spte, modify it, or even instantiate it. The walk can be aborted by returning nonzero from the method. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:23 +02:00
Avi Kivity	6c41f428b7	KVM: MMU: Infer shadow root level in direct_map() In all cases the shadow root level is available in mmu.shadow_root_level, so there is no need to pass it as a parameter. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:22 +02:00
Avi Kivity	6e37d3dc3e	KVM: MMU: Unify direct map 4K and large page paths The two paths are equivalent except for one argument, which is already available. Merge the two codepaths. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:22 +02:00
Avi Kivity	135f8c2b07	KVM: MMU: Move SHADOW_PT_INDEX to mmu.c It is not specific to the paging mode, so can be made global (and reusable). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:22 +02:00
Avi Kivity	6eb06cb286	KVM: x86 emulator: remove bad ByteOp specifier from NEG descriptor Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:22 +02:00
roel kluin	41afa02587	KVM: x86 emulator: remove duplicate SrcImm Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Avi Kivity	f4bbd9aaaa	KVM: Load real mode segments correctly Real mode segments to not reference the GDT or LDT; they simply compute base = selector * 16. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Avi Kivity	a16b20da87	KVM: VMX: Change segment dpl at reset to 3 This is more emulation friendly, if not 100% correct. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Avi Kivity	5706be0daf	KVM: VMX: Change cs reset state to be a data segment Real mode cs is a data segment, not a code segment. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Harvey Harrison	ee032c993e	KVM: make irq ack notifier functions static sparse says: arch/x86/kvm/x86.c:107:32: warning: symbol 'kvm_find_assigned_dev' was not declared. Should it be static? arch/x86/kvm/i8254.c:225:6: warning: symbol 'kvm_pit_ack_irq' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Amit Shah	29c8fa32c5	KVM: Use kvm_set_irq to inject interrupts ... instead of using the pic and ioapic variants Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:21 +02:00
Amit Shah	94c935a1ee	KVM: SVM: Fix typo Fix typo in as-yet unused macro definition. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Mohammed Gamal	a89a8fb93b	KVM: VMX: Modify mode switching and vmentry functions This patch modifies mode switching and vmentry function in order to drive invalid guest state emulation. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Mohammed Gamal	ea953ef0ca	KVM: VMX: Add invalid guest state handler This adds the invalid guest state handler function which invokes the x86 emulator until getting the guest to a VMX-friendly state. [avi: leave atomic context if scheduling] [guillaume: return to atomic context correctly] Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Mohammed Gamal	04fa4d3211	KVM: VMX: Add module parameter and emulation flag. The patch adds the module parameter required to enable emulating invalid guest state, as well as the emulation_required flag used to drive emulation whenever needed. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Mohammed Gamal	648dfaa7df	KVM: VMX: Add Guest State Validity Checks This patch adds functions to check whether guest state is VMX compliant. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Amit Shah	6762b7299a	KVM: Device assignment: Check for privileges before assigning irq Even though we don't share irqs at the moment, we should ensure regular user processes don't try to allocate system resources. We check for capability to access IO devices (CAP_SYS_RAWIO) before we request_irq on behalf of the guest. Noticed by Avi. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:20 +02:00
Avi Kivity	dc7404cea3	KVM: Handle spurious acks for PIT interrupts Spurious acks can be generated, for example if the PIC is being reset. Handle those acks gracefully rather than flooding the log with warnings. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:19 +02:00
Marcelo Tosatti	85428ac7c3	KVM: fix i8259 reset irq acking The irq ack during pic reset has three problems: - Ignores slave/master PIC, using gsi 0-8 for both. - Generates an ACK even if the APIC is in control. - Depends upon IMR being clear, which is broken if the irq was masked at the time it was generated. The last one causes the BIOS to hang after the first reboot of Windows installation, since PIT interrupts stop. [avi: fix check whether pic interrupts are seen by cpu] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:19 +02:00
Avi Kivity	ecfc79c700	KVM: VMX: Use interrupt queue for !irqchip_in_kernel Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:19 +02:00
Marcelo Tosatti	29415c37f0	KVM: set debug registers after "schedulable" section The vcpu thread can be preempted after the guest_debug_pre() callback, resulting in invalid debug registers on the new vcpu. Move it inside the non-preemptable section. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:19 +02:00
Sheng Yang	464d17c8b7	KVM: VMX: Clean up magic number 0x66 in init_rmode_tss Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:19 +02:00
Dave Hansen	6ad18fba05	KVM: Reduce stack usage in kvm_pv_mmu_op() We're in a hot path. We can't use kmalloc() because it might impact performance. So, we just stick the buffer that we need into the kvm_vcpu_arch structure. This is used very often, so it is not really a waste. We also have to move the buffer structure's definition to the arch-specific x86 kvm header. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:18 +02:00
Dave Hansen	b772ff362e	KVM: Reduce stack usage in kvm_arch_vcpu_ioctl() [sheng: fix KVM_GET_LAPIC using wrong size] Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:18 +02:00
Dave Hansen	f0d662759a	KVM: Reduce kvm stack usage in kvm_arch_vm_ioctl() On my machine with gcc 3.4, kvm uses ~2k of stack in a few select functions. This is mostly because gcc fails to notice that the different case: statements could have their stack usage combined. It overflows very nicely if interrupts happen during one of these large uses. This patch uses two methods for reducing stack usage. 1. dynamically allocate large objects instead of putting on the stack. 2. Use a union{} member for all of the case variables. This tricks gcc into combining them all into a single stack allocation. (There's also a comment on this) Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:18 +02:00
Ben-Ami Yassour	4d5c5d0fe8	KVM: pci device assignment Based on a patch from: Amit Shah <amit.shah@qumranet.com> This patch adds support for handling PCI devices that are assigned to the guest. The device to be assigned to the guest is registered in the host kernel and interrupt delivery is handled. If a device is already assigned, or the device driver for it is still loaded on the host, the device assignment is failed by conveying a -EBUSY reply to the userspace. Devices that share their interrupt line are not supported at the moment. By itself, this patch will not make devices work within the guest. The VT-d extension is required to enable the device to perform DMA. Another alternative is PVDMA. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:18 +02:00
Marcelo Tosatti	3cf57fed21	KVM: PIT: fix injection logic and count The PIT injection logic is problematic under the following cases: 1) If there is a higher priority vector to be delivered by the time kvm_pit_timer_intr_post is invoked ps->inject_pending won't be set. This opens the possibility for missing many PIT event injections (say if guest executes hlt at this point). 2) ps->inject_pending is racy with more than two vcpus. Since there's no locking around read/dec of pt->pending, two vcpu's can inject two interrupts for a single pt->pending count. Fix 1 by using an irq ack notifier: only reinject when the previous irq has been acked. Fix 2 with appropriate locking around manipulation of pending count and irq_ack by the injection / ack paths. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:17 +02:00
Marcelo Tosatti	f52447261b	KVM: irq ack notification Based on a patch from: Ben-Ami Yassour <benami@il.ibm.com> which was based on a patch from: Amit Shah <amit.shah@qumranet.com> Notify IRQ acking on PIC/APIC emulation. The previous patch missed two things: - Edge triggered interrupts on IOAPIC - PIC reset with IRR/ISR set should be equivalent to ack (LAPIC probably needs something similar). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: Amit Shah <amit.shah@qumranet.com> CC: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Avi Kivity	564f15378f	KVM: Add irq ack notifier list This can be used by kvm subsystems that are interested in when interrupts are acked, for example time drift compensation. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Alexander Graf	b5e2fec0eb	KVM: Ignore DEBUGCTL MSRs with no effect Netware writes to DEBUGCTL and reads from the DEBUGCTL and LAST*IP MSRs without further checks and is really confused to receive a #GP during that. To make it happy we should just make them stubs, which is exactly what SVM already does. Writes to DEBUGCTL that are vendor-specific are resembled to behave as if the virtual CPU does not know them. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:15 +02:00
Avi Kivity	313dbd49dc	KVM: VMX: Avoid vmwrite(HOST_RSP) when possible Usually HOST_RSP retains its value across guest entries. Take advantage of this and avoid a vmwrite() when this is so. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:15 +02:00
Avi Kivity	80e31d4f61	KVM: SVM: Unify register save/restore across 32 and 64 bit hosts Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:14 +02:00
Avi Kivity	c801949ddf	KVM: VMX: Unify register save/restore across 32 and 64 bit hosts Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:14 +02:00
Jan Kiszka	77ab6db0a1	KVM: VMX: Reinject real mode exception As we execute real mode guests in VM86 mode, exception have to be reinjected appropriately when the guest triggered them. For this purpose the patch adopts the real-mode injection pattern used in vmx_inject_irq to vmx_queue_exception, additionally taking care that the IP is set correctly for #BP exceptions. Furthermore it extends handle_rmode_exception to reinject all those exceptions that can be raised in real mode. This fixes the execution of himem.exe from FreeDOS and also makes its debug.com work properly. Note that guest debugging in real mode is broken now. This has to be fixed by the scheduled debugging infrastructure rework (will be done once base patches for QEMU have been accepted). Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:14 +02:00
Jan Kiszka	19bd8afdc4	KVM: Consolidate XX_VECTOR defines Signed-off-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:14 +02:00
Avi Kivity	7edd0ce058	KVM: Consolidate PIC isr clearing into a function Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:14 +02:00
Mohammed Gamal	60bd83a125	KVM: VMX: Remove redundant check in handle_rmode_exception Since checking for vcpu->arch.rmode.active is already done whenever we call handle_rmode_exception(), checking it inside the function is redundant. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	f7d9238f5d	KVM: VMX: Move interrupt post-processing to vmx_complete_interrupts() Instead of looking at failed injections in the vm entry path, move processing to the exit path in vmx_complete_interrupts(). This simplifes the logic and removes any state that is hidden in vmx registers. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	937a7eaef9	KVM: Add a pending interrupt queue Similar to the exception queue, this hold interrupts that have been accepted by the virtual processor core but not yet injected. Not yet used. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	35920a3569	KVM: VMX: Fix pending exception processing The vmx code assumes that IDT-Vectoring can only be set when an exception is injected due to the exception in question. That's not true, however: if the exception is injected correctly, and later another exception occurs but its delivery is blocked due to a fault, then we will incorrectly assume the first exception was not delivered. Fix by unconditionally dequeuing the pending exception, and requeuing it (or the second exception) if we see it in the IDT-Vectoring field. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	26eef70c3e	KVM: Clear exception queue before emulating an instruction If we're emulating an instruction, either it will succeed, in which case any previously queued exception will be spurious, or we will requeue the same exception. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	668f612fa0	KVM: VMX: Move nmi injection failure processing to vm exit path Instead of processing nmi injection failure in the vm entry path, move it to the vm exit path (vm_complete_interrupts()). This separates nmi injection from nmi post-processing, and moves the nmi state from the VT state into vcpu state (new variable nmi_injected specifying an injection in progress). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:13 +02:00
Avi Kivity	cf393f7566	KVM: Move NMI IRET fault processing to new vmx_complete_interrupts() Currently most interrupt exit processing is handled on the entry path, which is confusing. Move the NMI IRET fault processing to a new function, vmx_complete_interrupts(), which is called on the vmexit path. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:12 +02:00
Avi Kivity	5b5c6a5a60	KVM: MMU: Simplify kvm_mmu_zap_page() The twisty maze of conditionals can be reduced. [joerg: fix tlb flushing] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:12 +02:00
Avi Kivity	31aa2b44af	KVM: MMU: Separate the code for unlinking a shadow page from its parents Place into own function, in preparation for further cleanups. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:12 +02:00
Amit Shah	867767a365	KVM: Introduce kvm_set_irq to inject interrupts in guests This function injects an interrupt into the guest given the kvm struct, the (guest) irq number and the interrupt level. Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:12 +02:00
Marcelo Tosatti	5fdbf9765b	KVM: x86: accessors for guest registers As suggested by Avi, introduce accessors to read/write guest registers. This simplifies the ->cache_regs/->decache_regs interface, and improves register caching which is important for VMX, where the cost of vmcs_read/vmcs_write is significant. [avi: fix warnings] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:13:57 +02:00
Sheng Yang	ca60dfbb69	KVM: VMX: Rename misnamed msr bits MSR_IA32_FEATURE_LOCKED is just a bit in fact, which shouldn't be prefixed with MSR_. So is MSR_IA32_FEATURE_VMXON_ENABLED. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:13:57 +02:00

1 2 3 4 5 ...

386 Commits