linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-16 16:54:20 +08:00

Author	SHA1	Message	Date
Dongxiao Xu	b923e62e4d	KVM: VMX: VMCLEAR/VMPTRLD usage changes Originally VMCLEAR/VMPTRLD is called on vcpu migration. To support hosted VMM coexistance, VMCLEAR is executed on vcpu schedule out, and VMPTRLD is executed on vcpu schedule in. This could also eliminate the IPI when doing VMCLEAR. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:42 +03:00
Dongxiao Xu	92fe13be74	KVM: VMX: Some minor changes to code structure Do some preparations for vmm coexistence support. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:42 +03:00
Dongxiao Xu	7725b89414	KVM: VMX: Define new functions to wrapper direct call of asm code Define vmcs_load() and kvm_cpu_vmxon() to avoid direct call of asm code. Also move VMXE bit operation out of kvm_cpu_vmxoff(). Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:41 +03:00
Avi Kivity	f0f5933a16	KVM: MMU: Fix free memory accounting race in mmu_alloc_roots() We drop the mmu lock between freeing memory and allocating the roots; this allows some other vcpu to sneak in and allocate memory. While the race is benign (resulting only in temporary overallocation, not oom) it is simple and easy to fix by moving the freeing close to the allocation. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:41 +03:00
Gleb Natapov	6d77dbfc88	KVM: inject #UD if instruction emulation fails and exit to userspace Do not kill VM when instruction emulation fails. Inject #UD and report failure to userspace instead. Userspace may choose to reenter guest if vcpu is in userspace (cpl == 3) in which case guest OS will kill offending process and continue running. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:35:40 +03:00
Gui Jianfeng	54a4f0239f	KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed Currently, kvm_mmu_zap_page() returning the number of freed children sp. This might confuse the caller, because caller don't know the actual freed number. Let's make kvm_mmu_zap_page() return the number of pages it actually freed. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:39 +03:00
Gui Jianfeng	518c5a05e8	KVM: MMU: Fix debug output error in walk_addr() Fix a debug output error in walk_addr Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:39 +03:00
Gui Jianfeng	f3b8c964a9	KVM: MMU: mark page table dirty when a pte is actually modified Sometime cmpxchg_gpte doesn't modify gpte, in such case, don't mark page table page as dirty. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Joerg Roedel	eec4b140c9	KVM: SVM: Allow EFER.LMSLE to be set with nested svm This patch enables setting of efer bit 13 which is allowed in all SVM capable processors. This is necessary for the SLES11 version of Xen 4.0 to boot with nested svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Joerg Roedel	3f10c846f8	KVM: SVM: Dump vmcb contents on failed vmrun This patch adds a function to dump the vmcb into the kernel log and calls it after a failed vmrun to ease debugging. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:38 +03:00
Avi Kivity	d94e1dc9af	KVM: Get rid of KVM_REQ_KICK KVM_REQ_KICK poisons vcpu->requests by having a bit set during normal operation. This causes the fast path check for a clear vcpu->requests to fail all the time, triggering tons of atomic operations. Fix by replacing KVM_REQ_KICK with a vcpu->guest_mode atomic. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:37 +03:00
Gleb Natapov	54b8486f46	KVM: x86 emulator: do not inject exception directly into vcpu Return exception as a result of instruction emulation and handle injection in KVM code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:37 +03:00
Gleb Natapov	95cb229530	KVM: x86 emulator: move interruptibility state tracking out of emulator Emulator shouldn't access vcpu directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	4d2179e1e9	KVM: x86 emulator: handle shadowed registers outside emulator Emulator shouldn't access vcpu directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	bdb475a323	KVM: x86 emulator: use shadowed register in emulate_sysexit() emulate_sysexit() should use shadowed registers copy instead of looking into vcpu state directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:36 +03:00
Gleb Natapov	ef050dc039	KVM: x86 emulator: set RFLAGS outside x86 emulator code Removes the need for set_flags() callback. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:35 +03:00
Gleb Natapov	95c5588652	KVM: x86 emulator: advance RIP outside x86 emulator code Return new RIP as part of instruction emulation result instead of updating KVM's RIP from x86 emulator code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:35 +03:00
Gleb Natapov	3457e4192e	KVM: handle emulation failure case first If emulation failed return immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	8fe681e984	KVM: do not inject #PF in (read\|write)_emulated() callbacks Return error to x86 emulator instead of injection exception behind its back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	f181b96d4c	KVM: remove export of emulator_write_emulated() It is not called directly outside of the file it's defined in anymore. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:34 +03:00
Gleb Natapov	c3cd7ffaf5	KVM: x86 emulator: x86_emulate_insn() return -1 only in case of emulation failure Currently emulator returns -1 when emulation failed or IO is needed. Caller tries to guess whether emulation failed by looking at other variables. Make it easier for caller to recognise error condition by always returning -1 in case of failure. For this new emulator internal return value X86EMUL_IO_NEEDED is introduced. It is used to distinguish between error condition (which returns X86EMUL_UNHANDLEABLE) and condition that requires IO exit to userspace to continue emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:33 +03:00
Gleb Natapov	411c35b7ef	KVM: fill in run->mmio details in (read\|write)_emulated function Fill in run->mmio details in (read\|write)_emulated function just like pio does. There is no point in filling only vcpu fields there just to copy them into vcpu->run a little bit later. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:33 +03:00
Gleb Natapov	e680080e65	KVM: x86 emulator: fix X86EMUL_RETRY_INSTR and X86EMUL_CMPXCHG_FAILED values Currently X86EMUL_PROPAGATE_FAULT, X86EMUL_RETRY_INSTR and X86EMUL_CMPXCHG_FAILED have the same value so caller cannot distinguish why function such as emulator_cmpxchg_emulated() (which can return both X86EMUL_PROPAGATE_FAULT and X86EMUL_CMPXCHG_FAILED) failed. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:32 +03:00
Gleb Natapov	338dbc9781	KVM: x86 emulator: make (get\|set)_dr() callback return error if it fails Make (get\|set)_dr() callback return error if it fails instead of injecting exception behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:32 +03:00
Gleb Natapov	0f12244fe7	KVM: x86 emulator: make set_cr() callback return error if it fails Make set_cr() callback return error if it fails instead of injecting #GP behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	79168fd1a3	KVM: x86 emulator: cleanup some direct calls into kvm to use existing callbacks Use callbacks from x86_emulate_ops to access segments instead of calling into kvm directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	5951c44237	KVM: x86 emulator: add get_cached_segment_base() callback to x86_emulate_ops On VMX it is expensive to call get_cached_descriptor() just to get segment base since multiple vmcs_reads are done instead of only one. Introduce new call back get_cached_segment_base() for efficiency. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:31 +03:00
Gleb Natapov	3fb1b5dbd3	KVM: x86 emulator: add (set\|get)_msr callbacks to x86_emulate_ops Add (set\|get)_msr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	35aa5375d4	KVM: x86 emulator: add (set\|get)_dr callbacks to x86_emulate_ops Add (set\|get)_dr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	414e6277fd	KVM: x86 emulator: handle "far address" source operand ljmp/lcall instruction operand contains address and segment. It can be 10 bytes long. Currently we decode it as two different operands. Fix it by introducing new kind of operand that can hold entire far address. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:30 +03:00
Gleb Natapov	b8a98945ea	KVM: x86 emulator: cleanup nop emulation Make it more explicit what we are checking for. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:29 +03:00
Gleb Natapov	f0c13ef1a8	KVM: x86 emulator: cleanup xchg emulation Dst operand is already initialized during decoding stage. No need to reinitialize. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:29 +03:00
Gleb Natapov	054fe9f6e3	KVM: x86 emulator: fix Move r/m16 to segment register decoding This instruction does not need generic decoding for its dst operand. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Gleb Natapov	9de4157367	KVM: x86 emulator: introduce read cache Introduce read cache which is needed for instruction that require more then one exit to userspace. After returning from userspace the instruction will be re-executed with cached read value. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Avi Kivity	1c11e71357	KVM: VMX: Avoid writing HOST_CR0 every entry cr0.ts may change between entries, so we copy cr0 to HOST_CR0 before each entry. That is slow, so instead, set HOST_CR0 to have TS set unconditionally (which is a safe value), and issue a clts() just before exiting vcpu context if the task indeed owns the fpu. Saves ~50 cycles/exit. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:28 +03:00
Avi Kivity	08acfa1871	KVM: kvm_pdptr_read() may sleep Annotate it thusly. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Takuya Yoshikawa	914ebccd2d	KVM: x86: avoid unnecessary bitmap allocation when memslot is clean Although we always allocate a new dirty bitmap in x86's get_dirty_log(), it is only used as a zero-source of copy_to_user() and freed right after that when memslot is clean. This patch uses clear_user() instead of doing this unnecessary zero-source allocation. Performance improvement: as we can expect easily, the time needed to allocate a bitmap is completely reduced. In my test, the improved ioctl was about 4 to 10 times faster than the original one for clean slots. Furthermore, reducing memory allocations and copies will produce good effects to caches too. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Avi Kivity	c332c83ae7	KVM: VMX: Simplify vmx_get_nmi_mask() !! is not needed due to the cast to bool. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:27 +03:00
Huang Ying	bf998156d2	KVM: Avoid killing userspace through guest SRAO MCE on unmapped pages In common cases, guest SRAO MCE will cause corresponding poisoned page be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay the MCE to guest OS. But it is reported that if the poisoned page is accessed in guest after unmapping and before MCE is relayed to guest OS, userspace will be killed. The reason is as follows. Because poisoned page has been un-mapped, guest access will cause guest exit and kvm_mmu_page_fault will be called. kvm_mmu_page_fault can not get the poisoned page for fault address, so kernel and user space MMIO processing is tried in turn. In user MMIO processing, poisoned page is accessed again, then userspace is killed by force_sig_info. To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM and do not try kernel and user space MMIO processing for poisoned page. [xiao: fix warning introduced by avi] Reported-by: Max Asbock <masbock@linux.vnet.ibm.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:26 +03:00
Jason Wessel	ba773f7c51	x86,kgdb: Fix hw breakpoint regression HW breakpoints events stopped working correctly with kgdb as a result of commit: `018cbffe68` (Merge commit 'v2.6.33' into perf/core). The regression occurred because the behavior changed for setting NOTIFY_STOP as the return value to the die notifier if the breakpoint was known to the HW breakpoint API. Because kgdb is using the HW breakpoint API to register HW breakpoints slots, it must also now implement the overflow_handler call back else kgdb does not get to see the events from the die notifier. The kgdb_ll_trap function will be changed to be general purpose code which can allow an easy way to implement the hw_breakpoint API overflow call back. Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Dongdong Deng <dongdong.deng@windriver.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-07-28 19:10:30 -05:00
Linus Torvalds	1a041a23da	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Do not try to disable hpet if it hasn't been initialized before x86, i8259: Only register sysdev if we have a real 8259 PIC	2010-07-26 16:02:07 -07:00
Linus Torvalds	ee13cbdec4	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] powernow-k8: Limit Pstate transition latency check [CPUFREQ] Fix PCC driver error path [CPUFREQ] fix double freeing in error path of pcc-cpufreq [CPUFREQ] pcc driver should check for pcch method before calling _OSC [CPUFREQ] fix memory leak in cpufreq_add_dev [CPUFREQ] revert "[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)"	2010-07-26 15:35:53 -07:00
Borislav Petkov	3581ced3b6	[CPUFREQ] powernow-k8: Limit Pstate transition latency check The Pstate transition latency check was added for broken F10h BIOSen which wrongly contain a value of 0 for transition and bus master latency. Fam11h and later, however, (will) have similar transition latency so extend that behavior for them too. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Matthew Garrett	179ee43465	[CPUFREQ] Fix PCC driver error path The PCC cpufreq driver unmaps the mailbox address range if any CPUs fail to initialise, but doesn't do anything to remove the registered CPUs from the cpufreq core resulting in failures further down the line. We're better off simply returning a failure - the cpufreq core will unregister us cleanly if we end up with no successfully registered CPUs. Tidy up the failure path and also add a sanity check to ensure that the firmware gives us a realistic frequency - the core deals badly with that being set to 0. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Daniel J Blueman	3847d223f2	[CPUFREQ] fix double freeing in error path of pcc-cpufreq Prevent double freeing on error path. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Matthew Garrett	47f8bcf362	[CPUFREQ] pcc driver should check for pcch method before calling _OSC The pcc specification documents an _OSC method that's incompatible with the one defined as part of the ACPI spec. This shouldn't be a problem as both are supposed to be guarded with a UUID. Unfortunately approximately nobody (including HP, who wrote this spec) properly check the UUID on entry to the _OSC call. Right now this could result in surprising behaviour if the pcc driver performs an _OSC call on a machine that doesn't implement the pcc specification. Check whether the PCCH method exists first in order to reduce this probability. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-07-26 15:25:34 -04:00
Linus Torvalds	20ba5efb9c	Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Use kmalloc() instead of vmalloc() for KVM_[GS]ET_MSR KVM: MMU: fix conflict access permissions in direct sp	2010-07-26 08:18:18 -07:00
Len Brown	0e1cf38889	Merge branch 'bugzilla-16396' into release	2010-07-24 23:26:22 -04:00
Rafael J. Wysocki	72ad5d77fb	ACPI / Sleep: Allow the NVS saving to be skipped during suspend to RAM Commit `2a6b69765a` (ACPI: Store NVS state even when entering suspend to RAM) caused the ACPI suspend code save the NVS area during suspend and restore it during resume unconditionally, although it is known that some systems need to use acpi_sleep=s4_nonvs for hibernation to work. To allow the affected systems to avoid saving and restoring the NVS area during suspend to RAM and resume, introduce kernel command line option acpi_sleep=nonvs and make acpi_sleep=s4_nonvs work as its alias temporarily (add acpi_sleep=s4_nonvs to the feature removal file). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16396 . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: tomas m <tmezzadra@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2010-07-24 23:26:09 -04:00
Stefano Stabellini	ff4878089e	x86: Do not try to disable hpet if it hasn't been initialized before hpet_disable is called unconditionally on machine reboot if hpet support is compiled in the kernel. hpet_disable only checks if the machine is hpet capable but doesn't make sure that hpet has been initialized. [ tglx: Made it a one liner and removed the redundant hpet_address check ] Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <alpine.DEB.2.00.1007211726240.22235@kaball-desktop> Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-07-23 12:53:00 +02:00

1 2 3 4 5 ...

10798 Commits