linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-15 16:24:13 +08:00

Author	SHA1	Message	Date
Glauber Costa	12f9a48f7b	KVM: x86: release kvmclock page on reset When a vcpu is reset, kvmclock page keeps being written to this days. This is wrong and inconsistent: a cpu reset should take it to its initial state. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:28 -03:00
Huang Ying	f58c9df78c	mm: remove is_hwpoison_address Unused. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:27 -03:00
Huang Ying	fafc3dbaac	KVM: Replace is_hwpoison_address with __get_user_pages is_hwpoison_address only checks whether the page table entry is hwpoisoned, regardless the memory page mapped. While __get_user_pages will check both. QEMU will clear the poisoned page table entry (via unmap/map) to make it possible to allocate a new memory page for the virtual address across guest rebooting. But it is also possible that the underlying memory page is kept poisoned even after the corresponding page table entry is cleared, that is, a new memory page can not be allocated. __get_user_pages can catch these situations. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:27 -03:00
Huang Ying	69ebb83e13	mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally Make __get_user_pages return -EHWPOISON for HWPOISON page only if FOLL_HWPOISON is specified. With this patch, the interested callers can distinguish HWPOISON pages from general FAULT pages, while other callers will still get -EFAULT for all these pages, so the user space interface need not to be changed. This feature is needed by KVM, where UCR MCE should be relayed to guest for HWPOISON page, while instruction emulation and MMIO will be tried for general FAULT page. The idea comes from Andrew Morton. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:27 -03:00
Huang Ying	0014bd990e	mm: export __get_user_pages In most cases, get_user_pages and get_user_pages_fast should be used to pin user pages in memory. But sometimes, some special flags except FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in following patch, KVM needs FOLL_HWPOISON. To support these users, __get_user_pages is exported directly. There are some symbol name conflicts in infiniband driver, fixed them too. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Michel Lespinasse <walken@google.com> CC: Roland Dreier <roland@kernel.org> CC: Ralph Campbell <infinipath@qlogic.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:27 -03:00
john cooper	91c9c3eda4	KVM: x86: handle guest access to BBL_CR_CTL3 MSR A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:27 -03:00
Xiao Guangrong	3cba41307a	KVM: make make_all_cpus_request() lockless Now, we have 'vcpu->mode' to judge whether need to send ipi to other cpus, this way is very exact, so checking request bit is needless, then we can drop the spinlock let it's collateral Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:26 -03:00
Xiao Guangrong	6b7e2d0991	KVM: Add "exiting guest mode" state Currently we keep track of only two states: guest mode and host mode. This patch adds an "exiting guest mode" state that tells us that an IPI will happen soon, so unless we need to wait for the IPI, we can avoid it completely. Also 1: No need atomically to read/write ->mode in vcpu's thread 2: reorganize struct kvm_vcpu to make ->mode and ->requests in the same cache line explicitly Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:26 -03:00
Heiko Carstens	d48ead8b0b	KVM: fix build warning within __kvm_set_memory_region() on s390 Get rid of this warning: CC arch/s390/kvm/../../../virt/kvm/kvm_main.o arch/s390/kvm/../../../virt/kvm/kvm_main.c:596:12: warning: 'kvm_create_dirty_bitmap' defined but not used The only caller of the function is within a !CONFIG_S390 section, so add the same ifdef around kvm_create_dirty_bitmap() as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Jan Kiszka	9ca5231831	KVM: x86: Remove user space triggerable MCE error message This case is a pure user space error we do not need to record. Moreover, it can be misused to flood the kernel log. Remove it. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Xiao Guangrong	63f42e023e	KVM: fix rcu usage warning in kvm_arch_vcpu_ioctl_set_sregs() Fix: [ 1001.499596] =================================================== [ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ] [ 1001.499601] --------------------------------------------------- [ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection! ...... [ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62 [ 1001.499638] Call Trace: [ 1001.499644] [] lockdep_rcu_dereference+0x9d/0xa5 [ 1001.499653] [] gfn_to_memslot+0x8d/0xc8 [kvm] [ 1001.499661] [] gfn_to_hva+0x16/0x3f [kvm] [ 1001.499669] [] kvm_read_guest_page+0x1e/0x5e [kvm] [ 1001.499681] [] kvm_read_guest_page_mmu+0x53/0x5e [kvm] [ 1001.499699] [] load_pdptrs+0x3f/0x9c [kvm] [ 1001.499705] [] ? vmx_set_cr0+0x507/0x517 [kvm_intel] [ 1001.499717] [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm] [ 1001.499727] [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Avi Kivity	40712faeb8	KVM: VMX: Avoid atomic operation in vmx_vcpu_run Instead of exchanging the guest and host rcx, have separate storage for each. This allows us to avoid using the xchg instruction, which is is a little slower than normal operations. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:26 -03:00
Avi Kivity	1c696d0e1b	KVM: VMX: Simplify saving guest rcx in vmx_vcpu_run Change push top-of-stack pop guest-rcx pop dummy to pop guest-rcx which is the same thing, only simpler. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Rik van Riel	00c25bce02	KVM: VMX: increase ple_gap default to 128 On some CPUs, a ple_gap of 41 is simply insufficient to ever trigger PLE exits, even with the minimalistic PLE test from kvm-unit-tests. http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=commitdiff;h=eda71b28fa122203e316483b35f37aaacd42f545 For example, the Xeon X5670 CPU needs a ple_gap of at least 48 in order to get pause loop exits: # modprobe kvm_intel ple_gap=47 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 58298446 # rmmod kvm_intel # modprobe kvm_intel ple_gap=48 # taskset 1 /usr/local/bin/qemu-system-x86_64 \ -device testdev,chardev=log -chardev stdio,id=log \ -kernel x86/vmexit.flat -append ple-round-robin -smp 2 VNC server running on `::1:5900' enabling apic enabling apic ple-round-robin 36616 Increase the ple_gap to 128 to be on the safe side. Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Zhai, Edwin <edwin.zhai@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:25 -03:00
Joerg Roedel	3781c01c15	KVM: SVM: Add support for perf-kvm This patch adds the necessary code to run perf-kvm on AMD machines. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:25 -03:00
Avi Kivity	a917949935	KVM: VMX: Avoid leaking fake realmode state to userspace When emulating real mode, we fake some state: - tr.base points to a fake vm86 tss - segment registers are made to conform to vm86 restrictions change vmx_get_segment() not to expose this fake state to userspace; instead, return the original state. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Avi Kivity	d0ba64f9b4	KVM: VMX: Save and restore tr selector across mode switches When emulating real mode we play with tr hidden state, but leave tr.selector alone. That works well, except for save/restore, since loading TR writes it to the hidden state in vmx->rmode. Fix by also saving and restoring the tr selector; this makes things more consistent and allows migration to work during the early boot stages of Windows XP. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Peter Tyser	bc9c1933d9	KVM: PPC: Fix SPRG get/set for Book3S and BookE Previously SPRGs 4-7 were improperly read and written in kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs(); Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Sedat Dilek	775077a063	KVM guest: Fix section mismatch derived from kvm_guest_cpu_online() WARNING: arch/x86/built-in.o(.text+0x1bb74): Section mismatch in reference from the function kvm_guest_cpu_online() to the function .cpuinit.text:kvm_guest_cpu_init() The function kvm_guest_cpu_online() references the function __cpuinit kvm_guest_cpu_init(). This is often because kvm_guest_cpu_online lacks a __cpuinit annotation or the annotation of kvm_guest_cpu_init is wrong. This patch fixes the warning. Tested with linux-next (next-20101231) Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Avi Kivity	8234b22e1c	KVM: MMU: Don't flush shadow when enabling dirty tracking Instead, drop large mappings, which were the reason we dropped shadow. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:24 -03:00
Jan Kara	0c755de03e	Merge branch 'for_next' into for_linus	2011-03-17 16:44:22 +01:00
Dan Carpenter	1c389795c1	genirq: Fix incorrect unlock in __setup_irq() goto out_thread is called before we take the lock. It causes a gcc warning: "kernel/irq/manage.c:858: warning: ‘flags’ may be used uninitialized in this function" [ tglx: Moved unlock before free_cpumask_var() ] Signed-off-by: Dan Carpenter <error27@gmail.com> LKML-Reference: <20110317114307.GJ2008@bicker> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-03-17 15:52:30 +01:00
Thomas Gleixner	15825a5cd4	cris: Use generic show_interrupts() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jesper Nilsson <jesper.nilsson@axis.com>	2011-03-17 15:52:25 +01:00
Thomas Gleixner	ee0401ec11	genirq: show_interrupts: Check desc->name before printing it blindly desc->name is not required and not used by all architectures. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-03-17 15:52:19 +01:00
Thomas Gleixner	6d05c80dd2	cris: Use accessor functions to set IRQ_PER_CPU flag Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-03-17 15:52:13 +01:00
Thomas Gleixner	368e2119c3	cris: Fix irq conversion fallout arch/cris/arch-v10/kernel/irq.c: In function 'init_IRQ': arch/cris/arch-v10/kernel/irq.c:202:3: error: implicit declaration of function 'set_irq_desc_and_handler' Should have been set_irq_chip_and_handler() Fix it and convert to the new function names while at it. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-03-17 15:52:06 +01:00
Borislav Petkov	d34a6ecd45	amd64_edac: Fix decode_syndrome types Those should all be unsigned. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:40 +01:00
Borislav Petkov	8c6717510f	amd64_edac: Fix DCT argument type Fix amd64_debug_display_dimm_sizes() arguments order per convention (pvt is always first). Also, the now second arg denotes the DCT so adjust its type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:34 +01:00
Borislav Petkov	e761359a25	amd64_edac: Fix ranges signedness The dram ranges make sense only as an unsigned type. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:33 +01:00
Borislav Petkov	972ea17ab9	amd64_edac: Drop local variable Use the macro directly instead Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:32 +01:00
Borislav Petkov	71d2a32e8e	amd64_edac: Fix PCI config addressing types Adjust argument types to the PCI config API's types. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:31 +01:00
Borislav Petkov	151fa71c58	amd64_edac: Fix DRAM base macros Return unsigned u8 values only. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:30 +01:00
Borislav Petkov	b487c33e55	amd64_edac: Fix node id signedness A node id can never be negative since we use it as an index into the DRAM ranges array. This also makes one of the BUG_ON conditions redundant. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:28 +01:00
Borislav Petkov	d88977a9c4	amd64_edac: Drop redundant declarations Those were moved to the mce_amd.h header. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:27 +01:00
Borislav Petkov	df71a05324	amd64_edac: Enable driver on F15h Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	a3b7db09a6	amd64_edac: Adjust ECC symbol size to F15h F15h has the same ECC symbol size options as F10h revD and later so adjust checks to that. Simplify code a bit. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:26 +01:00
Borislav Petkov	87b3e0e6e4	amd64_edac: Simplify scrubrate setting Drop per-instance variable and compute min scrubrate dynamically. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:25 +01:00
Borislav Petkov	cb293250c7	PCI: Rename CPU PCI id define With increasing number of PCI function ids, add the PCI function id in the define name instead of its symbolic name in the BKDG for more clarity. Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:25 +01:00
Borislav Petkov	41d8bfaba7	amd64_edac: Improve DRAM address mapping Drop static tables which map the bits in F2x80 to a chip select size in favor of functions doing the mapping with some bit fiddling. Also, add F15 support. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:24 +01:00
Borislav Petkov	5a5d237169	amd64_edac: Sanitize ->read_dram_ctl_register This function is relevant for F10h and higher, and it has only one callsite so drop its function pointer from the low_ops struct. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:23 +01:00
Borislav Petkov	b15f0fcab1	amd64_edac: Adjust sys_addr to chip select conversion routine to F15h F15h sys_addr to chip select mapping is almost identical to F10h's so reuse that. Rename functions on that path accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	355fba6005	amd64_edac: Beef up early exit reporting Add paranoid checks for the sys address before going off and decoding it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:22 +01:00
Borislav Petkov	614ec9d853	amd64_edac: Revamp online spare handling Replace per-DCT macros with smarter ones, drop hack and look for the spare rank on all chip selects on a channel. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	5d4b58e84a	amd64_edac: Fix channel interleave removal Remove the channel interleave select bit properly. See F2x110[DctSelIntLvAddr] for details. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:21 +01:00
Borislav Petkov	e2f79dbdfb	amd64_edac: Correct node interleaving removal When node interleaving is enabled, a subset of the addr[14:12] bits has to be removed in order to get the normalized DCT address of the DRAM channel. The actual number of bits to remove is determined by F1x[1, 0][7C:40][IntlvEn]. Do this correctly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	95b0ef55cd	amd64_edac: Add support for interleaved region swapping On revC3 and revE Fam10h machines and later, non-interleaved graphics framebuffer memory under the 16G mark can be swapped with a region located at the bottom of memory so that the GPU can use the interleaved region and thus two channels. Add support for that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:20 +01:00
Borislav Petkov	700466249f	amd64_edac: Unify get_error_address The address bits from MC4_STATUS differ only between K8 and the rest so no need for a per-family method. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	f192c7b16c	amd64_edac: Simplify decoding path Use the struct mce directly instead of copying from it into a custom struct err_regs. No functionality change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	7d20d14da1	amd64_edac: Adjust channel counting to F15h The only difference is that F10h used to sport ganged DCTs and F15h doesn't so adjust the F10h routine and reuse it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:19 +01:00
Borislav Petkov	5980bb9cd8	amd64_edac: Cleanup old defines cruft Remove unused defines, drop family names from define names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-03-17 14:46:18 +01:00

... 3 4 5 6 7 ...

240724 Commits