linux/arch/x86
Kazuki Takiguchi f88a6977f8 KVM: x86/mmu: Fix race condition in direct_page_fault
commit 47b0c2e4c2 upstream.

make_mmu_pages_available() must be called with mmu_lock held for write.
However, if the TDP MMU is used, it will be called with mmu_lock held for
read.
This function does nothing unless shadow pages are used, so there is no
race unless nested TDP is used.
Since nested TDP uses shadow pages, old shadow pages may be zapped by this
function even when the TDP MMU is enabled.
Since shadow pages are never allocated by kvm_tdp_mmu_map(), a race
condition can be avoided by not calling make_mmu_pages_available() if the
TDP MMU is currently in use.

I encountered this when repeatedly starting and stopping nested VM.
It can be artificially caused by allocating a large number of nested TDP
SPTEs.

For example, the following BUG and general protection fault are caused in
the host kernel.

pte_list_remove: 00000000cd54fc10 many->many
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu/mmu.c:963!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:pte_list_remove.cold+0x16/0x48 [kvm]
Call Trace:
 <TASK>
 drop_spte+0xe0/0x180 [kvm]
 mmu_page_zap_pte+0x4f/0x140 [kvm]
 __kvm_mmu_prepare_zap_page+0x62/0x3e0 [kvm]
 kvm_mmu_zap_oldest_mmu_pages+0x7d/0xf0 [kvm]
 direct_page_fault+0x3cb/0x9b0 [kvm]
 kvm_tdp_page_fault+0x2c/0xa0 [kvm]
 kvm_mmu_page_fault+0x207/0x930 [kvm]
 npf_interception+0x47/0xb0 [kvm_amd]
 svm_invoke_exit_handler+0x13c/0x1a0 [kvm_amd]
 svm_handle_exit+0xfc/0x2c0 [kvm_amd]
 kvm_arch_vcpu_ioctl_run+0xa79/0x1780 [kvm]
 kvm_vcpu_ioctl+0x29b/0x6f0 [kvm]
 __x64_sys_ioctl+0x95/0xd0
 do_syscall_64+0x5c/0x90

general protection fault, probably for non-canonical address
0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:kvm_mmu_commit_zap_page.part.0+0x4b/0xe0 [kvm]
Call Trace:
 <TASK>
 kvm_mmu_zap_oldest_mmu_pages+0xae/0xf0 [kvm]
 direct_page_fault+0x3cb/0x9b0 [kvm]
 kvm_tdp_page_fault+0x2c/0xa0 [kvm]
 kvm_mmu_page_fault+0x207/0x930 [kvm]
 npf_interception+0x47/0xb0 [kvm_amd]

CVE: CVE-2022-45869
Fixes: a2855afc7e ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU")
Signed-off-by: Kazuki Takiguchi <takiguchi.kazuki171@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-08 11:28:43 +01:00
..
boot x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments 2022-08-17 14:22:45 +02:00
configs x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigs 2022-01-27 11:04:56 +01:00
crypto crypto: blake2s - remove shash module 2022-08-17 14:24:19 +02:00
entry x86/entry: Move CLD to the start of the idtentry macro 2022-08-31 17:16:34 +02:00
events perf/x86/intel/pt: Fix sampling using single range output 2022-11-26 09:24:48 +01:00
hyperv x86/hyperv: Restore VP assist page after cpu offlining/onlining 2022-12-02 17:41:03 +01:00
ia32 binfmt: remove in-tree usage of MAP_DENYWRITE 2021-09-03 18:42:01 +02:00
include x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3 2022-12-08 11:28:42 +01:00
kernel x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3 2022-12-08 11:28:42 +01:00
kvm KVM: x86/mmu: Fix race condition in direct_page_fault 2022-12-08 11:28:43 +01:00
lib x86/entry_32: Fix segment exceptions 2022-07-29 17:25:33 +02:00
math-emu x86: Prepare asm files for straight-line-speculation 2022-05-15 20:18:49 +02:00
mm x86/ioremap: Fix page aligned size calculation in __ioremap_caller() 2022-12-02 17:41:09 +01:00
net x86/extable: Extend extable functionality 2022-07-29 17:25:26 +02:00
pci x86/PCI: Fix ALi M1487 (IBC) PIRQ router link value interpretation 2022-06-09 10:22:46 +02:00
platform x86/olpc: fix 'logical not is only applied to the left hand side' 2022-08-17 14:24:18 +02:00
power x86/pm: Add enumeration check before spec MSRs save/restore setup 2022-12-02 17:41:09 +01:00
purgatory kernel.h: split out panic and oops helpers 2021-07-01 11:06:04 -07:00
ras
realmode x86/mm: Flush global TLB when switching to trampoline page-table 2022-01-27 11:04:35 +01:00
tools - Remove cc-option checks which are old and already supported by the 2021-08-30 13:27:16 -07:00
um arch: um: Mark the stack non-executable to fix a binutils warning 2022-10-12 09:53:27 +02:00
video
xen x86/entry: Work around Clang __bdos() bug 2022-10-26 12:35:31 +02:00
.gitignore
Kbuild
Kconfig x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB 2022-10-29 10:12:58 +02:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug arch: make TRACE_IRQFLAGS_NMI_SUPPORT generic 2022-08-17 14:23:00 +02:00
Makefile x86/realmode: build with -D__DISABLE_EXPORTS 2022-07-23 12:53:56 +02:00
Makefile_32.cpu x86/build: Do not add -falign flags unconditionally for clang 2021-09-19 10:35:53 +09:00
Makefile.um