linux/arch/x86/kvm
Sean Christopherson ce25681d59 KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
Add yet another spinlock for the TDP MMU and take it when marking indirect
shadow pages unsync.  When using the TDP MMU and L1 is running L2(s) with
nested TDP, KVM may encounter shadow pages for the TDP entries managed by
L1 (controlling L2) when handling a TDP MMU page fault.  The unsync logic
is not thread safe, e.g. the kvm_mmu_page fields are not atomic, and
misbehaves when a shadow page is marked unsync via a TDP MMU page fault,
which runs with mmu_lock held for read, not write.

Lack of a critical section manifests most visibly as an underflow of
unsync_children in clear_unsync_child_bit() due to unsync_children being
corrupted when multiple CPUs write it without a critical section and
without atomic operations.  But underflow is the best case scenario.  The
worst case scenario is that unsync_children prematurely hits '0' and
leads to guest memory corruption due to KVM neglecting to properly sync
shadow pages.

Use an entirely new spinlock even though piggybacking tdp_mmu_pages_lock
would functionally be ok.  Usurping the lock could degrade performance when
building upper level page tables on different vCPUs, especially since the
unsync flow could hold the lock for a comparatively long time depending on
the number of indirect shadow pages and the depth of the paging tree.

For simplicity, take the lock for all MMUs, even though KVM could fairly
easily know that mmu_lock is held for write.  If mmu_lock is held for
write, there cannot be contention for the inner spinlock, and marking
shadow pages unsync across multiple vCPUs will be slow enough that
bouncing the kvm_arch cacheline should be in the noise.

Note, even though L2 could theoretically be given access to its own EPT
entries, a nested MMU must hold mmu_lock for write and thus cannot race
against a TDP MMU page fault.  I.e. the additional spinlock only _needs_ to
be taken by the TDP MMU, as opposed to being taken by any MMU for a VM
that is running with the TDP MMU enabled.  Holding mmu_lock for read also
prevents the indirect shadow page from being freed.  But as above, keep
it simple and always take the lock.

Alternative #1, the TDP MMU could simply pass "false" for can_unsync and
effectively disable unsync behavior for nested TDP.  Write protecting leaf
shadow pages is unlikely to noticeably impact traditional L1 VMMs, as such
VMMs typically don't modify TDP entries, but the same may not hold true for
non-standard use cases and/or VMMs that are migrating physical pages (from
L1's perspective).

Alternative #2, the unsync logic could be made thread safe.  In theory,
simply converting all relevant kvm_mmu_page fields to atomics and using
atomic bitops for the bitmap would suffice.  However, (a) an in-depth audit
would be required, (b) the code churn would be substantial, and (c) legacy
shadow paging would incur additional atomic operations in performance
sensitive paths for no benefit (to legacy shadow paging).

Fixes: a2855afc7e ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210812181815.3378104-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13 03:32:14 -04:00
..
mmu KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock 2021-08-13 03:32:14 -04:00
svm KVM: SVM: improve the code readability for ASID management 2021-08-04 09:43:03 -04:00
vmx KVM: VMX: Remove vmx_msr_index from vmx.h 2021-07-15 10:19:41 -04:00
cpuid.c KVM: x86/pmu: Clear anythread deprecated bit when 0xa leaf is unsupported on the SVM 2021-07-14 12:17:56 -04:00
cpuid.h KVM: x86: Move reverse CPUID helpers to separate header file 2021-04-26 05:27:13 -04:00
debugfs.c KVM: nVMX: nSVM: Add a new VCPU statistic to show if VCPU is in guest mode 2021-06-17 13:09:36 -04:00
emulate.c KVM: x86: Drop "pre_" from enter/leave_smm() helpers 2021-06-17 13:09:35 -04:00
fpu.h KVM: x86: Move FPU register accessors into fpu.h 2021-06-17 13:09:24 -04:00
hyperv.c KVM: x86: hyper-v: Check if guest is allowed to use XMM registers for hypercall input 2021-08-03 06:16:40 -04:00
hyperv.h KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_ENFORCE_CPUID 2021-06-17 13:09:38 -04:00
i8254.c kvm: i8254: remove redundant assignment to pointer s 2020-06-11 12:35:18 -04:00
i8254.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
i8259.c KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacks 2020-01-27 19:59:37 +01:00
ioapic.c x86/kvm: fix vcpu-id indexed array sizes 2021-07-27 16:58:59 -04:00
ioapic.h x86/kvm: fix vcpu-id indexed array sizes 2021-07-27 16:58:59 -04:00
irq_comm.c x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00
irq.c KVM: x86/xen: Add event channel interrupt vector upcall 2021-02-04 14:19:39 +00:00
irq.h kvm/x86: Remove redundant function implementations 2020-05-27 13:11:10 -04:00
Kconfig ARM: 2021-06-28 15:40:51 -07:00
kvm_cache_regs.h KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 2021-06-17 13:09:47 -04:00
kvm_emulate.h KVM: x86: Drop "pre_" from enter/leave_smm() helpers 2021-06-17 13:09:35 -04:00
kvm_onhyperv.c KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx 2021-06-17 13:09:36 -04:00
kvm_onhyperv.h KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx 2021-06-17 13:09:36 -04:00
lapic.c KVM: LAPIC: Keep stored TMCCT register value 0 after KVM_SET_LAPIC 2021-06-17 14:26:11 -04:00
lapic.h KVM: x86: Add a return code to kvm_apic_accept_events 2021-06-17 13:09:31 -04:00
Makefile KVM: stats: Add fd-based API to read binary stats data 2021-06-24 11:47:57 -04:00
mmu.h KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault 2021-06-24 18:00:47 -04:00
mtrr.c KVM: x86: Add helper to consolidate "raw" reserved GPA mask calculations 2021-02-04 09:27:30 -05:00
pmu.c KVM: x86: use static calls to reduce kvm_x86_ops overhead 2021-02-04 05:27:30 -05:00
pmu.h x86: Fix various typos in comments 2021-03-18 15:31:53 +01:00
reverse_cpuid.h KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features 2021-04-26 05:27:15 -04:00
trace.h KVM: x86: Introduce trace_kvm_hv_hypercall_done() 2021-08-03 06:16:40 -04:00
tss.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
x86.c KVM: x86: accept userspace interrupt only if no event is injected 2021-07-30 07:53:02 -04:00
x86.h KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper 2021-06-24 18:00:45 -04:00
xen.c KVM: x86: Rename GPR accessors to make mode-aware variants the defaults 2021-04-26 05:27:13 -04:00
xen.h KVM: x86/xen: Add support for vCPU runstate information 2021-03-02 14:30:54 -05:00