linux/Documentation
Lai Jiangshan b1bd5cba33 KVM: X86: MMU: Use the correct inherited permissions to get shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions.  Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different.  KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries.  Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.

For example, here is a shared pagetable:

   pgd[]   pud[]        pmd[]            virtual address pointers
                     /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
        /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
   pgd-|           (shared pmd[] as above)
        \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
                     \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)

  pud1 and pud2 point to the same pmd table, so:
  - ptr1 and ptr3 points to the same page.
  - ptr2 and ptr4 points to the same page.

(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)

- First, the guest reads from ptr1 first and KVM prepares a shadow
  page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
  "u--" comes from the effective permissions of pgd, pud1 and
  pmd1, which are stored in pt->access.  "u--" is used also to get
  the pagetable for pud1, instead of "uw-".

- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
  The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
  even though the pud1 pmd (because of the incorrect argument to
  kvm_mmu_get_page in the previous step) has role.access="u--".

- Then the guest reads from ptr3.  The hypervisor reuses pud1's
  shadow pmd for pud2, because both use "u--" for their permissions.
  Thus, the shadow pmd already includes entries for both pmd1 and pmd2.

- At last, the guest writes to ptr4.  This causes no vmexit or pagefault,
  because pud1's shadow page structures included an "uw-" page even though
  its role.access was "u--".

Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.

In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.

The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.

The problem had existed long before the commit 41074d07c7 ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit.  So there is no fixes tag here.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com>
Cc: stable@vger.kernel.org
Fixes: cea0f0e7ea ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 12:29:53 -04:00
..
ABI Networking fixes for 5.13-rc1, including fixes from bpf, can 2021-05-08 08:31:46 -07:00
accounting
admin-guide Merge branch 'master' into next 2021-05-08 21:12:55 +10:00
arm It's been a relatively busy cycle in docsland, though more than usually 2021-04-26 13:22:43 -07:00
arm64 Assorted arm64 fixes and clean-ups, the most important: 2021-05-07 12:11:05 -07:00
block
bpf bpf: Document the pahole release info related to libbpf in bpf_devel_QA.rst 2021-04-23 17:11:58 -07:00
cdrom
core-api A few late-arriving documentation fixes, including some oprofile cleanup, a 2021-05-06 08:33:54 -07:00
cpu-freq
crypto
dev-tools scripts/gdb: add lx_current support for arm64 2021-05-07 00:26:33 -07:00
devicetree Kbuild updates for v5.13 (2nd) 2021-05-08 10:00:11 -07:00
doc-guide
driver-api VFIO updates for v5.13-rc1 pt2 2021-05-06 14:22:58 -07:00
fault-injection
fb Documentation: Add leading slash to some paths 2021-03-31 13:49:19 -06:00
features powerpc updates for 5.13 2021-04-30 12:22:28 -07:00
filesystems f2fs-for-5.13-rc1 2021-05-04 18:03:38 -07:00
firmware_class
firmware-guide Documentation: firmware-guide: gpio-properties: Add note to SPI CS case 2021-04-28 19:11:13 +02:00
fpga Documentation: fpga: dfl: Add description for DFL UIO support 2021-03-28 14:58:18 +02:00
gpu drm-misc-next for 5.13: 2021-04-07 17:32:12 +10:00
hid Documentation: Add leading slash to some paths 2021-03-31 13:49:19 -06:00
hwmon hwmon: Remove amd_energy driver 2021-04-20 06:52:08 -07:00
i2c
ia64
ide
iio iio: hrtimer: Allow sub Hz granularity 2021-03-25 19:13:49 +00:00
infiniband
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2021-05-06 23:37:55 -07:00
isdn
kbuild Kconfig updates for v5.13 2021-04-29 14:32:00 -07:00
kernel-hacking
leds Documentation: Add leading slash to some paths 2021-03-31 13:49:19 -06:00
litmus-tests
livepatch
locking
m68k
maintainer media: add a subsystem profile documentation 2021-03-22 08:56:42 +01:00
mhi
mips
misc-devices dw-xdata-pcie: Update outdated info and improve text format 2021-04-14 19:47:28 +02:00
netlabel
networking Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-04-28 11:59:31 -07:00
nios2
nvdimm
openrisc
parisc
PCI
pcmcia
power power supply and reset changes for the v5.13 series 2021-04-28 15:43:58 -07:00
powerpc powerpc updates for 5.13 2021-04-30 12:22:28 -07:00
process A few late-arriving documentation fixes, including some oprofile cleanup, a 2021-05-06 08:33:54 -07:00
RCU
riscv Documentation: riscv: Add documentation that describes the VM layout 2021-04-26 08:25:05 -07:00
s390 s390/pci: expose UID uniqueness guarantee 2021-04-05 11:30:57 +02:00
scheduler sched,doc: sched_debug_verbose cmdline should be sched_verbose 2021-05-06 15:33:26 +02:00
scsi for-5.13/block-2021-04-27 2021-04-28 14:27:12 -07:00
security Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> 2021-05-01 18:50:44 -07:00
sh
sound
sparc
sphinx
sphinx-static
spi spi: Updates for v5.13 2021-04-26 16:32:11 -07:00
staging
target
timers
trace Documentation: trace: Add documentation for TRBE 2021-04-06 16:05:38 -06:00
translations A few late-arriving documentation fixes, including some oprofile cleanup, a 2021-05-06 08:33:54 -07:00
usb docs: usbip: Fix major fields and descriptions in protocol 2021-04-09 16:04:45 +02:00
userspace-api Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> 2021-05-01 18:50:44 -07:00
virt KVM: X86: MMU: Use the correct inherited permissions to get shadow page 2021-06-08 12:29:53 -04:00
vm mm: gup: remove FOLL_SPLIT 2021-04-30 11:20:37 -07:00
w1
watchdog
x86 A few late-arriving documentation fixes, including some oprofile cleanup, a 2021-05-06 08:33:54 -07:00
xtensa
.gitignore
arch.rst
asm-annotations.rst
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
COPYING-logo
docutils.conf
dontdiff kbuild: generate Module.symvers only when vmlinux exists 2021-04-25 05:17:02 +09:00
index.rst
Kconfig
logo.gif
Makefile
memory-barriers.txt
SubmittingPatches
watch_queue.rst