linux/arch/x86
Rick Edgecombe fca4d413c5 x86/mm: Introduce _PAGE_SAVED_DIRTY
Some OSes have a greater dependence on software available bits in PTEs than
Linux. That left the hardware architects looking for a way to represent a
new memory type (shadow stack) within the existing bits. They chose to
repurpose a lightly-used state: Write=0,Dirty=1. So in order to support
shadow stack memory, Linux should avoid creating memory with this PTE bit
combination unless it intends for it to be shadow stack.

The reason it's lightly used is that Dirty=1 is normally set by HW
_before_ a write. A write with a Write=0 PTE would typically only generate
a fault, not set Dirty=1. Hardware can (rarely) both set Dirty=1 *and*
generate the fault, resulting in a Write=0,Dirty=1 PTE. Hardware which
supports shadow stacks will no longer exhibit this oddity.

So that leaves Write=0,Dirty=1 PTEs created in software. To avoid
inadvertently created shadow stack memory, in places where Linux normally
creates Write=0,Dirty=1, it can use the software-defined _PAGE_SAVED_DIRTY
in place of the hardware _PAGE_DIRTY. In other words, whenever Linux needs
to create Write=0,Dirty=1, it instead creates Write=0,SavedDirty=1 except
for shadow stack, which is Write=0,Dirty=1.

There are six bits left available to software in the 64-bit PTE after
consuming a bit for _PAGE_SAVED_DIRTY. For 32 bit, the same bit as
_PAGE_BIT_UFFD_WP is used, since user fault fd is not supported on 32
bit. This leaves one unused software bit on 32 bit (_PAGE_BIT_SOFT_DIRTY,
as this is also not supported on 32 bit).

Implement only the infrastructure for _PAGE_SAVED_DIRTY. Changes to
actually begin creating _PAGE_SAVED_DIRTY PTEs will follow once other
pieces are in place.

Since this SavedDirty shifting is done for all x86 CPUs, this leaves
the possibility for the hardware oddity to still create Write=0,Dirty=1
PTEs in rare cases. Since these CPUs also don't support shadow stack, this
will be harmless as it was before the introduction of SavedDirty.

Implement the shifting logic to be branchless. Embed the logic of whether
to do the shifting (including checking the Write bits) so that it can be
called by future callers that would otherwise need additional branching
logic. This efficiency allows the logic of when to do the shifting to be
centralized, making the code easier to reason about.

Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Tested-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/20230613001108.3040476-11-rick.p.edgecombe%40intel.com
2023-07-11 14:12:19 -07:00
..
boot - Fix a race window where load_unaligned_zeropad() could cause 2023-06-26 16:32:47 -07:00
coco - Some SEV and CC platform helpers cleanup and simplifications now that 2023-06-27 13:26:30 -07:00
configs x86/defconfig: Enable CONFIG_DEBUG_WX=y 2022-09-02 10:41:42 +02:00
crypto This push fixes an alignment crash in x86/aria. 2023-05-29 07:05:49 -04:00
entry - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
events Perf events changes for v6.5: 2023-06-27 14:43:02 -07:00
hyperv - Some SEV and CC platform helpers cleanup and simplifications now that 2023-06-27 13:26:30 -07:00
ia32 x86/signal/32: Merge native and compat 32-bit signal code 2022-10-19 09:58:49 +02:00
include x86/mm: Introduce _PAGE_SAVED_DIRTY 2023-07-11 14:12:19 -07:00
kernel x86/cpufeatures: Add CPU feature flags for shadow stacks 2023-07-11 14:12:18 -07:00
kvm ARM64: 2023-07-03 15:32:22 -07:00
lib Locking changes for v6.5: 2023-06-27 14:14:30 -07:00
math-emu x86/fpu: Include asm/fpu/regset.h 2023-05-18 11:56:18 -07:00
mm Merge branch 'expand-stack' 2023-06-28 20:35:21 -07:00
net bpf: Fix a bpf_jit_dump issue for x86_64 with sysctl bpf_jit_enable. 2023-06-12 16:47:18 +02:00
pci - Address -Wmissing-prototype warnings 2023-06-26 16:43:54 -07:00
platform A single regression fix for x86: 2023-07-01 11:40:01 -07:00
power x86/topology: Remove CPU0 hotplug option 2023-05-15 13:44:49 +02:00
purgatory hardening updates for v6.5-rc1 2023-06-27 21:24:18 -07:00
ras
realmode x86/realmode: Make stack lock work in trampoline_compat() 2023-05-30 14:11:47 +02:00
tools ELF: fix all "Elf" typos 2023-04-08 13:45:37 -07:00
um um: make stub data pages size tweakable 2023-04-20 23:08:43 +02:00
video drm changes for 6.5-rc1: 2023-06-29 11:00:17 -07:00
virt/vmx/tdx
xen mm: Move pte/pmd_mkwrite() callers with no VMA to _novma() 2023-07-11 14:10:57 -07:00
.gitignore
Kbuild
Kconfig x86/shstk: Add Kconfig option for shadow stack 2023-07-11 14:12:18 -07:00
Kconfig.assembler x86/shstk: Add Kconfig option for shadow stack 2023-07-11 14:12:18 -07:00
Kconfig.cpu x86/cpu: Remove X86_FEATURE_NAMES 2023-05-15 20:03:08 +02:00
Kconfig.debug docs: move x86 documentation into Documentation/arch/ 2023-03-30 12:58:51 -06:00
Makefile x86/unwind/orc: Add ELF section with ORC version identifier 2023-06-16 17:17:42 +02:00
Makefile_32.cpu
Makefile.postlink x86/build: Avoid relocation information in final vmlinux 2023-06-14 19:54:40 +02:00
Makefile.um um: Only disable SSE on clang to work around old GCC bugs 2023-04-04 09:57:05 +02:00