linux/arch/x86/kernel
Dave Hansen 62b5f7d013 mm/core, x86/mm/pkeys: Add execute-only protection keys support
Protection keys provide new page-based protection in hardware.
But, they have an interesting attribute: they only affect data
accesses and never affect instruction fetches.  That means that
if we set up some memory which is set as "access-disabled" via
protection keys, we can still execute from it.

This patch uses protection keys to set up mappings to do just that.
If a user calls:

	mmap(..., PROT_EXEC);
or
	mprotect(ptr, sz, PROT_EXEC);

(note PROT_EXEC-only without PROT_READ/WRITE), the kernel will
notice this, and set a special protection key on the memory.  It
also sets the appropriate bits in the Protection Keys User Rights
(PKRU) register so that the memory becomes unreadable and
unwritable.

I haven't found any userspace that does this today.  With this
facility in place, we expect userspace to move to use it
eventually.  Userspace _could_ start doing this today.  Any
PROT_EXEC calls get converted to PROT_READ inside the kernel, and
would transparently be upgraded to "true" PROT_EXEC with this
code.  IOW, userspace never has to do any PROT_EXEC runtime
detection.

This feature provides enhanced protection against leaking
executable memory contents.  This helps thwart attacks which are
attempting to find ROP gadgets on the fly.

But, the security provided by this approach is not comprehensive.
The PKRU register which controls access permissions is a normal
user register writable from unprivileged userspace.  An attacker
who can execute the 'wrpkru' instruction can easily disable the
protection provided by this feature.

The protection key that is used for execute-only support is
permanently dedicated at compile time.  This is fine for now
because there is currently no API to set a protection key other
than this one.

Despite there being a constant PKRU value across the entire
system, we do not set it unless this feature is in use in a
process.  That is to preserve the PKRU XSAVE 'init state',
which can lead to faster context switches.

PKRU *is* a user register and the kernel is modifying it.  That
means that code doing:

	pkru = rdpkru()
	pkru |= 0x100;
	mmap(..., PROT_EXEC);
	wrpkru(pkru);

could lose the bits in PKRU that enforce execute-only
permissions.  To avoid this, we suggest avoiding ever calling
mmap() or mprotect() when the PKRU value is expected to be
unstable.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Piotr Kwapulinski <kwapulinski.piotr@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: keescook@google.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210240.CB4BB5CA@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18 19:46:33 +01:00
..
acpi x86, ACPI: Handle apic/x2apic entries in MADT in correct order 2015-10-15 01:31:06 +02:00
apic Merge branches 'x86/fpu', 'x86/mm' and 'x86/asm' into x86/pkeys 2016-02-16 09:37:37 +01:00
cpu x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU 2016-02-18 19:46:30 +01:00
fpu mm/core, x86/mm/pkeys: Add execute-only protection keys support 2016-02-18 19:46:33 +01:00
kprobes Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-14 14:37:47 -07:00
.gitignore
alternative.c x86/alternatives: Make optimize_nops() interrupt safe and synced 2015-09-03 21:27:47 +02:00
amd_gart_64.c
amd_nb.c x86/gart: Check for GART support before accessing GART registers 2015-05-06 11:15:53 +02:00
apb_timer.c x86/asm/tsc: Rename native_read_tsc() to rdtsc() 2015-07-06 15:23:28 +02:00
aperture_64.c x86/gart: Check for GART support before accessing GART registers 2015-05-06 11:15:53 +02:00
apm_32.c apm32: Fix cputime == jiffies assumption 2015-07-29 15:44:58 +02:00
asm-offsets_32.c x86/syscalls: Add syscall entry qualifiers 2016-01-29 09:46:38 +01:00
asm-offsets_64.c x86/syscalls: Add syscall entry qualifiers 2016-01-29 09:46:38 +01:00
asm-offsets.c x86/paravirt: Remove the unused irq_enable_sysexit pv op 2015-11-23 10:48:16 +01:00
audit_64.c
bootflag.c x86: don't use module_init for non-modular core bootflag code 2015-06-16 14:12:34 -04:00
check.c Linux 4.2-rc8 2015-08-25 09:59:19 +02:00
cpuid.c new helpers: no_seek_end_llseek{,_size}() 2015-12-23 10:41:31 -05:00
crash_dump_32.c
crash_dump_64.c
crash.c perf, x86: Stop Intel PT before kdump starts 2015-11-23 09:58:26 +01:00
devicetree.c Replace module_init with equivalent device_initcall in non modules. 2015-07-02 10:30:48 -07:00
doublefault.c
dumpstack_32.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:23:34 -07:00
dumpstack_64.c x86/kernel: Use kstack_end() in dumpstack_64.c 2015-02-23 18:37:13 +01:00
dumpstack.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:23:34 -07:00
e820.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
early_printk.c x86/early_printk: Set __iomem address space for IO 2015-10-13 21:45:56 +02:00
early-quirks.c Linux 4.4-rc2 2015-11-23 09:04:05 +01:00
espfix_64.c Merge branch 'x86/urgent' into x86/asm, before applying dependent patches 2015-07-31 10:23:35 +02:00
ftrace.c x86: ftrace: Fix the comments for ftrace_modify_code_direct() 2016-01-04 18:06:38 -05:00
head32.c x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00
head64.c x86/platform/intel-mid: Enable 64-bit build 2016-01-19 08:39:56 +01:00
head_32.S Merge branches 'x86/fpu', 'x86/mm' and 'x86/asm' into x86/pkeys 2016-02-16 09:37:37 +01:00
head_64.S x86/asm: Remove unused L3_PAGE_OFFSET 2016-01-27 11:37:49 +01:00
head.c
hpet.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
hw_breakpoint.c x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros 2015-12-19 11:49:55 +01:00
i386_ksyms_32.c preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00
i8237.c
i8253.c clockevents/drivers/i8253: Migrate to new 'set-state' interface 2015-08-10 11:40:30 +02:00
i8259.c x86/irq: Probe for PIC presence before allocating descs for legacy IRQs 2015-11-07 10:37:37 +01:00
io_delay.c
ioport.c x86/asm/entry: Rename 'init_tss' to 'cpu_tss' 2015-03-06 08:32:58 +01:00
irq_32.c genirq: Remove irq argument from irq flow handlers 2015-09-16 15:47:51 +02:00
irq_64.c x86/irq: Drop unlikely before IS_ERR_OR_NULL 2015-10-01 11:08:56 +02:00
irq_work.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
irq.c x86/irq: Call irq_force_move_complete with irq descriptor 2016-01-15 13:44:01 +01:00
irqinit.c x86/irq: Store irq descriptor in vector array 2015-08-06 00:14:59 +02:00
jump_label.c jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP} 2015-08-03 11:34:12 +02:00
kdebugfs.c
kexec-bzimage64.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2015-09-08 12:41:25 -07:00
kgdb.c x86/kgdb: Replace bool_int_array[NR_CPUS] with bitmap 2015-09-28 10:13:31 +02:00
ksysfs.c
kvm.c The bulk of the changes here is for x86. And for once it's not 2015-06-24 09:36:49 -07:00
kvmclock.c x86/vdso: Remove pvclock fixmap machinery 2015-12-11 08:56:03 +01:00
ldt.c x86/mm: Factor out LDT init from context init 2016-02-18 19:46:31 +01:00
livepatch.c livepatch: Cleanup module page permission changes 2015-12-04 22:51:07 +01:00
machine_kexec_32.c
machine_kexec_64.c kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE 2016-01-20 17:09:18 -08:00
Makefile kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
mcount_64.S x86/ftrace: Add comment on static function tracing 2015-11-19 11:07:49 +01:00
mmconf-fam10h_64.c
module.c x86/mm/KASLR: Propagate KASLR status to kernel proper 2015-04-03 15:26:15 +02:00
mpparse.c x86: Cleanup irq_domain ops 2015-04-24 15:36:55 +02:00
msr.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
nmi_selftest.c
nmi.c x86/nmi: Save regs in crash dump on external NMI 2015-12-19 11:07:01 +01:00
paravirt_patch_32.c x86/paravirt: Remove the unused irq_enable_sysexit pv op 2015-11-23 10:48:16 +01:00
paravirt_patch_64.c x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op 2015-11-23 10:48:16 +01:00
paravirt-spinlocks.c locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS 2015-05-11 09:52:09 +02:00
paravirt.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-01-11 16:26:03 -08:00
pci-calgary_64.c x86/platform/calgary: Constify cal_chipset_ops structures 2015-11-29 08:50:58 +01:00
pci-dma.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
pci-iommu_table.c
pci-nommu.c
pci-swiotlb.c x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN 2015-12-06 12:46:31 +01:00
pcspeaker.c
perf_regs.c perf/x86/64: Report regs_user->ax too in get_regs_user() 2015-04-11 13:08:53 +02:00
pmem.c libnvdimm, e820: skip module loading when no type-12 2015-11-30 09:10:33 -08:00
probe_roms.c
process_32.c sched/core, sched/x86: Kill thread_info::saved_preempt_count 2015-10-06 17:08:18 +02:00
process_64.c x86/mm/pkeys: Dump PKRU with other kernel registers 2016-02-18 19:46:29 +01:00
process.c x86/vm86: Set thread.vm86 to NULL on fork/clone 2015-10-31 09:50:25 +01:00
ptrace.c arch/x86/kernel/ptrace.c: Remove unused arg_offs_table 2015-12-29 11:35:34 +01:00
pvclock.c x86/vdso: Remove pvclock fixmap machinery 2015-12-11 08:56:03 +01:00
quirks.c timers/x86/hpet: Type adjustments 2015-10-21 11:17:32 +02:00
reboot_fixups_32.c
reboot.c x86/reboot/quirks: Add iMac10,1 to pci_reboot_dmi_table[] 2016-01-12 12:27:36 +01:00
relocate_kernel_32.S x86/asm: Optimize unnecessarily wide TEST instructions 2015-03-07 11:12:43 +01:00
relocate_kernel_64.S x86/asm: Replace "MOVQ $imm, %reg" with MOVL 2015-04-01 13:17:39 +02:00
resource.c
rtc.c x86/paravirt: Prevent rtc_cmos platform device init on PV guests 2015-12-19 21:35:13 +01:00
setup_percpu.c
setup.c x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps 2016-02-18 19:46:29 +01:00
signal_compat.c x86/compat: Move copy_siginfo_*_user32() to signal_compat.c 2015-07-06 15:28:55 +02:00
signal.c x86/signal: Cleanup get_nr_restart_syscall() 2016-01-19 12:55:47 +01:00
smp.c x86/smp: Remove single IPI wrapper 2015-11-05 13:07:54 +01:00
smpboot.c x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros 2015-12-19 11:49:55 +01:00
stacktrace.c
step.c Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes 2015-08-18 09:39:47 +02:00
sys_x86_64.c x86/mm: Improve AMD Bulldozer ASLR workaround 2015-03-31 10:01:17 +02:00
sysfb_efi.c
sysfb_simplefb.c
sysfb.c
tboot.c
tce_64.c
test_nx.c
test_rodata.c treewide: Fix typo in printk messages 2015-03-06 23:05:39 +01:00
time.c x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()' 2015-03-23 11:14:17 +01:00
tls.c
tls.h
topology.c x86: Drop bogus __ref / __refdata annotations 2015-07-20 18:57:20 +02:00
trace_clock.c x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites 2015-07-06 15:23:29 +02:00
tracepoint.c
traps.c Merge branches 'x86/fpu', 'x86/mm' and 'x86/asm' into x86/pkeys 2016-02-16 09:37:37 +01:00
tsc_msr.c
tsc_sync.c x86/asm/tsc/sync: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers 2015-07-06 15:23:29 +02:00
tsc.c x86/tsc: Remove unused tsc_pre_init() hook 2015-11-19 11:03:13 +01:00
uprobes.c uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more clever 2015-07-31 10:38:06 +02:00
verify_cpu.S x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
vm86_32.c x86/cpufeature: Replace the old static_cpu_has() with safe variant 2016-01-30 11:22:18 +01:00
vmlinux.lds.S x86/alternatives: Add an auxilary section 2016-01-30 11:22:20 +01:00
vsmp_64.c x86: replace __init_or_module with __init in non-modular vsmp_64.c 2015-06-16 14:12:41 -04:00
x86_init.c x86/tsc: Remove unused tsc_pre_init() hook 2015-11-19 11:03:13 +01:00
x8664_ksyms_64.c preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00