linux/mm
Pasha Tatashin 1eba86c096 mm: change page type prior to adding page table entry
Patch series "page table check", v3.

Ensure that some memory corruptions are prevented by checking at the
time of insertion of entries into user page tables that there is no
illegal sharing.

We have recently found a problem [1] that existed in kernel since 4.14.
The problem was caused by broken page ref count and led to memory
leaking from one process into another.  The problem was accidentally
detected by studying a dump of one process and noticing that one page
contains memory that should not belong to this process.

There are some other page->_refcount related problems that were recently
fixed: [2], [3] which potentially could also lead to illegal sharing.

In addition to hardening refcount [4] itself, this work is an attempt to
prevent this class of memory corruption issues.

It uses a simple state machine that is independent from regular MM logic
to check for illegal sharing at time pages are inserted and removed from
page tables.

[1] https://lore.kernel.org/all/xr9335nxwc5y.fsf@gthelen2.svl.corp.google.com
[2] https://lore.kernel.org/all/1582661774-30925-2-git-send-email-akaher@vmware.com
[3] https://lore.kernel.org/all/20210622021423.154662-3-mike.kravetz@oracle.com
[4] https://lore.kernel.org/all/20211221150140.988298-1-pasha.tatashin@soleen.com

This patch (of 4):

There are a few places where we first update the entry in the user page
table, and later change the struct page to indicate that this is
anonymous or file page.

In most places, however, we first configure the page metadata and then
insert entries into the page table.  Page table check, will use the
information from struct page to verify the type of entry is inserted.

Change the order in all places to first update struct page, and later to
update page table.

This means that we first do calls that may change the type of page (anon
or file):

	page_move_anon_rmap
	page_add_anon_rmap
	do_page_add_anon_rmap
	page_add_new_anon_rmap
	page_add_file_rmap
	hugepage_add_anon_rmap
	hugepage_add_new_anon_rmap

And after that do calls that add entries to the page table:

	set_huge_pte_at
	set_pte_at

Link: https://lkml.kernel.org/r/20211221154650.1047963-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20211221154650.1047963-2-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Turner <pjt@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:28 +02:00
..
damon mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()' 2021-12-31 09:20:12 -08:00
kasan kasan: fix quarantine conflicting with init_on_free 2022-01-15 16:30:26 +02:00
kfence kfence: fix memory leak when cat kfence objects 2021-12-25 12:20:55 -08:00
backing-dev.c mm: bdi: initialize bdi_min_ratio when bdi is unregistered 2021-12-10 17:10:56 -08:00
balloon_compaction.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
bootmem_info.c mm/bootmem_info.c: mark __init on register_page_bootmem_info_section 2021-09-03 09:58:14 -07:00
cleancache.c
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c memblock: rename memblock_free to memblock_phys_free 2021-11-06 13:30:41 -07:00
cma.h mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
compaction.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: update comments regarding migration swap entries 2022-01-15 16:30:26 +02:00
debug.c mm,fs: split dump_mapping() out from dump_page() 2022-01-15 16:30:26 +02:00
dmapool.c mm/dmapool: use DEVICE_ATTR_RO macro 2021-06-29 10:53:52 -07:00
early_ioremap.c mm/early_ioremap.c: remove redundant early_ioremap_shutdown() 2021-09-08 11:50:24 -07:00
fadvise.c
failslab.c
filemap.c filemap: remove PageHWPoison check from next_uptodate_page() 2021-12-10 17:10:55 -08:00
folio-compat.c mm/filemap: Add FGP_STABLE 2021-10-18 07:49:41 -04:00
frontswap.c mm/frontswap.c: use non-atomic '__set_bit()' when possible 2022-01-15 16:30:26 +02:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/gup.c: stricter check on THP migration entry during follow_pmd_mask 2022-01-15 16:30:26 +02:00
highmem.c Fixes for 5.16 folios: 2021-11-25 10:13:56 -08:00
hmm.c mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled 2021-09-08 18:45:52 -07:00
huge_memory.c Memory folios 2021-11-01 08:47:59 -07:00
hugetlb_cgroup.c hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro 2021-11-06 13:30:39 -07:00
hugetlb_vmemmap.c mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON 2021-06-30 20:47:26 -07:00
hugetlb_vmemmap.h mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate 2021-06-30 20:47:25 -07:00
hugetlb.c mm: change page type prior to adding page table entry 2022-01-15 16:30:28 +02:00
hwpoison-inject.c mm: hwpoison: don't drop slab caches for offlining non-LRU page 2021-09-03 09:58:15 -07:00
init-mm.c mm: add setup_initial_init_mm() helper 2021-07-08 11:48:21 -07:00
internal.h mm: memcontrol: make cgroup_memory_nokmem static 2022-01-15 16:30:27 +02:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm: add a field to store names for private anonymous memory 2022-01-15 16:30:27 +02:00
Kconfig.debug
khugepaged.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
kmemleak.c kmemleak: fix kmemleak false positive report with HW tag-based kasan enable 2022-01-15 16:30:25 +02:00
ksm.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
list_lru.c mm: list_lru: only add memcg-aware lrus to the global lru list 2021-11-06 13:30:35 -07:00
maccess.c ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault 2021-08-20 11:39:25 +01:00
madvise.c mm: move anon_vma declarations to linux/mm_inline.h 2022-01-15 16:30:27 +02:00
Makefile mm/util: Add folio_mapping() and folio_file_mapping() 2021-09-27 09:27:30 -04:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c arm64 fixes for -rc1 2021-11-10 11:29:30 -08:00
memcontrol.c memcg: add per-memcg vmalloc stat 2022-01-15 16:30:27 +02:00
memfd.c mm,hugetlb: remove mlock ulimit for SHM_HUGETLB 2021-11-09 10:02:48 -08:00
memory_hotplug.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
memory-failure.c mm: shmem: don't truncate page if memory failure happens 2022-01-15 16:30:26 +02:00
memory.c mm: change page type prior to adding page table entry 2022-01-15 16:30:28 +02:00
mempolicy.c mm: add a field to store names for private anonymous memory 2022-01-15 16:30:27 +02:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm/memremap: add ZONE_DEVICE support for compound pages 2022-01-15 16:30:25 +02:00
memtest.c
migrate.c mm: change page type prior to adding page table entry 2022-01-15 16:30:28 +02:00
mincore.c
mlock.c mm: add a field to store names for private anonymous memory 2022-01-15 16:30:27 +02:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmap.c mm: protect free_pgtables with mmap_lock write lock in exit_mmap 2022-01-15 16:30:27 +02:00
mmu_gather.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
mmu_notifier.c mm/mmu_notifiers: ensure range_end() is paired with range_start() 2021-03-25 09:22:55 -07:00
mmzone.c
mprotect.c mm: add a field to store names for private anonymous memory 2022-01-15 16:30:27 +02:00
mremap.c mm, hugepages: add mremap() support for hugepage backed vma 2021-11-06 13:30:39 -07:00
msync.c mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start 2021-04-30 11:20:37 -07:00
nommu.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
oom_kill.c mm/oom_kill: allow process_mrelease to run under mmap_lock protection 2022-01-15 16:30:28 +02:00
page_alloc.c mm/memremap: add ZONE_DEVICE support for compound pages 2022-01-15 16:30:25 +02:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm/page_ext.c: fix a comment 2021-11-06 13:30:34 -07:00
page_idle.c mm/idle_page_tracking: make PG_idle reusable 2021-09-08 11:50:24 -07:00
page_io.c for-5.16/block-2021-10-29 2021-11-01 09:19:50 -07:00
page_isolation.c mm/page_isolation: guard against possible putback unisolated page 2021-11-06 13:30:40 -07:00
page_owner.c mm/page_owner.c: modify the type of argument "order" in some functions 2021-11-11 09:34:35 -08:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_vma_mapped.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
page-writeback.c folio: Add a function to get the host inode for a folio 2021-11-10 21:16:52 +00:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2021-07-01 17:17:24 -07:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c percpu: rework memcg accounting 2021-06-05 20:43:15 +00:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c memblock: use memblock_free for freeing virtual pointers 2021-11-06 13:30:41 -07:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: ptdump: fix build failure 2021-04-16 16:10:37 -07:00
readahead.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
rmap.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
rodata_test.c
secretmem.c mm/secretmem: avoid letting secretmem_users drop to zero 2021-10-28 17:18:55 -07:00
shmem.c shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode 2022-01-15 16:30:26 +02:00
shuffle.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: memcontrol: make cgroup_memory_nokmem static 2022-01-15 16:30:27 +02:00
slab.c mm: emit the "free" trace report before freeing memory in kmem_cache_free() 2021-11-20 10:35:54 -08:00
slab.h mm: slab: make slab iterator functions static 2022-01-15 16:30:25 +02:00
slob.c mm: emit the "free" trace report before freeing memory in kmem_cache_free() 2021-11-20 10:35:54 -08:00
slub.c mm/slub: fix endianness bug for alloc/free_traces attributes 2021-12-10 17:10:56 -08:00
sparse-vmemmap.c mm: remove redundant smp_wmb() 2021-11-06 13:30:36 -07:00
sparse.c memblock: use memblock_free for freeing virtual pointers 2021-11-06 13:30:41 -07:00
swap_cgroup.c
swap_slots.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
swap_state.c mm/workingset: Convert workingset_refault() to take a folio 2021-10-18 07:49:40 -04:00
swap.c mm/swap.c:put_pages_list(): reinitialise the page list 2021-11-20 10:35:54 -08:00
swapfile.c mm: change page type prior to adding page table entry 2022-01-15 16:30:28 +02:00
truncate.c mm/truncate.c: remove unneeded variable 2022-01-15 16:30:26 +02:00
usercopy.c
userfaultfd.c mm: shmem: don't truncate page if memory failure happens 2022-01-15 16:30:26 +02:00
util.c mm: Remove folio_test_single 2021-11-17 10:36:35 -05:00
vmacache.c
vmalloc.c memcg: add per-memcg vmalloc stat 2022-01-15 16:30:27 +02:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c mm: vmscan: reduce throttling due to a failure to make progress -fix 2021-12-31 13:12:55 -08:00
vmstat.c mm: vmstat.c: make extfrag_index show more pretty 2021-11-06 13:30:42 -07:00
workingset.c Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
z3fold.c mm/z3fold: add kerneldoc fields for z3fold_pool 2021-07-01 11:06:03 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
zsmalloc.c mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() 2021-11-06 13:30:43 -07:00
zswap.c mm/zswap.c: fix two bugs in zswap_writeback_entry() 2021-06-30 20:47:31 -07:00