linux/mm
Peter Xu c1991e0705 hugetlb/userfaultfd: forbid huge pmd sharing when uffd enabled
Huge pmd sharing could bring problem to userfaultfd.  The thing is that
userfaultfd is running its logic based on the special bits on page table
entries, however the huge pmd sharing could potentially share page table
entries for different address ranges.  That could cause issues on
either:

 - When sharing huge pmd page tables for an uffd write protected range,
   the newly mapped huge pmd range will also be write protected
   unexpectedly, or,

 - When we try to write protect a range of huge pmd shared range, we'll
   first do huge_pmd_unshare() in hugetlb_change_protection(), however
   that also means the UFFDIO_WRITEPROTECT could be silently skipped for
   the shared region, which could lead to data loss.

While at it, a few other things are done altogether:

 - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because
   that's definitely something that arch code would like to use too

 - ARM64 currently directly check against
   CONFIG_ARCH_WANT_HUGE_PMD_SHARE when trying to share huge pmd. Switch
   to the want_pmd_share() helper.

 - Move vma_shareable() from huge_pmd_share() into want_pmd_share().

[peterx@redhat.com: fix build with !ARCH_WANT_HUGE_PMD_SHARE]
  Link: https://lkml.kernel.org/r/20210310185359.88297-1-peterx@redhat.com

Link: https://lkml.kernel.org/r/20210218231202.15426-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Adam Ruprecht <ruprecht@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Cannon Matthews <cannonmatthews@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michal Koutn" <mkoutny@suse.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shawn Anastasio <shawn@anastas.io>
Cc: Steven Price <steven.price@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:20 -07:00
..
kasan kasan: record task_work_add() call stack 2021-04-30 11:20:42 -07:00
kfence kfence: make compatible with kmemleak 2021-03-25 09:22:55 -07:00
backing-dev.c mm/backing-dev.c: use might_alloc() 2021-02-26 09:41:01 -08:00
balloon_compaction.c
cleancache.c
cma_debug.c debugfs: make sure we can remove u32_array files cleanly 2020-07-10 13:54:00 -07:00
cma.c mm: cma: print region name on failure 2021-02-26 09:41:00 -08:00
cma.h mm: cma: use CMA_MAX_NAME to define the length of cma name array 2020-09-01 09:19:43 +02:00
compaction.c mm, compaction: make fast_isolate_freepages() stay within zone 2021-02-24 13:38:34 -08:00
debug_page_ref.c
debug_vm_pgtable.c mm: HUGE_VMAP arch support cleanup 2021-04-30 11:20:40 -07:00
debug.c mm/debug: improve memcg debugging 2021-02-24 13:38:27 -08:00
dmapool.c mm/dmapool: switch from strlcpy to strscpy 2021-04-30 11:20:39 -07:00
early_ioremap.c mm/early_ioremap.c: use __func__ instead of function name 2021-02-26 09:41:02 -08:00
fadvise.c mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED 2020-10-13 18:38:29 -07:00
failslab.c
filemap.c dax: account DAX entries as nrpages 2021-05-05 11:27:19 -07:00
frontswap.c mm/frontswap: mark various intentional data races 2020-08-14 19:56:56 -07:00
gup_test.c mm/gup_test.c: mark gup_test_init as __init function 2020-12-15 12:13:38 -08:00
gup_test.h selftests/vm: gup_test: introduce the dump_pages() sub-test 2020-12-15 12:13:38 -08:00
gup.c mm: gup: remove FOLL_SPLIT 2021-04-30 11:20:37 -07:00
highmem.c mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP 2021-03-25 09:22:55 -07:00
hmm.c mm: do page fault accounting in handle_mm_fault 2020-08-12 10:58:02 -07:00
huge_memory.c mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument 2021-03-13 11:27:31 -08:00
hugetlb_cgroup.c hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings 2021-03-25 09:22:55 -07:00
hugetlb.c hugetlb/userfaultfd: forbid huge pmd sharing when uffd enabled 2021-05-05 11:27:20 -07:00
hwpoison-inject.c mm,hwpoison-inject: don't pin for hwpoison_filter 2020-10-16 11:11:16 -07:00
init-mm.c mm/gup: prevent gup_fast from racing with COW during fork 2020-12-15 12:13:39 -08:00
internal.h mm/page_alloc: combine __alloc_pages and __alloc_pages_nodemask 2021-04-30 11:20:42 -07:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: move vmap_range from mm/ioremap.c to mm/vmalloc.c 2021-04-30 11:20:40 -07:00
Kconfig mm/Kconfig: remove default DISCONTIGMEM_MANUAL 2021-04-30 11:20:43 -07:00
Kconfig.debug mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO 2020-12-15 12:13:46 -08:00
khugepaged.c mm,thp,shmem: make khugepaged obey tmpfs mount flags 2021-02-26 09:40:59 -08:00
kmemleak.c mm/kmemleak.c: fix a typo 2021-04-30 11:20:36 -07:00
ksm.c mm: cleanup kstrto*() usage 2020-12-15 12:13:47 -08:00
list_lru.c mm/list_lru.c: remove kvfree_rcu_local() 2021-02-24 13:38:30 -08:00
maccess.c uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
madvise.c mm/madvise: replace ptrace attach requirement for process_madvise 2021-03-13 11:27:30 -08:00
Makefile mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: guard hugepage pud's usage 2021-04-16 16:10:37 -07:00
memblock.c memblock: remove return value of memblock_free_all() 2021-02-22 13:01:23 -08:00
memcontrol.c mm: memcontrol: inline __memcg_kmem_{un}charge() into obj_cgroup_{un}charge_pages() 2021-04-30 11:20:38 -07:00
memfd.c
memory_hotplug.c arm64: mte: Map hotplugged memory as Normal Tagged 2021-03-10 10:56:46 +00:00
memory-failure.c mm/memory-failure: unnecessary amount of unmapping 2021-04-30 11:20:44 -07:00
memory.c mm: apply_to_pte_range warn and fail if a large pte is encountered 2021-04-30 11:20:39 -07:00
mempolicy.c mm/mempolicy: fix mpol_misplaced kernel-doc 2021-04-30 11:20:43 -07:00
mempool.c kasan, mm: integrate page_alloc init with HW_TAGS 2021-04-30 11:20:41 -07:00
memremap.c mm/memremap.c: fix improper SPDX comment style 2021-04-30 11:20:37 -07:00
memtest.c
migrate.c mm/page_alloc: combine __alloc_pages and __alloc_pages_nodemask 2021-04-30 11:20:42 -07:00
mincore.c inode: make init and permission helpers idmapped mount aware 2021-01-24 14:27:16 +01:00
mlock.c mm/mlock: stop counting mlocked pages when none vma is found 2021-02-26 09:41:01 -08:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: add tracepoints around lock acquisition 2020-12-15 12:13:41 -08:00
mmap.c Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio" 2021-04-30 11:20:39 -07:00
mmu_gather.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
mmu_notifier.c mm/mmu_notifiers: ensure range_end() is paired with range_start() 2021-03-25 09:22:55 -07:00
mmzone.c mm/lru: replace pgdat lru_lock with lruvec lock 2020-12-15 14:48:04 -08:00
mprotect.c mm/mprotect.c: optimize error detection in do_mprotect_pkey() 2021-02-24 13:38:30 -08:00
mremap.c Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio" 2021-04-30 11:20:39 -07:00
msync.c mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start 2021-04-30 11:20:37 -07:00
nommu.c mm/nommu: Fix return type of filemap_map_pages() 2021-01-28 14:10:31 +00:00
oom_kill.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
page_alloc.c mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1 2021-04-30 11:20:43 -07:00
page_counter.c mm: page_counter: mitigate consequences of a page_counter underflow 2021-04-30 11:20:38 -07:00
page_ext.c mm: fix some spelling mistakes in comments 2020-12-15 22:46:19 -08:00
page_idle.c mm: page_idle_get_page() does not need lru_lock 2020-12-15 14:48:03 -08:00
page_io.c swap: fix swapfile read/write offset 2021-03-02 17:25:46 -07:00
page_isolation.c mm/page_isolation: do not isolate the max order page 2020-12-15 12:13:45 -08:00
page_owner.c mm: page_owner: detect page_owner recursion via task_struct 2021-04-30 11:20:36 -07:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: use list_entry_is_head() in page_reporting_cycle() 2021-02-24 13:38:30 -08:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: add colon to fix kernel-doc markups error for check_pte 2020-12-15 12:13:41 -08:00
page-writeback.c mm: page-writeback: simplify memcg handling in test_clear_page_writeback() 2021-04-30 11:20:37 -07:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h percpu: make pcpu_nr_empty_pop_pages per chunk type 2021-04-09 13:58:38 +00:00
percpu-km.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-stats.c percpu: make pcpu_nr_empty_pop_pages per chunk type 2021-04-09 13:58:38 +00:00
percpu-vm.c mm/vmalloc: remove unmap_kernel_range 2021-04-30 11:20:40 -07:00
percpu.c percpu: make pcpu_nr_empty_pop_pages per chunk type 2021-04-09 13:58:38 +00:00
pgalloc-track.h mm: move p?d_alloc_track to separate header file 2020-08-07 11:33:26 -07:00
pgtable-generic.c mm/pgtable-generic.c: optimize the VM_BUG_ON condition in pmdp_huge_clear_flush() 2021-02-24 13:38:30 -08:00
process_vm_access.c mm/process_vm_access.c: include compat.h 2021-01-12 18:12:54 -08:00
ptdump.c mm: ptdump: fix build failure 2021-04-16 16:10:37 -07:00
readahead.c mm: Implement readahead_control pageset expansion 2021-04-23 10:14:29 +01:00
rmap.c mm/rmap: correct obsolete comment of page_get_anon_vma() 2021-02-26 09:41:01 -08:00
rodata_test.c mm/rodata_test.c: fix missing function declaration 2020-08-21 09:52:53 -07:00
shmem.c shmem: allow reporting fanotify events with file handles on tmpfs 2021-04-19 16:03:48 +02:00
shuffle.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
shuffle.h mm/shuffle: remove dynamic reconfiguration 2020-08-07 11:33:29 -07:00
slab_common.c mm/slab_common: provide "slab_merge" option for !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT) builds 2021-04-30 11:20:36 -07:00
slab.c kasan, mm: integrate slab init_on_free with HW_TAGS 2021-04-30 11:20:41 -07:00
slab.h kasan, mm: integrate slab init_on_alloc with HW_TAGS 2021-04-30 11:20:41 -07:00
slob.c mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels 2021-03-08 14:18:46 -08:00
slub.c kasan, mm: integrate slab init_on_free with HW_TAGS 2021-04-30 11:20:41 -07:00
sparse-vmemmap.c mm/sparse: only sub-section aligned range would be populated 2020-08-07 11:33:27 -07:00
sparse.c mm/sparse: add the missing sparse_buffer_fini() in error branch 2021-04-30 11:20:39 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: remove redundant NULL check 2021-02-24 13:38:28 -08:00
swap_state.c mm: stop accounting shadow entries 2021-05-05 11:27:19 -07:00
swap.c mm: remove pagevec_lookup_entries 2021-02-26 09:40:59 -08:00
swapfile.c swap: fix swapfile read/write offset 2021-03-02 17:25:46 -07:00
truncate.c mm: stop accounting shadow entries 2021-05-05 11:27:19 -07:00
usercopy.c mm/usercopy.c: delete duplicated word 2020-08-12 10:57:58 -07:00
userfaultfd.c hugetlb: pass vma into huge_pte_alloc() and huge_pmd_share() 2021-05-05 11:27:20 -07:00
util.c mm: move page_mapping_file to pagemap.h 2021-04-30 11:20:37 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm/vmalloc: remove an empty line 2021-04-30 11:20:40 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm/vmscan: restore zone_reclaim_mode ABI 2021-02-24 13:38:34 -08:00
vmstat.c mm/vmstat.c: erase latency in vmstat_shepherd 2021-02-26 09:41:00 -08:00
workingset.c mm: stop accounting shadow entries 2021-05-05 11:27:19 -07:00
z3fold.c z3fold: prevent reclaim/free race for headless pages 2021-03-25 09:22:55 -07:00
zbud.c mm: set the sleep_mapped to true for zbud and z3fold 2021-02-26 09:41:01 -08:00
zpool.c mm/zswap: add the flag can_sleep_mapped 2021-02-26 09:41:01 -08:00
zsmalloc.c mm/zsmalloc.c: use page_private() to access page->private 2021-02-26 09:41:01 -08:00
zswap.c mm/zswap: add the flag can_sleep_mapped 2021-02-26 09:41:01 -08:00