linux/Documentation/mm
Peter Xu 5b485efcb6 mm/page_table_check: support userfault wr-protect entries
[ Upstream commit 8430557fc5 ]

Allow page_table_check hooks to check over userfaultfd wr-protect criteria
upon pgtable updates.  The rule is no co-existance allowed for any
writable flag against userfault wr-protect flag.

This should be better than c2da319c2e, where we used to only sanitize such
issues during a pgtable walk, but when hitting such issue we don't have a
good chance to know where does that writable bit came from [1], so that
even the pgtable walk exposes a kernel bug (which is still helpful on
triaging) but not easy to track and debug.

Now we switch to track the source.  It's much easier too with the recent
introduction of page table check.

There are some limitations with using the page table check here for
userfaultfd wr-protect purpose:

  - It is only enabled with explicit enablement of page table check configs
  and/or boot parameters, but should be good enough to track at least
  syzbot issues, as syzbot should enable PAGE_TABLE_CHECK[_ENFORCED] for
  x86 [1].  We used to have DEBUG_VM but it's now off for most distros,
  while distros also normally not enable PAGE_TABLE_CHECK[_ENFORCED], which
  is similar.

  - It conditionally works with the ptep_modify_prot API.  It will be
  bypassed when e.g. XEN PV is enabled, however still work for most of the
  rest scenarios, which should be the common cases so should be good
  enough.

  - Hugetlb check is a bit hairy, as the page table check cannot identify
  hugetlb pte or normal pte via trapping at set_pte_at(), because of the
  current design where hugetlb maps every layers to pte_t... For example,
  the default set_huge_pte_at() can invoke set_pte_at() directly and lose
  the hugetlb context, treating it the same as a normal pte_t. So far it's
  fine because we have huge_pte_uffd_wp() always equals to pte_uffd_wp() as
  long as supported (x86 only).  It'll be a bigger problem when we'll
  define _PAGE_UFFD_WP differently at various pgtable levels, because then
  one huge_pte_uffd_wp() per-arch will stop making sense first.. as of now
  we can leave this for later too.

This patch also removes commit c2da319c2e altogether, as we have something
better now.

[1] https://lore.kernel.org/all/000000000000dce0530615c89210@google.com/

Link: https://lkml.kernel.org/r/20240417212549.2766883-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19 06:04:29 +02:00
..
damon Docs/mm/damon/design: update for DAMON monitoring target type DAMOS filter 2023-08-21 13:37:37 -07:00
active_mm.rst lazy tlb: allow lazy tlb mm refcounting to be configurable 2023-03-28 16:20:08 -07:00
arch_pgtable_helpers.rst mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-06-16 13:47:40 +02:00
balance.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
bootmem.rst
free_page_reporting.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
highmem.rst Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
hmm.rst docs/mm: remove references to hmm_mirror ops and clean typos 2023-08-28 12:41:17 -06:00
hugetlbfs_reserv.rst mm: convert free_huge_page() to free_huge_folio() 2023-08-21 14:28:43 -07:00
hwpoison.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
index.rst mm: kill frontswap 2023-08-21 13:37:26 -07:00
ksm.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
memory-model.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
mmu_notifier.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
multigen_lru.rst mm: multi-gen LRU: improve design doc 2023-03-28 16:20:07 -07:00
numa.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
oom.rst
overcommit-accounting.rst
page_allocation.rst
page_cache.rst
page_frags.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
page_migration.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
page_owner.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
page_reclaim.rst docs/mm: Physical Memory: remove useless markup 2023-02-02 10:18:04 -07:00
page_table_check.rst mm/page_table_check: support userfault wr-protect entries 2024-08-19 06:04:29 +02:00
page_tables.rst Documentation/mm: Initial page table documentation 2023-06-16 08:13:13 -06:00
physical_memory.rst docs/mm: Physical Memory: Fix grammar 2023-04-11 16:16:50 -06:00
process_addrs.rst
remap_file_pages.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
shmfs.rst
slab.rst
slub.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
split_page_table_lock.rst mm: remove pgtable_{pmd, pte}_page_{ctor, dtor}() wrappers 2023-08-21 13:37:58 -07:00
swap.rst
transhuge.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
unevictable-lru.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
vmalloc.rst
vmalloced-kernel-stacks.rst
vmemmap_dedup.rst Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
z3fold.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
zsmalloc.rst mm: add orphaned kernel-doc to the rst files. 2023-08-24 16:20:31 -07:00