linux/Documentation/mm
Peter Xu 8430557fc5 mm/page_table_check: support userfault wr-protect entries
Allow page_table_check hooks to check over userfaultfd wr-protect criteria
upon pgtable updates.  The rule is no co-existance allowed for any
writable flag against userfault wr-protect flag.

This should be better than c2da319c2e, where we used to only sanitize such
issues during a pgtable walk, but when hitting such issue we don't have a
good chance to know where does that writable bit came from [1], so that
even the pgtable walk exposes a kernel bug (which is still helpful on
triaging) but not easy to track and debug.

Now we switch to track the source.  It's much easier too with the recent
introduction of page table check.

There are some limitations with using the page table check here for
userfaultfd wr-protect purpose:

  - It is only enabled with explicit enablement of page table check configs
  and/or boot parameters, but should be good enough to track at least
  syzbot issues, as syzbot should enable PAGE_TABLE_CHECK[_ENFORCED] for
  x86 [1].  We used to have DEBUG_VM but it's now off for most distros,
  while distros also normally not enable PAGE_TABLE_CHECK[_ENFORCED], which
  is similar.

  - It conditionally works with the ptep_modify_prot API.  It will be
  bypassed when e.g. XEN PV is enabled, however still work for most of the
  rest scenarios, which should be the common cases so should be good
  enough.

  - Hugetlb check is a bit hairy, as the page table check cannot identify
  hugetlb pte or normal pte via trapping at set_pte_at(), because of the
  current design where hugetlb maps every layers to pte_t... For example,
  the default set_huge_pte_at() can invoke set_pte_at() directly and lose
  the hugetlb context, treating it the same as a normal pte_t. So far it's
  fine because we have huge_pte_uffd_wp() always equals to pte_uffd_wp() as
  long as supported (x86 only).  It'll be a bigger problem when we'll
  define _PAGE_UFFD_WP differently at various pgtable levels, because then
  one huge_pte_uffd_wp() per-arch will stop making sense first.. as of now
  we can leave this for later too.

This patch also removes commit c2da319c2e altogether, as we have something
better now.

[1] https://lore.kernel.org/all/000000000000dce0530615c89210@google.com/

Link: https://lkml.kernel.org/r/20240417212549.2766883-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:41 -07:00
..
damon Docs/mm/damon/design: remove the details for pageout as paddr doesn't use MADV_PAGEOUT 2024-03-04 17:01:17 -08:00
active_mm.rst lazy tlb: allow lazy tlb mm refcounting to be configurable 2023-03-28 16:20:08 -07:00
allocation-profiling.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
arch_pgtable_helpers.rst Documentation/mm: drop pte_bad() descriptions from arch page table helpers 2023-12-10 16:51:40 -08:00
balance.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
bootmem.rst
free_page_reporting.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
highmem.rst Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
hmm.rst docs/mm: remove references to hmm_mirror ops and clean typos 2023-08-28 12:41:17 -06:00
hugetlbfs_reserv.rst mm: convert free_huge_page() to free_huge_folio() 2023-08-21 14:28:43 -07:00
hwpoison.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
index.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
ksm.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
memory-model.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
mmu_notifier.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
multigen_lru.rst mm: multi-gen LRU: improve design doc 2023-03-28 16:20:07 -07:00
numa.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
oom.rst
overcommit-accounting.rst docs: mm: fix vm overcommit documentation for OVERCOMMIT_GUESS 2023-10-10 13:35:55 -06:00
page_allocation.rst
page_cache.rst ubifs: Convert ubifs_vm_page_mkwrite() to use a folio 2024-02-25 21:08:00 +01:00
page_frags.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
page_migration.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
page_owner.rst mm,page_owner: fix refcount imbalance 2024-04-16 15:39:49 -07:00
page_reclaim.rst docs/mm: Physical Memory: remove useless markup 2023-02-02 10:18:04 -07:00
page_table_check.rst mm/page_table_check: support userfault wr-protect entries 2024-05-05 17:53:41 -07:00
page_tables.rst Documentation/page_tables: Add info about MMU/TLB and Page Faults 2023-10-10 13:35:55 -06:00
physical_memory.rst docs/mm: Physical Memory: Fix grammar 2023-04-11 16:16:50 -06:00
process_addrs.rst
remap_file_pages.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
shmfs.rst
slab.rst
slub.rst mm/slub: make the description of slab_min_objects helpful in doc 2024-01-22 10:31:08 +01:00
split_page_table_lock.rst mm: remove pgtable_{pmd, pte}_page_{ctor, dtor}() wrappers 2023-08-21 13:37:58 -07:00
swap.rst
transhuge.rst mm: track mapcount of large folios in single value 2024-05-05 17:53:28 -07:00
unevictable-lru.rst Documentation: stop referring to page_remove_rmap() 2023-12-29 11:58:54 -08:00
vmalloc.rst
vmalloced-kernel-stacks.rst
vmemmap_dedup.rst remove references to page->flags in documentation 2024-04-25 20:56:15 -07:00
z3fold.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
zsmalloc.rst mm: add orphaned kernel-doc to the rst files. 2023-08-24 16:20:31 -07:00