linux/arch
Peter Xu 8430557fc5 mm/page_table_check: support userfault wr-protect entries
Allow page_table_check hooks to check over userfaultfd wr-protect criteria
upon pgtable updates.  The rule is no co-existance allowed for any
writable flag against userfault wr-protect flag.

This should be better than c2da319c2e, where we used to only sanitize such
issues during a pgtable walk, but when hitting such issue we don't have a
good chance to know where does that writable bit came from [1], so that
even the pgtable walk exposes a kernel bug (which is still helpful on
triaging) but not easy to track and debug.

Now we switch to track the source.  It's much easier too with the recent
introduction of page table check.

There are some limitations with using the page table check here for
userfaultfd wr-protect purpose:

  - It is only enabled with explicit enablement of page table check configs
  and/or boot parameters, but should be good enough to track at least
  syzbot issues, as syzbot should enable PAGE_TABLE_CHECK[_ENFORCED] for
  x86 [1].  We used to have DEBUG_VM but it's now off for most distros,
  while distros also normally not enable PAGE_TABLE_CHECK[_ENFORCED], which
  is similar.

  - It conditionally works with the ptep_modify_prot API.  It will be
  bypassed when e.g. XEN PV is enabled, however still work for most of the
  rest scenarios, which should be the common cases so should be good
  enough.

  - Hugetlb check is a bit hairy, as the page table check cannot identify
  hugetlb pte or normal pte via trapping at set_pte_at(), because of the
  current design where hugetlb maps every layers to pte_t... For example,
  the default set_huge_pte_at() can invoke set_pte_at() directly and lose
  the hugetlb context, treating it the same as a normal pte_t. So far it's
  fine because we have huge_pte_uffd_wp() always equals to pte_uffd_wp() as
  long as supported (x86 only).  It'll be a bigger problem when we'll
  define _PAGE_UFFD_WP differently at various pgtable levels, because then
  one huge_pte_uffd_wp() per-arch will stop making sense first.. as of now
  we can leave this for later too.

This patch also removes commit c2da319c2e altogether, as we have something
better now.

[1] https://lore.kernel.org/all/000000000000dce0530615c89210@google.com/

Link: https://lkml.kernel.org/r/20240417212549.2766883-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:41 -07:00
..
alpha treewide: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
arc treewide: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
arm arm: mm: drop VM_FAULT_BADMAP/VM_FAULT_BADACCESS 2024-05-05 17:53:32 -07:00
arm64 arm64: mm: drop VM_FAULT_BADMAP/VM_FAULT_BADACCESS 2024-05-05 17:53:32 -07:00
csky csky: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
hexagon hexagon: vmlinux.lds.S: handle attributes section 2024-03-26 11:07:23 -07:00
loongarch mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
m68k TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
microblaze arch: define CONFIG_PAGE_SIZE_*KB on all architectures 2024-03-06 19:29:09 +01:00
mips mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
nios2 nios2: Only use built-in devicetree blob if configured to do so 2024-04-03 14:35:53 -05:00
openrisc OpenRISC updates for 6.9 2024-03-14 15:53:10 -07:00
parisc parisc: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
powerpc mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
riscv mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
s390 mm: pass VMA instead of MM to follow_pte() 2024-05-05 17:53:27 -07:00
sh sh/mm/cache: use folio_mapped() in copy_from_user_page() 2024-05-05 17:53:30 -07:00
sparc treewide: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
um mm: vmalloc: enable memory allocation profiling 2024-04-25 20:55:57 -07:00
x86 mm/page_table_check: support userfault wr-protect entries 2024-05-05 17:53:41 -07:00
xtensa xtensa/mm: convert check_tlb_entry() to sanity check folios 2024-05-05 17:53:31 -07:00
.gitignore
Kconfig Kconfig: add some hidden tabs on purpose 2024-04-12 10:05:10 -07:00