linux/Documentation/mm
Ryan Roberts 7a2bc8b34e mm: fix race between __split_huge_pmd_locked() and GUP-fast
commit 3a5a8d343e upstream.

__split_huge_pmd_locked() can be called for a present THP, devmap or
(non-present) migration entry.  It calls pmdp_invalidate() unconditionally
on the pmdp and only determines if it is present or not based on the
returned old pmd.  This is a problem for the migration entry case because
pmd_mkinvalid(), called by pmdp_invalidate() must only be called for a
present pmd.

On arm64 at least, pmd_mkinvalid() will mark the pmd such that any future
call to pmd_present() will return true.  And therefore any lockless
pgtable walker could see the migration entry pmd in this state and start
interpretting the fields as if it were present, leading to BadThings (TM).
GUP-fast appears to be one such lockless pgtable walker.

x86 does not suffer the above problem, but instead pmd_mkinvalid() will
corrupt the offset field of the swap entry within the swap pte.  See link
below for discussion of that problem.

Fix all of this by only calling pmdp_invalidate() for a present pmd.  And
for good measure let's add a warning to all implementations of
pmdp_invalidate[_ad]().  I've manually reviewed all other
pmdp_invalidate[_ad]() call sites and believe all others to be conformant.

This is a theoretical bug found during code review.  I don't have any test
case to trigger it in practice.

Link: https://lkml.kernel.org/r/20240501143310.1381675-1-ryan.roberts@arm.com
Link: https://lore.kernel.org/all/0dd7827a-6334-439a-8fd0-43c98e6af22b@arm.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16 13:41:38 +02:00
..
damon
active_mm.rst
arch_pgtable_helpers.rst mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-06-16 13:41:38 +02:00
balance.rst
bootmem.rst
free_page_reporting.rst
frontswap.rst
highmem.rst Documentation/mm: add details about kmap_local_page() and preemption 2022-08-08 18:06:46 -07:00
hmm.rst
hugetlbfs_reserv.rst
hwpoison.rst
index.rst mm: multi-gen LRU: design doc 2022-09-26 19:46:11 -07:00
ksm.rst ksm: add the ksm prefix to the names of the ksm private structures 2022-10-03 14:02:43 -07:00
memory-model.rst
mmu_notifier.rst
multigen_lru.rst mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[] 2023-09-19 12:27:54 +02:00
numa.rst
oom.rst
overcommit-accounting.rst
page_allocation.rst
page_cache.rst
page_frags.rst
page_migration.rst
page_owner.rst A handful of relatively simple documentation fixes, plus a set of patches 2022-10-13 10:58:32 -07:00
page_reclaim.rst
page_table_check.rst mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM 2023-06-14 11:15:29 +02:00
page_tables.rst
physical_memory.rst
process_addrs.rst
remap_file_pages.rst
shmfs.rst
slab.rst
slub.rst mm/slub: enable debugging memory wasting of kmalloc 2022-09-23 12:32:45 +02:00
split_page_table_lock.rst
swap.rst
transhuge.rst
unevictable-lru.rst Documentation/mm: modify page_referenced to folio_referenced 2022-09-29 13:16:08 -06:00
vmalloc.rst
vmalloced-kernel-stacks.rst
vmemmap_dedup.rst mm: hugetlb_vmemmap: move code comments to vmemmap_dedup.rst 2022-08-08 18:06:43 -07:00
z3fold.rst
zsmalloc.rst zsmalloc: document freeable stats 2023-04-13 16:55:35 +02:00