linux/mm
Zi Yan 7491f3f348 mm/rmap: do not add fully unmapped large folio to deferred split list
In __folio_remove_rmap(), a large folio is added to deferred split list if
any page in a folio loses its final mapping.  But it is possible that the
folio is fully unmapped and adding it to deferred split list is
unnecessary.

For PMD-mapped THPs, that was not really an issue, because removing the
last PMD mapping in the absence of PTE mappings would not have added the
folio to the deferred split queue.

However, for PTE-mapped THPs, which are now more prominent due to mTHP,
they are always added to the deferred split queue.  One side effect is
that the THP_DEFERRED_SPLIT_PAGE stat for a PTE-mapped folio can be
unintentionally increased, making it look like there are many partially
mapped folios -- although the whole folio is fully unmapped stepwise.

Core-mm now tries batch-unmapping consecutive PTEs of PTE-mapped THPs
where possible starting from commit b06dc281aa ("mm/rmap: introduce
folio_remove_rmap_[pte|ptes|pmd]()").  When it happens, a whole PTE-mapped
folio is unmapped in one go and can avoid being added to deferred split
list, reducing the THP_DEFERRED_SPLIT_PAGE noise.  But there will still be
noise when we cannot batch-unmap a complete PTE-mapped folio in one go --
or where this type of batching is not implemented yet, e.g., migration.

To avoid the unnecessary addition, folio->_nr_pages_mapped is checked to
tell if the whole folio is unmapped.  If the folio is already on deferred
split list, it will be skipped, too.

Note: commit 98046944a159 ("mm: huge_memory: add the missing
folio_test_pmd_mappable() for THP split statistics") tried to exclude mTHP
deferred split stats from THP_DEFERRED_SPLIT_PAGE, but it does not fix the
above issue.  A fully unmapped PTE-mapped order-9 THP was still added to
deferred split list and counted as THP_DEFERRED_SPLIT_PAGE, since nr is
512 (non zero), level is RMAP_LEVEL_PTE, and inside deferred_split_folio()
the order-9 folio is folio_test_pmd_mappable().

Link: https://lkml.kernel.org/r/20240502132852.862138-1-zi.yan@sent.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:56 -07:00
..
damon mm/damon/paddr: implement DAMOS filter type YOUNG 2024-05-05 17:53:55 -07:00
kasan fix missing vmalloc.h includes 2024-04-25 20:55:49 -07:00
kfence mm: introduce slabobj_ext to support slab object extensions 2024-04-25 20:55:51 -07:00
kmsan mm: kmsan: remove runtime checks from kmsan_unpoison_memory() 2024-02-22 10:24:41 -08:00
backing-dev.c writeback: support retrieving per group debug writeback stats of bdi 2024-05-05 17:53:51 -07:00
balloon_compaction.c
bootmem_info.c bootmem: use kmemleak_free_part_phys in put_page_bootmem 2023-10-25 16:47:13 -07:00
cma_debug.c
cma_sysfs.c mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
cma.c mm/cma: drop incorrect alignment check in cma_init_reserved_mem 2024-04-25 20:56:42 -07:00
cma.h mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
compaction.c memory: remove the now superfluous sentinel element from ctl_table array 2024-04-25 20:56:32 -07:00
debug_page_alloc.c mm: page_alloc: consolidate free page accounting 2024-04-25 20:56:04 -07:00
debug_page_ref.c
debug_vm_pgtable.c fix missing vmalloc.h includes 2024-04-25 20:55:49 -07:00
debug.c mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio() 2024-05-05 17:53:31 -07:00
dmapool_test.c
dmapool.c mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs 2023-12-05 11:17:58 +01:00
early_ioremap.c
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c mm: filemap: batch mm counter updating in filemap_map_pages() 2024-05-05 17:53:36 -07:00
folio-compat.c mm: remove __set_page_dirty_nobuffers() 2024-04-25 20:56:25 -07:00
gup_test.c
gup_test.h
gup.c gup: use folios for gup_devmap 2024-05-05 17:53:49 -07:00
highmem.c x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs 2023-12-29 12:22:28 -08:00
hmm.c mm/treewide: replace pXd_huge() with pXd_leaf() 2024-04-25 20:55:46 -07:00
huge_memory.c mm: delay the check for a NULL anon_vma 2024-05-05 17:53:53 -07:00
hugetlb_cgroup.c mm/hugetlb: assert hugetlb_lock in __hugetlb_cgroup_commit_charge 2024-05-05 17:53:41 -07:00
hugetlb_vmemmap.c memory: remove the now superfluous sentinel element from ctl_table array 2024-04-25 20:56:32 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: fix reference to nonexistent file 2023-10-25 16:47:14 -07:00
hugetlb.c mm: convert hugetlb_page_mapping_lock_write to folio 2024-05-05 17:53:46 -07:00
hwpoison-inject.c mm/memory-failure: convert shake_page() to shake_folio() 2024-05-05 17:53:45 -07:00
init-mm.c mm: Deprecate pasid field 2023-12-12 10:11:32 +01:00
internal.h mm/memory-failure: convert shake_page() to shake_folio() 2024-05-05 17:53:45 -07:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
Kconfig.debug mm/slub: unify all sl[au]b parameters with "slab_$param" 2024-01-22 10:31:08 +01:00
khugepaged.c mm: simplify thp_vma_allowable_order 2024-05-05 17:53:53 -07:00
kmemleak.c mm/kmemleak: compact kmemleak_object further 2024-04-25 20:56:05 -07:00
ksm.c mm/memory-failure: pass the folio to collect_procs_ksm() 2024-05-05 17:53:47 -07:00
list_lru.c mm/zswap: stop lru list shrinking when encounter warm region 2024-02-22 10:24:54 -08:00
maccess.c
madvise.c mm/madvise: optimize lazyfreeing with mTHP in madvise_free 2024-05-05 17:53:43 -07:00
Makefile mm/kmemleak: disable KASAN instrumentation in kmemleak 2024-04-25 20:56:05 -07:00
mapping_dirty_helpers.c
memblock.c cxl fixes for 6.8-rc6 2024-02-24 15:53:40 -08:00
memcontrol.c memcg: fix data-race KCSAN bug in rstats 2024-05-05 17:53:50 -07:00
memfd.c mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() 2024-03-04 17:01:21 -08:00
memory_hotplug.c mm/hugetlb: rename dissolve_free_huge_pages() to dissolve_free_hugetlb_folios() 2024-05-05 17:53:35 -07:00
memory-failure.c memory-failure: remove calls to page_mapping() 2024-05-05 17:53:48 -07:00
memory-tiers.c memory tier: create CPUless memory tiers after obtaining HMAT info 2024-05-05 17:53:26 -07:00
memory.c mm: optimise vmf_anon_prepare() for VMAs without an anon_vma 2024-05-05 17:53:54 -07:00
mempolicy.c mm: add pmd_folio() 2024-04-25 20:56:19 -07:00
mempool.c mempool: hook up to memory allocation profiling 2024-04-25 20:55:56 -07:00
memremap.c mm: convert put_devmap_managed_page_refs() to put_devmap_managed_folio_refs() 2024-05-05 17:53:49 -07:00
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-03-13 12:12:21 -07:00
migrate_device.c migrate: expand the use of folio in __migrate_device_pages() 2024-05-05 17:53:48 -07:00
migrate.c mm: convert hugetlb_page_mapping_lock_write to folio 2024-05-05 17:53:46 -07:00
mincore.c
mlock.c mm: add pmd_folio() 2024-04-25 20:56:19 -07:00
mm_init.c mm: init_mlocked_on_free_v3 2024-04-25 20:56:29 -07:00
mm_slot.h
mmap_lock.c
mmap.c mm/mmap: make accountable_mapping return bool 2024-05-05 17:53:26 -07:00
mmu_gather.c mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing 2024-02-22 15:27:17 -08:00
mmu_notifier.c
mmzone.c zswap: shrink zswap pool based on memory pressure 2023-12-12 10:57:02 -08:00
mprotect.c mm: support multi-size THP numa balancing 2024-04-25 20:56:30 -07:00
mremap.c mm: remove "prot" parameter from move_pte() 2024-04-25 20:56:24 -07:00
msync.c
nommu.c mm: remove follow_pfn 2024-04-25 20:56:12 -07:00
oom_kill.c memory: remove the now superfluous sentinel element from ctl_table array 2024-04-25 20:56:32 -07:00
page_alloc.c mm: page_alloc: allowing mTHP compaction to capture the freed page directly 2024-05-05 17:53:37 -07:00
page_counter.c
page_ext.c mm: make page_ext_get() take a const argument 2024-04-25 20:56:14 -07:00
page_idle.c
page_io.c mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters 2024-05-05 17:53:35 -07:00
page_isolation.c mm: page_isolation: prepare for hygienic freelists 2024-04-25 20:56:04 -07:00
page_owner.c mm: introduce slabobj_ext to support slab object extensions 2024-04-25 20:55:51 -07:00
page_poison.c mm/page_poison: replace kmap_atomic() with kmap_local_page() 2023-12-10 16:51:50 -08:00
page_reporting.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_reporting.h
page_table_check.c mm/page_table_check: support userfault wr-protect entries 2024-05-05 17:53:41 -07:00
page_vma_mapped.c mm: make page_mapped_in_vma conditional on CONFIG_MEMORY_FAILURE 2024-05-05 17:53:45 -07:00
page-writeback.c mm: remove stale comment __folio_mark_dirty 2024-05-05 17:53:53 -07:00
pagewalk.c mm: pagewalk: assert write mmap lock only for walking the user page tables 2023-12-10 16:51:53 -08:00
percpu-internal.h mm: percpu: add codetag reference into pcpuobj_ext 2024-04-25 20:55:56 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c percpu: clean up all mappings when pcpu_map_pages() fails 2024-04-25 20:55:49 -07:00
percpu.c mm: percpu: enable per-cpu allocation tagging 2024-04-25 20:55:56 -07:00
pgalloc-track.h
pgtable-generic.c
process_vm_access.c mm: fix process_vm_rw page counts 2023-12-10 16:51:39 -08:00
ptdump.c mm: ptdump: add check_wx_pages debugfs attribute 2024-02-22 10:24:47 -08:00
readahead.c mm/readahead: break read-ahead loop if filemap_add_folio return -ENOMEM 2024-04-25 20:56:07 -07:00
rmap.c mm/rmap: do not add fully unmapped large folio to deferred split list 2024-05-05 17:53:56 -07:00
rodata_test.c
secretmem.c
shmem_quota.c tmpfs: fix race on handling dquot rbtree 2024-03-26 11:07:23 -07:00
shmem.c mm: switch mm->get_unmapped_area() to a flag 2024-04-25 20:56:25 -07:00
show_mem.c lib: add memory allocations report in show_mem() 2024-04-25 20:55:57 -07:00
shrinker_debug.c mm: shrinker: convert shrinker_rwsem to mutex 2023-10-04 10:32:26 -07:00
shrinker.c mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info() 2024-01-05 09:58:32 -08:00
shuffle.c
shuffle.h mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
slab_common.c mm/slab: enable slab allocation tagging for kmalloc and friends 2024-04-25 20:55:55 -07:00
slab.h memcg: simple cleanup of stats update functions 2024-05-05 17:53:44 -07:00
slub.c mm, slab: move slab_memcg hooks to mm/memcontrol.c 2024-04-25 20:56:16 -07:00
sparse-vmemmap.c
sparse.c mm/sparse: guard the size of mem_section is power of 2 2024-05-05 17:53:40 -07:00
swap_cgroup.c
swap_slots.c mm: swap: update get_swap_pages() to take folio order 2024-04-25 20:56:37 -07:00
swap_state.c mm: remove struct page from get_shadow_from_swap_cache 2024-04-25 20:56:40 -07:00
swap.c mm: add kernel-doc for folio_mark_accessed() 2024-05-05 17:53:50 -07:00
swap.h mm/swap: fix race when skipping swapcache 2024-02-20 14:20:48 -08:00
swapfile.c mm: swapfile: check usable swap device in __folio_throttle_swaprate() 2024-05-05 17:53:42 -07:00
truncate.c mm: convert pagecache_isize_extended to use a folio 2024-04-25 20:56:43 -07:00
usercopy.c
userfaultfd.c mm: fix some minor per-VMA lock issues in userfaultfd 2024-05-05 17:53:54 -07:00
util.c mm: switch mm->get_unmapped_area() to a flag 2024-04-25 20:56:25 -07:00
vmalloc.c mm: vmalloc: dump page owner info if page is already mapped 2024-05-05 17:53:51 -07:00
vmpressure.c eventfd: simplify eventfd_signal() 2023-11-28 14:08:38 +01:00
vmscan.c mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters 2024-05-05 17:53:35 -07:00
vmstat.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
workingset.c mm: move mapping_set_update out of <linux/swap.h> 2024-02-21 11:36:50 +05:30
z3fold.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zbud.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zpool.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zsmalloc.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zswap.c mm: zswap: remove same_filled module params 2024-05-05 17:53:38 -07:00