linux/mm
Mike Kravetz 34d9e35b13 hugetlb: add demote bool to gigantic page routines
The routines remove_hugetlb_page and destroy_compound_gigantic_page will
remove a gigantic page and make the set of base pages ready to be
returned to a lower level allocator.  In the process of doing this, they
make all base pages reference counted.

The routine prep_compound_gigantic_page creates a gigantic page from a
set of base pages.  It assumes that all these base pages are reference
counted.

During demotion, a gigantic page will be split into huge pages of a
smaller size.  This logically involves use of the routines,
remove_hugetlb_page, and destroy_compound_gigantic_page followed by
prep_compound*_page for each smaller huge page.

When pages are reference counted (ref count >= 0), additional
speculative ref counts could be taken as described in previous commits
[1] and [2].  This could result in errors while demoting a huge page.
Quite a bit of code would need to be created to handle all possible
issues.

Instead of dealing with the possibility of speculative ref counts, avoid
the possibility by keeping ref counts at zero during the demote process.
Add a boolean 'demote' to the routines remove_hugetlb_page,
destroy_compound_gigantic_page and prep_compound_gigantic_page.  If the
boolean is set, the remove and destroy routines will not reference count
pages and the prep routine will not expect reference counted pages.

'*_for_demote' wrappers of the routines will be added in a subsequent
patch where this functionality is used.

[1] https://lore.kernel.org/linux-mm/20210622021423.154662-3-mike.kravetz@oracle.com/
[2] https://lore.kernel.org/linux-mm/20210809184832.18342-3-mike.kravetz@oracle.com/

Link: https://lkml.kernel.org/r/20211007181918.136982-5-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:39 -07:00
..
damon mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' 2021-10-28 17:18:55 -07:00
kasan kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC 2021-11-06 13:30:37 -07:00
kfence Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
backing-dev.c mm: simplify bdi refcounting 2021-11-06 13:30:34 -07:00
balloon_compaction.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
bootmem_info.c mm/bootmem_info.c: mark __init on register_page_bootmem_info_section 2021-09-03 09:58:14 -07:00
cleancache.c
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm/cma: add cma_pages_valid to determine if pages are in CMA 2021-11-06 13:30:39 -07:00
cma.h mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
compaction.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm: debug_vm_pgtable: don't use __P000 directly 2021-11-06 13:30:33 -07:00
debug.c mm/debug: sync up latest migrate_reason to migrate_reason_names 2021-09-24 16:13:35 -07:00
dmapool.c mm/dmapool: use DEVICE_ATTR_RO macro 2021-06-29 10:53:52 -07:00
early_ioremap.c mm/early_ioremap.c: remove redundant early_ioremap_shutdown() 2021-09-08 11:50:24 -07:00
fadvise.c
failslab.c
filemap.c mm: filemap: coding style cleanup for filemap_map_pmd() 2021-11-06 13:30:38 -07:00
frontswap.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/gup: further simplify __gup_device_huge() 2021-11-06 13:30:34 -07:00
highmem.c mm: in_irq() cleanup 2021-09-08 11:50:24 -07:00
hmm.c mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled 2021-09-08 18:45:52 -07:00
huge_memory.c mm: filemap: check if THP has hwpoisoned subpage for PMD page fault 2021-10-28 17:18:55 -07:00
hugetlb_cgroup.c hugetlb: make free_huge_page irq safe 2021-05-05 11:27:22 -07:00
hugetlb_vmemmap.c mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON 2021-06-30 20:47:26 -07:00
hugetlb_vmemmap.h mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate 2021-06-30 20:47:25 -07:00
hugetlb.c hugetlb: add demote bool to gigantic page routines 2021-11-06 13:30:39 -07:00
hwpoison-inject.c mm: hwpoison: don't drop slab caches for offlining non-LRU page 2021-09-03 09:58:15 -07:00
init-mm.c mm: add setup_initial_init_mm() helper 2021-07-08 11:48:21 -07:00
internal.h mm: introduce pmd_install() helper 2021-11-06 13:30:36 -07:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT 2021-11-06 13:30:33 -07:00
Kconfig.debug
khugepaged.c mm: khugepaged: skip huge page collapse for special files 2021-10-28 17:18:55 -07:00
kmemleak.c mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp 2021-09-08 18:45:53 -07:00
ksm.c mm/ksm: remove old GCC 4.9+ check 2021-09-13 10:18:28 -07:00
list_lru.c mm: list_lru: only add memcg-aware lrus to the global lru list 2021-11-06 13:30:35 -07:00
maccess.c ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault 2021-08-20 11:39:25 +01:00
madvise.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
Makefile mm: introduce Data Access MONitor (DAMON) 2021-09-08 11:50:24 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: remove double Note in kerneldoc 2021-07-01 11:06:02 -07:00
memblock.c memblock: exclude MEMBLOCK_NOMAP regions from kmemleak 2021-10-21 18:30:49 -10:00
memcontrol.c memcg: prohibit unconditional exceeding the limit of dying tasks 2021-11-06 13:30:35 -07:00
memfd.c Reimplement RLIMIT_MEMLOCK on top of ucounts 2021-04-30 14:14:02 -05:00
memory_hotplug.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
memory-failure.c mm: hwpoison: handle non-anonymous THP correctly 2021-11-06 13:30:38 -07:00
memory.c mm: remove redundant smp_wmb() 2021-11-06 13:30:36 -07:00
mempolicy.c mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation 2021-11-06 13:30:37 -07:00
mempool.c kasan: use separate (un)poison implementation for integrated init 2021-06-04 19:32:21 +01:00
memremap.c mm/memory_hotplug: remove nid parameter from arch_remove_memory() 2021-09-08 11:50:23 -07:00
memtest.c
migrate.c mm/migrate: fix CPUHP state to update node demotion order 2021-10-18 20:22:03 -10:00
mincore.c
mlock.c mm: introduce memfd_secret system call to create "secret" memory areas 2021-07-08 11:48:21 -07:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmap.c mm/mmap.c: fix a data race of mm->total_vm 2021-11-06 13:30:35 -07:00
mmu_gather.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
mmu_notifier.c mm/mmu_notifiers: ensure range_end() is paired with range_start() 2021-03-25 09:22:55 -07:00
mmzone.c
mprotect.c mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() 2021-11-06 13:30:36 -07:00
mremap.c mm/mremap: don't account pages in vma_to_resize() 2021-11-06 13:30:36 -07:00
msync.c mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start 2021-04-30 11:20:37 -07:00
nommu.c Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux 2021-09-04 11:35:47 -07:00
oom_kill.c mm, oom: do not trigger out_of_memory from the #PF 2021-11-06 13:30:35 -07:00
page_alloc.c mm/page_alloc: use clamp() to simplify code 2021-11-06 13:30:38 -07:00
page_counter.c mm: page_counter: mitigate consequences of a page_counter underflow 2021-04-30 11:20:38 -07:00
page_ext.c mm/page_ext.c: fix a comment 2021-11-06 13:30:34 -07:00
page_idle.c mm/idle_page_tracking: make PG_idle reusable 2021-09-08 11:50:24 -07:00
page_io.c swap: fix swapfile read/write offset 2021-03-02 17:25:46 -07:00
page_isolation.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
page_owner.c mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE 2021-09-08 11:50:22 -07:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_vma_mapped.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
page-writeback.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2021-07-01 17:17:24 -07:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c percpu: rework memcg accounting 2021-06-05 20:43:15 +00:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm/thp: fix __split_huge_pmd_locked() on shmem migration entry 2021-06-16 09:24:42 -07:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: ptdump: fix build failure 2021-04-16 16:10:37 -07:00
readahead.c mm: Protect operations adding pages to page cache with invalidate_lock 2021-07-13 13:14:27 +02:00
rmap.c Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
rodata_test.c
secretmem.c mm/secretmem: avoid letting secretmem_users drop to zero 2021-10-28 17:18:55 -07:00
shmem.c mm: shmem: don't truncate page if memory failure happens 2021-11-06 13:30:38 -07:00
shuffle.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context 2021-09-04 01:12:23 +02:00
slab.c mm/slab.c: remove useless lines in enable_cpucache() 2021-11-06 13:30:32 -07:00
slab.h mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() 2021-07-30 10:14:39 -07:00
slob.c mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels 2021-03-08 14:18:46 -08:00
slub.c mm, slub: use prefetchw instead of prefetch 2021-11-06 13:30:33 -07:00
sparse-vmemmap.c mm: remove redundant smp_wmb() 2021-11-06 13:30:36 -07:00
sparse.c mm: introduce memmap_alloc() to unify memory map allocation 2021-09-03 09:58:15 -07:00
swap_cgroup.c
swap_slots.c mm: Replace deprecated CPU-hotplug functions. 2021-08-28 01:46:17 +02:00
swap_state.c Revert "mm: swap: check if swap backing device is congested or not" 2021-08-20 11:31:42 -07:00
swap.c mm: optimise put_pages_list() 2021-11-06 13:30:35 -07:00
swapfile.c mm/swapfile: fix an integer overflow in swap_show() 2021-11-06 13:30:35 -07:00
truncate.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
usercopy.c
userfaultfd.c mm: shmem: don't truncate page if memory failure happens 2021-11-06 13:30:38 -07:00
util.c mm: fix uninitialized use in overcommit_policy_handler 2021-09-24 16:13:35 -07:00
vmacache.c
vmalloc.c mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation 2021-11-06 13:30:37 -07:00
vmpressure.c mm/vmpressure: replace vmpressure_to_css() with vmpressure_to_memcg() 2021-09-03 09:58:17 -07:00
vmscan.c mm,vmscan: fix divide by zero in get_scan_count 2021-09-08 18:45:53 -07:00
vmstat.c mm/page_alloc.c: show watermark_boost of zone in zoneinfo 2021-11-06 13:30:38 -07:00
workingset.c memcg: flush lruvec stats in the refault 2021-09-23 10:09:13 -07:00
z3fold.c mm/z3fold: add kerneldoc fields for z3fold_pool 2021-07-01 11:06:03 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
zsmalloc.c mm/zsmalloc.c: improve readability for async_free_zspage() 2021-07-01 11:06:02 -07:00
zswap.c mm/zswap.c: fix two bugs in zswap_writeback_entry() 2021-06-30 20:47:31 -07:00