linux/mm
David Hildenbrand 503b158fc3 mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved()
We currently initialize the memmap such that PG_reserved is set and the
refcount of the page is 1.  In virtio-mem code, we have to manually clear
that PG_reserved flag to make memory offlining with partially hotplugged
memory blocks possible: has_unmovable_pages() would otherwise bail out on
such pages.

We want to avoid PG_reserved where possible and move to typed pages
instead.  Further, we want to further enlighten memory offlining code
about PG_offline: offline pages in an online memory section.  One example
is handling managed page count adjustments in a cleaner way during memory
offlining.

So let's initialize the pages with PG_offline instead of PG_reserved. 
generic_online_page()->__free_pages_core() will now clear that flag before
handing that memory to the buddy.

Note that the page refcount is still 1 and would forbid offlining of such
memory except when special care is take during GOING_OFFLINE as currently
only implemented by virtio-mem.

With this change, we can now get non-PageReserved() pages in the XEN
balloon list.  From what I can tell, that can already happen via
decrease_reservation(), so that should be fine.

HV-balloon should not really observe a change: partial online memory
blocks still cannot get surprise-offlined, because the refcount of these
PageOffline() pages is 1.

Update virtio-mem, HV-balloon and XEN-balloon code to be aware that
hotplugged pages are now PageOffline() instead of PageReserved() before
they are handed over to the buddy.

We'll leave the ZONE_DEVICE case alone for now.

Note that self-hosted vmemmap pages will no longer be marked as
reserved.  This matches ordinary vmemmap pages allocated from the buddy
during memory hotplug.  Now, really only vmemmap pages allocated from
memblock during early boot will be marked reserved.  Existing
PageReserved() checks seem to be handling all relevant cases correctly
even after this change.

Link: https://lkml.kernel.org/r/20240607090939.89524-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de> [generic memory-hotplug bits]
Cc: Alexander Potapenko <glider@google.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eugenio Pérez <eperezma@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Marco Elver <elver@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:30:18 -07:00
..
damon mm/damon/lru_sort: remove unnecessary online tuning handling code 2024-07-03 19:30:15 -07:00
kasan kasan: fix bad call to unpoison_slab_object 2024-06-24 20:52:09 -07:00
kfence mm/kfence: add MODULE_DESCRIPTION() 2024-07-03 19:29:58 -07:00
kmsan mm: pass meminit_context to __free_pages_core() 2024-07-03 19:30:18 -07:00
backing-dev.c writeback: support retrieving per group debug writeback stats of bdi 2024-05-05 17:53:51 -07:00
balloon_compaction.c mm: remove MIGRATE_SYNC_NO_COPY mode 2024-07-03 19:30:00 -07:00
bootmem_info.c bootmem: use kmemleak_free_part_phys in put_page_bootmem 2023-10-25 16:47:13 -07:00
cma_debug.c
cma_sysfs.c mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
cma.c mm/cma: drop incorrect alignment check in cma_init_reserved_mem 2024-04-25 20:56:42 -07:00
cma.h mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
compaction.c mm: handle profiling for fake memory allocations during compaction 2024-06-24 20:52:09 -07:00
debug_page_alloc.c mm: page_alloc: consolidate free page accounting 2024-04-25 20:56:04 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick 2024-06-15 10:43:08 -07:00
debug.c mm/debug: print only page mapcount (excluding folio entire mapcount) in __dump_folio() 2024-05-05 17:53:31 -07:00
dmapool_test.c mm/dmapool: add MODULE_DESCRIPTION() 2024-07-03 19:29:58 -07:00
dmapool.c mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs 2023-12-05 11:17:58 +01:00
early_ioremap.c
execmem.c mm/execmem, arch: convert remaining overrides of module_alloc to execmem 2024-05-14 00:31:43 -07:00
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c mm/filemap: reinitialize folio->_mapcount directly 2024-07-03 19:30:17 -07:00
folio-compat.c mm: remove page_mapping() 2024-07-03 19:29:59 -07:00
gup_test.c
gup_test.h
gup.c mm: remove page_mkclean() 2024-07-03 19:30:17 -07:00
highmem.c mm/highmem: make nr_free_highpages() return "unsigned long" 2024-07-03 19:30:06 -07:00
hmm.c mm/treewide: replace pXd_huge() with pXd_leaf() 2024-04-25 20:55:46 -07:00
huge_memory.c mm/huge_memory.c: fix used-uninitialized 2024-07-03 19:30:16 -07:00
hugetlb_cgroup.c mm/hugetlb_cgroup: switch to the new cftypes 2024-07-03 19:30:10 -07:00
hugetlb_vmemmap.c mm: report per-page metadata information 2024-07-03 19:30:09 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: fix reference to nonexistent file 2023-10-25 16:47:14 -07:00
hugetlb.c mm/hugetlb: guard dequeue_hugetlb_folio_nodemask against NUMA_NO_NODE uses 2024-07-03 19:30:10 -07:00
hwpoison-inject.c mm/hwpoison: add MODULE_DESCRIPTION() 2024-07-03 19:29:58 -07:00
init-mm.c mm: Deprecate pasid field 2023-12-12 10:11:32 +01:00
internal.h mm: pass meminit_context to __free_pages_core() 2024-07-03 19:30:18 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed 2023-08-18 10:12:36 -07:00
Kconfig mm/zsmalloc: use a proper page type 2024-07-03 19:30:16 -07:00
Kconfig.debug mm/slub: unify all sl[au]b parameters with "slab_$param" 2024-01-22 10:31:08 +01:00
khugepaged.c khugepaged: simplify the allocation of slab caches 2024-07-03 19:30:15 -07:00
kmemleak.c mm: lift gfp_kmemleak_mask() to gfp.h 2024-05-19 14:40:44 -07:00
ksm.c mm: ksm: drop KSM_KMEM_CACHE() 2024-07-03 19:30:15 -07:00
list_lru.c mm/zswap: stop lru list shrinking when encounter warm region 2024-02-22 10:24:54 -08:00
maccess.c
madvise.c mm/madvise: add MF_ACTION_REQUIRED to madvise(MADV_HWPOISON) 2024-07-03 19:29:57 -07:00
Makefile mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
mapping_dirty_helpers.c mm: fix clean_record_shared_mapping_range kernel-doc 2023-08-24 16:20:30 -07:00
memblock.c memblock: use numa_valid_node() helper to check for invalid node ID 2024-06-16 10:17:57 +03:00
memcontrol.c mm: memcontrol: add VM_BUG_ON_FOLIO() to catch lru folio in mem_cgroup_migrate() 2024-07-03 19:30:13 -07:00
memfd.c mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() 2024-03-04 17:01:21 -08:00
memory_hotplug.c mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved() 2024-07-03 19:30:18 -07:00
memory-failure.c mm/memory-failure: correct comment in me_swapcache_dirty 2024-07-03 19:30:12 -07:00
memory-tiers.c memory tier: create CPUless memory tiers after obtaining HMAT info 2024-05-05 17:53:26 -07:00
memory.c mm: set pte writable while pte_soft_dirty() is true in do_swap_page() 2024-07-03 19:30:07 -07:00
mempolicy.c mm: mempolicy: use folio_alloc_mpol() in alloc_migration_target_by_mpol() 2024-07-03 19:29:53 -07:00
mempool.c mm: fix xyz_noprof functions calling profiled functions 2024-06-05 19:19:26 -07:00
memremap.c mm: convert put_devmap_managed_page_refs() to put_devmap_managed_folio_refs() 2024-05-05 17:53:49 -07:00
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-03-13 12:12:21 -07:00
migrate_device.c mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY 2024-07-03 19:30:00 -07:00
migrate.c mm: remove MIGRATE_SYNC_NO_COPY mode 2024-07-03 19:30:00 -07:00
mincore.c mm/swap: reduce swap cache search space 2024-07-03 19:29:56 -07:00
mlock.c mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-07-03 19:30:09 -07:00
mm_init.c mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved() 2024-07-03 19:30:18 -07:00
mm_slot.h
mmap_lock.c
mmap.c mm: batch unlink_file_vma calls in free_pgd_range 2024-07-03 19:29:58 -07:00
mmu_gather.c mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing 2024-02-22 15:27:17 -08:00
mmu_notifier.c mmu_notifier: remove the .change_pte() callback 2024-04-11 13:18:36 -04:00
mmzone.c zswap: shrink zswap pool based on memory pressure 2023-12-12 10:57:02 -08:00
mprotect.c mm: introduce pmd|pte_needs_soft_dirty_wp helpers for softdirty write-protect 2024-07-03 19:30:07 -07:00
mremap.c mm: remove page_mkclean() 2024-07-03 19:30:17 -07:00
mseal.c mseal: add mseal syscall 2024-05-23 19:40:26 -07:00
msync.c
nommu.c The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
oom_kill.c memory: remove the now superfluous sentinel element from ctl_table array 2024-04-25 20:56:32 -07:00
page_alloc.c mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved() 2024-07-03 19:30:18 -07:00
page_counter.c
page_ext.c mm: report per-page metadata information 2024-07-03 19:30:09 -07:00
page_idle.c
page_io.c mm: zswap: handle incorrect attempts to load large folios 2024-07-03 19:30:09 -07:00
page_isolation.c mm: page_isolation: prepare for hygienic freelists 2024-04-25 20:56:04 -07:00
page_owner.c mm/page-owner: use gfp_nested_mask() instead of open coded masking 2024-05-19 14:40:44 -07:00
page_poison.c mm/page_poison: replace kmap_atomic() with kmap_local_page() 2023-12-10 16:51:50 -08:00
page_reporting.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_reporting.h
page_table_check.c mm/page_table_check: fix crash on ZONE_DEVICE 2024-06-15 10:43:04 -07:00
page_vma_mapped.c mm: make page_mapped_in_vma conditional on CONFIG_MEMORY_FAILURE 2024-05-05 17:53:45 -07:00
page-writeback.c mm: avoid overflows in dirty throttling logic 2024-07-03 19:30:15 -07:00
pagewalk.c mm: pagewalk: assert write mmap lock only for walking the user page tables 2023-12-10 16:51:53 -08:00
percpu-internal.h mm: percpu: add codetag reference into pcpuobj_ext 2024-04-25 20:55:56 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c percpu: clean up all mappings when pcpu_map_pages() fails 2024-04-25 20:55:49 -07:00
percpu.c mm: percpu: enable per-cpu allocation tagging 2024-04-25 20:55:56 -07:00
pgalloc-track.h
pgtable-generic.c mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-05-07 10:37:00 -07:00
process_vm_access.c mm: fix process_vm_rw page counts 2023-12-10 16:51:39 -08:00
ptdump.c mm: ptdump: add check_wx_pages debugfs attribute 2024-02-22 10:24:47 -08:00
readahead.c The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
rmap.c mm/vmscan: avoid split lazyfree THP during shrink_folio_list() 2024-07-03 19:30:08 -07:00
rodata_test.c
secretmem.c mm/secretmem: use a folio in secretmem_fault() 2023-08-21 13:38:02 -07:00
shmem_quota.c tmpfs: fix race on handling dquot rbtree 2024-03-26 11:07:23 -07:00
shmem.c mm: shmem: add mTHP counters for anonymous shmem 2024-07-03 19:30:04 -07:00
show_mem.c lib: add memory allocations report in show_mem() 2024-04-25 20:55:57 -07:00
shrinker_debug.c mm: shrinker: convert shrinker_rwsem to mutex 2023-10-04 10:32:26 -07:00
shrinker.c mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info() 2024-01-05 09:58:32 -08:00
shuffle.c
shuffle.h mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
slab_common.c The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
slab.h The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
slub.c mm/slab: fix 'variable obj_exts set but not used' warning 2024-06-24 20:52:09 -07:00
sparse-vmemmap.c mm: report per-page metadata information 2024-07-03 19:30:09 -07:00
sparse.c mm: report per-page metadata information 2024-07-03 19:30:09 -07:00
swap_cgroup.c
swap_slots.c mm: swap: update get_swap_pages() to take folio order 2024-04-25 20:56:37 -07:00
swap_state.c mm: swap: remove 'synchronous' argument to swap_read_folio() 2024-07-03 19:30:06 -07:00
swap.c mm: add kernel-doc for folio_mark_accessed() 2024-05-05 17:53:50 -07:00
swap.h mm: swap: remove 'synchronous' argument to swap_read_folio() 2024-07-03 19:30:06 -07:00
swapfile.c mm: remove the implementation of swap_free() and always use swap_free_nr() 2024-07-03 19:30:01 -07:00
truncate.c mm/vmscan: update stale references to shrink_page_list 2024-07-03 19:29:52 -07:00
usercopy.c
userfaultfd.c mm: userfaultfd: use swap() in double_pt_lock() 2024-07-03 19:30:03 -07:00
util.c hardening fixes for v6.10-rc5 2024-06-17 12:00:22 -07:00
vmalloc.c mm/vmalloc: use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() 2024-07-03 19:30:02 -07:00
vmpressure.c eventfd: simplify eventfd_signal() 2023-11-28 14:08:38 +01:00
vmscan.c mm: rename alloc_demote_folio to alloc_migrate_folio 2024-07-03 19:30:12 -07:00
vmstat.c mm: report per-page metadata information 2024-07-03 19:30:09 -07:00
workingset.c mm: cleanup WORKINGSET_NODES in workingset 2024-05-07 10:36:59 -07:00
z3fold.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zbud.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zpool.c mm: zpool: return pool size in pages 2024-04-25 20:55:48 -07:00
zsmalloc.c mm/zsmalloc: use a proper page type 2024-07-03 19:30:16 -07:00
zswap.c mm: zswap: handle incorrect attempts to load large folios 2024-07-03 19:30:09 -07:00