linux/mm
Huang Ying 5b85593709 migrate_pages: organize stats with struct migrate_pages_stats
Patch series "migrate_pages(): batch TLB flushing", v5.

Now, migrate_pages() migrates folios one by one, like the fake code as
follows,

  for each folio
    unmap
    flush TLB
    copy
    restore map

If multiple folios are passed to migrate_pages(), there are opportunities
to batch the TLB flushing and copying.  That is, we can change the code to
something as follows,

  for each folio
    unmap
  for each folio
    flush TLB
  for each folio
    copy
  for each folio
    restore map

The total number of TLB flushing IPI can be reduced considerably.  And we
may use some hardware accelerator such as DSA to accelerate the folio
copying.

So in this patch, we refactor the migrate_pages() implementation and
implement the TLB flushing batching.  Base on this, hardware accelerated
folio copying can be implemented.

If too many folios are passed to migrate_pages(), in the naive batched
implementation, we may unmap too many folios at the same time.  The
possibility for a task to wait for the migrated folios to be mapped again
increases.  So the latency may be hurt.  To deal with this issue, the max
number of folios be unmapped in batch is restricted to no more than
HPAGE_PMD_NR in the unit of page.  That is, the influence is at the same
level of THP migration.

We use the following test to measure the performance impact of the
patchset,

On a 2-socket Intel server,

 - Run pmbench memory accessing benchmark

 - Run `migratepages` to migrate pages of pmbench between node 0 and
   node 1 back and forth.

With the patch, the TLB flushing IPI reduces 99.1% during the test and
the number of pages migrated successfully per second increases 291.7%.

Xin Hao helped to test the patchset on an ARM64 server with 128 cores,
2 NUMA nodes.  Test results show that the page migration performance
increases up to 78%.


This patch (of 9):

Define struct migrate_pages_stats to organize the various statistics in
migrate_pages().  This makes it easier to collect and consume the
statistics in multiple functions.  This will be needed in the following
patches in the series.

Link: https://lkml.kernel.org/r/20230213123444.155149-1-ying.huang@intel.com
Link: https://lkml.kernel.org/r/20230213123444.155149-2-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16 20:43:52 -08:00
..
damon mm/damon/dbgfs: print DAMON debugfs interface deprecation message 2023-02-13 15:54:33 -08:00
kasan lib/stacktrace, kasan, kmsan: rework extra_bits interface 2023-02-16 20:43:51 -08:00
kfence hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
kmsan lib/stacktrace, kasan, kmsan: rework extra_bits interface 2023-02-16 20:43:51 -08:00
backing-dev.c mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob 2022-11-30 15:59:06 -08:00
balloon_compaction.c mm: Convert all PageMovable users to movable_operations 2022-08-02 12:34:03 -04:00
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-28 14:02:45 -07:00
cma_debug.c mm/cma_debug: show complete cma name in debugfs directories 2022-09-11 20:25:50 -07:00
cma_sysfs.c
cma.c mm/cma: fix potential memory loss on cma_declare_contiguous_nid 2023-02-02 22:33:24 -08:00
cma.h
compaction.c mm, compaction: finish pageblocks on complete migration failure 2023-02-02 22:33:30 -08:00
debug_page_ref.c
debug_vm_pgtable.c mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE 2023-02-02 22:33:11 -08:00
debug.c mm: export dump_mm() 2023-02-09 16:51:40 -08:00
dmapool.c
early_ioremap.c
fadvise.c mm: support POSIX_FADV_NOREUSE 2023-01-18 17:12:57 -08:00
failslab.c mm: fix unexpected changes to {failslab|fail_page_alloc}.attr 2022-11-22 18:50:44 -08:00
filemap.c filemap: add mapping_read_folio_gfp() 2023-02-09 16:51:42 -08:00
folio-compat.c mm: add folio_add_new_anon_rmap() 2023-02-02 22:32:56 -08:00
frontswap.c frontswap: don't call ->init if no ops are registered 2022-09-26 12:14:34 -07:00
gup_test.c mm/gup_test: free memory allocated via kvcalloc() using kvfree() 2022-12-15 16:37:48 -08:00
gup_test.h mm/gup_test: start/stop/read functionality for PIN LONGTERM test 2022-11-08 17:37:15 -08:00
gup.c mm/hugetlb: convert isolate_hugetlb to folios 2023-02-13 15:54:26 -08:00
highmem.c highmem: fix kmap_to_page() for kmap_local_page() addresses 2022-10-12 18:51:51 -07:00
hmm.c mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare 2023-01-18 17:12:39 -08:00
huge_memory.c mm/gup: simplify the external interface functions and consolidate invariants 2023-02-09 16:51:44 -08:00
hugetlb_cgroup.c mm/hugetlb: increase use of folios in alloc_huge_page() 2023-02-13 15:54:27 -08:00
hugetlb_vmemmap.c mm/hugetlb_vmemmap: remap head page to newly allocated page 2022-11-30 15:58:47 -08:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability 2022-08-08 18:06:43 -07:00
hugetlb.c mm/hugetlb: convert hugetlb_wp() to take in a folio 2023-02-13 15:54:29 -08:00
hwpoison-inject.c mm/hwpoison: add __init/__exit annotations to module init/exit funcs 2022-10-03 14:03:05 -07:00
init-mm.c mm: remove rb tree. 2022-09-26 19:46:16 -07:00
internal.h mm/gup: move private gup FOLL_ flags to internal.h 2023-02-09 16:51:45 -08:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: Add ioremap/iounmap_allowed() 2022-06-27 12:22:31 +01:00
Kconfig zsmalloc: set default zspage chain size to 8 2023-02-02 22:33:23 -08:00
Kconfig.debug mm: move KMEMLEAK's Kconfig items from lib to mm 2023-02-02 22:33:26 -08:00
khugepaged.c mm/khugepaged: fix invalid page access in release_pte_pages() 2023-02-13 15:54:26 -08:00
kmemleak.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
ksm.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
list_lru.c mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe 2022-06-16 19:48:31 -07:00
maccess.c maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault() 2022-11-11 11:44:46 -08:00
madvise.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
Makefile mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol 2022-10-03 14:03:36 -07:00
mapping_dirty_helpers.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
memblock.c mm: Always release pages to the buddy allocator in memblock_free_late(). 2023-01-08 18:49:33 +02:00
memcontrol.c mm: multi-gen LRU: section for memcg LRU 2023-02-02 22:33:27 -08:00
memfd.c mm/memfd: add write seals when apply SEAL_EXEC to executable memfd 2023-01-18 17:12:37 -08:00
memory_hotplug.c mm/hugetlb: convert isolate_hugetlb to folios 2023-02-13 15:54:26 -08:00
memory-failure.c mm/hugetlb: convert isolate_hugetlb to folios 2023-02-13 15:54:26 -08:00
memory-tiers.c memory tier: release the new_memtier in find_create_memory_tier() 2023-02-09 16:51:40 -08:00
memory.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
mempolicy.c mm/mempolicy: convert migrate_page_add() to migrate_folio_add() 2023-02-13 15:54:31 -08:00
mempool.c mempool: do not use ksize() for poisoning 2022-11-30 15:58:41 -08:00
memremap.c mm/memremap.c: fix outdated comment in devm_memremap_pages 2023-02-09 16:51:46 -08:00
memtest.c
migrate_device.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
migrate.c migrate_pages: organize stats with struct migrate_pages_stats 2023-02-16 20:43:52 -08:00
mincore.c mm: convert find_get_incore_page() to filemap_get_incore_folio() 2022-11-08 17:37:18 -08:00
mlock.c mm: introduce vm_flags_reset_once to replace WRITE_ONCE vm_flags updates 2023-02-09 16:51:41 -08:00
mm_init.c memory: move hotplug memory notifier priority to same file for easy sorting 2022-11-08 17:37:17 -08:00
mm_slot.h mm: introduce common struct mm_slot 2022-10-03 14:02:43 -07:00
mmap_lock.c
mmap.c mm: introduce __vm_flags_mod and use it in untrack_pfn 2023-02-09 16:51:40 -08:00
mmu_gather.c mm: mmu_gather: allow more than one batch of delayed rmaps 2022-12-11 18:12:21 -08:00
mmu_notifier.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
mmzone.c mm: multi-gen LRU: groundwork 2022-09-26 19:46:09 -07:00
mprotect.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
mremap.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
msync.c mm/msync: use vma_find() instead of vma linked list 2022-09-26 19:46:25 -07:00
nommu.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
oom_kill.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
page_alloc.c mm/page_alloc: reduce fallbacks to (MIGRATE_PCPTYPES - 1) 2023-02-09 16:51:41 -08:00
page_counter.c mm: page_counter: remove unneeded atomic ops for low/min 2022-09-11 20:26:01 -07:00
page_ext.c mm/page_ext: init page_ext early if there are no deferred struct pages 2023-02-02 22:33:22 -08:00
page_idle.c mm: page_idle: convert page idle to use a folio 2023-01-18 17:12:52 -08:00
page_io.c block: remove ->rw_page 2023-02-02 22:33:34 -08:00
page_isolation.c mm/page_isolation: fix clang deadcode warning 2022-10-28 13:37:22 -07:00
page_owner.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
page_poison.c
page_reporting.c mm/page_reporting: replace rcu_access_pointer() with rcu_dereference_protected() 2023-01-18 17:12:50 -08:00
page_reporting.h
page_table_check.c mm/page_ext: do not allocate space for page_ext->flags if not needed 2023-02-02 22:33:11 -08:00
page_vma_mapped.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
page-writeback.c fs: convert writepage_t callback to pass a folio 2023-02-02 22:33:34 -08:00
pagewalk.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
percpu-internal.h percpu: improve percpu_alloc_percpu event trace 2022-05-13 07:20:18 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm/percpu.c: remove the lcm code since block size is fixed at page size 2022-11-07 22:59:18 -08:00
pgalloc-track.h
pgtable-generic.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
process_vm_access.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-03 10:13:13 -07:00
readahead.c readahead: convert readahead_expand() to use a folio 2023-02-02 22:33:21 -08:00
rmap.c mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio() 2023-02-13 15:54:29 -08:00
rodata_test.c mm/rodata_test: use PAGE_ALIGNED() helper 2022-10-03 14:03:05 -07:00
secretmem.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
shmem.c shmem: fix W=1 build warnings with CONFIG_SHMEM=n 2023-02-09 16:51:42 -08:00
shrinker_debug.c mm: shrinkers: fix deadlock in shrinker debugfs 2023-02-09 15:56:51 -08:00
shuffle.c mm/shuffle: convert module_param_call to module_param_cb 2022-10-03 14:03:07 -07:00
shuffle.h
slab_common.c mm/kasan: simplify and refine kasan_cache code 2023-01-18 17:12:55 -08:00
slab.c mm: introduce folio_is_pfmemalloc 2023-01-18 17:12:57 -08:00
slab.h mm/slab: add is_kmalloc_cache() helper function 2023-01-18 17:12:54 -08:00
slob.c Merge branch 'slab/for-6.1/kmalloc_size_roundup' into slab/for-next 2022-09-29 11:30:55 +02:00
slub.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
sparse-vmemmap.c mm/sparse-vmemmap: generalise vmemmap_populate_hugepages() 2022-12-11 18:12:12 -08:00
sparse.c mm/sparse: fix "unused function 'pgdat_to_phys'" warning 2023-02-02 22:33:29 -08:00
swap_cgroup.c mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled 2022-10-03 14:03:36 -07:00
swap_slots.c mm/swap: convert put_swap_page() to put_swap_folio() 2022-10-03 14:02:46 -07:00
swap_state.c swap_state: update shadow_nodes for anonymous page 2023-02-02 22:33:24 -08:00
swap.c mm: clean up mlock_page / munlock_page references in comments 2023-02-02 22:33:20 -08:00
swap.h mm: remove the __swap_writepage return value 2023-02-02 22:33:33 -08:00
swapfile.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
truncate.c folio-compat: remove lru_cache_add() 2022-12-11 18:12:13 -08:00
usercopy.c mm: use kstrtobool() instead of strtobool() 2022-11-30 15:58:45 -08:00
userfaultfd.c mm/uffd: detect pgtable allocation failures 2023-01-18 17:12:53 -08:00
util.c mm: fix typo in __vm_enough_memory warning 2023-02-13 15:54:33 -08:00
vmalloc.c mm/vmalloc: skip the uninitilized vmalloc areas 2023-02-09 16:51:43 -08:00
vmpressure.c
vmscan.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
vmstat.c mm: vmscan: split khugepaged stats from direct reclaim stats 2022-11-30 15:58:41 -08:00
workingset.c swap_state: update shadow_nodes for anonymous page 2023-02-02 22:33:24 -08:00
z3fold.c mm: remove PageMovable export 2023-01-18 17:12:57 -08:00
zbud.c zpool: clean out dead code 2022-12-11 18:12:10 -08:00
zpool.c zpool: clean out dead code 2022-12-11 18:12:10 -08:00
zsmalloc.c zsmalloc: make zspage chain size configurable 2023-02-02 22:33:23 -08:00
zswap.c zswap: fix writeback lock ordering for zsmalloc 2022-12-11 18:12:09 -08:00