linux/mm
Alex Sierra 6077c943be mm: rename is_pinnable_page() to is_longterm_pinnable_page()
Patch series "Add MEMORY_DEVICE_COHERENT for coherent device memory
mapping", v9.

This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like MEMORY_DEVICE_PRIVATE.

This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.

System stability and performance are not affected according to our ongoing
testing, including xfstests.

How it works: The system BIOS advertises the GPU device memory (aka VRAM)
as SPM (special purpose memory) in the UEFI system address map.

The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages.  The initial user for
this hardware page migration capability is the Frontier supercomputer
project.  This functionality is not AMD-specific.  We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.

Our test nodes in the lab are similar to the Frontier configuration, with
.5 TB of system memory plus 256 GB of device memory split across 4 GPUs,
all in a single coherent address space.  Page migration is expected to
improve application efficiency significantly.  We will report empirical
results as they become available.

Coherent device type pages at gup are now migrated back to system memory
if they are being pinned long-term (FOLL_LONGTERM).  The reason is, that
long-term pinning would interfere with the device memory manager owning
the device-coherent pages (e.g.  evictions in TTM).  These series
incorporate Alistair Popple patches to do this migration from
pin_user_pages() calls.  hmm_gup_test has been added to hmm-test to test
different get user pages calls.

This series includes handling of device-managed anonymous pages returned
by vm_normal_pages.  Although they behave like normal pages for purposes
of mapping in CPU page tables and for COW, they do not support LRU lists,
NUMA migration or THP.

We also introduced a FOLL_LRU flag that adds the same behaviour to
follow_page and related APIs, to allow callers to specify that they expect
to put pages on an LRU list.


This patch (of 14):

is_pinnable_page() and folio_is_pinnable() are renamed to
is_longterm_pinnable_page() and folio_is_longterm_pinnable() respectively.
These functions are used in the FOLL_LONGTERM flag context.

Link: https://lkml.kernel.org/r/20220715150521.18165-1-alex.sierra@amd.com
Link: https://lkml.kernel.org/r/20220715150521.18165-2-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:27 -07:00
..
damon mm/damon/lru_sort: fix potential memory leak in damon_lru_sort_init() 2022-07-17 17:14:27 -07:00
kasan kasan: fix zeroing vmalloc memory with HW_TAGS 2022-07-03 18:08:39 -07:00
kfence mm/kfence: select random number before taking raw lock 2022-06-16 19:11:31 -07:00
backing-dev.c init: Initialize noop_backing_dev_info early 2022-06-16 10:55:57 +02:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
cma_debug.c
cma_sysfs.c
cma.c Revert "mm/cma.c: remove redundant cma_mutex lock" 2022-05-13 15:11:26 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
compaction.c mm, docs: fix comments that mention mem_hotplug_end() 2022-07-03 18:08:50 -07:00
debug_page_ref.c
debug_vm_pgtable.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
failslab.c mm: fix missing handler for __GFP_NOWARN 2022-05-19 14:08:55 -07:00
filemap.c filemap: Handle sibling entries in filemap_get_read_batch() 2022-06-20 16:37:45 -04:00
folio-compat.c fs: Remove aop flags parameter from grab_cache_page_write_begin() 2022-05-08 14:28:19 -04:00
frontswap.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
gup_test.c mm: rename is_pinnable_page() to is_longterm_pinnable_page() 2022-07-17 17:14:27 -07:00
gup_test.h
gup.c mm: rename is_pinnable_page() to is_longterm_pinnable_page() 2022-07-17 17:14:27 -07:00
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
huge_memory.c mm: shrinkers: provide shrinkers with names 2022-07-03 18:08:40 -07:00
hugetlb_cgroup.c hugetlb: add hugetlb.*.numa_stat file 2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c mm: memory_hotplug: make hugetlb_optimize_vmemmap compatible with memmap_on_memory 2022-07-03 18:08:49 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
hugetlb.c mm: rename is_pinnable_page() to is_longterm_pinnable_page() 2022-07-17 17:14:27 -07:00
hwpoison-inject.c mm/memory-failure: disable unpoison once hw error happens 2022-06-16 19:11:32 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm: split free page with properly free memory accounting and without race 2022-05-27 09:33:43 -07:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
Kconfig.debug Two followon fixes for the post-5.19 series "Use pageblock_order for cma 2022-05-27 11:40:49 -07:00
khugepaged.c mm/khugepaged: try to free transhuge swapcache when possible 2022-07-03 18:08:52 -07:00
kmemleak.c mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan() 2022-06-16 19:48:32 -07:00
ksm.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
list_lru.c mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe 2022-06-16 19:48:31 -07:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c mm/madvise: minor cleanup for swapin_walk_pmd_entry() 2022-07-03 18:08:49 -07:00
Makefile mm: shrinkers: introduce debugfs interface for memory shrinkers 2022-07-03 18:08:40 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c mm: kmemleak: remove kmemleak_not_leak_phys() and the min_count argument to kmemleak_alloc_phys() 2022-06-16 19:48:30 -07:00
memcontrol.c mm: memcontrol: introduce mem_cgroup_ino() and mem_cgroup_get_from_ino() 2022-07-03 18:08:40 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory_hotplug.c mm: memory_hotplug: make hugetlb_optimize_vmemmap compatible with memmap_on_memory 2022-07-03 18:08:49 -07:00
memory-failure.c mm/swap: convert delete_from_swap_cache() to take a folio 2022-07-03 18:08:48 -07:00
memory.c mm: avoid unnecessary page fault retires on shared memory types 2022-06-16 19:48:27 -07:00
mempolicy.c mm/mempolicy: fix get_nodes out of bound access 2022-07-03 18:08:39 -07:00
mempool.c mm/mempool: use might_alloc() 2022-06-16 19:48:30 -07:00
memremap.c mm/memremap: fix memunmap_pages() race with get_dev_pagemap() 2022-06-16 19:48:31 -07:00
memtest.c
migrate_device.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
migrate.c mm/migration: fix potential pte_unmap on an not mapped pte 2022-07-03 18:08:37 -07:00
mincore.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c
mmap_lock.c
mmap.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
mmu_gather.c mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush 2022-04-28 23:16:12 -07:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c mm/mprotect: try avoiding write faults for exclusive anonymous pages when changing protection 2022-07-03 18:08:44 -07:00
mremap.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery 2022-06-01 15:57:16 -07:00
page_alloc.c mm/page_alloc: make the annotations of available memory more accurate 2022-07-03 18:08:50 -07:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: use for_each_online_node and node_online instead of open coding 2022-04-29 14:36:58 -07:00
page_idle.c mm: don't be stuck to rmap lock on reclaim path 2022-05-19 14:08:54 -07:00
page_io.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
page_isolation.c mm/page_isolation.c: fix one kernel-doc comment 2022-06-16 19:11:30 -07:00
page_owner.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c Six hotfixes. One from Miaohe Lin is considered a minor thing so it isn't 2022-05-27 11:29:35 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: check possible huge PMD map with transhuge_vma_suitable() 2022-07-03 18:08:37 -07:00
page-writeback.c sysctl changes for v5.19-rc1 2022-05-26 16:57:20 -07:00
pagewalk.c
percpu-internal.h percpu: improve percpu_alloc_percpu event trace 2022-05-13 07:20:18 -07:00
percpu-km.c
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c
percpu.c percpu: improve percpu_alloc_percpu event trace 2022-05-13 07:20:18 -07:00
pgalloc-track.h
pgtable-generic.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
process_vm_access.c
ptdump.c mm: sparsemem: use page table lock to protect kernel pmd operations 2022-03-22 15:57:08 -07:00
readahead.c filemap: Fix serialization adding transparent huge pages to page cache 2022-06-23 12:22:00 -04:00
rmap.c mm: hugetlb: kill set_huge_swap_pte_at() 2022-07-03 18:08:50 -07:00
rodata_test.c
secretmem.c secretmem: Convert to free_folio 2022-05-09 23:12:53 -04:00
shmem.c mm/swap: convert delete_from_swap_cache() to take a folio 2022-07-03 18:08:48 -07:00
shrinker_debug.c mm: shrinkers: add scan interface for shrinker debugfs 2022-07-03 18:08:41 -07:00
shuffle.c
shuffle.h
slab_common.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
slab.c mm/slab: delete cache_alloc_debugcheck_before() 2022-06-16 19:48:29 -07:00
slab.h slab changes for 5.19 2022-05-25 10:24:04 -07:00
slob.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slub.c mm/slub: add missing TID updates on slab deactivation 2022-06-13 17:41:36 +02:00
sparse-vmemmap.c mm: sparsemem: drop unexpected word 'a' in comments 2022-07-03 18:08:50 -07:00
sparse.c mm: memory_hotplug: enumerate all supported section flags 2022-07-03 18:08:49 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c mm/swap: remove buggy cache->nr check in refill_swap_slots_cache 2022-05-19 14:08:51 -07:00
swap_state.c mm/swap: convert __delete_from_swap_cache() to a folio 2022-07-03 18:08:48 -07:00
swap.c mm: convert destroy_compound_page() to destroy_large_folio() 2022-07-03 18:08:48 -07:00
swap.h mm/khugepaged: try to free transhuge swapcache when possible 2022-07-03 18:08:52 -07:00
swapfile.c mm/swap: convert delete_from_swap_cache() to take a folio 2022-07-03 18:08:48 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c usercopy: Make usercopy resilient against ridiculously large copies 2022-06-13 09:54:52 -07:00
userfaultfd.c mm/uffd: enable write protection for shmem & hugetlbfs 2022-05-13 07:20:11 -07:00
util.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
vmacache.c
vmalloc.c mm/vmalloc: extend __find_vmap_area() with one more argument 2022-07-03 18:08:41 -07:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c mm, docs: fix comments that mention mem_hotplug_end() 2022-07-03 18:08:50 -07:00
vmstat.c Bitmap patches for 5.19-rc1 2022-06-04 14:04:27 -07:00
workingset.c mm: shrinkers: provide shrinkers with names 2022-07-03 18:08:40 -07:00
z3fold.c mm/z3fold: fix z3fold_page_migrate races with z3fold_map 2022-05-27 09:33:44 -07:00
zbud.c
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c mm: shrinkers: provide shrinkers with names 2022-07-03 18:08:40 -07:00
zswap.c zswap: memcg accounting 2022-05-19 14:08:53 -07:00