linux/include/asm-generic
David Hildenbrand d7f861b9c4 mm/mmu_gather: add __tlb_remove_folio_pages()
Add __tlb_remove_folio_pages(), which will remove multiple consecutive
pages that belong to the same large folio, instead of only a single page. 
We'll be using this function when optimizing unmapping/zapping of large
folios that are mapped by PTEs.

We're using the remaining spare bit in an encoded_page to indicate that
the next enoced page in an array contains actually shifted "nr_pages". 
Teach swap/freeing code about putting multiple folio references, and
delayed rmap handling to remove page ranges of a folio.

This extension allows for still gathering almost as many small folios as
we used to (-1, because we have to prepare for a possibly bigger next
entry), but still allows for gathering consecutive pages that belong to
the same large folio.

Note that we don't pass the folio pointer, because it is not required for
now.  Further, we don't support page_size != PAGE_SIZE, it won't be
required for simple PTE batching.

We have to provide a separate s390 implementation, but it's fairly
straight forward.

Another, more invasive and likely more expensive, approach would be to use
folio+range or a PFN range instead of page+nr_pages.  But, we should do
that consistently for the whole mmu_gather.  For now, let's keep it simple
and add "nr_pages" only.

Note that it is now possible to gather significantly more pages: In the
past, we were able to gather ~10000 pages, now we can also gather ~5000
folio fragments that span multiple pages.  A folio fragment on x86-64 can
span up to 512 pages (2 MiB THP) and on arm64 with 64k in theory 8192
pages (512 MiB THP).  Gathering more memory is not considered something we
should worry about, especially because these are already corner cases.

While we can gather more total memory, we won't free more folio fragments.
As long as page freeing time primarily only depends on the number of
involved folios, there is no effective change for !preempt configurations.
However, we'll adjust tlb_batch_pages_flush() separately to handle corner
cases where page freeing time grows proportionally with the actual memory
size.

Link: https://lkml.kernel.org/r/20240214204435.167852-9-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:27:17 -08:00
..
bitops mm: delete checks for xor_unlock_is_negative_byte() 2023-10-18 14:34:17 -07:00
vdso lib/vdso: Avoid highres update if clocksource is not VDSO capable 2020-02-17 20:12:17 +01:00
access_ok.h uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
agp.h char/agp: introduce asm-generic/agp.h 2023-02-13 22:13:29 +01:00
archrandom.h random: handle archrandom with multiple longs 2022-07-25 13:26:14 +02:00
asm-offsets.h
asm-prototypes.h
atomic64.h locking/atomic: delete !ARCH_ATOMIC remnants 2021-05-26 13:20:52 +02:00
atomic.h locking/atomic: make atomic*_{cmp,}xchg optional 2023-06-05 09:57:14 +02:00
audit_change_attr.h
audit_dir_write.h
audit_read.h
audit_signal.h
audit_write.h
barrier.h asm-generic: Add memory barrier dma_mb() 2022-06-23 18:34:58 +01:00
bitops.h include: move find.h from asm_generic to linux 2022-01-15 08:47:31 -08:00
bitsperlong.h lib: extend the scope of small_const_nbits() macro 2021-05-06 19:24:11 -07:00
bug.h panic: make function declarations visible 2023-06-09 17:44:15 -07:00
cache.h
cacheflush.h mm: Introduce flush_cache_vmap_early() 2023-12-14 00:23:17 -08:00
cfi.h cfi: Flip headers 2023-12-15 16:25:55 -08:00
checksum.h asm-generic: Improve csum_fold 2024-01-17 17:52:29 -08:00
cmpxchg-local.h asm-generic: Fix 32 bit __generic_cmpxchg_local 2024-01-05 23:19:14 +01:00
cmpxchg.h asm-generic: avoid __generic_cmpxchg_local warnings 2023-04-04 17:58:11 +02:00
compat.h asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() 2022-11-01 10:20:11 +11:00
current.h asm-generic: current: Don't include thread-info.h if building asm 2023-08-26 22:38:49 +02:00
delay.h
device.h
div64.h ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK() 2021-08-20 11:39:28 +01:00
dma-mapping.h dma-mapping: no need to pass a bus_type into get_arch_dma_ops() 2023-02-15 12:35:20 +01:00
dma.h
early_ioremap.h mm/early_ioremap.c: remove redundant early_ioremap_shutdown() 2021-09-08 11:50:24 -07:00
emergency-restart.h
error-injection.h docs: fault-injection: add requirements of error injectable functions 2023-02-02 22:50:00 -08:00
exec.h
export.h ia64,export.h: replace EXPORT_DATA_SYMBOL* with EXPORT_SYMBOL* 2023-06-22 21:21:06 +09:00
extable.h
fb.h fbdev: Replace fb_pgprotect() with pgprot_framebuffer() 2023-10-12 09:20:46 +02:00
fixmap.h
flat.h binfmt_flat: remove the persistent argument from flat_get_addr_from_rp 2019-06-24 09:16:47 +10:00
ftrace.h
futex.h futex: Fix additional regressions 2021-12-11 23:31:51 +01:00
getorder.h asm-generic: force inlining of get_order() to work around gcc10 poor decision 2020-12-15 22:46:15 -08:00
hardirq.h irqstat: Move declaration into asm-generic/hardirq.h 2020-11-23 10:31:06 +01:00
hugetlb.h mm: hugetlb: add huge page size param to set_huge_pte_at() 2023-09-29 17:20:47 -07:00
hw_irq.h
hyperv-tlfs.h x86/hyperv: Add smp support for SEV-SNP guest 2023-08-22 00:38:20 +00:00
int-ll64.h
io.h mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed 2023-08-18 10:12:36 -07:00
ioctl.h
iomap.h asm-generic/iomap.h: remove ARCH_HAS_IOREMAP_xx macros 2023-08-18 10:12:32 -07:00
irq_regs.h
irq_work.h
irq.h
irqflags.h
Kbuild cfi: Flip headers 2023-12-15 16:25:55 -08:00
kdebug.h
kmap_size.h mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL 2020-11-24 14:42:08 +01:00
kprobes.h treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
kvm_para.h
kvm_types.h KVM: Move x86's version of struct kvm_mmu_memory_cache to common code 2020-07-09 13:29:42 -04:00
linkage.h
local64.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
local.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
logic_io.h logic_io instance of iounmap() needs volatile on argument 2021-12-21 21:31:08 +01:00
mcs_spinlock.h
memory_model.h mm, arch: add generic implementation of pfn_valid() for FLATMEM 2023-02-09 16:51:41 -08:00
mm_hooks.h mm: remove arch_bprm_mm_init() hook 2020-01-23 10:41:16 -08:00
mmiowb_types.h
mmiowb.h asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible() 2020-07-17 10:02:03 +01:00
mmu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
mmu.h
module.h
module.lds.h kbuild: preprocess module linker script 2020-09-25 00:36:41 +09:00
mshyperv.h hyperv: reduce size of ms_hyperv_info 2023-09-22 20:28:16 +00:00
msi.h genirq: Get rid of GENERIC_MSI_IRQ_DOMAIN 2022-11-17 15:15:20 +01:00
nommu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
numa.h arm64: irq: set the correct node for VMAP stack 2023-12-05 14:26:50 +00:00
page.h asm-generic/page.h: Make pfn accessors static inlines 2023-05-29 11:27:08 +02:00
param.h
parport.h
pci_iomap.h PCI: Stub __pci_ioport_map() for arches that don't support it at all 2022-07-29 12:01:00 -05:00
pci.h asm-generic: Add new pci.h and use it 2022-07-22 17:34:57 -05:00
percpu.h percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg() 2023-06-08 10:28:39 +02:00
pgalloc.h mm: add statistics for PUD level pagetable 2023-10-06 14:44:10 -07:00
pgtable_uffd.h userfaultfd: wp: add pmd_swp_*uffd_wp() helpers 2020-04-07 10:43:39 -07:00
pgtable-nop4d.h mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * 2021-07-08 11:48:22 -07:00
pgtable-nopmd.h riscv/mm: fix two page table check related issues 2022-05-19 14:08:48 -07:00
pgtable-nopud.h mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * 2021-07-08 11:48:22 -07:00
preempt.h riscv: support PREEMPT_DYNAMIC with static keys 2023-08-31 00:18:34 -07:00
qrwlock_types.h locking/qrwlock: Change "queue rwlock" to "queued rwlock" 2022-05-11 16:27:04 +02:00
qrwlock.h asm-generic changes for 5.19 2022-05-26 10:50:30 -07:00
qspinlock_types.h locking/qspinlock: Do not include atomic.h from qspinlock_types.h 2020-07-29 16:14:19 +02:00
qspinlock.h asm-generic: qspinlock: fix queued_spin_value_unlocked() implementation 2023-11-22 09:32:49 -08:00
resource.h
rwonce.h asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h' 2020-07-21 10:50:36 +01:00
seccomp.h seccomp: Use -1 marker for end of mode 1 syscall list 2020-07-10 16:01:52 -07:00
sections.h asm-generic: sections: refactor memory_intersects 2022-08-28 14:02:45 -07:00
serial.h
set_memory.h
shmparam.h
signal.h asm-generic: Remove empty #ifdef SA_RESTORER 2022-09-10 09:56:53 +02:00
simd.h
softirq_stack.h asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig. 2022-09-05 17:20:55 +02:00
spinlock_types.h asm-generic: ticket-lock: New generic ticket-based spinlock 2022-05-11 11:49:38 -07:00
spinlock.h asm-generic: ticket-lock: Optimize arch_spin_value_unlocked() 2023-09-21 10:17:00 +02:00
statfs.h
string.h
switch_to.h
syscall.h ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h 2022-03-10 13:35:08 -06:00
syscalls.h
timex.h
tlb.h mm/mmu_gather: add __tlb_remove_folio_pages() 2024-02-22 15:27:17 -08:00
tlbflush.h
topology.h mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA 2021-06-29 10:53:55 -07:00
trace_clock.h
uaccess.h uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
unaligned.h asm-generic: make sparse happy with odd-sized put_unaligned_*() 2024-01-08 09:48:35 -08:00
user.h
vermagic.h arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> 2020-04-23 10:50:26 +09:00
vga.h
vmlinux.lds.h kunit: add KUNIT_INIT_TABLE to init linker section 2023-12-18 13:21:15 -07:00
vtime.h
word-at-a-time.h word-at-a-time: use the same return type for has_zero regardless of endianness 2023-08-02 10:23:36 -07:00
xor.h lib/xor: make xor prototypes more friendly to compiler vectorization 2022-02-11 20:39:39 +11:00