linux/mm
Qian Cai 1a9f219157 mm/hotplug: treat CMA pages as unmovable
has_unmovable_pages() is used by allocating CMA and gigantic pages as
well as the memory hotplug.  The later doesn't know how to offline CMA
pool properly now, but if an unused (free) CMA page is encountered, then
has_unmovable_pages() happily considers it as a free memory and
propagates this up the call chain.  Memory offlining code then frees the
page without a proper CMA tear down which leads to an accounting issues.
Moreover if the same memory range is onlined again then the memory never
gets back to the CMA pool.

State after memory offline:

 # grep cma /proc/vmstat
 nr_free_cma 205824

 # cat /sys/kernel/debug/cma/cma-kvm_cma/count
 209920

Also, kmemleak still think those memory address are reserved below but
have already been used by the buddy allocator after onlining.  This
patch fixes the situation by treating CMA pageblocks as unmovable except
when has_unmovable_pages() is called as part of CMA allocation.

  Offlined Pages 4096
  kmemleak: Cannot insert 0xc000201f7d040008 into the object search tree (overlaps existing)
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    create_object+0x344/0x380
    __kmalloc_node+0x3ec/0x860
    kvmalloc_node+0x58/0x110
    seq_read+0x41c/0x620
    __vfs_read+0x3c/0x70
    vfs_read+0xbc/0x1a0
    ksys_read+0x7c/0x140
    system_call+0x5c/0x70
  kmemleak: Kernel memory leak detector disabled
  kmemleak: Object 0xc000201cc8000000 (size 13757317120):
  kmemleak:   comm "swapper/0", pid 0, jiffies 4294937297
  kmemleak:   min_count = -1
  kmemleak:   count = 0
  kmemleak:   flags = 0x5
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       cma_declare_contiguous+0x2a4/0x3b0
       kvm_cma_reserve+0x11c/0x134
       setup_arch+0x300/0x3f8
       start_kernel+0x9c/0x6e8
       start_here_common+0x1c/0x4b0
  kmemleak: Automatic memory scanning thread ended

[cai@lca.pw: use is_migrate_cma_page() and update commit log]
  Link: http://lkml.kernel.org/r/20190416170510.20048-1-cai@lca.pw
Link: http://lkml.kernel.org/r/20190413002623.8967-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
..
kasan kasan: fix variable 'tag' set but not used warning 2019-03-29 10:01:36 -07:00
backing-dev.c writeback: synchronize sync(2) against cgroup writeback membership switches 2019-01-22 14:39:38 -07:00
balloon_compaction.c virtio_balloon: fix deadlock on OOM 2017-11-14 23:57:38 +02:00
cleancache.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
cma_debug.c mm/cma_debug.c: remove static scoped cma_debugfs_root 2019-03-05 21:07:20 -08:00
cma.c memblock: emphasize that memblock_alloc_range() returns a physical address 2019-03-12 10:04:01 -07:00
cma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compaction.c mm/compaction.c: abort search if isolation fails 2019-04-04 11:56:15 +01:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug.c mm/debug.c: fix __dump_page when mapping->host is not set 2019-03-29 10:01:37 -07:00
dmapool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2017-12-11 14:54:44 +01:00
fadvise.c vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
failslab.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
filemap.c filemap: add a comment about FAULT_FLAG_RETRY_NOWAIT behavior 2019-03-15 11:26:07 -07:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2017-12-14 16:00:48 -08:00
frontswap.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
gup_benchmark.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
gup.c Merge branch 'page-refs' (page ref overflow) 2019-04-14 15:09:40 -07:00
highmem.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
hmm.c mm/hmm: convert to use vm_fault_t 2019-03-12 10:04:00 -07:00
huge_memory.c mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd() 2019-04-05 16:02:31 -10:00
hugetlb_cgroup.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
hugetlb.c Merge branch 'page-refs' (page ref overflow) 2019-04-14 15:09:40 -07:00
hwpoison-inject.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
init-mm.c mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids 2018-07-17 09:35:30 +02:00
internal.h mm, compaction: capture a page under direct compaction 2019-03-05 21:07:17 -08:00
interval_tree.c mm/interval_tree.c: use vma_pages() helper 2018-01-31 17:18:37 -08:00
Kconfig ksm: replace jhash2 with xxhash 2018-12-28 12:11:46 -08:00
Kconfig.debug mm/page_owner: move config option to mm/Kconfig.debug 2019-03-05 21:07:18 -08:00
khugepaged.c mm: memcontrol: expose THP events on a per-memcg basis 2019-03-05 21:07:19 -08:00
kmemleak-test.c
kmemleak.c kmemleak: powerpc: skip scanning holes in the .bss section 2019-04-05 16:02:30 -10:00
ksm.c mm: ksm: do not block on page lock when searching stable tree 2019-03-05 21:07:19 -08:00
list_lru.c numa: make "nr_node_ids" unsigned int 2019-03-05 21:07:19 -08:00
maccess.c Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses" 2019-02-25 09:10:51 -08:00
madvise.c mm/mmu_notifier: use structure for invalidate_range_start/end calls v2 2018-12-28 12:11:50 -08:00
Makefile mm: remove nobootmem 2018-10-31 08:54:16 -07:00
memblock.c mm: memblock: update comments and kernel-doc 2019-03-12 10:04:02 -07:00
memcontrol.c mm: writeback: use exact memcg dirty counts 2019-04-05 16:02:31 -10:00
memfd.c mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd 2019-03-05 21:07:19 -08:00
memory_hotplug.c mm/memory_hotplug.c: fix notification in offline error path 2019-03-29 10:01:37 -07:00
memory-failure.c mm: hwpoison: fix thp split handing in soft_offline_in_use_page() 2019-03-05 21:07:13 -08:00
memory.c mm/memory.c: fix modifying of page protection by insert_pfn() 2019-03-29 10:01:37 -07:00
mempolicy.c mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified 2019-03-29 10:01:37 -07:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate 2019-03-29 10:01:37 -07:00
mincore.c Revert "Change mincore() to count "mapped" pages rather than "cached" pages" 2019-01-24 09:04:37 +13:00
mlock.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
mm_init.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
mmap.c mm: fix some typos in mm directory 2019-03-05 21:07:18 -08:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_gather.c mm: Replace call_rcu_sched() with call_rcu() 2018-11-27 09:21:46 -08:00
mmu_notifier.c mm/mmu_notifier: use structure for invalidate_range_start/end calls v2 2018-12-28 12:11:50 -08:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c mm: update ptep_modify_prot_commit to take old pte value as arg 2019-03-05 21:07:18 -08:00
mremap.c mm,mremap: bail out earlier in mremap_to under map pressure 2019-03-05 21:07:21 -08:00
msync.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nommu.c mm/gup: cache dev_pagemap while pinning pages 2018-10-26 16:38:15 -07:00
oom_kill.c mm,oom: don't kill global init via memory.oom.group 2019-03-05 21:07:19 -08:00
page_alloc.c mm/hotplug: treat CMA pages as unmovable 2019-04-19 09:46:05 -07:00
page_counter.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
page_ext.c memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
page_idle.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
page_io.c mm/page_io.c: fix polled swap page in 2019-01-04 13:13:48 -08:00
page_isolation.c mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate() 2019-03-29 10:01:37 -07:00
page_owner.c mm: no need to check return value of debugfs_create functions 2019-03-05 21:07:17 -08:00
page_poison.c page_poison: play nicely with KASAN 2019-03-05 21:07:13 -08:00
page_vma_mapped.c mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly 2018-10-31 08:54:11 -07:00
page-writeback.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
pagewalk.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
percpu-internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-km.c percpu: km: no need to consider pcpu_group_offsets[0] 2019-02-26 13:47:58 -08:00
percpu-stats.c treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
percpu-vm.c percpu: allow select gfp to be passed to underlying allocators 2018-02-18 05:33:01 -08:00
percpu.c memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
pgtable-generic.c x86/mm: Page size aware flush_tlb_mm_range() 2018-10-09 16:51:11 +02:00
process_vm_access.c mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors 2018-02-06 18:32:48 -08:00
quicklist.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
readahead.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
rmap.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
rodata_test.c mm: fix RODATA_TEST failure "rodata_test: test data was not read only" 2017-10-03 17:54:24 -07:00
shmem.c mm: swapoff: shmem_unuse() stop eviction without igrab() 2019-04-19 09:46:04 -07:00
slab_common.c mm: add support for kmem caches in DMA32 zone 2019-03-29 10:01:37 -07:00
slab.c slab: store tagged freelist for off-slab slabmgmt 2019-04-19 09:46:04 -07:00
slab.h mm: add support for kmem caches in DMA32 zone 2019-03-29 10:01:37 -07:00
slob.c slab: __GFP_ZERO is incompatible with a constructor 2018-06-07 17:34:34 -07:00
slub.c mm: add support for kmem caches in DMA32 zone 2019-03-29 10:01:37 -07:00
sparse-vmemmap.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sparse.c mm/hotplug: fix offline undo_isolate_page_range() 2019-03-29 10:01:37 -07:00
swap_cgroup.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c mm: swap: add comment for swap_vma_readahead 2019-03-05 21:07:16 -08:00
swap.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
swapfile.c mm: swapoff: shmem_unuse() stop eviction without igrab() 2019-04-19 09:46:04 -07:00
truncate.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
usercopy.c mm/usercopy.c: no check page span for stack objects 2019-01-08 17:15:11 -08:00
userfaultfd.c hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" 2019-01-08 17:15:11 -08:00
util.c mm/util.c: fix strndup_user() comment 2019-04-05 16:02:31 -10:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
vmpressure.c mm/vmpressure.c: convert to use match_string() helper 2018-06-07 17:34:36 -07:00
vmscan.c mm: remove zone_lru_lock() function, access ->lru_lock directly 2019-03-05 21:07:21 -08:00
vmstat.c mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n 2019-04-19 09:46:04 -07:00
workingset.c mm/workingset: remove unused @mapping argument in workingset_eviction() 2019-03-05 21:07:21 -08:00
z3fold.c z3fold: fix possible reclaim races 2018-11-18 10:15:09 -08:00
zbud.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
zpool.c mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc 2018-02-21 15:35:43 -08:00
zsmalloc.c mm/zsmalloc.c: fix fall-through annotation 2018-10-26 16:26:35 -07:00
zswap.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00