linux/mm
Michal Hocko 2d070eab2e mm: consider zone which is not fully populated to have holes
__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid page
from the given pfn range for the given zone.  This doesn't handle zones
which are not fully populated though.  Memory pageblocks can be offlined
or might not have been onlined yet.  In such a case the zone should be
considered to have holes otherwise pfn walkers can touch and play with
offline pages.

Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp.  __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though.

In order to do this patch adds a new memory section state
(SECTION_IS_ONLINE) which is set in memory_present (during boot time) or
in online_pages_range during the memory hotplug.  Similarly
offline_mem_sections clears the bit and it is called when the memory
range is offlined.

pfn_to_online_page helper is then added which check the mem section and
only returns a page if it is onlined already.

Use the new helper in __pageblock_pfn_to_page and skip the whole page
block in such a case.

[mhocko@suse.com: check valid section number in pfn_to_online_page (Vlastimil),
 mark sections online after all struct pages are initialized in
 online_pages_range (Vlastimil)]
  Link: http://lkml.kernel.org/r/20170518164210.GD18333@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170515085827.16474-8-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Tobias Regnery <tobias.regnery@gmail.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:32 -07:00
..
kasan Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
backing-dev.c bdi: Drop 'parent' argument from bdi_register[_va]() 2017-04-20 12:09:55 -06:00
balloon_compaction.c mm: balloon: use general non-lru movable page feature 2016-07-26 16:19:19 -07:00
bootmem.c mm/bootmem.c: cosmetic improvement of code readability 2017-02-22 16:41:29 -08:00
cleancache.c fs: switch ->s_uuid to uuid_t 2017-06-05 16:59:12 +02:00
cma_debug.c cma: Store a name in the cma structure 2017-04-18 20:41:12 +02:00
cma.c cma: Introduce cma_for_each_area 2017-04-18 20:41:12 +02:00
cma.h cma: Store a name in the cma structure 2017-04-18 20:41:12 +02:00
compaction.c mm, compaction: finish whole pageblock to reduce fragmentation 2017-05-08 17:15:10 -07:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm, debug: print raw struct page data in __dump_page() 2016-12-12 18:55:08 -08:00
dmapool.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
early_ioremap.c mm/early_ioremap: use offset_in_page macro 2015-11-05 19:34:48 -08:00
fadvise.c mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-03 13:08:04 -07:00
frame_vector.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c Merge branch 'linus' into x86/mm, to pick up fixes 2017-06-22 10:57:28 +02:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c mm, THP, swap: check whether THP can be split firstly 2017-07-06 16:24:31 -07:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hugetlb.c mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified 2017-06-02 15:07:38 -07:00
hwpoison-inject.c mm: hwpoison: call shake_page() unconditionally 2017-05-03 15:52:12 -07:00
init-mm.c mm: Add a user_ns owner to mm_struct and fix ptrace permission checks 2016-11-22 11:49:48 -06:00
internal.h mm, compaction: finish whole pageblock to reduce fragmentation 2017-05-08 17:15:10 -07:00
interval_tree.c
Kconfig mm, THP, swap: delay splitting THP during swap out 2017-07-06 16:24:31 -07:00
Kconfig.debug mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
khugepaged.c mm, thp: remove cond_resched from __collapse_huge_page_copy 2017-06-23 16:15:55 -07:00
kmemcheck.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c mm: fix section name for .data..ro_after_init 2017-03-31 17:13:30 -07:00
ksm.c ksm: optimize refile of stable_node_dup at the head of the chain 2017-07-06 16:24:31 -07:00
list_lru.c mm/list_lru.c: avoid error-path NULL pointer deref 2016-10-27 18:43:42 -07:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c mm/madvise: move up the behavior parameter validation 2017-05-03 15:52:11 -07:00
Makefile percpu: expose statistics about percpu memory via debugfs 2017-06-20 15:31:38 -04:00
memblock.c mm: consider memblock reservations for deferred memory initialization sizing 2017-06-02 15:07:38 -07:00
memcontrol.c mm, THP, swap: delay splitting THP during swap out 2017-07-06 16:24:31 -07:00
memory_hotplug.c mm: consider zone which is not fully populated to have holes 2017-07-06 16:24:32 -07:00
memory-failure.c mm/memory-failure.c: use compound_head() flags for huge pages 2017-06-17 06:37:05 +09:00
memory.c mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
mempolicy.c mm/mempolicy.c: fix error handling in set_mempolicy and mbind. 2017-04-08 10:57:55 -07:00
mempool.c sched/wait: Rename wait_queue_t => wait_queue_entry_t 2017-06-20 12:18:27 +02:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: make rmap_one boolean function 2017-05-03 15:52:10 -07:00
mincore.c mm: remove shmem_mapping() shmem_zero_setup() duplicates 2017-02-24 17:46:56 -08:00
mlock.c mlock: fix mlock count can not decrease in race condition 2017-06-02 15:07:38 -07:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c mm/mmap.c: mark protection_map as __ro_after_init 2017-07-06 16:24:31 -07:00
mmu_context.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mmu_notifier.c mm: Use static initialization for "srcu" 2017-04-18 11:38:22 -07:00
mmzone.c mm/mmzone.c: swap likely to unlikely as code logic is different for next_zones_zonelist() 2017-02-22 16:41:29 -08:00
mprotect.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
mremap.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
msync.c mm/msync: use offset_in_page macro 2015-11-05 19:34:48 -08:00
nobootmem.c mm/nobootmem.c: return 0 when start_pfn equals end_pfn 2017-07-06 16:24:31 -07:00
nommu.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
oom_kill.c oom: improve oom disable handling 2017-05-03 15:52:10 -07:00
page_alloc.c mm: consider zone which is not fully populated to have holes 2017-07-06 16:24:32 -07:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
page_idle.c mm: make rmap_one boolean function 2017-05-03 15:52:10 -07:00
page_io.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
page_isolation.c mm, page_alloc: count movable pages when stealing from pageblock 2017-05-08 17:15:10 -07:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c mm: enable page poisoning early at boot 2017-05-03 15:52:10 -07:00
page_vma_mapped.c mm: fix page_vma_mapped_walk() for ksm pages 2017-04-08 00:47:48 -07:00
page-writeback.c Add GETFSMAP support; some performance improvements for very large 2017-05-08 11:30:05 -07:00
pagewalk.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
percpu-internal.h percpu: fix early calls for spinlock in pcpu_stats 2017-06-21 13:53:52 -04:00
percpu-km.c percpu: fix static checker warnings in pcpu_destroy_chunk 2017-06-29 11:23:38 -04:00
percpu-stats.c percpu: expose statistics about percpu memory via debugfs 2017-06-20 15:31:38 -04:00
percpu-vm.c percpu: fix static checker warnings in pcpu_destroy_chunk 2017-06-29 11:23:38 -04:00
percpu.c percpu: resolve err may not be initialized in pcpu_alloc 2017-06-21 12:00:45 -04:00
pgtable-generic.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
process_vm_access.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: don't cap request size based on read-ahead setting 2016-12-12 18:55:08 -08:00
rmap.c mm, x86/mm: Make the batched unmap TLB flush API more generic 2017-05-24 10:18:27 +02:00
rodata_test.c mm: remove rodata_test_data export, add pr_fmt 2017-05-03 15:52:09 -07:00
shmem.c mm, THP, swap: unify swap slot free functions to put_swap_page 2017-07-06 16:24:31 -07:00
slab_common.c mm: allow slab_nomerge to be set at build time 2017-07-06 16:24:31 -07:00
slab.c mm/slab.c: replace open-coded round-up code with ALIGN 2017-07-06 16:24:30 -07:00
slab.h mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
slob.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
slub.c mm/slub.c: wrap kmem_cache->cpu_partial in config CONFIG_SLUB_CPU_PARTIAL 2017-07-06 16:24:30 -07:00
sparse-vmemmap.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
sparse.c mm: consider zone which is not fully populated to have holes 2017-07-06 16:24:32 -07:00
swap_cgroup.c mm, THP, swap: delay splitting THP during swap out 2017-07-06 16:24:31 -07:00
swap_slots.c mm, THP, swap: delay splitting THP during swap out 2017-07-06 16:24:31 -07:00
swap_state.c mm, THP, swap: move anonymous THP split logic to vmscan 2017-07-06 16:24:31 -07:00
swap.c mm: move MADV_FREE pages into LRU_INACTIVE_FILE list 2017-05-03 15:52:08 -07:00
swapfile.c mm, THP, swap: unify swap slot free functions to put_swap_page 2017-07-06 16:24:31 -07:00
truncate.c mm: fix data corruption due to stale mmap reads 2017-05-12 15:57:15 -07:00
usercopy.c mm/usercopy: Drop extra is_vmalloc_or_module() check 2017-04-05 12:30:18 -07:00
userfaultfd.c mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
util.c mm: clarify why we want kmalloc before falling backto vmallock 2017-06-02 15:07:37 -07:00
vmacache.c sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
vmalloc.c mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings 2017-06-23 16:15:55 -07:00
vmpressure.c mm: correct the comment when reclaimed pages exceed the scanned pages 2017-06-17 06:37:05 +09:00
vmscan.c mm, THP, swap: enable THP swap optimization only if has compound map 2017-07-06 16:24:31 -07:00
vmstat.c mm/vmstat.c: standardize file operations variable names 2017-07-06 16:24:31 -07:00
workingset.c mm: memcontrol: use node page state naming scheme for memcg 2017-05-03 15:52:11 -07:00
z3fold.c z3fold: fix page locking in z3fold_alloc() 2017-04-13 18:24:20 -07:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c zsmalloc: expand class bit 2017-04-13 18:24:21 -07:00
zswap.c zswap: don't param_set_charp while holding spinlock 2017-02-27 18:43:45 -08:00