linux/mm
Xu Yu 478d134e95 mm/huge_memory: do not overkill when splitting huge_zero_page
Kernel panic when injecting memory_failure for the global huge_zero_page,
when CONFIG_DEBUG_VM is enabled, as follows.

  Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
  page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
  head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
  flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
  raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
  page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
  ------------[ cut here ]------------
  kernel BUG at mm/huge_memory.c:2499!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
  Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
  RIP: 0010:split_huge_page_to_list+0x66a/0x880
  Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
  RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
  RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
  RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
  R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
  R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
  FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  try_to_split_thp_page+0x3a/0x130
  memory_failure+0x128/0x800
  madvise_inject_error.cold+0x8b/0xa1
  __x64_sys_madvise+0x54/0x60
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x7fc3754f8bf9
  Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
  RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
  RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
  RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
  R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
  R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000

We think that raising BUG is overkilling for splitting huge_zero_page, the
huge_zero_page can't be met from normal paths other than memory failure,
but memory failure is a valid caller.  So we tend to replace the BUG to
WARN + returning -EBUSY, and thus the panic above won't happen again.

Link: https://lkml.kernel.org/r/f35f8b97377d5d3ede1bc5ac3114da888c57cbce.1651052574.git.xuyu@linux.alibaba.com
Fixes: d173d5417f ("mm/memory-failure.c: skip huge_zero_page in memory_failure()")
Fixes: 6a46079cf5 ("HWPOISON: The high level memory error handler in the VM v7")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Suggested-by: Yang Shi <shy828301@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-28 23:14:43 -07:00
..
damon mm/damon: prevent activated scheme from sleeping by deactivated schemes 2022-04-01 11:46:09 -07:00
kasan kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time 2022-04-27 13:28:48 -07:00
kfence mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
backing-dev.c remove congestion tracking framework 2022-03-22 15:57:01 -07:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
compaction.c mm: compaction: fix compiler warning when CONFIG_COMPACTION=n 2022-04-15 14:49:55 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: remove pte entry from the page table 2022-02-04 09:25:04 -08:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c remove inode_congested() 2022-03-22 15:57:01 -07:00
failslab.c
filemap.c tmpfs: fix regressions from wider use of ZERO_PAGE 2022-04-15 14:49:54 -07:00
folio-compat.c mm/rmap: Convert rmap_walk() to take a folio 2022-03-21 13:01:35 -04:00
frontswap.c frontswap: remove support for multiple ops 2022-01-22 08:33:38 +02:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/munlock: add lru_add_drain() to fix memcg_stat_test 2022-04-01 11:46:09 -07:00
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm/hmm.c: remove unneeded local variable ret 2022-03-22 15:57:12 -07:00
huge_memory.c mm/huge_memory: do not overkill when splitting huge_zero_page 2022-04-28 23:14:43 -07:00
hugetlb_cgroup.c hugetlb: add hugetlb.*.numa_stat file 2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key 2022-03-22 15:57:08 -07:00
hugetlb_vmemmap.h mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate 2021-06-30 20:47:25 -07:00
hugetlb.c mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb() 2022-04-21 20:01:09 -07:00
hwpoison-inject.c mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler 2022-03-22 15:57:07 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm: generalize ARCH_HAS_FILTER_PGPROT 2022-03-24 19:06:51 -07:00
Kconfig.debug mm: page table check 2022-01-15 16:30:28 +02:00
khugepaged.c mm/khugepaged: remove reuse_swap_page() usage 2022-03-24 19:06:51 -07:00
kmemleak.c mm: kmemleak: take a full lowmem check in kmemleak_*_phys() 2022-04-15 14:49:56 -07:00
ksm.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
list_lru.c mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" 2022-04-08 14:20:36 -10:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c Revert "mm: madvise: skip unmapped vma holes passed to process_madvise" 2022-04-01 11:46:09 -07:00
Makefile mm: move the migrate_vma_* device migration code into its own file 2022-03-03 12:47:33 -05:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c memblock: test suite and a small cleanup 2022-03-27 13:36:06 -07:00
memcontrol.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory_hotplug.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
memory-failure.c Revert "mm/memory-failure.c: skip huge_zero_page in memory_failure()" 2022-04-28 23:14:43 -07:00
memory.c mm,hwpoison: unmap poisoned page before invalidation 2022-04-01 11:46:09 -07:00
mempolicy.c Merge branch 'akpm' (patches from Andrew) 2022-04-08 14:31:41 -10:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm: delete __ClearPageWaiters() 2022-03-24 19:06:45 -07:00
memtest.c
migrate_device.c mm/migrate: Convert remove_migration_ptes() to folios 2022-03-21 13:01:35 -04:00
migrate.c mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation. 2022-04-08 14:20:36 -10:00
mincore.c
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmap.c mm, hugetlb: allow for "high" userspace addresses 2022-04-21 20:01:09 -07:00
mmu_gather.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c memory tiering: skip to scan fast memory 2022-03-22 15:57:09 -07:00
mremap.c mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0) 2022-04-08 14:20:36 -10:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-21 20:01:10 -07:00
page_alloc.c page_alloc: use vmalloc_huge for large system hash 2022-04-24 10:00:54 -07:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: make some vars and functions static or __init 2022-01-15 16:30:31 +02:00
page_idle.c mm/rmap: Constify the rmap_walk_control argument 2022-03-21 13:01:35 -04:00
page_io.c mm: fix unexpected zeroed page mapping with zram swap 2022-04-15 14:49:55 -07:00
page_isolation.c Revert "mm/page_isolation: unset migratetype directly for non Buddy page" 2022-02-04 09:25:04 -08:00
page_owner.c mm/page_owner.c: record tgid 2022-03-24 19:06:44 -07:00
page_poison.c
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_table_check.c mm/page_table_check.c: use strtobool for param parsing 2022-03-22 15:57:11 -07:00
page_vma_mapped.c mm/rmap: Fix handling of hugetlbfs pages in page_vma_mapped_walk 2022-04-07 10:11:20 -04:00
page-writeback.c mm: warn on deleting redirtied only if accounted 2022-03-24 19:06:51 -07:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h mm: memcg/percpu: account extra objcg space to memory cgroups 2022-01-15 16:30:31 +02:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: sparsemem: use page table lock to protect kernel pmd operations 2022-03-22 15:57:08 -07:00
readahead.c readahead: Update comments 2022-04-01 14:40:42 -04:00
rmap.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
rodata_test.c
secretmem.c mm/secretmem: fix panic when growing a memfd_secret 2022-04-15 14:49:54 -07:00
shmem.c tmpfs: fix regressions from wider use of ZERO_PAGE 2022-04-15 14:49:54 -07:00
shuffle.c
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slab.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slab.h mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slob.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slub.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
sparse-vmemmap.c mm: sparsemem: move vmemmap related to HugeTLB to CONFIG_HUGETLB_PAGE_FREE_VMEMMAP 2022-03-22 15:57:08 -07:00
sparse.c mm/sparse: make mminit_validate_memmodel_limits() static 2022-03-22 15:57:05 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
swap_state.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
swap.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
swapfile.c mm/swapfile: remove stale reuse_swap_page() 2022-03-24 19:06:51 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
userfaultfd.c userfaultfd: mark uffd_wp regardless of VM_WRITE flag 2022-04-21 20:01:09 -07:00
util.c kvmalloc: use vmalloc_huge for vmalloc allocations 2022-04-24 10:05:38 -07:00
vmacache.c
vmalloc.c mm/vmalloc: huge vmalloc backing pages should be split rather than compound 2022-04-22 09:20:16 -07:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
vmstat.c mm: only re-generate demotion targets when a numa node changes its N_CPU state 2022-03-22 15:57:11 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
z3fold.c mm/z3fold: add kerneldoc fields for z3fold_pool 2021-07-01 11:06:03 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: replace get_cpu_var with local_lock 2022-01-22 08:33:37 +02:00
zswap.c mm/zswap.c: allow handling just same-value filled pages 2022-03-22 15:57:11 -07:00