Go to file
Baolin Wang 9128bfbc5c mm: migrate: fix getting incorrect page mapping during page migration
[ Upstream commit d1adb25df7 ]

When running stress-ng testing, we found below kernel crash after a few hours:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
pc : dentry_name+0xd8/0x224
lr : pointer+0x22c/0x370
sp : ffff800025f134c0
......
Call trace:
  dentry_name+0xd8/0x224
  pointer+0x22c/0x370
  vsnprintf+0x1ec/0x730
  vscnprintf+0x2c/0x60
  vprintk_store+0x70/0x234
  vprintk_emit+0xe0/0x24c
  vprintk_default+0x3c/0x44
  vprintk_func+0x84/0x2d0
  printk+0x64/0x88
  __dump_page+0x52c/0x530
  dump_page+0x14/0x20
  set_migratetype_isolate+0x110/0x224
  start_isolate_page_range+0xc4/0x20c
  offline_pages+0x124/0x474
  memory_block_offline+0x44/0xf4
  memory_subsys_offline+0x3c/0x70
  device_offline+0xf0/0x120
  ......

After analyzing the vmcore, I found this issue is caused by page migration.
The scenario is that, one thread is doing page migration, and we will use the
target page's ->mapping field to save 'anon_vma' pointer between page unmap and
page move, and now the target page is locked and refcount is 1.

Currently, there is another stress-ng thread performing memory hotplug,
attempting to offline the target page that is being migrated. It discovers that
the refcount of this target page is 1, preventing the offline operation, thus
proceeding to dump the page. However, page_mapping() of the target page may
return an incorrect file mapping to crash the system in dump_mapping(), since
the target page->mapping only saves 'anon_vma' pointer without setting
PAGE_MAPPING_ANON flag.

There are seveval ways to fix this issue:
(1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving
'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target
page has not built mappings yet.
(2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing
the system, however, there are still some PFN walkers that call page_mapping()
without holding the page lock, such as compaction.
(3) Using target page->private field to save the 'anon_vma' pointer and 2 bits
page state, just as page->mapping records an anonymous page, which can remove
the page_mapping() impact for PFN walkers and also seems a simple way.

So I choose option 3 to fix this issue, and this can also fix other potential
issues for PFN walkers, such as compaction.

Link: https://lkml.kernel.org/r/e60b17a88afc38cb32f84c3e30837ec70b343d2b.1702641709.git.baolin.wang@linux.alibaba.com
Fixes: 64c8902ed4 ("migrate_pages: split unmap_and_move() to _unmap() and _move()")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Xu Yu <xuyu@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-31 16:19:10 -08:00
arch LoongArch/smp: Call rcutree_report_cpu_starting() earlier 2024-01-31 16:18:57 -08:00
block block: ensure we hold a queue reference when using queue limits 2024-01-25 15:35:57 -08:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto crypto: api - Disallow identical driver names 2024-01-31 16:18:49 -08:00
Documentation drm: Allow drivers to indicate the damage helpers to ignore damage clips 2024-01-31 16:19:08 -08:00
drivers thermal: gov_power_allocator: avoid inability to reset a cdev 2024-01-31 16:19:10 -08:00
fs pipe: wakeup wr_wait after setting max_usage 2024-01-31 16:19:09 -08:00
include media: v4l2-cci: Add support for little-endian encoded registers 2024-01-31 16:19:10 -08:00
init rootfs: Fix support for rootfstype= when root= is given 2024-01-25 15:35:46 -08:00
io_uring io_uring: adjust defer tw counting 2024-01-25 15:36:00 -08:00
ipc Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
kernel bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf 2024-01-31 16:19:04 -08:00
lib crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init 2024-01-31 16:18:49 -08:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm: migrate: fix getting incorrect page mapping during page migration 2024-01-31 16:19:10 -08:00
net net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set 2024-01-31 16:19:09 -08:00
rust rust: Ignore preserve-most functions 2024-01-25 15:35:41 -08:00
samples vfio/mtty: Overhaul mtty interrupt handling 2024-01-10 17:16:55 +01:00
scripts scripts/get_abi: fix source path leak 2024-01-31 16:18:55 -08:00
security lsm: new security_file_ioctl_compat() hook 2024-01-31 16:18:54 -08:00
sound soundwire: fix initializing sysfs for same devices on different buses 2024-01-31 16:18:47 -08:00
tools selftests: bonding: do not test arp/ns target with mode balance-alb/tlb 2024-01-31 16:19:05 -08:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: rename binkernel.spec to kernel.spec 2023-07-25 00:59:33 +09:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Char/Misc driver fixes for 6.6-final 2023-10-28 07:51:27 -10:00
Makefile Linux 6.6.14 2024-01-25 15:36:01 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.