Go to file
Ilya Lipnitskiy e720e7d0e9 mm: fix race by making init_zero_pfn() early_initcall
There are code paths that rely on zero_pfn to be fully initialized
before core_initcall.  For example, wq_sysfs_init() is a core_initcall
function that eventually results in a call to kernel_execve, which
causes a page fault with a subsequent mmput.  If zero_pfn is not
initialized by then it may not get cleaned up properly and result in an
error:

  BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1

Here is an analysis of the race as seen on a MIPS device. On this
particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
initialized, at which point it becomes PFN 5120:

  1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
       kobject_uevent_env+0x7e4/0x7ec
       kset_register+0x68/0x88
       bus_register+0xdc/0x34c
       subsys_virtual_register+0x34/0x78
       wq_sysfs_init+0x1c/0x4c
       do_one_initcall+0x50/0x1a8
       kernel_init_freeable+0x230/0x2c8
       kernel_init+0x10/0x100
       ret_from_kernel_thread+0x14/0x1c

  2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
     kernel_execve asynchronously.

  3. Memory allocations in kernel_execve cause a page fault, bumping the
     MM reference counter:
       add_mm_counter_fast+0xb4/0xc0
       handle_mm_fault+0x6e4/0xea0
       __get_user_pages.part.78+0x190/0x37c
       __get_user_pages_remote+0x128/0x360
       get_arg_page+0x34/0xa0
       copy_string_kernel+0x194/0x2a4
       kernel_execve+0x11c/0x298
       call_usermodehelper_exec_async+0x114/0x194

  4. In case zero_pfn has not been initialized yet, zap_pte_range does
     not decrement the MM_ANONPAGES RSS counter and the BUG message is
     triggered shortly afterwards when __mmdrop checks the ref counters:
       __mmdrop+0x98/0x1d0
       free_bprm+0x44/0x118
       kernel_execve+0x160/0x1d8
       call_usermodehelper_exec_async+0x114/0x194
       ret_from_kernel_thread+0x14/0x1c

To avoid races such as described above, initialize init_zero_pfn at
early_initcall level.  Depending on the architecture, ZERO_PAGE is
either constant or gets initialized even earlier, at paging_init, so
there is no issue with initializing zero_pfn earlier.

Link: https://lkml.kernel.org/r/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niAXOpqEbt7w@mail.gmail.com
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: stable@vger.kernel.org
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-30 09:46:12 -07:00
arch - fixed compile error with option MIPS_ELF_APPENDED_DTB 2021-03-30 08:57:50 -07:00
block block: don't create too many partitions 2021-03-27 09:22:18 -06:00
certs certs: Replace K{U,G}IDT_INIT() with GLOBAL_ROOT_{U,G}ID 2021-01-21 16:16:10 +00:00
crypto crypto: mips/poly1305 - enable for all MIPS processors 2021-03-08 11:52:17 +01:00
Documentation arm64 fixes for -rc5 2021-03-25 11:07:40 -07:00
drivers xen: branch for v5.12-rc6 2021-03-30 08:55:47 -07:00
fs 5 cifs/smb3 fixes, 2 for stable, includes an important fix for encryption and an ACL fix, as well as a fix for possible reflink data corruption 2021-03-28 12:06:21 -07:00
include Fix the non-debug mutex_lock_io_nested() method to map to mutex_lock_io() instead of mutex_lock(). 2021-03-28 12:12:22 -07:00
init Merge branch 'akpm' (patches from Andrew) 2021-03-14 12:23:34 -07:00
ipc fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
kernel io_uring-5.12-2021-03-27 2021-03-28 11:42:05 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-03-24 18:16:04 -07:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm mm: fix race by making init_zero_pfn() early_initcall 2021-03-30 09:46:12 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-03-24 18:16:04 -07:00
samples Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2021-03-09 17:15:56 -08:00
scripts kbuild: fix ld-version.sh to not be affected by locale 2021-03-13 11:12:13 +09:00
security integrity-v5.12-fix 2021-03-25 16:46:43 -07:00
sound sound fixes for 5.12-rc4 2021-03-19 09:53:32 -07:00
tools Some more perf tools fixes for v5.12: 2021-03-28 13:22:54 -07:00
usr Kbuild updates for v5.12 2021-02-25 10:17:31 -08:00
virt KVM: x86/mmu: Consider the hva in mmu_notifier retry 2021-02-22 13:16:53 -05:00
.clang-format cxl for 5.12 2021-02-24 09:38:36 -08:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore clang-lto series for v5.12-rc1 2021-02-23 09:28:51 -08:00
.mailmap Merge branch 'akpm' (patches from Andrew) 2021-03-25 11:43:43 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS treewide: Miguel has moved 2021-02-26 09:41:03 -08:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS ARM: SoC fixes for v5.12 2021-03-26 11:19:38 -07:00
Makefile Linux 5.12-rc5 2021-03-28 15:48:16 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.