Go to file
Minchan Kim e914d8f003 mm: fix unexpected zeroed page mapping with zram swap
Two processes under CLONE_VM cloning, user process can be corrupted by
seeing zeroed page unexpectedly.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
  swap_readpage valid data
    swap_slot_free_notify
      delete zram entry
                              swap_readpage zeroed(invalid) data
                              pte_lock
                              map the *zero data* to userspace
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return and next refault will
  read zeroed data

The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
increase the refcount of swap slot at copy_mm so it couldn't catch up
whether it's safe or not to discard data from backing device.  In the
case, only the lock it could rely on to synchronize swap slot freeing is
page table lock.  Thus, this patch gets rid of the swap_slot_free_notify
function.  With this patch, CPU A will see correct data.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
                              swap_readpage original data
                              pte_lock
                              map the original data
                              swap_free
                                swap_range_free
                                  bd_disk->fops->swap_slot_free_notify
  swap_readpage read zeroed data
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return
  on next refault will see mapped data by CPU B

The concern of the patch would increase memory consumption since it
could keep wasted memory with compressed form in zram as well as
uncompressed form in address space.  However, most of cases of zram uses
no readahead and do_swap_page is followed by swap_free so it will free
the compressed form from in zram quickly.

Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
Fixes: 0bcac06f27 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: Ivan Babrou <ivan@cloudflare.com>
Tested-by: Ivan Babrou <ivan@cloudflare.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-15 14:49:55 -07:00
arch s390 updates for 5.18-rc3 2022-04-14 12:17:25 -07:00
block for-5.18/block-2022-04-01 2022-04-01 16:20:00 -07:00
certs Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
crypto for-5.18/64bit-pi-2022-03-25 2022-03-26 12:01:35 -07:00
Documentation Networking fixes for 5.18-rc3, including fixes from wireless and 2022-04-14 11:58:19 -07:00
drivers Networking fixes for 5.18-rc3, including fixes from wireless and 2022-04-14 11:58:19 -07:00
fs for-5.18-rc2-tag 2022-04-14 10:58:27 -07:00
include mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
init Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
ipc fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
kernel irq_work: use kasan_record_aux_stack_noalloc() record callstack 2022-04-15 14:49:55 -07:00
lib Driver core changes for 5.18-rc2 2022-04-10 09:55:09 -10:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: fix unexpected zeroed page mapping with zram swap 2022-04-15 14:49:55 -07:00
net Networking fixes for 5.18-rc3, including fixes from wireless and 2022-04-14 11:58:19 -07:00
samples dma-mapping updates for Linux 5.18 2022-03-29 08:50:14 -07:00
scripts hardening fixes for v5.18-rc3 2022-04-12 14:29:40 -10:00
security hardening updates for v5.18-rc1-fix1 2022-03-31 11:43:01 -07:00
sound sound fixes for 5.18-rc3 2022-04-14 11:08:12 -07:00
tools x86: 2022-04-12 14:16:33 -10:00
usr Kbuild updates for v5.18 2022-03-31 11:59:03 -07:00
virt KVM/arm64 fixes for 5.18, take #1 2022-04-08 12:30:04 -04:00
.clang-format genirq/msi: Make interrupt allocation less convoluted 2021-12-16 22:22:20 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: update Vasily Averin's email address 2022-04-08 14:20:36 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: replace a Microchip AT91 maintainer 2022-02-09 11:30:01 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: Broadcom internal lists aren't maintainers 2022-04-15 14:49:54 -07:00
Makefile Linux 5.18-rc2 2022-04-10 14:21:36 -10:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.