linux/init
Johannes Weiner 2d1c498072 mm: memcontrol: make swap tracking an integral part of memory control
Without swap page tracking, users that are otherwise memory controlled can
easily escape their containment and allocate significant amounts of memory
that they're not being charged for.  That's because swap does readahead,
but without the cgroup records of who owned the page at swapout, readahead
pages don't get charged until somebody actually faults them into their
page table and we can identify an owner task.  This can be maliciously
exploited with MADV_WILLNEED, which triggers arbitrary readahead
allocations without charging the pages.

Make swap swap page tracking an integral part of memcg and remove the
Kconfig options.  In the first place, it was only made configurable to
allow users to save some memory.  But the overhead of tracking cgroup
ownership per swap page is minimal - 2 byte per page, or 512k per 1G of
swap, or 0.04%.  Saving that at the expense of broken containment
semantics is not something we should present as a coequal option.

The swapaccount=0 boot option will continue to exist, and it will
eliminate the page_counter overhead and hide the swap control files, but
it won't disable swap slot ownership tracking.

This patch makes sure we always have the cgroup records at swapin time;
the next patch will fix the actual bug by charging readahead swap pages at
swapin time rather than at fault time.

v2: fix double swap charge bug in cgroup1/cgroup2 code gating

[hannes@cmpxchg.org: fix crash with cgroup_disable=memory]
  Link: http://lkml.kernel.org/r/20200521215855.GB815153@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: http://lkml.kernel.org/r/20200508183105.225460-16-hannes@cmpxchg.org
Debugged-by: Hugh Dickins <hughd@google.com>
Debugged-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:48 -07:00
..
calibrate.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
do_mounts_initrd.c x86/setup: Add an initrdmem= option to specify initrd physical address 2020-04-27 09:28:16 +02:00
do_mounts_md.c init/: remove ineffective sparse disabling 2018-08-22 10:52:49 -07:00
do_mounts_rd.c init/: remove ineffective sparse disabling 2018-08-22 10:52:49 -07:00
do_mounts.c block: remove __bdevname 2020-03-24 07:57:07 -06:00
do_mounts.h fs: add do_mknodat() helper and ksys_mknod() wrapper; remove in-kernel calls to syscall 2018-04-02 20:15:56 +02:00
init_task.c arm64 updates for 5.8 2020-06-01 15:18:27 -07:00
initramfs.c gcc-10: mark more functions __init to avoid section mismatch warnings 2020-05-09 17:50:03 -07:00
Kconfig mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
main.c padata: initialize earlier 2020-06-03 20:09:45 -07:00
Makefile kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h 2020-04-09 00:13:45 +09:00
noinitramfs.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 167 2019-05-30 11:26:39 -07:00
version.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00