linux/Documentation
Jakub Kicinski 4b82ab4f28 mm/memcg: automatically penalize tasks with high swap use
Add a memory.swap.high knob, which can be used to protect the system
from SWAP exhaustion.  The mechanism used for penalizing is similar to
memory.high penalty (sleep on return to user space).

That is not to say that the knob itself is equivalent to memory.high.
The objective is more to protect the system from potentially buggy tasks
consuming a lot of swap and impacting other tasks, or even bringing the
whole system to stand still with complete SWAP exhaustion.  Hopefully
without the need to find per-task hard limits.

Slowing misbehaving tasks down gradually allows user space oom killers
or other protection mechanisms to react.  oomd and earlyoom already do
killing based on swap exhaustion, and memory.swap.high protection will
help implement such userspace oom policies more reliably.

We can use one counter for number of pages allocated under pressure to
save struct task space and avoid two separate hierarchy walks on the hot
path.  The exact overage is calculated on return to user space, anyway.

Take the new high limit into account when determining if swap is "full".
Borrowing the explanation from Johannes:

  The idea behind "swap full" is that as long as the workload has plenty
  of swap space available and it's not changing its memory contents, it
  makes sense to generously hold on to copies of data in the swap device,
  even after the swapin.  A later reclaim cycle can drop the page without
  any IO.  Trading disk space for IO.

  But the only two ways to reclaim a swap slot is when they're faulted
  in and the references go away, or by scanning the virtual address space
  like swapoff does - which is very expensive (one could argue it's too
  expensive even for swapoff, it's often more practical to just reboot).

  So at some point in the fill level, we have to start freeing up swap
  slots on fault/swapin.  Otherwise we could eventually run out of swap
  slots while they're filled with copies of data that is also in RAM.

  We don't want to OOM a workload because its available swap space is
  filled with redundant cache.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Chris Down <chris@chrisdown.name>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Link: http://lkml.kernel.org/r/20200527195846.102707-5-kuba@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:09 -07:00
..
ABI Printk changes for 5.8 2020-06-01 12:13:30 -07:00
accounting doc: cgroup: improve formatting of references 2020-03-02 12:57:03 -07:00
admin-guide mm/memcg: automatically penalize tasks with high swap use 2020-06-02 10:59:09 -07:00
arm docs: arm: tcm: Fix a few typos 2020-02-19 02:42:21 -07:00
arm64 Documentation: arm64: fix amu.rst doc warnings 2020-04-23 17:05:22 +01:00
block block: Document genhd capability flags 2020-03-12 07:47:22 -06:00
bpf bpf: lsm: Add Documentation 2020-03-30 01:35:12 +02:00
cdrom
core-api Printk changes for 5.8 2020-06-01 12:13:30 -07:00
cpu-freq docs: cpu-freq: convert cpufreq-stats.txt to ReST 2020-03-06 00:01:02 +01:00
crypto crypto: algapi - make unregistration functions return void 2019-12-20 14:58:35 +08:00
dev-tools linux-kselftest-kunit-5.7-rc1 2020-04-01 16:11:40 -07:00
devicetree Fixes and new features for pstore 2020-06-01 12:07:34 -07:00
doc-guide Documentation: build warnings related to missing blank lines after explicit markups has been fixed 2020-02-05 10:30:03 -07:00
driver-api A handful of late-arriving fixes for the documentation tree. 2020-04-10 17:53:43 -07:00
fault-injection
fb fbdev: fbmem: allow overriding the number of bootup logos 2020-01-03 14:27:40 +01:00
features documentation: vm: Advertise support for pte_special in riscv 2020-02-19 02:41:01 -07:00
filesystems mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead 2020-06-02 10:59:08 -07:00
firmware_class
firmware-guide Documentation: firmware-guide: ACPI: fix table alignment in namespace.rst 2020-04-08 14:27:48 +02:00
fpga Documentation: fpga: dfl: add descriptions for thermal/power management interfaces 2019-10-16 19:18:26 -07:00
gpu UAPI Changes: 2020-03-19 10:40:27 +10:00
hid
hwmon hwmon: Add Baikal-T1 PVT sensor driver 2020-05-28 07:59:45 -07:00
i2c i2c: convert SMBus alert setup function to return an ERRPTR 2020-03-10 12:19:52 +01:00
ia64
ide
iio
infiniband
input
isdn isdn: capi: dead code removal 2019-12-11 09:12:38 +01:00
kbuild Documentation: kbuild: fix the section title format 2020-04-23 10:53:19 +09:00
kernel-hacking docs: locking: Drop :c:func: throughout 2020-03-20 17:16:24 -06:00
leds
livepatch livepatch: Documentation of the new API for tracking system state changes 2019-11-01 13:08:24 +01:00
locking Documentation/locking/locktypes: Minor copy editor fixes 2020-03-28 12:47:34 +01:00
m68k
maintainer Add a maintainer entry profile for documentation 2020-01-24 09:48:39 -07:00
media media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
mhi docs: Add documentation for MHI bus 2020-03-19 07:41:04 +01:00
mips docs: mips: remove no longer needed au1xxx_ide.rst documentation 2020-03-24 15:53:48 +01:00
misc-devices Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-04-01 14:47:40 -07:00
netlabel
networking ice: cleanup language in ice.rst for fw.app 2020-05-01 15:30:24 -07:00
nios2
nvdimm docs: nvdimm: use ReST notation for subsection 2020-01-24 09:54:42 -07:00
openrisc openrisc: configs: Cleanup CONFIG_CROSS_COMPILE 2020-01-31 22:10:58 +09:00
parisc
PCI pci-v5.7-changes 2020-04-03 14:25:02 -07:00
pcmcia
power Merge branches 'pm-core', 'pm-sleep', 'pm-acpi' and 'pm-domains' 2020-03-30 14:46:58 +02:00
powerpc powerpc updates for 5.7 2020-04-05 11:12:59 -07:00
process checkpatch/coding-style: deprecate 80-column warning 2020-05-31 11:00:42 -07:00
RCU doc: Add rcutorture scripting to torture.txt 2020-02-27 07:03:14 -08:00
riscv It has been a relatively quiet cycle for documentation, but there's still a 2020-01-29 15:27:31 -08:00
s390
scheduler Documentation/scheduler: fix links in sched-stats 2019-10-29 04:35:41 -06:00
scsi SCSI misc on 20200402 2020-04-02 17:03:53 -07:00
security crypto: lib/sha1 - rename "sha" to "sha1" 2020-05-08 15:32:17 +10:00
sh
sound ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups 2020-03-31 10:54:06 +02:00
sparc
sphinx docs: Fix empty parallelism argument 2020-02-25 03:11:04 -07:00
sphinx-static doc-rst: Reduce CSS padding around Field 2019-10-02 10:03:06 -06:00
spi
target docs: prevent warnings due to autosectionlabel 2020-03-20 17:01:29 -06:00
timers
trace New tracing features: 2020-04-05 10:36:18 -07:00
translations media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
usb usb: raw-gadget: documentation updates 2020-05-14 12:30:18 +03:00
userspace-api docs: userspace: ioctl-number: remove mc146818rtc conflict 2020-02-13 11:42:02 -07:00
virt docs/virt/kvm: Document configuring and running nested guests 2020-05-06 05:45:47 -04:00
vm Documentation/vm/slub.rst: s/Toggle/Enable/ 2020-06-02 10:59:06 -07:00
w1 docs: w1: Fix a typo in omap-hdq.rst 2019-12-30 11:58:02 -07:00
watchdog linux-watchdog 5.4-rc1 tag 2019-09-27 11:17:38 -07:00
x86 Documentation/x86, efi/x86: Clarify EFI handover protocol and its requirements 2020-04-14 08:32:15 +02:00
xtensa
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
asm-annotations.rst Documentation: Call out example SYM_FUNC_* usage as x86-specific 2020-01-16 12:53:16 -07:00
atomic_bitops.txt
atomic_t.txt
bus-virt-phys-mapping.txt
Changes
CodingStyle
conf.py docs: conf.py: avoid thousands of duplicate label warning on Sphinx 2020-03-20 17:01:34 -06:00
COPYING-logo
crc32.txt
debugging-via-ohci1394.txt
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt dma-mapping: remove the DMA_ATTR_WRITE_BARRIER flag 2019-11-14 12:01:54 -04:00
DMA-ISA-LPC.txt
docutils.conf
dontdiff modpost: dump missing namespaces into a single modules.nsdeps file 2019-11-11 20:10:01 +09:00
futex-requeue-pi.txt
hwspinlock.txt
index.rst Char/Misc driver patches for 5.7-rc1 2020-04-03 13:22:40 -07:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
Kconfig
kprobes.txt
kref.txt docs: kref: Clarify the use of two kref_put() in example code 2020-02-25 03:39:10 -07:00
logo.gif
lzo.txt
mailbox.txt
Makefile Kbuild updates for v5.7 2020-03-31 16:03:39 -07:00
memory-barriers.txt Documentation/memory-barriers: Fix typos 2020-02-27 07:03:14 -08:00
nommu-mmap.txt
percpu-rw-semaphore.txt
pi-futex.txt
preempt-locking.txt
rbtree.txt
remoteproc.txt remoteproc: Add elf64 support in elf loader 2020-03-25 22:29:40 -07:00
robust-futex-ABI.txt threads: Update PID limit comment according to futex UAPI change 2020-03-21 17:48:13 +01:00
robust-futexes.txt
rpmsg.txt
speculation.txt
static-keys.txt
SubmittingPatches
tee.txt Documentation: tee: add AMD-TEE driver details 2020-01-04 13:49:51 +08:00
this_cpu_ops.txt
unaligned-memory-access.txt
xz.txt