linux/Documentation
Roman Gushchin cf11e85fc0 mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
..
ABI f2fs-for-5.7-rc1 2020-04-07 13:48:26 -07:00
accounting doc: cgroup: improve formatting of references 2020-03-02 12:57:03 -07:00
admin-guide mm: hugetlb: optionally allocate gigantic hugepages using cma 2020-04-10 15:36:21 -07:00
arm docs: arm: tcm: Fix a few typos 2020-02-19 02:42:21 -07:00
arm64 arm64 updates for 5.7: 2020-03-31 10:05:01 -07:00
block block: Document genhd capability flags 2020-03-12 07:47:22 -06:00
bpf bpf: lsm: Add Documentation 2020-03-30 01:35:12 +02:00
cdrom docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
core-api mm: add pagemap.h to the fine documentation 2020-04-02 09:35:29 -07:00
cpu-freq docs: cpu-freq: convert cpufreq-stats.txt to ReST 2020-03-06 00:01:02 +01:00
crypto crypto: algapi - make unregistration functions return void 2019-12-20 14:58:35 +08:00
dev-tools linux-kselftest-kunit-5.7-rc1 2020-04-01 16:11:40 -07:00
devicetree linux-watchdog 5.7-rc1 tag 2020-04-08 21:29:10 -07:00
doc-guide Documentation: build warnings related to missing blank lines after explicit markups has been fixed 2020-02-05 10:30:03 -07:00
driver-api - Convert tsens configuration DT binding to yaml (Rajeshwari) 2020-04-07 20:00:16 -07:00
fault-injection docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
fb fbdev: fbmem: allow overriding the number of bootup logos 2020-01-03 14:27:40 +01:00
features documentation: vm: Advertise support for pte_special in riscv 2020-02-19 02:41:01 -07:00
filesystems 9p pull request for inclusion in 5.7 (take 2) 2020-04-08 21:51:14 -07:00
firmware_class
firmware-guide docs: firmware-guide: ACPI: Replace dma_request_slave_channel() with dma_request_chan() 2019-12-19 22:58:12 +01:00
fpga Documentation: fpga: dfl: add descriptions for thermal/power management interfaces 2019-10-16 19:18:26 -07:00
gpu UAPI Changes: 2020-03-19 10:40:27 +10:00
hid docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
hwmon docs: hwmon: Update documentation for isl68137 pmbus driver 2020-03-22 16:42:54 -07:00
i2c i2c: convert SMBus alert setup function to return an ERRPTR 2020-03-10 12:19:52 +01:00
ia64 docs: add SPDX tags to new index files 2019-07-15 11:03:03 -03:00
ide docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
iio docs: add some documentation dirs to the driver-api book 2019-07-15 11:03:02 -03:00
infiniband Documentation/infiniband: update name of some functions 2019-09-13 16:55:55 -03:00
input Input: docs: fix spelling mistake "potocol" -> "protocol" 2019-08-06 11:24:49 -06:00
isdn isdn: capi: dead code removal 2019-12-11 09:12:38 +01:00
kbuild Kbuild updates for v5.7 2020-03-31 16:03:39 -07:00
kernel-hacking docs: locking: Drop :c:func: throughout 2020-03-20 17:16:24 -06:00
leds leds: core: Add support for composing LED class device names 2019-07-25 20:07:52 +02:00
livepatch livepatch: Documentation of the new API for tracking system state changes 2019-11-01 13:08:24 +01:00
locking Documentation/locking/locktypes: Minor copy editor fixes 2020-03-28 12:47:34 +01:00
m68k docs: README.buddha: convert to ReST and add to m68k book 2019-07-31 13:30:10 -06:00
maintainer Add a maintainer entry profile for documentation 2020-01-24 09:48:39 -07:00
media media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
mhi docs: Add documentation for MHI bus 2020-03-19 07:41:04 +01:00
mips docs: mips: remove no longer needed au1xxx_ide.rst documentation 2020-03-24 15:53:48 +01:00
misc-devices Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-04-01 14:47:40 -07:00
netlabel docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-03-31 17:29:33 -07:00
nios2 docs: nios2: add it to the main Documentation body 2019-07-31 13:31:51 -06:00
nvdimm docs: nvdimm: use ReST notation for subsection 2020-01-24 09:54:42 -07:00
openrisc openrisc: configs: Cleanup CONFIG_CROSS_COMPILE 2020-01-31 22:10:58 +09:00
parisc docs: parisc: convert to ReST and add to documentation body 2019-07-31 13:30:15 -06:00
PCI pci-v5.7-changes 2020-04-03 14:25:02 -07:00
pcmcia docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
power Merge branches 'pm-core', 'pm-sleep', 'pm-acpi' and 'pm-domains' 2020-03-30 14:46:58 +02:00
powerpc powerpc updates for 5.7 2020-04-05 11:12:59 -07:00
process Char/Misc driver patches for 5.7-rc1 2020-04-03 13:22:40 -07:00
RCU doc: Add rcutorture scripting to torture.txt 2020-02-27 07:03:14 -08:00
riscv It has been a relatively quiet cycle for documentation, but there's still a 2020-01-29 15:27:31 -08:00
s390 Documentation/s390: remove outdated debugging390 documentation 2019-08-21 12:41:43 +02:00
scheduler Documentation/scheduler: fix links in sched-stats 2019-10-29 04:35:41 -06:00
scsi SCSI misc on 20200402 2020-04-02 17:03:53 -07:00
security docs: prevent warnings due to autosectionlabel 2020-03-20 17:01:29 -06:00
sh docs: remove extra conf.py files 2019-07-17 06:57:52 -03:00
sound ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups 2020-03-31 10:54:06 +02:00
sparc docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
sphinx docs: Fix empty parallelism argument 2020-02-25 03:11:04 -07:00
sphinx-static doc-rst: Reduce CSS padding around Field 2019-10-02 10:03:06 -06:00
spi spi: docs: convert to ReST and add it to the kABI bookset 2019-07-31 14:13:13 -06:00
target docs: prevent warnings due to autosectionlabel 2020-03-20 17:01:29 -06:00
timers docs: add some directories to the main documentation index 2019-07-15 11:03:03 -03:00
trace New tracing features: 2020-04-05 10:36:18 -07:00
translations media updates for v5.7-rc1 2020-03-30 13:42:05 -07:00
usb usb: gadget: add raw-gadget interface 2020-03-15 11:34:48 +02:00
userspace-api docs: userspace: ioctl-number: remove mc146818rtc conflict 2020-02-13 11:42:02 -07:00
virt ARM: 2020-04-02 15:13:15 -07:00
vm mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00
w1 docs: w1: Fix a typo in omap-hdq.rst 2019-12-30 11:58:02 -07:00
watchdog linux-watchdog 5.4-rc1 tag 2019-09-27 11:17:38 -07:00
x86 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-03-31 11:04:05 -07:00
xtensa docs: add arch doc directories to the index 2019-07-15 11:03:01 -03:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
asm-annotations.rst Documentation: Call out example SYM_FUNC_* usage as x86-specific 2020-01-16 12:53:16 -07:00
atomic_bitops.txt
atomic_t.txt Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 16:12:03 -07:00
bus-virt-phys-mapping.txt
Changes
CodingStyle
conf.py docs: conf.py: avoid thousands of duplicate label warning on Sphinx 2020-03-20 17:01:34 -06:00
COPYING-logo docs: logo.txt: rename it to COPYING-logo 2019-07-15 09:20:27 -03:00
crc32.txt
debugging-via-ohci1394.txt
digsig.txt
DMA-API-HOWTO.txt docs: DMA-API-HOWTO.txt: fix an unmarked code block 2019-07-15 09:20:24 -03:00
DMA-API.txt dma-mapping: remove dma_release_declared_memory 2019-09-04 11:13:19 +02:00
DMA-attributes.txt dma-mapping: remove the DMA_ATTR_WRITE_BARRIER flag 2019-11-14 12:01:54 -04:00
DMA-ISA-LPC.txt
docutils.conf
dontdiff modpost: dump missing namespaces into a single modules.nsdeps file 2019-11-11 20:10:01 +09:00
futex-requeue-pi.txt
hwspinlock.txt hwspinlock: add the 'in_atomic' API 2019-06-29 21:08:14 -07:00
index.rst Char/Misc driver patches for 5.7-rc1 2020-04-03 13:22:40 -07:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
Kconfig
kprobes.txt
kref.txt docs: kref: Clarify the use of two kref_put() in example code 2020-02-25 03:39:10 -07:00
logo.gif
lzo.txt
mailbox.txt
Makefile Kbuild updates for v5.7 2020-03-31 16:03:39 -07:00
memory-barriers.txt Documentation/memory-barriers: Fix typos 2020-02-27 07:03:14 -08:00
nommu-mmap.txt
percpu-rw-semaphore.txt
pi-futex.txt docs: locking: convert docs to ReST and rename to *.rst 2019-07-15 08:53:27 -03:00
preempt-locking.txt
rbtree.txt docs: rbtree.txt: fix Sphinx build warnings 2019-07-15 09:20:24 -03:00
remoteproc.txt remoteproc: Add elf64 support in elf loader 2020-03-25 22:29:40 -07:00
robust-futex-ABI.txt threads: Update PID limit comment according to futex UAPI change 2020-03-21 17:48:13 +01:00
robust-futexes.txt
rpmsg.txt
speculation.txt
static-keys.txt
SubmittingPatches
tee.txt Documentation: tee: add AMD-TEE driver details 2020-01-04 13:49:51 +08:00
this_cpu_ops.txt
unaligned-memory-access.txt
xz.txt