linux/Documentation
Michal Hocko 7550c60798 mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps
Patch series "THP eligibility reporting via proc".

This series of three patches aims at making THP eligibility reporting much
more robust and long term sustainable.  The trigger for the change is a
regression report [2] and the long follow up discussion.  In short the
specific application didn't have good API to query whether a particular
mapping can be backed by THP so it has used VMA flags to workaround that.
These flags represent a deep internal state of VMAs and as such they
should be used by userspace with a great deal of caution.

A similar has happened for [3] when users complained that VM_MIXEDMAP is
no longer set on DAX mappings.  Again a lack of a proper API led to an
abuse.

The first patch in the series tries to emphasise that that the semantic of
flags might change and any application consuming those should be really
careful.

The remaining two patches provide a more suitable interface to address [2]
and provide a consistent API to query the THP status both for each VMA and
process wide as well.  [1]

http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2]
http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
[3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz

This patch (of 3):

Even though vma flags exported via /proc/<pid>/smaps are explicitly
documented to be not guaranteed for future compatibility the warning
doesn't go far enough because it doesn't mention semantic changes to those
flags.  And they are important as well because these flags are a deep
implementation internal to the MM code and the semantic might change at
any time.

Let's consider two recent examples:
http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz
: commit e1fb4a0864 "dax: remove VM_MIXEDMAP for fsdax and device dax" has
: removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the
: mean time certain customer of ours started poking into /proc/<pid>/smaps
: and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA
: flags, the application just fails to start complaining that DAX support is
: missing in the kernel.

http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com
: Commit 1860033237 ("mm: make PR_SET_THP_DISABLE immediately active")
: introduced a regression in that userspace cannot always determine the set
: of vmas where thp is ineligible.
: Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps
: to determine if a vma is eligible to be backed by hugepages.
: Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to
: be disabled and emit "nh" as a flag for the corresponding vmas as part of
: /proc/pid/smaps.  After the commit, thp is disabled by means of an mm
: flag and "nh" is not emitted.
: This causes smaps parsing libraries to assume a vma is eligible for thp
: and ends up puzzling the user on why its memory is not backed by thp.

In both cases userspace was relying on a semantic of a specific VMA flag.
The primary reason why that happened is a lack of a proper interface.
While this has been worked on and it will be fixed properly, it seems that
our wording could see some refinement and be more vocal about semantic
aspect of these flags as well.

Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Paul Oppenheimer <bepvte@gmail.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:50 -08:00
..
ABI zram: writeback throttle 2018-12-28 12:11:49 -08:00
accelerators ocxl: Document new OCXL IOCTLs 2018-06-03 20:40:33 +10:00
accounting psi: cgroup support 2018-10-26 16:26:32 -07:00
acpi ACPI: property: graph: Update graph documentation to use generic references 2018-07-23 12:44:52 +02:00
admin-guide selinux/stable-4.21 PR 20181224 2018-12-27 12:01:58 -08:00
aoe
arm ARM: SoC device tree updates for 4.20 2018-10-29 15:05:20 -07:00
arm64 arm64 festive updates for 4.21 2018-12-25 17:41:56 -08:00
auxdisplay Doc: misc-devices: move lcd-panel-cgram.txt to auxdisplay/ 2018-04-12 16:08:02 +02:00
backlight
block Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
blockdev zram: writeback throttle 2018-12-28 12:11:49 -08:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-08-07 11:02:05 -07:00
bus-devices
cdrom Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
cgroup-v1 Merge branch 'for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2018-10-25 17:15:46 -07:00
cma
connector
console Documentation: corrections to console/console.txt 2018-08-10 16:09:40 -06:00
core-api XArray: Add xa_cmpxchg_irq and xa_cmpxchg_bh 2018-12-06 08:26:17 -05:00
cpu-freq Documentation: cpu-freq: Frequencies aren't always sorted 2018-11-07 13:29:04 +01:00
cpuidle Documentation: admin-guide: PM: Add cpuidle document 2018-12-03 10:03:36 +01:00
crypto crypto: skcipher - remove remnants of internal IV generators 2018-12-23 11:52:45 +08:00
dev-tools kasan: update documentation 2018-12-28 12:11:44 -08:00
device-mapper This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
devicetree Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-12-27 13:53:32 -08:00
doc-guide Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
driver-api docs: driver-api: Add I3C documentation 2018-11-12 10:33:49 +01:00
driver-model gpio: Add devm_gpiod_unhinge() 2018-12-11 01:04:23 +00:00
early-userspace initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
EDID
extcon
fault-injection Documentation: nvme: Documentation for nvme fault injection 2018-03-26 08:53:43 -06:00
fb This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
features MIPS: Enable IOREMAP_PROT config option for MIPS cpus 2018-11-05 10:15:28 -08:00
filesystems mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps 2018-12-28 12:11:50 -08:00
firmware_class
fmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
fpga docs: fpga: add a document for FPGA Device Feature List (DFL) Framework Overview 2018-07-15 13:55:44 +02:00
gpio Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
gpu Final changes to drm-misc-next for v4.21: 2018-12-07 11:23:05 +10:00
hid
hwmon hwmon: (ina3221) Read channel input source info from DT 2018-10-10 20:37:13 -07:00
i2c i2c: add i2c bus driver for NVIDIA GPU 2018-11-09 17:46:43 +01:00
ia64 ia64: doc: tweak whitespace for 'console=' parameter 2018-03-05 14:41:38 -08:00
ide Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
iio
infiniband Documentation/ABI: update infiniband sysfs interfaces 2018-02-23 08:18:33 -07:00
input Revert "Input: Add the REL_WHEEL_HI_RES event code" 2018-11-22 08:57:44 +01:00
ioctl drm pull for 4.20-rc1 2018-10-28 17:49:53 -07:00
isdn Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
kbuild kbuild: remove unused cc-fullversion variable 2018-11-02 00:15:26 +09:00
kdump
kernel-hacking doc:it_IT: translation for kernel-hacking 2018-07-26 16:21:09 -06:00
laptops platform-drivers-x86 for v4.20-1 2018-11-01 08:42:21 -07:00
leds Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
lightnvm
livepatch livepatch: Remove not longer valid limitations from the documentation 2018-05-24 15:37:57 +02:00
locking This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
m68k Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
maintainer docs: Fix more broken references 2018-06-15 18:11:26 -03:00
md
media media fixes for v4.20-rc8 2018-12-25 13:11:30 -08:00
memory-devices
mic
mips Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
misc-devices pci_endpoint_test: Add 2 ioctl commands 2018-07-19 11:46:57 +01:00
mmc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
mtd Documentation: mtd: remove stale pxa3xx NAND controller documentation 2018-09-04 23:37:38 +02:00
namespaces
netlabel Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
networking Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-12-20 18:20:26 -08:00
nfc
nios2
nvdimm
nvmem Documentation: nvmem: document cell tables and lookup entries 2018-09-28 15:14:54 +02:00
openrisc
parisc Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
PCI pci-v4.20-changes 2018-10-25 06:50:48 -07:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf Documentation: perf: Add documentation for ThunderX2 PMU uncore driver 2018-12-06 12:29:47 +00:00
phy
platform
power This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
powerpc powerpc/fadump: Reservationless firmware assisted dump 2018-12-21 11:32:49 +11:00
pps
process The Compiler Attributes series 2018-11-01 18:34:46 -07:00
pti
ptp ptp: Fix documentation to match code. 2018-03-26 12:13:21 -04:00
rapidio Documentation: rapidio: move sysfs interface to ABI 2018-02-23 08:25:45 -07:00
RCU doc: Fix "struction" typo in RCU memory-ordering documentation 2018-11-12 08:56:25 -08:00
riscv perf: riscv: Add Document for Future Porting Guide 2018-06-04 14:02:11 -07:00
s390 KVM updates for v4.20 2018-10-25 17:57:35 -07:00
scheduler This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
scsi SCSI misc on 20181024 2018-10-25 07:40:30 -07:00
security Merge branch 'next-keys2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-11-01 15:23:59 -07:00
serial TTY/Serial patches for 4.20-rc1 2018-10-29 10:42:20 -07:00
sh
sound ALSA: doc: Brush up the old writing-an-alsa-driver 2018-10-18 10:30:01 +02:00
sparc sparc64: Add support for ADI (Application Data Integrity) 2018-03-18 07:38:48 -07:00
sphinx Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
sphinx-static docs: improve readability for people with poorer eyesight 2018-10-07 09:16:50 -06:00
spi Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
sysctl mm: reclaim small amounts of memory when an external fragmentation event occurs 2018-12-28 12:11:48 -08:00
target
thermal thermal: Add cooling device's statistics in sysfs 2018-04-02 21:49:01 +08:00
timers Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
trace The biggest change here is the updates to kprobes 2018-10-30 09:49:56 -07:00
translations This was a moderately busy cycle for docs, with the usual collection of 2018-08-14 14:29:31 -07:00
usb USB-serial updates for v4.19-rc1 2018-07-20 21:47:15 +02:00
userspace-api x86/speculation: Add prctl() control for indirect branch speculation 2018-11-28 11:57:13 +01:00
virtual x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID 2018-12-14 17:59:54 +01:00
vm Merge drm/drm-next into drm-intel-next-queued 2018-11-20 13:14:08 +02:00
w1 Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00
watchdog documentation: watchdog: add documentation for armada-37xx-wdt 2018-10-13 15:19:40 +02:00
wimax
x86 Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-12-26 12:17:43 -08:00
xilinx Documentation: xilinx: Add documentation for eemi APIs 2018-10-09 13:26:05 +02:00
xtensa
.gitignore
atomic_bitops.txt locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit() 2018-02-13 14:55:53 +01:00
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt Documentation: remove stale firmware API reference 2018-05-14 16:44:41 +02:00
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt
DMA-ISA-LPC.txt
docutils.conf
dontdiff
efi-stub.txt efi_stub: update documentation on dtb= parameter 2018-09-09 14:46:44 -06:00
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst docs: tidy up TOCs and refs to license-rules.rst 2018-08-31 16:50:50 -06:00
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt doc: Update removal of RCU-bh/sched update machinery 2018-08-30 10:59:48 -07:00
kobject.txt
kprobes.txt kprobes/Documentation: Fix various typos 2018-06-22 11:10:55 +02:00
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt
mailbox.txt
Makefile
memory-barriers.txt locking/memory-barriers: Replace smp_cond_acquire() with smp_cond_load_acquire() 2018-10-02 10:28:05 +02:00
men-chameleon-bus.txt
nommu-mmap.txt Documentation: nommu-map: Fix duplicate word typo 2018-06-26 09:01:27 -06:00
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt Documentation: preempt-locking: Use better example 2018-10-12 11:35:47 -06:00
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt rfkill: Fix several typos in documentation 2018-06-15 13:36:08 +02:00
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt NTB: switchtec_ntb: Update switchtec documentation with prerequisites for NTB 2018-10-11 11:28:53 -05:00
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-06-08 10:24:27 -06:00
vfio.txt vfio: fix documentation 2018-05-08 09:16:41 -06:00
video-output.txt
xillybus.txt
xz.txt
zorro.txt