2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-01 00:54:15 +08:00
linux-next/arch/x86/mm
Ian Campbell 1431559200 x86, mm: Allow highmem user page tables to be disabled at boot time
Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
enable CONFIG_HIGHPTE for any x86 configuration which has highmem
enabled. This means that the overhead applies even to machines which
have a fairly modest amount of high memory and which therefore do not
really benefit from allocating PTEs in high memory but still pay the
price of the additional mapping operations.

Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
no actual highptes being allocated there was a reduction in system
time used from 59.737s to 55.9s.

With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 175.396 (0.238914)
  User Time 515.983 (5.85019)
  System Time 59.737 (1.26727)
  Percent CPU 263.8 (71.6796)
  Context Switches 39989.7 (4672.64)
  Sleeps 42617.7 (246.307)

With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 174.278 (0.831968)
  User Time 515.659 (6.07012)
  System Time 55.9 (1.07799)
  Percent CPU 263.8 (71.266)
  Context Switches 39929.6 (4485.13)
  Sleeps 42583.7 (373.039)

This patch allows the user to control the allocation of PTEs in
highmem from the command line ("userpte=nohigh") but retains the
status-quo as the default.

It is possible that some simple heuristic could be developed which
allows auto-tuning of this option however I don't have a sufficiently
large machine available to me to perform any particularly meaningful
experiments. We could probably handwave up an argument for a threshold
at 16G of total RAM.

Assuming 768M of lowmem we have 196608 potential lowmem PTE
pages. Each page can map 2M of RAM in a PAE-enabled configuration,
meaning a maximum of 384G of RAM could potentially be mapped using
lowmem PTEs.

Even allowing generous factor of 10 to account for other required
lowmem allocations, generous slop to account for page sharing (which
reduces the total amount of RAM mappable by a given number of PT
pages) and other innacuracies in the estimations it would seem that
even a 32G machine would not have a particularly pressing need for
highmem PTEs. I think 32G could be considered to be at the upper bound
of what might be sensible on a 32 bit machine (although I think in
practice 64G is still supported).

It's seems questionable if HIGHPTE is even a win for any amount of RAM
you would sensibly run a 32 bit kernel on rather than going 64 bit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25 10:28:19 +01:00
..
kmemcheck x86, kmemcheck: Use KERN_WARNING for error reporting 2009-12-28 10:28:35 +01:00
dump_pagetables.c x86: remove (null) in /sys kernel_page_tables 2009-04-14 11:50:22 +02:00
extable.c x86, 64-bit: Move K8 B step iret fixup to fault entry asm 2009-10-12 18:29:46 +02:00
fault.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-05 15:33:27 -08:00
gup.c x86, doc: Fix minor spelling error in arch/x86/mm/gup.c 2010-02-02 16:00:44 -08:00
highmem_32.c Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2009-09-14 17:43:43 -07:00
hugetlbpage.c x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-29 08:40:03 -07:00
init_32.c mm: make totalhigh_pages unsigned long 2010-01-11 09:34:03 -08:00
init_64.c memory hotplug: fix a bug on /dev/mem for 64-bit kernels 2010-02-02 18:11:23 -08:00
init.c x86, mm: Report state of NX protections during boot 2009-11-16 13:44:59 -08:00
iomap_32.c x86, pat: Add PAT reserve free to io_mapping* APIs 2009-08-26 15:41:16 -07:00
ioremap.c Merge branch 'linus' into x86/mm 2010-02-17 18:28:05 +01:00
k8topology_64.c x86: Move find_smp_config() earlier and avoid bootmem usage 2009-11-24 12:10:51 +01:00
kmmio.c hw-breakpoints, perf: Fix broken mmiotrace due to dr6 by reference change 2010-01-17 08:01:44 +01:00
Makefile x86: split NX setup into separate file to limit unstack-protected code 2009-09-21 13:56:58 -07:00
memtest.c x86: memtest: use pointers of equal type for comparison 2009-06-11 16:26:35 +02:00
mmap.c x86: Increase MIN_GAP to include randomized stack 2009-09-10 17:00:12 -07:00
mmio-mod.c x86: Fix build warning in arch/x86/mm/mmio-mod.c 2009-12-14 08:55:43 +01:00
numa_32.c x86: Export k8 physical topology 2009-10-12 22:56:45 +02:00
numa_64.c x86, numa: Use near(er) online node instead of roundrobin for NUMA 2009-11-23 10:06:24 +01:00
numa.c cpumask: convert node_to_cpumask_map[] to cpumask_var_t 2009-03-13 14:35:31 +01:00
pageattr-test.c x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious 2008-09-05 17:09:57 +02:00
pageattr.c x86, pageattr: Make set_memory_(x|nx) aware of NX support 2009-11-16 13:44:58 -08:00
pat.c vfs: Implement proper O_SYNC semantics 2009-12-10 15:02:50 +01:00
pf_in.c x86: fix mmiotrace 8-bit register decoding 2008-10-14 10:33:50 +02:00
pf_in.h x86 mmiotrace: move files into arch/x86/mm/. 2008-05-24 11:25:37 +02:00
pgtable_32.c x86/32: no need to use set_pte_present in set_pte_vaddr 2009-03-19 14:04:18 +01:00
pgtable.c x86, mm: Allow highmem user page tables to be disabled at boot time 2010-02-25 10:28:19 +01:00
physaddr.c x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
physaddr.h x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
setup_nx.c x86, mm: Report state of NX protections during boot 2009-11-16 13:44:59 -08:00
srat_32.c x86: Fix checking of SRAT when node 0 ram is not from 0 2009-12-16 16:43:37 -08:00
srat_64.c x86: Set hotpluggable nodes in nodes_possible_map 2010-01-23 06:21:57 +01:00
testmmiotrace.c testmmiotrace.c: Add and use pr_fmt(fmt) 2009-10-12 08:05:41 +02:00
tlb.c x86: Convert tlbstate_lock to raw_spinlock 2010-02-17 18:28:59 +01:00