License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
# SPDX-License-Identifier: GPL-2.0
|
2007-06-13 00:30:17 +08:00
|
|
|
source "arch/powerpc/platforms/Kconfig.cputype"
|
2007-03-19 18:53:53 +08:00
|
|
|
|
2010-10-22 08:17:55 +08:00
|
|
|
config 32BIT
|
|
|
|
bool
|
|
|
|
default y if PPC32
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config 64BIT
|
|
|
|
bool
|
|
|
|
default y if PPC64
|
|
|
|
|
2021-12-21 00:38:12 +08:00
|
|
|
config LIVEPATCH_64
|
|
|
|
def_bool PPC64
|
|
|
|
depends on LIVEPATCH
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config MMU
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
2017-04-20 22:36:20 +08:00
|
|
|
config ARCH_MMAP_RND_BITS_MAX
|
|
|
|
# On Book3S 64, the default virtual address space for 64-bit processes
|
|
|
|
# is 2^47 (128TB). As a maximum, allow randomisation to consume up to
|
|
|
|
# 32T of address space (2^45), which should ensure a reasonable gap
|
|
|
|
# between bottom-up and top-down allocations for applications that
|
|
|
|
# consume "normal" amounts of address space. Book3S 64 only supports 64K
|
|
|
|
# and 4K page sizes.
|
|
|
|
default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 (64K)
|
|
|
|
default 33 if PPC_BOOK3S_64 # 33 = 45 (32T) - 12 (4K)
|
|
|
|
#
|
|
|
|
# On all other 64-bit platforms (currently only Book3E), the virtual
|
|
|
|
# address space is 2^46 (64TB). Allow randomisation to consume up to 16T
|
|
|
|
# of address space (2^44). Only 4K page sizes are supported.
|
|
|
|
default 32 if 64BIT # 32 = 44 (16T) - 12 (4K)
|
|
|
|
#
|
|
|
|
# For 32-bit, use the compat values, as they're the same.
|
|
|
|
default ARCH_MMAP_RND_COMPAT_BITS_MAX
|
|
|
|
|
|
|
|
config ARCH_MMAP_RND_BITS_MIN
|
|
|
|
# Allow randomisation to consume up to 1GB of address space (2^30).
|
|
|
|
default 14 if 64BIT && PPC_64K_PAGES # 14 = 30 (1GB) - 16 (64K)
|
|
|
|
default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K)
|
|
|
|
#
|
|
|
|
# For 32-bit, use the compat values, as they're the same.
|
|
|
|
default ARCH_MMAP_RND_COMPAT_BITS_MIN
|
|
|
|
|
|
|
|
config ARCH_MMAP_RND_COMPAT_BITS_MAX
|
|
|
|
# Total virtual address space for 32-bit processes is 2^31 (2GB).
|
|
|
|
# Allow randomisation to consume up to 512MB of address space (2^29).
|
|
|
|
default 11 if PPC_256K_PAGES # 11 = 29 (512MB) - 18 (256K)
|
|
|
|
default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K)
|
2019-07-04 00:04:13 +08:00
|
|
|
default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K)
|
2017-04-20 22:36:20 +08:00
|
|
|
default 17 # 17 = 29 (512MB) - 12 (4K)
|
|
|
|
|
|
|
|
config ARCH_MMAP_RND_COMPAT_BITS_MIN
|
|
|
|
# Total virtual address space for 32-bit processes is 2^31 (2GB).
|
|
|
|
# Allow randomisation to consume up to 8MB of address space (2^23).
|
|
|
|
default 5 if PPC_256K_PAGES # 5 = 23 (8MB) - 18 (256K)
|
|
|
|
default 7 if PPC_64K_PAGES # 7 = 23 (8MB) - 16 (64K)
|
|
|
|
default 9 if PPC_16K_PAGES # 9 = 23 (8MB) - 14 (16K)
|
|
|
|
default 11 # 11 = 23 (8MB) - 12 (4K)
|
|
|
|
|
2009-10-14 03:44:44 +08:00
|
|
|
config NR_IRQS
|
|
|
|
int "Number of virtual interrupt numbers"
|
2020-12-11 01:14:44 +08:00
|
|
|
range 32 1048576
|
2009-10-14 03:44:44 +08:00
|
|
|
default "512"
|
|
|
|
help
|
|
|
|
This defines the number of virtual interrupt numbers the kernel
|
|
|
|
can manage. Virtual interrupt numbers are what you see in
|
|
|
|
/proc/interrupts. If you configure your system to have too few,
|
|
|
|
drivers will fail to load or worse - handle with care.
|
|
|
|
|
2016-12-20 02:30:08 +08:00
|
|
|
config NMI_IPI
|
|
|
|
bool
|
2017-07-13 05:35:52 +08:00
|
|
|
depends on SMP && (DEBUGGER || KEXEC_CORE || HARDLOCKUP_DETECTOR)
|
2016-12-20 02:30:08 +08:00
|
|
|
default y
|
|
|
|
|
2017-08-01 20:00:52 +08:00
|
|
|
config PPC_WATCHDOG
|
|
|
|
bool
|
|
|
|
depends on HARDLOCKUP_DETECTOR
|
|
|
|
depends on HAVE_HARDLOCKUP_DETECTOR_ARCH
|
|
|
|
default y
|
|
|
|
help
|
|
|
|
This is a placeholder when the powerpc hardlockup detector
|
|
|
|
watchdog is selected (arch/powerpc/kernel/watchdog.c). It is
|
2020-12-07 23:54:20 +08:00
|
|
|
selected via the generic lockup detector menu which is why we
|
2017-08-01 20:00:52 +08:00
|
|
|
have no standalone config option for it here.
|
|
|
|
|
2008-04-17 12:35:00 +08:00
|
|
|
config STACKTRACE_SUPPORT
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
2008-04-17 12:35:01 +08:00
|
|
|
config LOCKDEP_SUPPORT
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
2008-01-30 20:31:20 +08:00
|
|
|
config GENERIC_LOCKBREAK
|
|
|
|
bool
|
|
|
|
default y
|
2019-10-25 00:04:58 +08:00
|
|
|
depends on SMP && PREEMPTION
|
2008-01-30 20:31:20 +08:00
|
|
|
|
2006-03-26 17:39:33 +08:00
|
|
|
config GENERIC_HWEIGHT
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config PPC
|
|
|
|
bool
|
|
|
|
default y
|
2017-03-06 19:53:59 +08:00
|
|
|
#
|
|
|
|
# Please keep this list sorted alphabetically.
|
|
|
|
#
|
32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
All new 32-bit architectures should have 64-bit userspace off_t type, but
existing architectures has 32-bit ones.
To enforce the rule, new config option is added to arch/Kconfig that defaults
ARCH_32BIT_OFF_T to be disabled for new 32-bit architectures. All existing
32-bit architectures enable it explicitly.
New option affects force_o_largefile() behaviour. Namely, if userspace
off_t is 64-bits long, we have no reason to reject user to open big files.
Note that even if architectures has only 64-bit off_t in the kernel
(arc, c6x, h8300, hexagon, nios2, openrisc, and unicore32),
a libc may use 32-bit off_t, and therefore want to limit the file size
to 4GB unless specified differently in the open flags.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yury Norov <ynorov@marvell.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-05-16 16:18:49 +08:00
|
|
|
select ARCH_32BIT_OFF_T if PPC32
|
2021-05-05 09:38:17 +08:00
|
|
|
select ARCH_ENABLE_MEMORY_HOTPLUG
|
|
|
|
select ARCH_ENABLE_MEMORY_HOTREMOVE
|
2021-04-22 01:06:42 +08:00
|
|
|
select ARCH_HAS_COPY_MC if PPC64
|
2022-02-17 04:05:28 +08:00
|
|
|
select ARCH_HAS_CURRENT_STACK_POINTER
|
2018-12-19 15:09:39 +08:00
|
|
|
select ARCH_HAS_DEBUG_VIRTUAL
|
2021-03-18 11:48:55 +08:00
|
|
|
select ARCH_HAS_DEBUG_VM_PGTABLE
|
2021-07-09 00:49:43 +08:00
|
|
|
select ARCH_HAS_DEBUG_WX if STRICT_KERNEL_RWX
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_HAS_DEVMEM_IS_ALLOWED
|
2021-04-22 01:06:42 +08:00
|
|
|
select ARCH_HAS_DMA_MAP_DIRECT if PPC_PSERIES
|
include/linux/string.h: add the option of fortified string.h functions
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.
GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and
allow full use of the regular string intrinsics while avoiding the need
for a bunch of _chk functions and per-arch assembly to avoid wrapper
overhead.
This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.
Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:
* Some of the fortified string functions (strncpy, strcat), don't yet
place a limit on reads from the source based on __builtin_object_size of
the source buffer.
* Extending coverage to more string functions like strlcat.
* It should be possible to optionally use __builtin_object_size(x, 1) for
some functions (C strings) to detect intra-object overflows (like
glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
approach to avoid likely compatibility issues.
* The compile-time checks should be made available via a separate config
option which can be enabled by default (or always enabled) once enough
time has passed to get the issues it catches fixed.
Kees said:
"This is great to have. While it was out-of-tree code, it would have
blocked at least CVE-2016-3858 from being exploitable (improper size
argument to strlcpy()). I've sent a number of fixes for
out-of-bounds-reads that this detected upstream already"
[arnd@arndb.de: x86: fix fortified memcpy]
Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de
[keescook@chromium.org: avoid panic() in favor of BUG()]
Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast
[keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help]
Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com
Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-13 05:36:10 +08:00
|
|
|
select ARCH_HAS_FORTIFY_SOURCE
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_HAS_GCOV_PROFILE_ALL
|
2019-07-12 11:57:28 +08:00
|
|
|
select ARCH_HAS_HUGEPD if HUGETLB_PAGE
|
2021-04-22 01:06:42 +08:00
|
|
|
select ARCH_HAS_KCOV
|
|
|
|
select ARCH_HAS_MEMBARRIER_CALLBACKS
|
|
|
|
select ARCH_HAS_MEMBARRIER_SYNC_CORE
|
2021-12-01 22:41:52 +08:00
|
|
|
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN if PPC_64S_HASH_MMU
|
2019-02-22 22:45:42 +08:00
|
|
|
select ARCH_HAS_MMIOWB if PPC64
|
2021-04-22 01:06:42 +08:00
|
|
|
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
|
2018-01-10 23:21:13 +08:00
|
|
|
select ARCH_HAS_PHYS_TO_DMA
|
2019-07-31 14:31:41 +08:00
|
|
|
select ARCH_HAS_PMEM_API
|
2019-07-17 07:30:47 +08:00
|
|
|
select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
|
2018-06-08 08:06:08 +08:00
|
|
|
select ARCH_HAS_PTE_SPECIAL
|
2019-08-23 00:44:05 +08:00
|
|
|
select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
|
2021-06-09 09:34:23 +08:00
|
|
|
select ARCH_HAS_SET_MEMORY
|
2021-10-15 18:02:42 +08:00
|
|
|
select ARCH_HAS_STRICT_KERNEL_RWX if (PPC_BOOK3S || PPC_8xx || 40x) && !HIBERNATION
|
2021-10-15 18:02:49 +08:00
|
|
|
select ARCH_HAS_STRICT_KERNEL_RWX if FSL_BOOKE && !HIBERNATION && !RANDOMIZE_BASE
|
2022-01-17 18:06:39 +08:00
|
|
|
select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
|
2019-07-31 14:31:41 +08:00
|
|
|
select ARCH_HAS_UACCESS_FLUSHCACHE
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_HAS_UBSAN_SANITIZE_ALL
|
|
|
|
select ARCH_HAVE_NMI_SAFE_CMPXCHG
|
2019-05-14 08:22:59 +08:00
|
|
|
select ARCH_KEEP_MEMBLOCK
|
2013-10-08 10:15:32 +08:00
|
|
|
select ARCH_MIGHT_HAVE_PC_PARPORT
|
2014-01-02 03:32:26 +08:00
|
|
|
select ARCH_MIGHT_HAVE_PC_SERIO
|
2018-01-04 23:35:25 +08:00
|
|
|
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
|
2021-09-25 01:13:53 +08:00
|
|
|
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
|
2021-03-16 15:57:15 +08:00
|
|
|
select ARCH_STACKWALK
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_SUPPORTS_ATOMIC_RMW
|
2021-10-15 18:02:42 +08:00
|
|
|
select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC_BOOK3S || PPC_8xx || 40x
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_USE_BUILTIN_BSWAP
|
|
|
|
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
|
2021-04-30 13:55:15 +08:00
|
|
|
select ARCH_USE_MEMTEST
|
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 21:14:20 +08:00
|
|
|
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
|
|
|
|
select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
|
2022-04-10 01:17:36 +08:00
|
|
|
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
|
2017-03-06 19:53:59 +08:00
|
|
|
select ARCH_WANT_IPC_PARSE_VERSION
|
2020-09-14 12:52:17 +08:00
|
|
|
select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
|
2020-11-20 04:46:56 +08:00
|
|
|
select ARCH_WANT_LD_ORPHAN_WARN
|
2017-01-15 05:32:50 +08:00
|
|
|
select ARCH_WEAK_RELEASE_ACQUIRE
|
2013-03-07 02:11:51 +08:00
|
|
|
select BINFMT_ELF
|
2019-12-04 08:46:31 +08:00
|
|
|
select BUILDTIME_TABLE_SORT
|
2017-03-06 19:53:59 +08:00
|
|
|
select CLONE_BACKWARDS
|
2021-11-05 11:50:42 +08:00
|
|
|
select CPUMASK_OFFSTACK if NR_CPUS >= 8192
|
2017-03-06 19:53:59 +08:00
|
|
|
select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
|
2020-07-08 18:22:47 +08:00
|
|
|
select DMA_OPS_BYPASS if PPC64
|
2021-04-22 01:06:42 +08:00
|
|
|
select DMA_OPS if PPC64
|
2018-03-27 12:29:06 +08:00
|
|
|
select DYNAMIC_FTRACE if FUNCTION_TRACER
|
2017-03-06 19:53:59 +08:00
|
|
|
select EDAC_ATOMIC_SCRUB
|
|
|
|
select EDAC_SUPPORT
|
|
|
|
select GENERIC_ATOMIC64 if PPC32
|
|
|
|
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
|
|
|
|
select GENERIC_CMOS_UPDATE
|
|
|
|
select GENERIC_CPU_AUTOPROBE
|
2018-07-28 07:06:34 +08:00
|
|
|
select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
|
2019-09-12 21:49:43 +08:00
|
|
|
select GENERIC_EARLY_IOREMAP
|
2021-04-01 00:48:47 +08:00
|
|
|
select GENERIC_GETTIMEOFDAY
|
2017-03-06 19:53:59 +08:00
|
|
|
select GENERIC_IRQ_SHOW
|
|
|
|
select GENERIC_IRQ_SHOW_LEVEL
|
2018-11-16 03:05:32 +08:00
|
|
|
select GENERIC_PCI_IOMAP if PCI
|
2021-07-09 00:49:43 +08:00
|
|
|
select GENERIC_PTDUMP
|
2017-03-06 19:53:59 +08:00
|
|
|
select GENERIC_SMP_IDLE_THREAD
|
powerpc: Convert VDSO update function to use new update_vsyscall interface
This converts the powerpc VDSO time update function to use the new
interface introduced in commit 576094b7f0aa ("time: Introduce new
GENERIC_TIME_VSYSCALL", 2012-09-11). Where the old interface gave
us the time as of the last update in seconds and whole nanoseconds,
with the new interface we get the nanoseconds part effectively in
a binary fixed-point format with tk->tkr_mono.shift bits to the
right of the binary point.
With the old interface, the fractional nanoseconds got truncated,
meaning that the value returned by the VDSO clock_gettime function
would have about 1ns of jitter in it compared to the value computed
by the generic timekeeping code in the kernel.
The powerpc VDSO time functions (clock_gettime and gettimeofday)
already work in units of 2^-32 seconds, or 0.23283 ns, because that
makes it simple to split the result into seconds and fractional
seconds, and represent the fractional seconds in either microseconds
or nanoseconds. This is good enough accuracy for now, so this patch
avoids changing how the VDSO works or the interface in the VDSO data
page.
This patch converts the powerpc update_vsyscall_old to be called
update_vsyscall and use the new interface. We convert the fractional
second to units of 2^-32 seconds without truncating to whole nanoseconds.
(There is still a conversion to whole nanoseconds for any legacy users
of the vdso_data/systemcfg stamp_xtime field.)
In addition, this improves the accuracy of the computation of tb_to_xs
for those systems with high-frequency timebase clocks (>= 268.5 MHz)
by doing the right shift in two parts, one before the multiplication and
one after, rather than doing the right shift before the multiplication.
(We can't do all of the right shift after the multiplication unless we
use 128-bit arithmetic.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-27 16:04:52 +08:00
|
|
|
select GENERIC_TIME_VSYSCALL
|
2021-04-01 00:48:47 +08:00
|
|
|
select GENERIC_VDSO_TIME_NS
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_ARCH_AUDITSYSCALL
|
2021-05-03 17:17:55 +08:00
|
|
|
select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
|
2021-07-01 09:48:12 +08:00
|
|
|
select HAVE_ARCH_HUGE_VMAP if PPC_RADIX_MMU || PPC_8xx
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_ARCH_JUMP_LABEL
|
2021-03-23 23:47:59 +08:00
|
|
|
select HAVE_ARCH_JUMP_LABEL_RELATIVE
|
2020-05-28 18:17:04 +08:00
|
|
|
select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14
|
|
|
|
select HAVE_ARCH_KASAN_VMALLOC if PPC32 && PPC_PAGE_SHIFT <= 14
|
2021-10-15 18:02:42 +08:00
|
|
|
select HAVE_ARCH_KFENCE if PPC_BOOK3S_32 || PPC_8xx || 40x
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_ARCH_KGDB
|
2017-04-20 22:36:20 +08:00
|
|
|
select HAVE_ARCH_MMAP_RND_BITS
|
|
|
|
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
|
2019-01-15 12:18:56 +08:00
|
|
|
select HAVE_ARCH_NVRAM_OPS
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_ARCH_SECCOMP_FILTER
|
|
|
|
select HAVE_ARCH_TRACEHOOK
|
2019-08-19 13:54:20 +08:00
|
|
|
select HAVE_ASM_MODVERSIONS
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_CONTEXT_TRACKING if PPC64
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_C_RECORDMCOUNT
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_DEBUG_KMEMLEAK
|
|
|
|
select HAVE_DEBUG_STACKOVERFLOW
|
2009-01-07 02:49:17 +08:00
|
|
|
select HAVE_DYNAMIC_FTRACE
|
2021-12-21 00:38:28 +08:00
|
|
|
select HAVE_DYNAMIC_FTRACE_WITH_ARGS if MPROFILE_KERNEL || PPC32
|
2021-10-28 20:24:04 +08:00
|
|
|
select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL || PPC32
|
2021-03-23 00:37:52 +08:00
|
|
|
select HAVE_EBPF_JIT
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU)
|
2019-07-12 11:57:14 +08:00
|
|
|
select HAVE_FAST_GUP
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_FTRACE_MCOUNT_RECORD
|
2022-05-09 13:36:08 +08:00
|
|
|
select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1
|
2018-06-07 17:52:02 +08:00
|
|
|
select HAVE_FUNCTION_ERROR_INJECTION
|
2009-02-12 09:06:43 +08:00
|
|
|
select HAVE_FUNCTION_GRAPH_TRACER
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_FUNCTION_TRACER
|
2018-05-28 17:22:05 +08:00
|
|
|
select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200 # plugin support on gcc <= 5.1 is buggy on PPC
|
2020-11-26 21:10:05 +08:00
|
|
|
select HAVE_GENERIC_VDSO
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_HARDLOCKUP_DETECTOR_ARCH if PPC_BOOK3S_64 && SMP
|
|
|
|
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
|
2008-07-24 12:27:08 +08:00
|
|
|
select HAVE_IOREMAP_PROT
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_IRQ_EXIT_ON_IRQ_STACK
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_IRQ_TIME_ACCOUNTING
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_KERNEL_GZIP
|
2019-06-14 18:16:24 +08:00
|
|
|
select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE
|
2019-06-14 18:16:25 +08:00
|
|
|
select HAVE_KERNEL_LZO if DEFAULT_UIMAGE
|
2019-02-01 04:59:04 +08:00
|
|
|
select HAVE_KERNEL_XZ if PPC_BOOK3S || 44x
|
2008-02-03 04:10:35 +08:00
|
|
|
select HAVE_KPROBES
|
2017-04-19 20:52:26 +08:00
|
|
|
select HAVE_KPROBES_ON_FTRACE
|
2008-03-05 06:28:37 +08:00
|
|
|
select HAVE_KRETPROBES
|
2018-05-09 21:00:01 +08:00
|
|
|
select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
|
2021-12-21 00:38:12 +08:00
|
|
|
select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_MOD_ARCH_SPECIFIC
|
2017-07-13 05:35:52 +08:00
|
|
|
select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
|
2021-04-20 22:02:07 +08:00
|
|
|
select HAVE_OPTPROBES
|
perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 18:02:48 +08:00
|
|
|
select HAVE_PERF_EVENTS
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_PERF_EVENTS_NMI if PPC64
|
2016-02-20 13:02:46 +08:00
|
|
|
select HAVE_PERF_REGS
|
2016-04-28 17:31:08 +08:00
|
|
|
select HAVE_PERF_USER_STACK_DUMP
|
2010-04-07 16:10:20 +08:00
|
|
|
select HAVE_REGS_AND_STACK_ACCESS_API
|
2021-03-16 15:57:13 +08:00
|
|
|
select HAVE_RELIABLE_STACKTRACE
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_RSEQ
|
mm: percpu: generalize percpu related config
Patch series "mm: percpu: Cleanup percpu first chunk function".
When supporting page mapping percpu first chunk allocator on arm64, we
found there are lots of duplicated codes in percpu embed/page first chunk
allocator. This patchset is aimed to cleanup them and should no function
change.
The currently supported status about 'embed' and 'page' in Archs shows
below,
embed: NEED_PER_CPU_PAGE_FIRST_CHUNK
page: NEED_PER_CPU_EMBED_FIRST_CHUNK
embed page
------------------------
arm64 Y Y
mips Y N
powerpc Y Y
riscv Y N
sparc Y Y
x86 Y Y
------------------------
There are two interfaces about percpu first chunk allocator,
extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
size_t atom_size,
pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
extern int __init pcpu_page_first_chunk(size_t reserved_size,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn,
- pcpu_fc_populate_pte_fn_t populate_pte_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic
pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the
pcpu_embed/page_first_chunk().
1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be
provided when archs supported NUMA.
2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too,
a generic pcpu_populate_pte() which marked '__weak' is provided, if you
need a different function to populate pte on the arch(like x86), please
provide its own implementation.
[1] https://github.com/kevin78/linux.git percpu-cleanup
This patch (of 4):
The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/
NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have
duplicate definitions on platforms that subscribe it.
Move them into mm, drop these redundant definitions and instead just
select it on applicable platforms.
Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 10:07:41 +08:00
|
|
|
select HAVE_SETUP_PER_CPU_AREA if PPC64
|
2021-02-10 07:40:52 +08:00
|
|
|
select HAVE_SOFTIRQ_ON_OWN_STACK
|
2021-04-22 01:06:42 +08:00
|
|
|
select HAVE_STACKPROTECTOR if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
|
|
|
|
select HAVE_STACKPROTECTOR if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
|
powerpc/32: Add support for out-of-line static calls
Add support for out-of-line static calls on PPC32. This change
improve performance of calls to global function pointers by
using direct calls instead of indirect calls.
The trampoline is initialy populated with a 'blr' or branch to target,
followed by an unreachable long jump sequence.
In order to cater with parallele execution, the trampoline needs to
be updated in a way that ensures it remains consistent at all time.
This means we can't use the traditional lis/addi to load r12 with
the target address, otherwise there would be a window during which
the first instruction contains the upper part of the new target
address while the second instruction still contains the lower part of
the old target address. To avoid that the target address is stored
just after the 'bctr' and loaded from there with a single instruction.
Then, depending on the target distance, arch_static_call_transform()
will either replace the first instruction by a direct 'bl <target>' or
'nop' in order to have the trampoline fall through the long jump
sequence.
For the special case of __static_call_return0(), to avoid the risk of
a far branch, a version of it is inlined at the end of the trampoline.
Performancewise the long jump sequence is probably not better than
the indirect calls set by GCC when we don't use static calls, but
such calls are unlikely to be required on powerpc32: With most
configurations the kernel size is far below 32 Mbytes so only
modules may happen to be too far. And even modules are likely to
be close enough as they are allocated below the kernel core and
as close as possible of the kernel text.
static_call selftest is running successfully with this change.
With this patch, __do_irq() has the following sequence to trace
irq entries:
c0004a00 <__SCT__tp_func_irq_entry>:
c0004a00: 48 00 00 e0 b c0004ae0 <__traceiter_irq_entry>
c0004a04: 3d 80 c0 00 lis r12,-16384
c0004a08: 81 8c 4a 1c lwz r12,18972(r12)
c0004a0c: 7d 89 03 a6 mtctr r12
c0004a10: 4e 80 04 20 bctr
c0004a14: 38 60 00 00 li r3,0
c0004a18: 4e 80 00 20 blr
c0004a1c: 00 00 00 00 .long 0x0
...
c0005654 <__do_irq>:
...
c0005664: 7c 7f 1b 78 mr r31,r3
...
c00056a0: 81 22 00 00 lwz r9,0(r2)
c00056a4: 39 29 00 01 addi r9,r9,1
c00056a8: 91 22 00 00 stw r9,0(r2)
c00056ac: 3d 20 c0 af lis r9,-16209
c00056b0: 81 29 74 cc lwz r9,29900(r9)
c00056b4: 2c 09 00 00 cmpwi r9,0
c00056b8: 41 82 00 10 beq c00056c8 <__do_irq+0x74>
c00056bc: 80 69 00 04 lwz r3,4(r9)
c00056c0: 7f e4 fb 78 mr r4,r31
c00056c4: 4b ff f3 3d bl c0004a00 <__SCT__tp_func_irq_entry>
Before this patch, __do_irq() was doing the following to trace irq
entries:
c0005700 <__do_irq>:
...
c0005710: 7c 7e 1b 78 mr r30,r3
...
c000574c: 93 e1 00 0c stw r31,12(r1)
c0005750: 81 22 00 00 lwz r9,0(r2)
c0005754: 39 29 00 01 addi r9,r9,1
c0005758: 91 22 00 00 stw r9,0(r2)
c000575c: 3d 20 c0 af lis r9,-16209
c0005760: 83 e9 f4 cc lwz r31,-2868(r9)
c0005764: 2c 1f 00 00 cmpwi r31,0
c0005768: 41 82 00 24 beq c000578c <__do_irq+0x8c>
c000576c: 81 3f 00 00 lwz r9,0(r31)
c0005770: 80 7f 00 04 lwz r3,4(r31)
c0005774: 7d 29 03 a6 mtctr r9
c0005778: 7f c4 f3 78 mr r4,r30
c000577c: 4e 80 04 21 bctrl
c0005780: 85 3f 00 0c lwzu r9,12(r31)
c0005784: 2c 09 00 00 cmpwi r9,0
c0005788: 40 82 ff e4 bne c000576c <__do_irq+0x6c>
Behind the fact of now using a direct 'bl' instead of a
'load/mtctr/bctr' sequence, we can also see that we get one less
register on the stack.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6ec2a7865ed6a5ec54ab46d026785bafe1d837ea.1630484892.git.christophe.leroy@csgroup.eu
2021-09-01 16:30:21 +08:00
|
|
|
select HAVE_STATIC_CALL if PPC32
|
2017-03-06 19:53:59 +08:00
|
|
|
select HAVE_SYSCALL_TRACEPOINTS
|
|
|
|
select HAVE_VIRT_CPU_ACCOUNTING
|
2021-05-08 19:12:55 +08:00
|
|
|
select HUGETLB_PAGE_SIZE_VARIABLE if PPC_BOOK3S_64 && HUGETLB_PAGE
|
2018-04-03 21:47:59 +08:00
|
|
|
select IOMMU_HELPER if PPC64
|
2012-02-16 16:37:49 +08:00
|
|
|
select IRQ_DOMAIN
|
2011-10-05 10:30:51 +08:00
|
|
|
select IRQ_FORCED_THREADING
|
2021-04-22 01:06:42 +08:00
|
|
|
select MMU_GATHER_PAGE_SIZE
|
|
|
|
select MMU_GATHER_RCU_TABLE_FREE
|
2012-09-28 13:01:03 +08:00
|
|
|
select MODULES_USE_ELF_RELA
|
2018-07-30 15:37:21 +08:00
|
|
|
select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE
|
mm: percpu: generalize percpu related config
Patch series "mm: percpu: Cleanup percpu first chunk function".
When supporting page mapping percpu first chunk allocator on arm64, we
found there are lots of duplicated codes in percpu embed/page first chunk
allocator. This patchset is aimed to cleanup them and should no function
change.
The currently supported status about 'embed' and 'page' in Archs shows
below,
embed: NEED_PER_CPU_PAGE_FIRST_CHUNK
page: NEED_PER_CPU_EMBED_FIRST_CHUNK
embed page
------------------------
arm64 Y Y
mips Y N
powerpc Y Y
riscv Y N
sparc Y Y
x86 Y Y
------------------------
There are two interfaces about percpu first chunk allocator,
extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
size_t atom_size,
pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
extern int __init pcpu_page_first_chunk(size_t reserved_size,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn,
- pcpu_fc_populate_pte_fn_t populate_pte_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic
pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the
pcpu_embed/page_first_chunk().
1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be
provided when archs supported NUMA.
2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too,
a generic pcpu_populate_pte() which marked '__weak' is provided, if you
need a different function to populate pte on the arch(like x86), please
provide its own implementation.
[1] https://github.com/kevin78/linux.git percpu-cleanup
This patch (of 4):
The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/
NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have
duplicate definitions on platforms that subscribe it.
Move them into mm, drop these redundant definitions and instead just
select it on applicable platforms.
Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 10:07:41 +08:00
|
|
|
select NEED_PER_CPU_EMBED_FIRST_CHUNK if PPC64
|
|
|
|
select NEED_PER_CPU_PAGE_FIRST_CHUNK if PPC64
|
2018-04-05 15:44:52 +08:00
|
|
|
select NEED_SG_DMA_LENGTH
|
2017-03-06 19:53:59 +08:00
|
|
|
select OF
|
2020-01-26 19:52:47 +08:00
|
|
|
select OF_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE
|
2017-03-06 19:53:59 +08:00
|
|
|
select OF_EARLY_FLATTREE
|
|
|
|
select OLD_SIGACTION if PPC32
|
|
|
|
select OLD_SIGSUSPEND
|
2018-11-16 03:05:33 +08:00
|
|
|
select PCI_DOMAINS if PCI
|
2020-09-28 18:13:07 +08:00
|
|
|
select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
|
2018-11-16 03:05:34 +08:00
|
|
|
select PCI_SYSCALL if PCI
|
2019-06-04 11:00:37 +08:00
|
|
|
select PPC_DAWR if PPC64
|
2018-04-23 16:36:38 +08:00
|
|
|
select RTC_LIB
|
2017-03-06 19:53:59 +08:00
|
|
|
select SPARSE_IRQ
|
2021-06-09 09:34:29 +08:00
|
|
|
select STRICT_KERNEL_RWX if STRICT_MODULE_RWX
|
2017-03-06 19:53:59 +08:00
|
|
|
select SYSCTL_EXCEPTION_TRACE
|
2019-01-31 18:08:58 +08:00
|
|
|
select THREAD_INFO_IN_TASK
|
2021-07-31 13:22:32 +08:00
|
|
|
select TRACE_IRQFLAGS_SUPPORT
|
2017-03-06 19:53:59 +08:00
|
|
|
select VIRT_TO_BUS if !PPC64
|
|
|
|
#
|
|
|
|
# Please keep this list sorted alphabetically.
|
|
|
|
#
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2018-07-28 07:06:34 +08:00
|
|
|
config PPC_BARRIER_NOSPEC
|
2019-07-04 00:04:13 +08:00
|
|
|
bool
|
|
|
|
default y
|
|
|
|
depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
|
2018-07-28 07:06:34 +08:00
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config EARLY_PRINTK
|
|
|
|
bool
|
2005-11-23 14:57:25 +08:00
|
|
|
default y
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2013-11-26 07:23:11 +08:00
|
|
|
config PANIC_TIMEOUT
|
|
|
|
int
|
|
|
|
default 180
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config COMPAT
|
2020-03-20 18:20:17 +08:00
|
|
|
bool "Enable support for 32bit binaries"
|
|
|
|
depends on PPC64
|
2021-05-19 04:58:57 +08:00
|
|
|
depends on !CC_IS_CLANG || CLANG_VERSION >= 120000
|
2020-03-20 18:20:17 +08:00
|
|
|
default y if !CPU_LITTLE_ENDIAN
|
[PATCH v3] ipc: provide generic compat versions of IPC syscalls
When using the "compat" APIs, architectures will generally want to
be able to make direct syscalls to msgsnd(), shmctl(), etc., and
in the kernel we would want them to be handled directly by
compat_sys_xxx() functions, as is true for other compat syscalls.
However, for historical reasons, several of the existing compat IPC
syscalls do not do this. semctl() expects a pointer to the fourth
argument, instead of the fourth argument itself. msgsnd(), msgrcv()
and shmat() expect arguments in different order.
This change adds an ARCH_WANT_OLD_COMPAT_IPC config option that can be
set to preserve this behavior for ports that use it (x86, sparc, powerpc,
s390, and mips). No actual semantics are changed for those architectures,
and there is only a minimal amount of code refactoring in ipc/compat.c.
Newer architectures like tile (and perhaps future architectures such
as arm64 and unicore64) should not select this option, and thus can
avoid having any IPC-specific code at all in their architecture-specific
compat layer. In the same vein, if this option is not selected, IPC_64
mode is assumed, since that's what the <asm-generic> headers expect.
The workaround code in "tile" for msgsnd() and msgrcv() is removed
with this change; it also fixes the bug that shmat() and semctl() were
not being properly handled.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-03-16 01:13:38 +08:00
|
|
|
select ARCH_WANT_OLD_COMPAT_IPC
|
2012-12-26 08:27:42 +08:00
|
|
|
select COMPAT_OLD_SIGACTION
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
config SYSVIPC_COMPAT
|
|
|
|
bool
|
|
|
|
depends on COMPAT && SYSVIPC
|
|
|
|
default y
|
|
|
|
|
2008-11-11 16:05:16 +08:00
|
|
|
config SCHED_OMIT_FRAME_POINTER
|
2005-09-26 14:04:21 +08:00
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
|
|
|
config ARCH_MAY_HAVE_PC_FDC
|
|
|
|
bool
|
2014-08-19 05:13:41 +08:00
|
|
|
default PCI
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2006-01-11 11:43:56 +08:00
|
|
|
config PPC_UDBG_16550
|
|
|
|
bool
|
|
|
|
|
|
|
|
config GENERIC_TBSYNC
|
|
|
|
bool
|
|
|
|
default y if PPC32 && SMP
|
|
|
|
|
2021-10-27 19:29:31 +08:00
|
|
|
config AUDIT_ARCH
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
|
2006-12-08 19:30:41 +08:00
|
|
|
config GENERIC_BUG
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
depends on BUG
|
|
|
|
|
2020-12-01 08:52:03 +08:00
|
|
|
config GENERIC_BUG_RELATIVE_POINTERS
|
|
|
|
def_bool y
|
|
|
|
depends on GENERIC_BUG
|
|
|
|
|
2007-03-20 02:18:02 +08:00
|
|
|
config SYS_SUPPORTS_APM_EMULATION
|
2007-05-23 22:51:46 +08:00
|
|
|
default y if PMAC_APM_EMU
|
2007-03-20 02:18:02 +08:00
|
|
|
bool
|
|
|
|
|
2011-04-15 02:29:16 +08:00
|
|
|
config EPAPR_BOOT
|
|
|
|
bool
|
|
|
|
help
|
|
|
|
Used to allow a board to specify it wants an ePAPR compliant wrapper.
|
|
|
|
|
2006-01-17 00:53:22 +08:00
|
|
|
config DEFAULT_UIMAGE
|
|
|
|
bool
|
|
|
|
help
|
|
|
|
Used to allow a board to specify it wants a uImage built by default
|
|
|
|
|
2007-12-08 09:12:39 +08:00
|
|
|
config ARCH_HIBERNATION_POSSIBLE
|
|
|
|
bool
|
2007-05-03 20:31:38 +08:00
|
|
|
default y
|
|
|
|
|
2007-12-08 09:14:00 +08:00
|
|
|
config ARCH_SUSPEND_POSSIBLE
|
|
|
|
def_bool y
|
2009-09-16 05:43:57 +08:00
|
|
|
depends on ADB_PMU || PPC_EFIKA || PPC_LITE5200 || PPC_83xx || \
|
2012-07-20 20:42:36 +08:00
|
|
|
(PPC_85xx && !PPC_E500MC) || PPC_86xx || PPC_PSERIES \
|
|
|
|
|| 44x || 40x
|
2007-12-08 09:14:00 +08:00
|
|
|
|
2019-04-11 11:34:46 +08:00
|
|
|
config ARCH_SUSPEND_NONZERO_CPU
|
|
|
|
def_bool y
|
|
|
|
depends on PPC_POWERNV || PPC_PSERIES
|
|
|
|
|
2006-11-11 14:24:53 +08:00
|
|
|
config PPC_DCR_NATIVE
|
|
|
|
bool
|
|
|
|
|
|
|
|
config PPC_DCR_MMIO
|
|
|
|
bool
|
|
|
|
|
|
|
|
config PPC_DCR
|
|
|
|
bool
|
|
|
|
depends on PPC_DCR_NATIVE || PPC_DCR_MMIO
|
|
|
|
default y
|
|
|
|
|
2006-11-11 14:25:08 +08:00
|
|
|
config PPC_OF_PLATFORM_PCI
|
|
|
|
bool
|
2007-12-21 12:37:07 +08:00
|
|
|
depends on PCI
|
2006-11-11 14:25:08 +08:00
|
|
|
depends on PPC64 # not supported on 32 bits yet
|
|
|
|
|
2012-08-24 05:31:32 +08:00
|
|
|
config ARCH_SUPPORTS_UPROBES
|
|
|
|
def_bool y
|
|
|
|
|
2010-02-08 19:50:57 +08:00
|
|
|
config PPC_ADV_DEBUG_REGS
|
|
|
|
bool
|
|
|
|
depends on 40x || BOOKE
|
|
|
|
default y
|
|
|
|
|
|
|
|
config PPC_ADV_DEBUG_IACS
|
|
|
|
int
|
|
|
|
depends on PPC_ADV_DEBUG_REGS
|
|
|
|
default 4 if 44x
|
|
|
|
default 2
|
|
|
|
|
|
|
|
config PPC_ADV_DEBUG_DACS
|
|
|
|
int
|
|
|
|
depends on PPC_ADV_DEBUG_REGS
|
|
|
|
default 2
|
|
|
|
|
|
|
|
config PPC_ADV_DEBUG_DVCS
|
|
|
|
int
|
|
|
|
depends on PPC_ADV_DEBUG_REGS
|
|
|
|
default 2 if 44x
|
|
|
|
default 0
|
|
|
|
|
|
|
|
config PPC_ADV_DEBUG_DAC_RANGE
|
|
|
|
bool
|
|
|
|
depends on PPC_ADV_DEBUG_REGS && 44x
|
|
|
|
default y
|
|
|
|
|
2019-06-04 11:00:37 +08:00
|
|
|
config PPC_DAWR
|
|
|
|
bool
|
|
|
|
|
2015-04-15 06:45:57 +08:00
|
|
|
config PGTABLE_LEVELS
|
|
|
|
int
|
|
|
|
default 2 if !PPC64
|
|
|
|
default 4
|
|
|
|
|
[POWERPC] 4xx: PLB to PCI Express support
This adds to the previous 2 patches the support for the 4xx PCI Express
cells as found in the 440SPe revA, revB and 405EX.
Unfortunately, due to significant differences between these, and other
interesting "features" of those pieces of HW, the code isn't as simple
as it is for PCI and PCI-X and some of the functions differ significantly
between the 3 implementations. Thus, not only this code can only support
those 3 implementations for now and will refuse to operate on any other,
but there are added ifdef's to avoid the bloat of building a fairly large
amount of code on platforms that don't need it.
Also, this code currently only supports fully initializing root complex
nodes, not endpoint. Some more code will have to be lifted from the
arch/ppc implementation to add the endpoint support, though it's mostly
differences in memory mapping, and the question on how to represent
endpoint mode PCI in the device-tree is thus open.
Many thanks to Stefan Roese for testing & fixing up the 405EX bits !
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-21 12:39:24 +08:00
|
|
|
source "arch/powerpc/sysdev/Kconfig"
|
2007-03-16 22:32:17 +08:00
|
|
|
source "arch/powerpc/platforms/Kconfig"
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
menu "Kernel options"
|
|
|
|
|
|
|
|
config HIGHMEM
|
|
|
|
bool "High memory support"
|
|
|
|
depends on PPC32
|
2020-11-03 17:27:27 +08:00
|
|
|
select KMAP_LOCAL
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2018-12-11 19:01:04 +08:00
|
|
|
source "kernel/Kconfig.hz"
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
config MATH_EMULATION
|
|
|
|
bool "Math emulation"
|
2021-06-18 11:43:41 +08:00
|
|
|
depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE || PPC_MICROWATT
|
2020-08-19 01:19:17 +08:00
|
|
|
select PPC_FPU_REGS
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2005-09-26 14:04:21 +08:00
|
|
|
Some PowerPC chips designed for embedded applications do not have
|
|
|
|
a floating-point unit and therefore do not implement the
|
|
|
|
floating-point instructions in the PowerPC instruction set. If you
|
|
|
|
say Y here, the kernel will include code to emulate a floating-point
|
|
|
|
unit, which will allow programs that use floating-point
|
|
|
|
instructions to run.
|
|
|
|
|
2013-06-09 15:01:24 +08:00
|
|
|
This is also useful to emulate missing (optional) instructions
|
|
|
|
such as fsqrt on cores that do have an FPU but do not implement
|
|
|
|
them (such as Freescale BookE).
|
|
|
|
|
2013-07-16 19:57:15 +08:00
|
|
|
choice
|
|
|
|
prompt "Math emulation options"
|
|
|
|
default MATH_EMULATION_FULL
|
|
|
|
depends on MATH_EMULATION
|
|
|
|
|
|
|
|
config MATH_EMULATION_FULL
|
|
|
|
bool "Emulate all the floating point instructions"
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2013-07-16 19:57:15 +08:00
|
|
|
Select this option will enable the kernel to support to emulate
|
|
|
|
all the floating point instructions. If your SoC doesn't have
|
|
|
|
a FPU, you should select this.
|
|
|
|
|
|
|
|
config MATH_EMULATION_HW_UNIMPLEMENTED
|
|
|
|
bool "Just emulate the FPU unimplemented instructions"
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2013-07-16 19:57:15 +08:00
|
|
|
Select this if you know there does have a hardware FPU on your
|
|
|
|
SoC, but some floating point instructions are not implemented by that.
|
|
|
|
|
|
|
|
endchoice
|
|
|
|
|
2013-02-14 00:21:43 +08:00
|
|
|
config PPC_TRANSACTIONAL_MEM
|
2019-07-04 00:04:13 +08:00
|
|
|
bool "Transactional Memory support for POWERPC"
|
|
|
|
depends on PPC_BOOK3S_64
|
|
|
|
depends on SMP
|
|
|
|
select ALTIVEC
|
|
|
|
select VSX
|
|
|
|
help
|
|
|
|
Support user-mode Transactional Memory on POWERPC.
|
2013-02-14 00:21:43 +08:00
|
|
|
|
2019-11-25 11:06:31 +08:00
|
|
|
config PPC_UV
|
|
|
|
bool "Ultravisor support"
|
|
|
|
depends on KVM_BOOK3S_HV_POSSIBLE
|
2020-01-09 17:20:47 +08:00
|
|
|
depends on DEVICE_PRIVATE
|
2019-11-25 11:06:31 +08:00
|
|
|
default n
|
|
|
|
help
|
|
|
|
This option paravirtualizes the kernel to run in POWER platforms that
|
|
|
|
supports the Protected Execution Facility (PEF). On such platforms,
|
|
|
|
the ultravisor firmware runs at a privilege level above the
|
|
|
|
hypervisor.
|
|
|
|
|
|
|
|
If unsure, say "N".
|
|
|
|
|
2017-05-29 15:39:40 +08:00
|
|
|
config LD_HEAD_STUB_CATCH
|
|
|
|
bool "Reserve 256 bytes to cope with linker stubs in HEAD text" if EXPERT
|
|
|
|
depends on PPC64
|
|
|
|
help
|
|
|
|
Very large kernels can cause linker branch stubs to be generated by
|
|
|
|
code in head_64.S, which moves the head text sections out of their
|
|
|
|
specified location. This option can work around the problem.
|
|
|
|
|
|
|
|
If unsure, say "N".
|
|
|
|
|
2016-03-03 12:27:00 +08:00
|
|
|
config MPROFILE_KERNEL
|
2020-04-22 17:26:12 +08:00
|
|
|
depends on PPC64 && CPU_LITTLE_ENDIAN && FUNCTION_TRACER
|
2018-05-30 20:19:22 +08:00
|
|
|
def_bool $(success,$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) -I$(srctree)/include -D__KERNEL__)
|
2016-03-03 12:27:00 +08:00
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config HOTPLUG_CPU
|
|
|
|
bool "Support for enabling/disabling CPUs"
|
2013-05-21 11:49:35 +08:00
|
|
|
depends on SMP && (PPC_PSERIES || \
|
2020-01-29 10:22:25 +08:00
|
|
|
PPC_PMAC || PPC_POWERNV || FSL_SOC_BOOKE)
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2005-09-26 14:04:21 +08:00
|
|
|
Say Y here to be able to disable and re-enable individual
|
|
|
|
CPUs at runtime on SMP machines.
|
|
|
|
|
|
|
|
Say N if you are unsure.
|
|
|
|
|
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 21:14:20 +08:00
|
|
|
config PPC_QUEUED_SPINLOCKS
|
2021-01-18 20:34:51 +08:00
|
|
|
bool "Queued spinlocks" if EXPERT
|
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 21:14:20 +08:00
|
|
|
depends on SMP
|
2021-01-18 20:34:51 +08:00
|
|
|
default PPC_BOOK3S_64
|
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 21:14:20 +08:00
|
|
|
help
|
|
|
|
Say Y here to use queued spinlocks which give better scalability and
|
|
|
|
fairness on large SMP and NUMA systems without harming single threaded
|
|
|
|
performance.
|
|
|
|
|
2009-11-26 01:23:25 +08:00
|
|
|
config ARCH_CPU_PROBE_RELEASE
|
|
|
|
def_bool y
|
|
|
|
depends on HOTPLUG_CPU
|
|
|
|
|
2013-11-15 12:20:50 +08:00
|
|
|
config PPC64_SUPPORTS_MEMORY_FAILURE
|
|
|
|
bool "Add support for memory hwpoison"
|
|
|
|
depends on PPC_BOOK3S_64
|
|
|
|
default "y" if PPC_POWERNV
|
|
|
|
select ARCH_SUPPORTS_MEMORY_FAILURE
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config KEXEC
|
2013-01-17 10:53:25 +08:00
|
|
|
bool "kexec system call"
|
2015-10-07 11:48:22 +08:00
|
|
|
depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP)) || PPC_BOOK3E
|
2015-09-10 06:38:55 +08:00
|
|
|
select KEXEC_CORE
|
2005-09-26 14:04:21 +08:00
|
|
|
help
|
|
|
|
kexec is a system call that implements the ability to shutdown your
|
|
|
|
current kernel, and to start another kernel. It is like a reboot
|
2006-06-29 13:32:47 +08:00
|
|
|
but it is independent of the system firmware. And like a reboot
|
2005-09-26 14:04:21 +08:00
|
|
|
you can start any kernel with it, not just Linux.
|
|
|
|
|
2006-06-29 13:32:47 +08:00
|
|
|
The name comes from the similarity to the exec system call.
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
It is an ongoing process to be certain the hardware in a machine
|
|
|
|
is properly shutdown, so do not be surprised if this code does not
|
2013-08-21 03:38:03 +08:00
|
|
|
initially work for you. As of this writing the exact hardware
|
|
|
|
interface is strongly in flux, so no good recommendation can be
|
|
|
|
made.
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2016-11-29 20:45:53 +08:00
|
|
|
config KEXEC_FILE
|
|
|
|
bool "kexec file based system call"
|
|
|
|
select KEXEC_CORE
|
2021-02-22 01:49:26 +08:00
|
|
|
select HAVE_IMA_KEXEC if IMA
|
2016-11-29 20:45:53 +08:00
|
|
|
select BUILD_BIN2C
|
2019-08-24 03:49:13 +08:00
|
|
|
select KEXEC_ELF
|
2016-11-29 20:45:53 +08:00
|
|
|
depends on PPC64
|
|
|
|
depends on CRYPTO=y
|
|
|
|
depends on CRYPTO_SHA256=y
|
|
|
|
help
|
|
|
|
This is a new version of the kexec system call. This call is
|
|
|
|
file based and takes in file descriptors as system call arguments
|
|
|
|
for kernel and initramfs as opposed to a list of segments as is the
|
|
|
|
case for the older kexec call.
|
|
|
|
|
kexec_file: make use of purgatory optional
Patch series "kexec_file, x86, powerpc: refactoring for other
architecutres", v2.
This is a preparatory patchset for adding kexec_file support on arm64.
It was originally included in a arm64 patch set[1], but Philipp is also
working on their kexec_file support on s390[2] and some changes are now
conflicting.
So these common parts were extracted and put into a separate patch set
for better integration. What's more, my original patch#4 was split into
a few small chunks for easier review after Dave's comment.
As such, the resulting code is basically identical with my original, and
the only *visible* differences are:
- renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup()
- change one of types of arguments at prepare_elf64_headers()
Those, unfortunately, require a couple of trivial changes on the rest
(#1, #6 to #13) of my arm64 kexec_file patch set[1].
Patch #1 allows making a use of purgatory optional, particularly useful
for arm64.
Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
verify_sig}() and arch_kimage_file_post_load_cleanup() across
architectures.
Patches #3-#7 are also intended to generalize parse_elf64_headers(),
along with exclude_mem_range(), to be made best re-use of.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
[2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html
This patch (of 7):
On arm64, crash dump kernel's usable memory is protected by *unmapping*
it from kernel virtual space unlike other architectures where the region
is just made read-only. It is highly unlikely that the region is
accidentally corrupted and this observation rationalizes that digest
check code can also be dropped from purgatory. The resulting code is so
simple as it doesn't require a bit ugly re-linking/relocation stuff,
i.e. arch_kexec_apply_relocations_add().
Please see:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
All that the purgatory does is to shuffle arguments and jump into a new
kernel, while we still need to have some space for a hash value
(purgatory_sha256_digest) which is never checked against.
As such, it doesn't make sense to have trampline code between old kernel
and new kernel on arm64.
This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
allows related code to be compiled in only if necessary.
[takahiro.akashi@linaro.org: fix trivial screwup]
Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14 06:35:45 +08:00
|
|
|
config ARCH_HAS_KEXEC_PURGATORY
|
|
|
|
def_bool KEXEC_FILE
|
|
|
|
|
2016-07-13 09:14:39 +08:00
|
|
|
config RELOCATABLE
|
|
|
|
bool "Build a relocatable kernel"
|
2016-10-19 11:16:00 +08:00
|
|
|
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
|
2016-07-13 09:14:39 +08:00
|
|
|
select NONSTATIC_KERNEL
|
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 17:54:06 +08:00
|
|
|
select MODULE_REL_CRCS if MODVERSIONS
|
2016-07-13 09:14:39 +08:00
|
|
|
help
|
|
|
|
This builds a kernel image that is capable of running at the
|
|
|
|
location the kernel is loaded at. For ppc32, there is no any
|
|
|
|
alignment restrictions, and this feature is a superset of
|
|
|
|
DYNAMIC_MEMSTART and hence overrides it. For ppc64, we should use
|
|
|
|
16k-aligned base address. The kernel is linked as a
|
|
|
|
position-independent executable (PIE) and contains dynamic relocations
|
|
|
|
which are processed early in the bootup process.
|
|
|
|
|
|
|
|
One use is for the kexec on panic case where the recovery kernel
|
|
|
|
must live at a different physical address than the primary
|
|
|
|
kernel.
|
|
|
|
|
|
|
|
Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
|
|
|
|
it has been loaded at and the compile time physical addresses
|
|
|
|
CONFIG_PHYSICAL_START is ignored. However CONFIG_PHYSICAL_START
|
|
|
|
setting can still be useful to bootwrappers that need to know the
|
|
|
|
load address of the kernel (eg. u-boot/mkimage).
|
|
|
|
|
2019-09-20 17:45:40 +08:00
|
|
|
config RANDOMIZE_BASE
|
|
|
|
bool "Randomize the address of the kernel image"
|
|
|
|
depends on (FSL_BOOKE && FLATMEM && PPC32)
|
|
|
|
depends on RELOCATABLE
|
|
|
|
help
|
|
|
|
Randomizes the virtual address at which the kernel image is
|
|
|
|
loaded, as a security feature that deters exploit attempts
|
|
|
|
relying on knowledge of the location of kernel internals.
|
|
|
|
|
|
|
|
If unsure, say Y.
|
|
|
|
|
2016-10-14 15:31:33 +08:00
|
|
|
config RELOCATABLE_TEST
|
|
|
|
bool "Test relocatable kernel"
|
|
|
|
depends on (PPC64 && RELOCATABLE)
|
|
|
|
help
|
|
|
|
This runs the relocatable kernel at the address it was initially
|
|
|
|
loaded at, which tends to be non-zero and therefore test the
|
|
|
|
relocation code.
|
|
|
|
|
2006-01-15 05:48:25 +08:00
|
|
|
config CRASH_DUMP
|
2017-05-09 06:56:24 +08:00
|
|
|
bool "Build a dump capture kernel"
|
2018-11-17 18:24:58 +08:00
|
|
|
depends on PPC64 || PPC_BOOK3S_32 || FSL_BOOKE || (44x && !SMP)
|
2016-10-19 11:16:00 +08:00
|
|
|
select RELOCATABLE if PPC64 || 44x || FSL_BOOKE
|
2006-01-15 05:48:25 +08:00
|
|
|
help
|
2017-05-09 06:56:24 +08:00
|
|
|
Build a kernel suitable for use as a dump capture kernel.
|
2008-10-22 01:38:10 +08:00
|
|
|
The same kernel binary can be used as production kernel and dump
|
|
|
|
capture kernel.
|
2006-01-15 05:48:25 +08:00
|
|
|
|
2012-02-16 09:14:22 +08:00
|
|
|
config FA_DUMP
|
|
|
|
bool "Firmware-assisted dump"
|
2019-09-11 22:50:26 +08:00
|
|
|
depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
|
2017-05-09 06:56:24 +08:00
|
|
|
select CRASH_CORE
|
|
|
|
select CRASH_DUMP
|
2008-03-22 07:50:50 +08:00
|
|
|
help
|
2012-02-16 09:14:22 +08:00
|
|
|
A robust mechanism to get reliable kernel crash dump with
|
|
|
|
assistance from firmware. This approach does not use kexec,
|
2017-05-09 06:56:24 +08:00
|
|
|
instead firmware assists in booting the capture kernel
|
2012-02-16 09:14:22 +08:00
|
|
|
while preserving memory contents. Firmware-assisted dump
|
|
|
|
is meant to be a kdump replacement offering robustness and
|
|
|
|
speed not possible without system firmware assistance.
|
2008-03-22 07:50:50 +08:00
|
|
|
|
2019-09-11 22:50:26 +08:00
|
|
|
If unsure, say "y". Only special kernels like petitboot may
|
|
|
|
need to say "N" here.
|
2008-03-22 07:50:50 +08:00
|
|
|
|
2019-09-11 22:56:03 +08:00
|
|
|
config PRESERVE_FA_DUMP
|
|
|
|
bool "Preserve Firmware-assisted dump"
|
|
|
|
depends on PPC64 && PPC_POWERNV && !FA_DUMP
|
|
|
|
help
|
|
|
|
On a kernel with FA_DUMP disabled, this option helps to preserve
|
|
|
|
crash data from a previously crash'ed kernel. Useful when the next
|
|
|
|
memory preserving kernel boot would process this crash data.
|
|
|
|
Petitboot kernel is the typical usecase for this option.
|
|
|
|
|
2019-09-11 22:56:33 +08:00
|
|
|
config OPAL_CORE
|
|
|
|
bool "Export OPAL memory as /sys/firmware/opal/core"
|
|
|
|
depends on PPC64 && PPC_POWERNV
|
|
|
|
help
|
|
|
|
This option uses the MPIPL support in firmware to provide an
|
|
|
|
ELF core of OPAL memory after a crash. The ELF core is exported
|
|
|
|
as /sys/firmware/opal/core file which is helpful in debugging
|
|
|
|
OPAL crashes using GDB.
|
2008-03-22 07:50:50 +08:00
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config IRQ_ALL_CPUS
|
|
|
|
bool "Distribute interrupts on all CPUs by default"
|
2013-05-15 17:21:01 +08:00
|
|
|
depends on SMP
|
2005-09-26 14:04:21 +08:00
|
|
|
help
|
|
|
|
This option gives the kernel permission to distribute IRQs across
|
|
|
|
multiple CPUs. Saying N here will route all IRQs to the first
|
|
|
|
CPU. Generally saying Y is safe, although some problems have been
|
|
|
|
reported with SMP Power Macintoshes with this option enabled.
|
|
|
|
|
2005-10-29 08:46:58 +08:00
|
|
|
config NUMA
|
2020-11-24 20:05:47 +08:00
|
|
|
bool "NUMA Memory Allocation and Scheduler Support"
|
2020-11-24 20:05:45 +08:00
|
|
|
depends on PPC64 && SMP
|
2020-11-24 20:05:46 +08:00
|
|
|
default y if PPC_PSERIES || PPC_POWERNV
|
mm: percpu: generalize percpu related config
Patch series "mm: percpu: Cleanup percpu first chunk function".
When supporting page mapping percpu first chunk allocator on arm64, we
found there are lots of duplicated codes in percpu embed/page first chunk
allocator. This patchset is aimed to cleanup them and should no function
change.
The currently supported status about 'embed' and 'page' in Archs shows
below,
embed: NEED_PER_CPU_PAGE_FIRST_CHUNK
page: NEED_PER_CPU_EMBED_FIRST_CHUNK
embed page
------------------------
arm64 Y Y
mips Y N
powerpc Y Y
riscv Y N
sparc Y Y
x86 Y Y
------------------------
There are two interfaces about percpu first chunk allocator,
extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
size_t atom_size,
pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
extern int __init pcpu_page_first_chunk(size_t reserved_size,
- pcpu_fc_alloc_fn_t alloc_fn,
- pcpu_fc_free_fn_t free_fn,
- pcpu_fc_populate_pte_fn_t populate_pte_fn);
+ pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic
pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the
pcpu_embed/page_first_chunk().
1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be
provided when archs supported NUMA.
2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too,
a generic pcpu_populate_pte() which marked '__weak' is provided, if you
need a different function to populate pte on the arch(like x86), please
provide its own implementation.
[1] https://github.com/kevin78/linux.git percpu-cleanup
This patch (of 4):
The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/
NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have
duplicate definitions on platforms that subscribe it.
Move them into mm, drop these redundant definitions and instead just
select it on applicable platforms.
Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 10:07:41 +08:00
|
|
|
select USE_PERCPU_NUMA_NODE_ID
|
2020-11-24 20:05:47 +08:00
|
|
|
help
|
|
|
|
Enable NUMA (Non-Uniform Memory Access) support.
|
|
|
|
|
|
|
|
The kernel will try to allocate memory used by a CPU on the
|
|
|
|
local memory controller of the CPU and add some more
|
|
|
|
NUMA awareness to the kernel.
|
2005-10-29 08:46:58 +08:00
|
|
|
|
2006-04-11 13:53:53 +08:00
|
|
|
config NODES_SHIFT
|
|
|
|
int
|
2009-09-22 03:56:43 +08:00
|
|
|
default "8" if PPC64
|
2006-04-11 13:53:53 +08:00
|
|
|
default "4"
|
2021-06-29 10:43:01 +08:00
|
|
|
depends on NUMA
|
2006-04-11 13:53:53 +08:00
|
|
|
|
2014-05-17 07:41:20 +08:00
|
|
|
config HAVE_MEMORYLESS_NODES
|
|
|
|
def_bool y
|
|
|
|
depends on NUMA
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config ARCH_SELECT_MEMORY_MODEL
|
|
|
|
def_bool y
|
|
|
|
depends on PPC64
|
|
|
|
|
|
|
|
config ARCH_FLATMEM_ENABLE
|
2005-11-30 03:20:55 +08:00
|
|
|
def_bool y
|
|
|
|
depends on (PPC64 && !NUMA) || PPC32
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2005-11-11 11:22:35 +08:00
|
|
|
config ARCH_SPARSEMEM_ENABLE
|
2005-09-26 14:04:21 +08:00
|
|
|
def_bool y
|
2005-11-30 03:20:55 +08:00
|
|
|
depends on PPC64
|
2007-10-16 16:24:17 +08:00
|
|
|
select SPARSEMEM_VMEMMAP_ENABLE
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2005-11-11 11:22:35 +08:00
|
|
|
config ARCH_SPARSEMEM_DEFAULT
|
2005-09-26 14:04:21 +08:00
|
|
|
def_bool y
|
2017-04-05 14:10:48 +08:00
|
|
|
depends on PPC_BOOK3S_64
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2016-11-15 18:59:38 +08:00
|
|
|
config ILLEGAL_POINTER_VALUE
|
|
|
|
hex
|
|
|
|
# This is roughly half way between the top of user space and the bottom
|
|
|
|
# of kernel space, which seems about as good as we can get.
|
|
|
|
default 0x5deadbeef0000000 if PPC64
|
|
|
|
default 0
|
|
|
|
|
2005-11-08 01:39:48 +08:00
|
|
|
config ARCH_MEMORY_PROBE
|
|
|
|
def_bool y
|
|
|
|
depends on MEMORY_HOTPLUG
|
|
|
|
|
2008-12-11 09:55:41 +08:00
|
|
|
choice
|
|
|
|
prompt "Page size"
|
2021-10-15 08:16:49 +08:00
|
|
|
default PPC_64K_PAGES if PPC_BOOK3S_64
|
2008-12-11 09:55:41 +08:00
|
|
|
default PPC_4K_PAGES
|
2005-11-07 08:06:55 +08:00
|
|
|
help
|
2008-12-11 09:55:41 +08:00
|
|
|
Select the kernel logical page size. Increasing the page size
|
|
|
|
will reduce software overhead at each page boundary, allow
|
|
|
|
hardware prefetch mechanisms to be more effective, and allow
|
|
|
|
larger dma transfers increasing IO efficiency and reducing
|
|
|
|
overhead. However the utilization of memory will increase.
|
|
|
|
For example, each cached file will using a multiple of the
|
|
|
|
page size to hold its contents and the difference between the
|
|
|
|
end of file and the end of page is wasted.
|
|
|
|
|
|
|
|
Some dedicated systems, such as software raid serving with
|
|
|
|
accelerated calculations, have shown significant increases.
|
|
|
|
|
|
|
|
If you configure a 64 bit kernel for 64k pages but the
|
|
|
|
processor does not support them, then the kernel will simulate
|
|
|
|
them with 4k pages, loading them on demand, but with the
|
|
|
|
reduced software overhead and larger internal fragmentation.
|
|
|
|
For the 32 bit kernel, a large page option will not be offered
|
|
|
|
unless it is supported by the configured processor.
|
|
|
|
|
|
|
|
If unsure, choose 4K_PAGES.
|
|
|
|
|
|
|
|
config PPC_4K_PAGES
|
|
|
|
bool "4k page size"
|
2016-01-30 01:02:49 +08:00
|
|
|
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
|
2008-12-11 09:55:41 +08:00
|
|
|
|
|
|
|
config PPC_16K_PAGES
|
2015-08-07 14:19:46 +08:00
|
|
|
bool "16k page size"
|
2018-11-29 22:07:21 +08:00
|
|
|
depends on 44x || PPC_8xx
|
2008-12-11 09:55:41 +08:00
|
|
|
|
|
|
|
config PPC_64K_PAGES
|
2015-08-07 14:19:46 +08:00
|
|
|
bool "64k page size"
|
2019-02-08 20:34:16 +08:00
|
|
|
depends on 44x || PPC_BOOK3S_64
|
2016-01-30 01:02:49 +08:00
|
|
|
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
|
2008-12-11 09:55:41 +08:00
|
|
|
|
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 09:40:44 +08:00
|
|
|
config PPC_256K_PAGES
|
2021-01-20 15:49:14 +08:00
|
|
|
bool "256k page size (Requires non-standard binutils settings)"
|
|
|
|
depends on 44x && !PPC_47x
|
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 09:40:44 +08:00
|
|
|
help
|
|
|
|
Make the page size 256k.
|
|
|
|
|
2021-01-20 15:49:14 +08:00
|
|
|
The kernel will only be able to run applications that have been
|
|
|
|
compiled with '-zmax-page-size' set to 256K (the default is 64K) using
|
|
|
|
binutils later than 2.17.50.0.3, or by patching the ELF_MAXPAGESIZE
|
|
|
|
definition from 0x10000 to 0x40000 in older versions.
|
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 09:40:44 +08:00
|
|
|
|
2008-12-11 09:55:41 +08:00
|
|
|
endchoice
|
2005-11-07 08:06:55 +08:00
|
|
|
|
2019-02-22 03:08:46 +08:00
|
|
|
config PPC_PAGE_SHIFT
|
|
|
|
int
|
|
|
|
default 18 if PPC_256K_PAGES
|
|
|
|
default 16 if PPC_64K_PAGES
|
|
|
|
default 14 if PPC_16K_PAGES
|
|
|
|
default 12
|
|
|
|
|
2017-02-24 08:52:09 +08:00
|
|
|
config THREAD_SHIFT
|
|
|
|
int "Thread shift" if EXPERT
|
|
|
|
range 13 15
|
|
|
|
default "15" if PPC_256K_PAGES
|
|
|
|
default "14" if PPC64
|
2020-04-08 23:58:49 +08:00
|
|
|
default "14" if KASAN
|
2017-02-24 08:52:09 +08:00
|
|
|
default "13"
|
|
|
|
help
|
|
|
|
Used to define the stack size. The default is almost always what you
|
|
|
|
want. Only change this if you know what you are doing.
|
|
|
|
|
2019-02-22 03:08:50 +08:00
|
|
|
config DATA_SHIFT_BOOL
|
2020-05-19 13:49:25 +08:00
|
|
|
bool "Set custom data alignment"
|
2019-02-22 03:08:50 +08:00
|
|
|
depends on ADVANCED_OPTIONS
|
2021-03-04 22:35:09 +08:00
|
|
|
depends on STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE
|
2021-10-15 18:02:49 +08:00
|
|
|
depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !STRICT_KERNEL_RWX) || \
|
|
|
|
FSL_BOOKE
|
2019-02-22 03:08:50 +08:00
|
|
|
help
|
|
|
|
This option allows you to set the kernel data alignment. When
|
|
|
|
RAM is mapped by blocks, the alignment needs to fit the size and
|
|
|
|
number of possible blocks. The default should be OK for most configs.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
2019-02-22 03:08:47 +08:00
|
|
|
|
|
|
|
config DATA_SHIFT
|
2019-02-22 03:08:50 +08:00
|
|
|
int "Data shift" if DATA_SHIFT_BOOL
|
2019-02-22 03:08:47 +08:00
|
|
|
default 24 if STRICT_KERNEL_RWX && PPC64
|
2021-03-04 22:35:09 +08:00
|
|
|
range 17 28 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
|
|
|
|
range 19 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
|
2021-10-15 18:02:49 +08:00
|
|
|
range 20 24 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_FSL_BOOKE
|
2019-02-22 03:08:49 +08:00
|
|
|
default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
|
2021-03-04 22:35:09 +08:00
|
|
|
default 18 if (DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
|
2019-02-22 03:08:52 +08:00
|
|
|
default 23 if STRICT_KERNEL_RWX && PPC_8xx
|
2021-03-04 22:35:09 +08:00
|
|
|
default 23 if (DEBUG_PAGEALLOC || KFENCE) && PPC_8xx && PIN_TLB_DATA
|
|
|
|
default 19 if (DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
|
2021-10-15 18:02:49 +08:00
|
|
|
default 24 if STRICT_KERNEL_RWX && FSL_BOOKE
|
2019-02-22 03:08:47 +08:00
|
|
|
default PPC_PAGE_SHIFT
|
2019-02-22 03:08:50 +08:00
|
|
|
help
|
|
|
|
On Book3S 32 (603+), DBATs are used to map kernel text and rodata RO.
|
|
|
|
Smaller is the alignment, greater is the number of necessary DBATs.
|
2019-02-22 03:08:47 +08:00
|
|
|
|
2019-02-22 03:08:52 +08:00
|
|
|
On 8xx, large pages (512kb or 8M) are used to map kernel linear
|
|
|
|
memory. Aligning to 8M reduces TLB misses as only 8M pages are used
|
2020-05-19 13:49:25 +08:00
|
|
|
in that case. If PIN_TLB is selected, it must be aligned to 8M as
|
|
|
|
8M pages will be pinned.
|
2019-02-22 03:08:52 +08:00
|
|
|
|
2008-04-11 09:11:56 +08:00
|
|
|
config FORCE_MAX_ZONEORDER
|
|
|
|
int "Maximum zone order"
|
2016-02-19 13:38:47 +08:00
|
|
|
range 8 9 if PPC64 && PPC_64K_PAGES
|
2009-07-21 23:25:53 +08:00
|
|
|
default "9" if PPC64 && PPC_64K_PAGES
|
powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K
For hugetlb to work with 4K page size, we need MAX_ORDER to be 13 or
more. When switching from a 64K page size to 4K linux page size using
make oldconfig, we end up with a CONFIG_FORCE_MAX_ZONEORDER value of 9.
This results in a 16M hugepage beiing considered as a gigantic huge page
which in turn results in failure to setup hugepages if gigantic hugepage
support is not enabled.
This also results in kernel crash with 4K radix configuration. We
hit the below BUG_ON on radix:
kernel BUG at mm/huge_memory.c:364!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1-00006-gbae9cc6 #1
task: c0000000f1af8000 task.stack: c0000000f1aec000
NIP: c000000000c5fa0c LR: c000000000c5f9d8 CTR: c000000000c5f9a4
REGS: c0000000f1aef920 TRAP: 0700 Not tainted (4.8.0-rc1-00006-gbae9cc6)
MSR: 9000000102029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE,TM[E]> CR: 24000844 XER: 00000000
CFAR: c000000000c5f9e0 SOFTE: 1
....
NIP [c000000000c5fa0c] hugepage_init+0x68/0x238
LR [c000000000c5f9d8] hugepage_init+0x34/0x238
Fixes: a7ee539584acf ("powerpc/Kconfig: Update config option based on page size")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Santhosh <santhog4@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-20 01:31:33 +08:00
|
|
|
range 13 13 if PPC64 && !PPC_64K_PAGES
|
2009-07-21 23:25:53 +08:00
|
|
|
default "13" if PPC64 && !PPC_64K_PAGES
|
|
|
|
range 9 64 if PPC32 && PPC_16K_PAGES
|
|
|
|
default "9" if PPC32 && PPC_16K_PAGES
|
|
|
|
range 7 64 if PPC32 && PPC_64K_PAGES
|
|
|
|
default "7" if PPC32 && PPC_64K_PAGES
|
|
|
|
range 5 64 if PPC32 && PPC_256K_PAGES
|
|
|
|
default "5" if PPC32 && PPC_256K_PAGES
|
2008-09-24 12:29:08 +08:00
|
|
|
range 11 64
|
2008-04-11 09:11:56 +08:00
|
|
|
default "11"
|
|
|
|
help
|
|
|
|
The kernel memory allocator divides physically contiguous memory
|
|
|
|
blocks into "zones", where each zone is a power of two number of
|
|
|
|
pages. This option selects the largest power of two that the kernel
|
|
|
|
keeps in the memory allocator. If you need to allocate very large
|
|
|
|
blocks of physically contiguous memory, then you may need to
|
|
|
|
increase this value.
|
|
|
|
|
|
|
|
This config option is actually maximum order plus one. For example,
|
|
|
|
a value of 11 means that the largest free memory block is 2^10 pages.
|
|
|
|
|
|
|
|
The page size is not necessarily 4KB. For example, on 64-bit
|
|
|
|
systems, 64KB pages can be enabled via CONFIG_PPC_64K_PAGES. Keep
|
|
|
|
this in mind when choosing a value for this option.
|
|
|
|
|
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 05:35:13 +08:00
|
|
|
config PPC_SUBPAGE_PROT
|
2020-07-03 09:19:58 +08:00
|
|
|
bool "Support setting protections for 4k subpages (subpage_prot syscall)"
|
|
|
|
default n
|
2021-12-01 22:41:51 +08:00
|
|
|
depends on PPC_64S_HASH_MMU && PPC_64K_PAGES
|
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 05:35:13 +08:00
|
|
|
help
|
2020-07-03 09:19:58 +08:00
|
|
|
This option adds support for system call to allow user programs
|
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 05:35:13 +08:00
|
|
|
to set access permissions (read/write, readonly, or no access)
|
|
|
|
on the 4k subpages of each 64k page.
|
|
|
|
|
2020-07-03 09:19:58 +08:00
|
|
|
If unsure, say N here.
|
|
|
|
|
2020-08-22 02:55:57 +08:00
|
|
|
config PPC_PROT_SAO_LPAR
|
|
|
|
bool "Support PROT_SAO mappings in LPARs"
|
|
|
|
depends on PPC_BOOK3S_64
|
|
|
|
help
|
|
|
|
This option adds support for PROT_SAO mappings from userspace
|
|
|
|
inside LPARs on supported CPUs.
|
|
|
|
|
|
|
|
This may cause issues when performing guest migration from
|
|
|
|
a CPU that supports SAO to one that does not.
|
|
|
|
|
|
|
|
If unsure, say N here.
|
|
|
|
|
2014-10-08 16:54:50 +08:00
|
|
|
config PPC_COPRO_BASE
|
|
|
|
bool
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config SCHED_SMT
|
|
|
|
bool "SMT (Hyperthreading) scheduler support"
|
|
|
|
depends on PPC64 && SMP
|
|
|
|
help
|
|
|
|
SMT scheduler support improves the CPU scheduler's decision making
|
|
|
|
when dealing with POWER5 cpus at a cost of slightly increased
|
|
|
|
overhead in some places. If unsure say N here.
|
|
|
|
|
2012-09-10 08:35:26 +08:00
|
|
|
config PPC_DENORMALISATION
|
|
|
|
bool "PowerPC denormalisation exception handling"
|
|
|
|
depends on PPC_BOOK3S_64
|
2013-07-31 14:31:26 +08:00
|
|
|
default "y" if PPC_POWERNV
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2012-09-10 08:35:26 +08:00
|
|
|
Add support for handling denormalisation of single precision
|
|
|
|
values. Useful for bare metal only. If unsure say Y here.
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config CMDLINE
|
2020-06-12 06:42:19 +08:00
|
|
|
string "Initial kernel command string"
|
2019-04-27 00:23:27 +08:00
|
|
|
default ""
|
2005-09-26 14:04:21 +08:00
|
|
|
help
|
|
|
|
On some platforms, there is currently no way for the boot loader to
|
|
|
|
pass arguments to the kernel. For these platforms, you can supply
|
|
|
|
some command-line options at build time by entering them here. In
|
|
|
|
most cases you will need to specify the root device here.
|
|
|
|
|
2019-08-02 06:50:06 +08:00
|
|
|
choice
|
|
|
|
prompt "Kernel command line type" if CMDLINE != ""
|
|
|
|
default CMDLINE_FROM_BOOTLOADER
|
|
|
|
|
|
|
|
config CMDLINE_FROM_BOOTLOADER
|
|
|
|
bool "Use bootloader kernel arguments if available"
|
|
|
|
help
|
|
|
|
Uses the command-line options passed by the boot loader. If
|
|
|
|
the boot loader doesn't provide any, the default kernel command
|
|
|
|
string provided in CMDLINE will be used.
|
|
|
|
|
|
|
|
config CMDLINE_EXTEND
|
|
|
|
bool "Extend bootloader kernel arguments"
|
|
|
|
help
|
|
|
|
The command-line arguments provided by the boot loader will be
|
|
|
|
appended to the default kernel command string.
|
|
|
|
|
2014-02-21 04:48:17 +08:00
|
|
|
config CMDLINE_FORCE
|
|
|
|
bool "Always use the default kernel command string"
|
|
|
|
help
|
|
|
|
Always use the default kernel command string, even if the boot
|
|
|
|
loader passes other arguments to the kernel.
|
|
|
|
This is useful if you cannot or don't want to change the
|
|
|
|
command-line options your boot loader passes to the kernel.
|
|
|
|
|
2019-08-02 06:50:06 +08:00
|
|
|
endchoice
|
|
|
|
|
2008-07-09 23:41:52 +08:00
|
|
|
config EXTRA_TARGETS
|
|
|
|
string "Additional default image types"
|
|
|
|
help
|
|
|
|
List additional targets to be built by the bootwrapper here (separated
|
|
|
|
by spaces). This is useful for targets that depend of device tree
|
|
|
|
files in the .dts directory.
|
|
|
|
|
|
|
|
Targets in this list will be build as part of the default build
|
|
|
|
target, or when the user does a 'make zImage' or a
|
|
|
|
'make zImage.initrd'.
|
|
|
|
|
|
|
|
If unsure, leave blank
|
|
|
|
|
2008-01-16 12:17:00 +08:00
|
|
|
config ARCH_WANTS_FREEZER_CONTROL
|
|
|
|
def_bool y
|
|
|
|
depends on ADB_PMU
|
|
|
|
|
2018-12-11 19:01:04 +08:00
|
|
|
source "kernel/power/Kconfig"
|
2005-09-26 14:04:21 +08:00
|
|
|
|
2018-01-19 09:50:24 +08:00
|
|
|
config PPC_MEM_KEYS
|
|
|
|
prompt "PowerPC Memory Protection Keys"
|
|
|
|
def_bool y
|
|
|
|
depends on PPC_BOOK3S_64
|
2021-12-01 22:41:51 +08:00
|
|
|
depends on PPC_64S_HASH_MMU
|
2018-01-19 09:50:24 +08:00
|
|
|
select ARCH_USES_HIGH_VMA_FLAGS
|
|
|
|
select ARCH_HAS_PKEYS
|
|
|
|
help
|
|
|
|
Memory Protection Keys provides a mechanism for enforcing
|
|
|
|
page-based protections, but without requiring modification of the
|
|
|
|
page tables when an application changes protection domains.
|
|
|
|
|
2019-06-08 02:54:31 +08:00
|
|
|
For details, see Documentation/core-api/protection-keys.rst
|
2018-01-19 09:50:24 +08:00
|
|
|
|
|
|
|
If unsure, say y.
|
|
|
|
|
2019-11-06 07:00:22 +08:00
|
|
|
config PPC_SECURE_BOOT
|
|
|
|
prompt "Enable secure boot support"
|
|
|
|
bool
|
2020-09-24 09:49:22 +08:00
|
|
|
depends on PPC_POWERNV || PPC_PSERIES
|
2019-10-31 11:31:27 +08:00
|
|
|
depends on IMA_ARCH_POLICY
|
2020-03-09 08:57:51 +08:00
|
|
|
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
|
2019-11-06 07:00:22 +08:00
|
|
|
help
|
|
|
|
Systems with firmware secure boot enabled need to define security
|
|
|
|
policies to extend secure boot to the OS. This config allows a user
|
|
|
|
to enable OS secure boot on systems that have firmware support for
|
|
|
|
it. If in doubt say N.
|
|
|
|
|
2019-11-11 11:10:34 +08:00
|
|
|
config PPC_SECVAR_SYSFS
|
|
|
|
bool "Enable sysfs interface for POWER secure variables"
|
|
|
|
default y
|
|
|
|
depends on PPC_SECURE_BOOT
|
|
|
|
depends on SYSFS
|
|
|
|
help
|
|
|
|
POWER secure variables are managed and controlled by firmware.
|
|
|
|
These variables are exposed to userspace via sysfs to enable
|
|
|
|
read/write operations on these variables. Say Y if you have
|
|
|
|
secure boot enabled and want to expose variables to userspace.
|
|
|
|
|
powerpc/rtas: Restrict RTAS requests from userspace
A number of userspace utilities depend on making calls to RTAS to retrieve
information and update various things.
The existing API through which we expose RTAS to userspace exposes more
RTAS functionality than we actually need, through the sys_rtas syscall,
which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they
want with arbitrary arguments.
Many RTAS calls take the address of a buffer as an argument, and it's up to
the caller to specify the physical address of the buffer as an argument. We
allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can
access, and then expose the physical address and size of this buffer in
/proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address,
poke at the buffer using /dev/mem, and pass an address in the RMO buffer to
the RTAS call.
However, there's nothing stopping the caller from specifying whatever
address they want in the RTAS call, and it's easy to construct a series of
RTAS calls that can overwrite arbitrary bytes (even without /dev/mem
access).
Additionally, there are some RTAS calls that do potentially dangerous
things and for which there are no legitimate userspace use cases.
In the past, this would not have been a particularly big deal as it was
assumed that root could modify all system state freely, but with Secure
Boot and lockdown we need to care about this.
We can't fundamentally change the ABI at this point, however we can address
this by implementing a filter that checks RTAS calls against a list
of permitted calls and forces the caller to use addresses within the RMO
buffer.
The list is based off the list of calls that are used by the librtas
userspace library, and has been tested with a number of existing userspace
RTAS utilities. For compatibility with any applications we are not aware of
that require other calls, the filter can be turned off at build time.
Cc: stable@vger.kernel.org
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200820044512.7543-1-ajd@linux.ibm.com
2020-08-20 12:45:12 +08:00
|
|
|
config PPC_RTAS_FILTER
|
|
|
|
bool "Enable filtering of RTAS syscalls"
|
|
|
|
default y
|
|
|
|
depends on PPC_RTAS
|
|
|
|
help
|
|
|
|
The RTAS syscall API has security issues that could be used to
|
|
|
|
compromise system integrity. This option enforces restrictions on the
|
|
|
|
RTAS calls and arguments passed by userspace programs to mitigate
|
|
|
|
these issues.
|
|
|
|
|
|
|
|
Say Y unless you know what you are doing and the filter is causing
|
|
|
|
problems for you.
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
endmenu
|
|
|
|
|
|
|
|
config ISA_DMA_API
|
|
|
|
bool
|
2012-02-22 22:10:12 +08:00
|
|
|
default PCI
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
menu "Bus options"
|
|
|
|
|
|
|
|
config ISA
|
|
|
|
bool "Support for ISA-bus hardware"
|
2013-03-27 08:47:03 +08:00
|
|
|
depends on PPC_CHRP
|
2005-10-26 14:47:42 +08:00
|
|
|
select PPC_I8259
|
2005-09-26 14:04:21 +08:00
|
|
|
help
|
|
|
|
Find out whether you have ISA slots on your motherboard. ISA is the
|
|
|
|
name of a bus system, i.e. the way the CPU talks to the other stuff
|
|
|
|
inside your box. If you have an Apple machine, say N here; if you
|
2013-03-27 08:47:03 +08:00
|
|
|
have an IBM RS/6000 or pSeries machine, say Y. If you have an
|
|
|
|
embedded board, consult your board documentation.
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
config GENERIC_ISA_DMA
|
|
|
|
bool
|
2010-07-15 15:38:16 +08:00
|
|
|
depends on ISA_DMA_API
|
2005-09-26 14:04:21 +08:00
|
|
|
default y
|
|
|
|
|
2005-10-26 14:36:55 +08:00
|
|
|
config PPC_INDIRECT_PCI
|
|
|
|
bool
|
|
|
|
depends on PCI
|
2006-01-15 06:57:39 +08:00
|
|
|
default y if 40x || 44x
|
2005-10-26 14:36:55 +08:00
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config SBUS
|
|
|
|
bool
|
|
|
|
|
2006-01-11 11:43:56 +08:00
|
|
|
config FSL_SOC
|
|
|
|
bool
|
|
|
|
|
2007-07-10 18:44:34 +08:00
|
|
|
config FSL_PCI
|
2019-07-04 00:04:13 +08:00
|
|
|
bool
|
2019-02-13 15:01:22 +08:00
|
|
|
select ARCH_HAS_DMA_SET_MASK
|
2007-07-10 18:44:34 +08:00
|
|
|
select PPC_INDIRECT_PCI
|
2009-01-29 03:25:29 +08:00
|
|
|
select PCI_QUIRKS
|
2007-07-10 18:44:34 +08:00
|
|
|
|
2009-09-16 05:43:57 +08:00
|
|
|
config FSL_PMC
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
depends on SUSPEND && (PPC_85xx || PPC_86xx)
|
|
|
|
help
|
|
|
|
Freescale MPC85xx/MPC86xx power management controller support
|
|
|
|
(suspend/resume). For MPC83xx see platforms/83xx/suspend.c
|
|
|
|
|
2010-10-08 18:25:27 +08:00
|
|
|
config PPC4xx_CPM
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
depends on SUSPEND && (44x || 40x)
|
|
|
|
help
|
|
|
|
PPC4xx Clock Power Management (CPM) support (suspend/resume).
|
|
|
|
It also enables support for two different idle states (idle-wait
|
|
|
|
and idle-doze).
|
|
|
|
|
2008-03-26 19:39:50 +08:00
|
|
|
config 4xx_SOC
|
|
|
|
bool
|
|
|
|
|
2008-04-12 01:03:40 +08:00
|
|
|
config FSL_LBC
|
2010-10-18 15:22:31 +08:00
|
|
|
bool "Freescale Local Bus support"
|
2008-04-12 01:03:40 +08:00
|
|
|
help
|
2010-10-18 15:22:31 +08:00
|
|
|
Enables reporting of errors from the Freescale local bus
|
|
|
|
controller. Also contains some common code used by
|
|
|
|
drivers for specific local bus peripherals.
|
2008-04-12 01:03:40 +08:00
|
|
|
|
2008-05-24 00:38:54 +08:00
|
|
|
config FSL_GTM
|
|
|
|
bool
|
|
|
|
depends on PPC_83xx || QUICC_ENGINE || CPM2
|
|
|
|
help
|
|
|
|
Freescale General-purpose Timers support
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config PCI_8260
|
|
|
|
bool
|
|
|
|
depends on PCI && 8260
|
2005-10-26 14:36:55 +08:00
|
|
|
select PPC_INDIRECT_PCI
|
2005-09-26 14:04:21 +08:00
|
|
|
default y
|
|
|
|
|
2011-03-24 07:43:03 +08:00
|
|
|
config FSL_RIO
|
|
|
|
bool "Freescale Embedded SRIO Controller support"
|
2018-11-16 03:05:36 +08:00
|
|
|
depends on RAPIDIO = y && HAVE_RAPIDIO
|
2011-03-24 07:43:03 +08:00
|
|
|
default "n"
|
2019-07-04 00:04:13 +08:00
|
|
|
help
|
2011-03-24 07:43:03 +08:00
|
|
|
Include support for RapidIO controller on Freescale embedded
|
|
|
|
processors (MPC8548, MPC8641, etc).
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
endmenu
|
|
|
|
|
2011-12-15 06:57:15 +08:00
|
|
|
config NONSTATIC_KERNEL
|
|
|
|
bool
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
menu "Advanced setup"
|
|
|
|
depends on PPC32
|
|
|
|
|
|
|
|
config ADVANCED_OPTIONS
|
|
|
|
bool "Prompt for advanced kernel configuration options"
|
|
|
|
help
|
|
|
|
This option will enable prompting for a variety of advanced kernel
|
|
|
|
configuration options. These options can cause the kernel to not
|
|
|
|
work if they are set incorrectly, but can be used to optimize certain
|
|
|
|
aspects of kernel memory management.
|
|
|
|
|
|
|
|
Unless you know what you are doing, say N here.
|
|
|
|
|
|
|
|
comment "Default settings for advanced configuration options are used"
|
|
|
|
depends on !ADVANCED_OPTIONS
|
|
|
|
|
|
|
|
config LOWMEM_SIZE_BOOL
|
|
|
|
bool "Set maximum low memory"
|
|
|
|
depends on ADVANCED_OPTIONS
|
|
|
|
help
|
|
|
|
This option allows you to set the maximum amount of memory which
|
|
|
|
will be used as "low memory", that is, memory which the kernel can
|
|
|
|
access directly, without having to set up a kernel virtual mapping.
|
|
|
|
This can be useful in optimizing the layout of kernel virtual
|
|
|
|
memory.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config LOWMEM_SIZE
|
|
|
|
hex "Maximum low memory size (in bytes)" if LOWMEM_SIZE_BOOL
|
|
|
|
default "0x30000000"
|
|
|
|
|
2008-12-09 11:34:58 +08:00
|
|
|
config LOWMEM_CAM_NUM_BOOL
|
|
|
|
bool "Set number of CAMs to use to map low memory"
|
|
|
|
depends on ADVANCED_OPTIONS && FSL_BOOKE
|
|
|
|
help
|
|
|
|
This option allows you to set the maximum number of CAM slots that
|
|
|
|
will be used to map low memory. There are a limited number of slots
|
|
|
|
available and even more limited number that will fit in the L1 MMU.
|
|
|
|
However, using more entries will allow mapping more low memory. This
|
|
|
|
can be useful in optimizing the layout of kernel virtual memory.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config LOWMEM_CAM_NUM
|
2009-03-31 20:05:50 +08:00
|
|
|
depends on FSL_BOOKE
|
2008-12-09 11:34:58 +08:00
|
|
|
int "Number of CAMs to use to map low memory" if LOWMEM_CAM_NUM_BOOL
|
2021-10-15 18:02:49 +08:00
|
|
|
default 3 if !STRICT_KERNEL_RWX
|
|
|
|
default 9 if DATA_SHIFT >= 24
|
|
|
|
default 12 if DATA_SHIFT >= 22
|
|
|
|
default 15
|
2008-12-09 11:34:58 +08:00
|
|
|
|
2011-12-15 06:57:15 +08:00
|
|
|
config DYNAMIC_MEMSTART
|
2013-01-17 10:53:25 +08:00
|
|
|
bool "Enable page aligned dynamic load address for kernel"
|
|
|
|
depends on ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || 44x)
|
2011-12-15 06:57:15 +08:00
|
|
|
select NONSTATIC_KERNEL
|
|
|
|
help
|
|
|
|
This option enables the kernel to be loaded at any page aligned
|
2019-07-04 00:04:13 +08:00
|
|
|
physical address. The kernel creates a mapping from KERNELBASE to
|
2011-12-15 06:57:15 +08:00
|
|
|
the address where the kernel is loaded. The page size here implies
|
|
|
|
the TLB page size of the mapping for kernel on the particular platform.
|
|
|
|
Please refer to the init code for finding the TLB page size.
|
|
|
|
|
|
|
|
DYNAMIC_MEMSTART is an easy way of implementing pseudo-RELOCATABLE
|
|
|
|
kernel image, where the only restriction is the page aligned kernel
|
2019-07-04 00:04:13 +08:00
|
|
|
load address. When this option is enabled, the compile time physical
|
2011-12-15 06:57:15 +08:00
|
|
|
address CONFIG_PHYSICAL_START is ignored.
|
|
|
|
|
2011-12-15 06:58:12 +08:00
|
|
|
This option is overridden by CONFIG_RELOCATABLE
|
|
|
|
|
2008-04-22 02:22:34 +08:00
|
|
|
config PAGE_OFFSET_BOOL
|
|
|
|
bool "Set custom page offset address"
|
|
|
|
depends on ADVANCED_OPTIONS
|
|
|
|
help
|
|
|
|
This option allows you to set the kernel virtual address at which
|
|
|
|
the kernel will map low memory. This can be useful in optimizing
|
|
|
|
the virtual memory layout of the system.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config PAGE_OFFSET
|
|
|
|
hex "Virtual address of memory base" if PAGE_OFFSET_BOOL
|
|
|
|
default "0xc0000000"
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config KERNEL_START_BOOL
|
|
|
|
bool "Set custom kernel base address"
|
|
|
|
depends on ADVANCED_OPTIONS
|
|
|
|
help
|
|
|
|
This option allows you to set the kernel virtual address at which
|
2008-04-22 02:22:34 +08:00
|
|
|
the kernel will be loaded. Normally this should match PAGE_OFFSET
|
|
|
|
however there are times (like kdump) that one might not want them
|
|
|
|
to be the same.
|
2005-09-26 14:04:21 +08:00
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config KERNEL_START
|
|
|
|
hex "Virtual address of kernel base" if KERNEL_START_BOOL
|
2008-04-22 02:22:34 +08:00
|
|
|
default PAGE_OFFSET if PAGE_OFFSET_BOOL
|
2011-12-15 06:57:15 +08:00
|
|
|
default "0xc2000000" if CRASH_DUMP && !NONSTATIC_KERNEL
|
2005-09-26 14:04:21 +08:00
|
|
|
default "0xc0000000"
|
|
|
|
|
2008-04-22 02:22:34 +08:00
|
|
|
config PHYSICAL_START_BOOL
|
|
|
|
bool "Set physical address where the kernel is loaded"
|
|
|
|
depends on ADVANCED_OPTIONS && FLATMEM && FSL_BOOKE
|
|
|
|
help
|
|
|
|
This gives the physical address where the kernel is loaded.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config PHYSICAL_START
|
|
|
|
hex "Physical address where the kernel is loaded" if PHYSICAL_START_BOOL
|
2018-11-17 18:25:07 +08:00
|
|
|
default "0x02000000" if PPC_BOOK3S && CRASH_DUMP && !NONSTATIC_KERNEL
|
2008-04-22 02:22:34 +08:00
|
|
|
default "0x00000000"
|
|
|
|
|
|
|
|
config PHYSICAL_ALIGN
|
|
|
|
hex
|
2008-12-09 11:34:59 +08:00
|
|
|
default "0x04000000" if FSL_BOOKE
|
2008-04-22 02:22:34 +08:00
|
|
|
help
|
|
|
|
This value puts the alignment restrictions on physical address
|
|
|
|
where kernel is loaded and run from. Kernel is compiled for an
|
|
|
|
address which meets above alignment restriction.
|
|
|
|
|
2005-09-26 14:04:21 +08:00
|
|
|
config TASK_SIZE_BOOL
|
|
|
|
bool "Set custom user task size"
|
|
|
|
depends on ADVANCED_OPTIONS
|
|
|
|
help
|
|
|
|
This option allows you to set the amount of virtual address space
|
|
|
|
allocated to user tasks. This can be useful in optimizing the
|
|
|
|
virtual memory layout of the system.
|
|
|
|
|
|
|
|
Say N here unless you know what you are doing.
|
|
|
|
|
|
|
|
config TASK_SIZE
|
|
|
|
hex "Size of user task space" if TASK_SIZE_BOOL
|
2013-03-27 08:47:03 +08:00
|
|
|
default "0x80000000" if PPC_8xx
|
2021-04-01 21:30:43 +08:00
|
|
|
default "0xb0000000" if PPC_BOOK3S_32
|
2007-10-12 02:40:21 +08:00
|
|
|
default "0xc0000000"
|
2005-09-26 14:04:21 +08:00
|
|
|
endmenu
|
|
|
|
|
2005-09-30 14:16:52 +08:00
|
|
|
if PPC64
|
powerpc: Work around gcc miscompilation of __pa() on 64-bit
On 64-bit, __pa(&static_var) gets miscompiled by recent versions of
gcc as something like:
addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
addi 3,3,.LANCHOR1+4611686018427387904@toc@l
This ends up effectively ignoring the offset, since its bottom 32 bits
are zero, and means that the result of __pa() still has 0xC in the top
nibble. This happens with gcc 4.8.1, at least.
To work around this, for 64-bit we make __pa() use an AND operator,
and for symmetry, we make __va() use an OR operator. Using an AND
operator rather than a subtraction ends up with slightly shorter code
since it can be done with a single clrldi instruction, whereas it
takes three instructions to form the constant (-PAGE_OFFSET) and add
it on. (Note that MEMORY_START is always 0 on 64-bit.)
CC: <stable@vger.kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-27 14:07:49 +08:00
|
|
|
# This value must have zeroes in the bottom 60 bits otherwise lots will break
|
2008-04-22 02:22:34 +08:00
|
|
|
config PAGE_OFFSET
|
|
|
|
hex
|
|
|
|
default "0xc000000000000000"
|
2005-09-30 14:16:52 +08:00
|
|
|
config KERNEL_START
|
|
|
|
hex
|
2005-09-30 15:24:15 +08:00
|
|
|
default "0xc000000000000000"
|
2008-04-22 02:22:34 +08:00
|
|
|
config PHYSICAL_START
|
|
|
|
hex
|
|
|
|
default "0x00000000"
|
2005-09-30 14:16:52 +08:00
|
|
|
endif
|
|
|
|
|
2013-10-11 11:07:57 +08:00
|
|
|
config ARCH_RANDOM
|
|
|
|
def_bool n
|
|
|
|
|
2007-09-16 18:53:25 +08:00
|
|
|
config PPC_LIB_RHEAP
|
|
|
|
bool
|
|
|
|
|
2008-04-17 12:28:09 +08:00
|
|
|
source "arch/powerpc/kvm/Kconfig"
|
powerpc/livepatch: Add live patching support on ppc64le
Add the kconfig logic & assembly support for handling live patched
functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
depends on the new -mprofile-kernel ftrace ABI, which is only supported
currently on ppc64le.
Live patching is handled by a special ftrace handler. This means it runs
from ftrace_caller(). The live patch handler modifies the NIP so as to
redirect the return from ftrace_caller() to the new patched function.
However there is one particularly tricky case we need to handle.
If a function A calls another function B, and it is known at link time
that they share the same TOC, then A will not save or restore its TOC,
and will call the local entry point of B.
When we live patch B, we replace it with a new function C, which may
not have the same TOC as A. At live patch time it's too late to modify A
to do the TOC save/restore, so the live patching code must interpose
itself between A and C, and do the TOC save/restore that A omitted.
An additionaly complication is that the livepatch code can not create a
stack frame in order to save the TOC. That is because if C takes > 8
arguments, or is varargs, A will have written the arguments for C in
A's stack frame.
To solve this, we introduce a "livepatch stack" which grows upward from
the base of the regular stack, and is used to store the TOC & LR when
calling a live patched function.
When the patched function returns, we retrieve the real LR & TOC from
the livepatch stack, restore them, and pop the livepatch "stack frame".
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
2016-03-24 19:04:05 +08:00
|
|
|
|
|
|
|
source "kernel/livepatch/Kconfig"
|