The pseries platform uses 32bit default DMA window (always 4K pages) and
optional 64bit DMA window available via DDW ("Dynamic DMA Windows"),
64K or 2M pages. For ages the default one was not removed and a huge
window was created in addition. Things changed with SRIOV-enabled
PowerVM which creates a default-and-bigger DMA window in 64bit space
(still using 4K pages) for IOV VFs so certain OSes do not need to use
the DDW API in order to utilize all available TCE budget.
Linux on the other hand removes the default window and creates a bigger
one (with more TCEs or/and a bigger page size - 64K/2M) in a bid to map
the entire RAM, and if the new window size is smaller than that - it
still uses this new bigger window. The result is that the default window
is removed but the "ibm,dma-window" property is not.
When kdump is invoked, the existing code tries reusing the existing 64bit
DMA window which location and parameters are stored in the device tree but
this fails as the new property does not make it to the kdump device tree
blob. So the code falls back to the default window which does not exist
anymore although the device tree says that it does. The result of that
is that PCI devices become unusable and cannot be used for kdumping.
This preserves the DMA64 and DIRECT64 properties in the device tree blob
for the crash kernel. Since the crash kernel setup is done after device
drivers are loaded and probed, the proper DMA config is stored at least
for boot time devices.
Because DDW window is optional and the code configures the default window
first, the existing code creates an IOMMU table descriptor for
the non-existing default DMA window. It is harmless for kdump as it does
not touch the actual window (only reads what is mapped and marks those IO
pages as used) but it is bad for kexec which clears it thinking it is
a smaller default window rather than a bigger DDW window.
This removes the "ibm,dma-window" property from the device tree after
a bigger window is created and the crash kernel setup picks it up.
Fixes: 381ceda88c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220629060614.1680476-1-aik@ozlabs.ru
During kdump, two set of NMI IPIs are sent to secondary CPUs, if
'crash_kexec_post_notifiers' option is set. The first set of NMI IPIs
to stop the CPUs and the other set to collect register data. Instead,
capture register data for secondary CPUs while stopping them itself.
Also, fallback to smp_send_stop() in case the function gets called
without kdump configured.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220630064942.192283-1-hbathini@linux.ibm.com
Trying to remove asm/ppc_asm.h from all places that don't need it
leads to several failures linked to firmware_has_feature().
To fix it, include asm/firmware.h in all files using
firmware_has_feature()
All users found with:
git grep -L "firmware\.h" ` git grep -l "firmware_has_feature("`
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/11956ec181a034b51a881ac9c059eea72c679a73.1651828453.git.christophe.leroy@csgroup.eu
This marks more files and functions that can possibly be called in
real mode as not to be instrumented by KASAN. Most were found by
inspection, except for get_pseries_errorlog() which was reported as
causing a crash in testing.
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo
The kexec code paths involve code that necessarily run in real mode, as
CPUs are disabled and control is transferred to the new kernel. Disable
address sanitization for the kexec code and the functions called in real
mode on CPUs being disabled.
[paulus@ozlabs.org: combined a few work-in-progress commits of
Daniel's and wrote the commit message.]
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Move pseries_machine_kexec() into kexec.c so setup.c can be instrumented]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTFSQ2TUSEaDdVC@cleo
Various spelling mistakes in comments.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
We originally added asm-prototypes.h in commit 42f5b4cacd ("powerpc:
Introduce asm-prototypes.h"). It's purpose was for prototypes of C
functions that are only called from asm, in order to fix sparse
warnings about missing prototypes.
A few months later Nick added a different use case in
commit 4efca4ed05 ("kbuild: modversions for EXPORT_SYMBOL() for asm")
for C prototypes for exported asm functions. This is basically the
inverse of our original usage.
Since then we've added various prototypes to asm-prototypes.h for both
reasons, meaning we now need to unstitch it all.
Dispatch prototypes of C functions into relevant headers and keep
only the prototypes for functions defined in assembly.
For the time being, leave prom_init() there because moving it
into asm/prom.h or asm/setup.h conflicts with
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o
This will be fixed later by untaggling asm/pci.h and asm/prom.h
or by renaming the function in shadowrom.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu
On large config LPARs (having 192 and more cores), Linux fails to boot
due to insufficient memory in the first memblock. It is due to the
memory reservation for the crash kernel which starts at 128MB offset of
the first memblock. This memory reservation for the crash kernel doesn't
leave enough space in the first memblock to accommodate other essential
system resources.
The crash kernel start address was set to 128MB offset by default to
ensure that the crash kernel get some memory below the RMA region which
is used to be of size 256MB. But given that the RMA region size can be
512MB or more, setting the crash kernel offset to mid of RMA size will
leave enough space for the kernel to allocate memory for other system
resources.
Since the above crash kernel offset change is only applicable to the LPAR
platform, the LPAR feature detection is pushed before the crash kernel
reservation. The rest of LPAR specific initialization will still
be done during pseries_probe_fw_features as usual.
This patch is dependent on changes to paca allocation for boot CPU. It
expect boot CPU to discover 1T segment support which is introduced by
the patch posted here:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-January/239175.html
Reported-by: Abdul haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220204085601.107257-1-sourabhjain@linux.ibm.com
Some functions defined in `arch/powerpc/kernel` (and one in `arch/powerpc/
kexec`) are deserving of an `__init` macro attribute. These functions are
only called by other initialization functions and therefore should inherit
the attribute.
Also, change function declarations in header files to include `__init`.
Signed-off-by: Nick Child <nick.child@ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211216220035.605465-2-nick.child@ibm.com
Compiling out hash support code when CONFIG_PPC_64S_HASH_MMU=n saves
128kB kernel image size (90kB text) on powernv_defconfig minus KVM,
350kB on pseries_defconfig minus KVM, 40kB on a tiny config.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup defined(ARCH_HAS_MEMREMAP_COMPAT_ALIGN), which needs CONFIG.
Fix radix_enabled() use in setup_initial_memory_limit(). Add some
stubs to reduce number of ifdefs.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211201144153.2456614-18-npiggin@gmail.com
ppc_md.iommu_save() is not set anymore by any platform after
commit c40785ad30 ("powerpc/dart: Use a cachable DART").
So iommu_save() has become a nop and can be removed.
ppc_md.show_percpuinfo() is not set anymore by any platform after
commit 4350147a81 ("[PATCH] ppc64: SMU based macs cpufreq support").
Last users of ppc_md.rtc_read_val() and ppc_md.rtc_write_val() were
removed by commit 0f03a43b8f ("[POWERPC] Remove todc code from
ARCH=powerpc")
Last user of kgdb_map_scc() was removed by commit 17ce452f7e ("kgdb,
powerpc: arch specific powerpc kgdb support").
ppc.machine_kexec_prepare() has not been used since
commit 8ee3e0d696 ("powerpc: Remove the main legacy iSerie platform
code"). This allows the removal of machine_kexec_prepare() and the
rename of default_machine_kexec_prepare() into machine_kexec_prepare()
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Daniel Axtens <dja@axtens.net>
[mpe: Drop prototype for default_machine_kexec_prepare() as noted by dja]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/24d4ca0ada683c9436a5f812a7aeb0a1362afa2b.1630398606.git.christophe.leroy@csgroup.eu
Fix following coccicheck warning:
./arch/powerpc/kexec/file_load_64.c:698:1-22: WARNING: Function
for_each_node_by_type should have of_node_put() before goto
Early exits from for_each_node_by_type should decrement the
node reference counter.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211018015418.10182-1-wanjiabing@vivo.com
- Convert pseries & powernv to use MSI IRQ domains.
- Rework the pseries CPU numbering so that CPUs that are removed, and later re-added, are
given a CPU number on the same node as previously, when possible.
- Add support for a new more flexible device-tree format for specifying NUMA distances.
- Convert powerpc to GENERIC_PTDUMP.
- Retire sbc8548 and sbc8641d board support.
- Various other small features and fixes.
Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard, Cédric Le Goater,
Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas, Fangrui Song, Finn Thain, Gautham R.
Shenoy, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo
Bras, Lukas Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan Chancellor,
Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R. Sampat, Randy Dunlap, Sebastian
Andrzej Siewior, Srikar Dronamraju, Wan Jiabing, Xiongwei Song, Zheng Yongjun.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEyHTYTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDo3D/9aXMVP2wsEMNB0XhTiJ1UUdi311Uq9
PvkAaGZH14ZqZLVigeiD3gt6YzTH0cEuGj6qgwsJrPDjF8FESnMbBsprMLr5/qE1
itWRGMAMCFaeTcB9ogYVJkzwg6RN2ZgIqoq4NVswNSXoAQGWb+1bvXq3RnXXNuGR
TQmLL02poNC6nX0YbRaQoT1Xx4nfUTiKHhU+Aok9uOCMJIyYZVATR6Qafb7/j7tO
UvjwOHztbu84lcJOGmSnw4LcmwNORLuP9IwR0r+O1M3ijEZqDo9TPkvtSz8HZwjU
mxdJwhrUmN0euMcghuiFxW+1XG2eM49ugsdJugiezG2RaIijbIp0nAIvdeaKAgT1
OSSwvWCQ0fkTPyLXE+O6tVqMhlUMdqQlRcyNwmN9svIip9VnwGNq3vA4ePlJm6Fi
i0i/tLqVNlJwFokZ7blW5g8SRgGRuFfXd5XUYLFvy5Teez+/7b1mW95gPQZSJ8kV
Tbx2e0nHAPX4hCAxJ1AB3/zTlnjY+4+WJ9bD5XdgXkeVE8PPh1BEkulhMi1R1OMj
57D1W6OgsBu/Pze78wjAvwO8+NAb1T/2mv2Bd/LY6Q+7hNDqOOhuajyBTxbH41FG
sqx5bKjKOwgTybfV9A0Eo0e4FQBX07yXltBFHaPlyA4sOsIhM59+PxNrEwN1eZrQ
LVVsdBXg8pHxrw==
=EbN0
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Convert pseries & powernv to use MSI IRQ domains.
- Rework the pseries CPU numbering so that CPUs that are removed, and
later re-added, are given a CPU number on the same node as
previously, when possible.
- Add support for a new more flexible device-tree format for specifying
NUMA distances.
- Convert powerpc to GENERIC_PTDUMP.
- Retire sbc8548 and sbc8641d board support.
- Various other small features and fixes.
Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard,
Cédric Le Goater, Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas,
Fangrui Song, Finn Thain, Gautham R. Shenoy, Hari Bathini, Joel
Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo Bras, Lukas
Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan
Chancellor, Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R.
Sampat, Randy Dunlap, Sebastian Andrzej Siewior, Srikar Dronamraju, Wan
Jiabing, Xiongwei Song, and Zheng Yongjun.
* tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (154 commits)
powerpc/bug: Cast to unsigned long before passing to inline asm
powerpc/ptdump: Fix generic ptdump for 64-bit
KVM: PPC: Fix clearing never mapped TCEs in realmode
powerpc/pseries/iommu: Rename "direct window" to "dma window"
powerpc/pseries/iommu: Make use of DDW for indirect mapping
powerpc/pseries/iommu: Find existing DDW with given property name
powerpc/pseries/iommu: Update remove_dma_window() to accept property name
powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new helper
powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
powerpc/pseries/iommu: Allow DDW windows starting at 0x00
powerpc/pseries/iommu: Add ddw_list_new_entry() helper
powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper
powerpc/kernel/iommu: Add new iommu_table_in_use() helper
powerpc/pseries/iommu: Replace hard-coded page shift
powerpc/numa: Update cpu_cpu_map on CPU online/offline
powerpc/numa: Print debug statements only when required
powerpc/numa: convert printk to pr_xxx
powerpc/numa: Drop dbg in favour of pr_debug
powerpc/smp: Enable CACHE domain for shared processor
powerpc/smp: Update cpu_core_map on all PowerPc systems
...
for_each_node_by_type should have of_node_put() before return.
Generated by: scripts/coccinelle/iterators/for_each_child.cocci
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2108031654080.17639@hadrien
All NMI contexts are handled the same as the safe context: store the
message and defer printing. There is no need to have special NMI
context tracking for this. Using in_nmi() is enough.
There are several parts of the kernel that are manually calling into
the printk NMI context tracking in order to cause general printk
deferred printing:
arch/arm/kernel/smp.c
arch/powerpc/kexec/crash.c
kernel/trace/trace.c
For arm/kernel/smp.c and powerpc/kexec/crash.c, provide a new
function pair printk_deferred_enter/exit that explicitly achieves the
same objective.
For ftrace, remove the printk context manipulation completely. It was
added in commit 03fc7f9c99 ("printk/nmi: Prevent deadlock when
accessing the main log buffer in NMI"). The purpose was to enforce
storing messages directly into the ring buffer even in NMI context.
It really should have only modified the behavior in NMI context.
There is no need for a special behavior any longer. All messages are
always stored directly now. The console deferring is handled
transparently in vprintk().
Signed-off-by: John Ogness <john.ogness@linutronix.de>
[pmladek@suse.com: Remove special handling in ftrace.c completely.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210715193359.25946-5-john.ogness@linutronix.de
- A big series refactoring parts of our KVM code, and converting some to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li,
Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique
Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley,
Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun
Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain,
YueHaibing, Zhang Jianhua, Zhen Lei.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDfFS4THG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFxHEAC88NJ+Gz87LiTQFt6QjhziBaJUd+sY
uqADPRROr4P50O8PjYZbMi2qbXzOlLkZO4wJWX7jpZ1F9KmbPNqY2shD8h4ahyge
F/uqzBW1FXBJfnDEKdU2MzalkeTP+dwxLZyouUamjDCGNLFjOV4x/Fft5otOdXjO
k9uO6yoGyOkWYzjC+Y/irNPlIDDByB/+bD92Cb52Y2mXMDDEnx4JzbtkeJW+8udT
Sjn3bWzeL+dz5GehjMKwK4+SptNiyQGOgM8FwtnKUMvgzxv04DqCGjr9YC12L2Z7
VoFZc4GzVgtf8DZg4fJ3KG5aG2nH3Tui7jc9lUckdrxixDAZw5wSG7CQ39gFb/5+
7A4fEJk4Z3h5llibwxAZrC7wV8ZDDXn8oRFzRcOJjfxYaD+ohOOyWHIebwkdiXYx
nfYI7sBcScDLXeBvHtDra2GJpbFSpVL3S/QNhhi1vKVNrFSyAgbAybcVL2xPLZ6+
8Mh7A8xt+hf2bo9AXuYJDo9mwXWfg1093d0kT+AslcRhZioBk18c2AiZLIz0FzuL
Ua/e5FPb99x9LSdcZHvaAXBoHT2iTgDyCyDa3gkIesyuRX6ggHoFcVQuvdDcbJ9d
H8LK+Tahy1Y+E5b6KdtU8mDEGE+QG+CWLnwQ6YSCaL/MYgaFzNa32Jdj1fmztSBC
cttP43kHZ7ljTw==
=zo4d
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- A big series refactoring parts of our KVM code, and converting some
to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on
Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.
* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
powerpc: Only build restart_table.c for 64s
powerpc/64s: move ret_from_fork etc above __end_soft_masked
powerpc/64s/interrupt: clean up interrupt return labels
powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
powerpc/64: enable MSR[EE] in irq replay pt_regs
powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
powerpc/64s: add a table of implicit soft-masked addresses
powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
powerpc/64e: fix CONFIG_RELOCATABLE build warnings
powerpc/64s: fix hash page fault interrupt handler
powerpc/4xx: Fix setup_kuep() on SMP
powerpc/32s: Fix setup_{kuap/kuep}() on SMP
powerpc/interrupt: Use names in check_return_regs_valid()
powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
powerpc/ptrace: Refactor regs_set_return_{msr/ip}
powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
powerpc/pseries/vas: Include irqdomain.h
powerpc: mark local variables around longjmp as volatile
...
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA
configuration options are equivalent.
Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead.
Done with
$ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \
$(git grep -wl CONFIG_NEED_MULTIPLE_NODES)
$ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \
$(git grep -wl NEED_MULTIPLE_NODES)
with manual tweaks afterwards.
[rppt@linux.ibm.com: fix arm boot crash]
Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com
Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc-11 points out that modifying local variables next to a
longjmp/setjmp may cause undefined behavior:
arch/powerpc/kexec/crash.c: In function 'crash_kexec_prepare_cpus.constprop':
arch/powerpc/kexec/crash.c:108:22: error: variable 'ncpus' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbere
d]
arch/powerpc/kexec/crash.c:109:13: error: variable 'tries' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbere
d]
arch/powerpc/xmon/xmon.c: In function 'xmon_print_symbol':
arch/powerpc/xmon/xmon.c:3625:21: error: variable 'name' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'stop_spus':
arch/powerpc/xmon/xmon.c:4057:13: error: variable 'i' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'restart_spus':
arch/powerpc/xmon/xmon.c:4098:13: error: variable 'i' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'dump_opal_msglog':
arch/powerpc/xmon/xmon.c:3008:16: error: variable 'pos' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'show_pte':
arch/powerpc/xmon/xmon.c:3207:29: error: variable 'tsk' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'show_tasks':
arch/powerpc/xmon/xmon.c:3302:29: error: variable 'tsk' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c: In function 'xmon_core':
arch/powerpc/xmon/xmon.c:494:13: error: variable 'cmd' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c:860:21: error: variable 'bp' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c:860:21: error: variable 'bp' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/xmon/xmon.c:492:48: error: argument 'fromipi' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
According to the documentation, marking these as 'volatile' is
sufficient to avoid the problem, and it shuts up the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210429080708.1520360-1-arnd@kernel.org
kexec_file_load() uses initial_boot_params in setting up the device tree
for the kernel to be loaded. Though initial_boot_params holds info about
CPUs at the time of boot, it doesn't account for hot added CPUs.
So, kexec'ing with kexec_file_load() syscall leaves the kexec'ed kernel
with inaccurate CPU info.
If kdump kernel is loaded with kexec_file_load() syscall and the system
crashes on a hot added CPU, the capture kernel hangs failing to identify
the boot CPU, with no output.
To avoid this from happening, extract current CPU info from of_root
device node and use it for setting up the fdt in kexec_file_load case.
Fixes: 6ecd0163d3 ("powerpc/kexec_file: Add appropriate regions for memory reserve map")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210429060256.199714-1-sourabhjain@linux.ibm.com
- Enable KFENCE for 32-bit.
- Implement EBPF for 32-bit.
- Convert 32-bit to do interrupt entry/exit in C.
- Convert 64-bit BookE to do interrupt entry/exit in C.
- Changes to our signal handling code to use user_access_begin/end() more extensively.
- Add support for time namespaces (CONFIG_TIME_NS)
- A series of fixes that allow us to reenable STRICT_KERNEL_RWX.
- Other smaller features, fixes & cleanups.
Thanks to: Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le Goater, Chen Huang, Chris
Packham, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Dan Carpenter, Daniel
Axtens, Daniel Henrique Barboza, David Gibson, Davidlohr Bueso, Denis Efremov,
dingsenjie, Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren Myneni, He Ying,
Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee Jones, Leonardo Bras, Li Huafei,
Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Nathan Chancellor, Nathan
Lynch, Nicholas Piggin, Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi
Bangoria, Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen Rothwell, Thadeu Lima
de Souza Cascardo, Thomas Gleixner, Tony Ambardar, Tyrel Datwyler, Vaibhav Jain,
Vincenzo Frascino, Xiongwei Song, Yang Li, Yu Kuai, Zhang Yunkai.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmCLV1kTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLUyD/4jrTolG4sVec211hYO+0VuJzoqN4Cf
j2CA2Ju39butnSMiq4LJUPRB7QRZY1OofkoNFpZeDQspjfZXPz2ulpYAz+SxHWE2
ReHPmWH1rOABlUPXFboePF4OLwmAs9eR5mN2z9HpKXbT3k78HaToLqiONyB4fVCr
Q5TkJeRn/Y7ZJLdyPLTpczHHleQ8KoM6kT7ncXnTm6p97JOBJSrGaJ5N/8X5a4+e
6jtgB7Pvw8jNDShSr8BDLBgBZZcmoTiuG8KfgwRZ+m+mKB1yI2X8S/a54w/lDi9g
UcSv3jQcFLJuW+T/pYe4R330uWDYa0cwjJOtMmsJ98S4EYOevoe9fZuL97qNshme
xtBr4q1i03G1icYOJJ8dXtvabG2rUzj8t1SCDpwYfrynzTWVRikiQYTXUBhRSFoK
nsoklvKd2IZa485XYJ2ljSyClMy8S4yJJ9RuzZ94DTXDSJUesKuyRWGnso4mhkcl
wvl4wwMTJvnCMKVo6dsJyV24QWfd6dABxzm04uPA94CKhG33UwK8252jXVeaohSb
WSO7qWBONgDXQLJ0mXRcEYa9NHvFS4Jnp6APbxnHr1gS+K+PNkD4gPBf34FoyN0E
9s27kvEYk5vr8APUclETF6+FkbGUD5bFbusjt3hYloFpAoHQ/k5pFVDsOZNPA8sW
fDIRp05KunDojw==
=dfKL
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Enable KFENCE for 32-bit.
- Implement EBPF for 32-bit.
- Convert 32-bit to do interrupt entry/exit in C.
- Convert 64-bit BookE to do interrupt entry/exit in C.
- Changes to our signal handling code to use user_access_begin/end()
more extensively.
- Add support for time namespaces (CONFIG_TIME_NS)
- A series of fixes that allow us to reenable STRICT_KERNEL_RWX.
- Other smaller features, fixes & cleanups.
Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le
Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M.
Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique
Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie,
Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren
Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee
Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar,
Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin,
Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria,
Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen
Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar,
Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li,
Yu Kuai, and Zhang Yunkai.
* tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits)
powerpc/signal32: Fix erroneous SIGSEGV on RT signal return
powerpc: Avoid clang uninitialized warning in __get_user_size_allowed
powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe
powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n
powerpc/kasan: Fix shadow start address with modules
powerpc/kernel/iommu: Use largepool as a last resort when !largealloc
powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs
powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants"
powerpc/iommu: Annotate nested lock for lockdep
powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
powerpc/iommu: Allocate it_map by vmalloc
selftests/powerpc: remove unneeded semicolon
powerpc/64s: remove unneeded semicolon
powerpc/eeh: remove unneeded semicolon
powerpc/selftests: Add selftest to test concurrent perf/ptrace events
powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR
powerpc/selftests/perf-hwbreak: Coalesce event creation code
powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR
powerpc/configs: Add IBMVNIC to some 64-bit configs
selftests/powerpc: Add uaccess flush test
...
Uninitialized local variable "elf_info" would be passed to
kexec_free_elf_info() if kexec_build_elf_info() returns an error
in elf64_load().
If kexec_build_elf_info() returns an error, return the error
immediately.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-2-nramas@linux.microsoft.com
There are a few "goto out;" statements before the local variable "fdt"
is initialized through the call to of_kexec_alloc_and_setup_fdt() in
elf64_load(). This will result in an uninitialized "fdt" being passed
to kvfree() in this function if there is an error before the call to
of_kexec_alloc_and_setup_fdt().
If there is any error after fdt is allocated, but before it is
saved in the arch specific kimage struct, free the fdt.
Fixes: 3c985d31ad ("powerpc: Use common of_kexec_alloc_and_setup_fdt()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421163610.23775-1-nramas@linux.microsoft.com
Define macros to list ppc interrupt types in interttupt.h, replace the
reference of the trap hex values with these macros.
Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S,
arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S,
arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h.
Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
[mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
kexec_fdt_totalsize_ppc64() includes the base FDT size in its size
calculation, but commit 3c985d31ad ("powerpc: Use common
of_kexec_alloc_and_setup_fdt()") changed the kexec code to use the
generic function of_kexec_alloc_and_setup_fdt() which already includes
the base FDT size. That change made the code overestimate the size a bit
by counting twice the space required for the kernel command line and
/chosen properties.
Therefore change kexec_fdt_totalsize_ppc64() to calculate just the extra
space needed by the kdump kernel, and change the function name so that it
better reflects what the function is now doing.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
[robh: reword commit msg as no longer a fix from merging to branches]
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210220005204.1417200-1-bauerman@linux.ibm.com
delete_fdt_mem_rsv() defined in "arch/powerpc/kexec/file_load.c"
has been renamed to fdt_find_and_del_mem_rsv(), and moved to
"drivers/of/kexec.c".
Remove delete_fdt_mem_rsv() in "arch/powerpc/kexec/file_load.c".
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-13-nramas@linux.microsoft.com
The functions defined in "arch/powerpc/kexec/ima.c" handle setting up
and freeing the resources required to carry over the IMA measurement
list from the current kernel to the next kernel across kexec system call.
These functions do not have architecture specific code, but are
currently limited to powerpc.
Move remove_ima_buffer() and setup_ima_buffer() calls into
of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c".
Move the remaining architecture independent functions from
"arch/powerpc/kexec/ima.c" to "drivers/of/kexec.c".
Delete "arch/powerpc/kexec/ima.c" and "arch/powerpc/include/asm/ima.h".
Remove references to the deleted files and functions in powerpc and
in ima.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-11-nramas@linux.microsoft.com
The fields ima_buffer_addr and ima_buffer_size in "struct kimage_arch"
for powerpc are used to carry forward the IMA measurement list across
kexec system call. These fields are not architecture specific, but are
currently limited to powerpc.
arch_ima_add_kexec_buffer() defined in "arch/powerpc/kexec/ima.c"
sets ima_buffer_addr and ima_buffer_size for the kexec system call.
This function does not have architecture specific code, but is
currently limited to powerpc.
Move ima_buffer_addr and ima_buffer_size to "struct kimage".
Set ima_buffer_addr and ima_buffer_size in ima_add_kexec_buffer()
in security/integrity/ima/ima_kexec.c.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-9-nramas@linux.microsoft.com
The code for setting up the /chosen node in the device tree
and updating the memory reservation for the next kernel has been
moved to of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c".
Use the common of_kexec_alloc_and_setup_fdt() to setup the device tree
and update the memory reservation for kexec for powerpc.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-8-nramas@linux.microsoft.com
ELF related fields elf_headers, elf_headers_sz, and elfcorehdr_addr
have been moved from 'struct kimage_arch' to 'struct kimage' as
elf_headers, elf_headers_sz, and elf_load_addr respectively.
Use the ELF fields defined in 'struct kimage'.
Suggested-by: Rob Herring <robh@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-4-nramas@linux.microsoft.com
On systems with large amount of memory, loading kdump kernel through
kexec_file_load syscall may fail with the below error:
"Failed to update fdt with linux,drconf-usable-memory property"
This happens because the size estimation for kdump kernel's FDT does
not account for the additional space needed to setup usable memory
properties. Fix it by accounting for the space needed to include
linux,usable-memory & linux,drconf-usable-memory properties while
estimating kdump kernel's FDT size.
Fixes: 6ecd0163d3 ("powerpc/kexec_file: Add appropriate regions for memory reserve map")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/161243826811.119001.14083048209224609814.stgit@hbathini
There are several occurrences of the following pattern:
for_each_memblock(memory, reg) {
start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));
/* do something with start and end */
}
Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.
[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently for_each_mem_range() and for_each_mem_range_rev() iterators are
the most generic way to traverse memblock regions. As such, they have 8
parameters and they are hardly convenient to users. Most users choose to
utilize one of their wrappers and the only user that actually needs most
of the parameters is memblock itself.
To avoid yet another naming for memblock iterators, rename the existing
for_each_mem_range[_rev]() to __for_each_mem_range[_rev]() and add a new
for_each_mem_range[_rev]() wrappers with only index, start and end
parameters.
The new wrapper nicely fits into init_unavailable_mem() and will be used
in upcoming changes to simplify memblock traversals.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> [MIPS]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-11-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kernels built with CONFIG_PPC_EARLY_DEBUG_OPAL enabled expects r8 & r9
to be filled with OPAL base & entry addresses respectively. Setting
these registers allows the kernel to perform OPAL calls before the
device tree is parsed.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602303975.575379.5032301944162937479.stgit@hbathini
The kexec purgatory has to run in real mode. Only the first memory
block maybe accessible in real mode. And, unlike the case with panic
kernel, no memory is set aside for regular kexec load. Another thing
to note is, the memory for crashkernel is reserved at an offset of
128MB. So, when crashkernel memory is reserved, the memory ranges to
load kexec segments shrink further as the generic code only looks for
memblock free memory ranges and in all likelihood only a tiny bit of
memory from 0 to 128MB would be available to load kexec segments.
With kdump being used by default in general, kexec file load is likely
to fail almost always. This can be fixed by changing the memory hole
lookup logic for regular kexec to use the same method as kdump. This
would mean that most kexec segments will overlap with crashkernel
memory region. That should still be ok as the pages, whose destination
address isn't available while loading, are placed in an intermediate
location till a flush to the actual destination address happens during
kexec boot sequence.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602302326.575379.14038896654942043093.stgit@hbathini
While initrd, elfcorehdr and backup regions are already added to the
reserve map, there are a few missing regions that need to be added to
the memory reserve map. Add them here. And now that all the changes to
load panic kernel are in place, claim likewise.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602300473.575379.4218568032039284448.stgit@hbathini
Prepare elf headers for the crashing kernel's core file using
crash_prepare_elf64_headers() and pass on this info to kdump kernel by
updating its command line with elfcorehdr parameter. Also, add
elfcorehdr location to reserve map to avoid it from being stomped on
while booting.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
[mpe: Ensure cmdline is nul terminated]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602298855.575379.15819225623219909517.stgit@hbathini
Though kdump kernel boots from loaded address, the first 64KB of it is
copied down to real 0. So, setup a backup region and let purgatory
copy the first 64KB of crashed kernel into this backup region before
booting into kdump kernel. Update reserve map with backup region and
crashed kernel's memory to avoid kdump kernel from accidentially using
that memory.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602294718.575379.16216507537038008623.stgit@hbathini
Kdump kernel, used for capturing the kernel core image, is supposed
to use only specific memory regions to avoid corrupting the image to
be captured. The regions are crashkernel range - the memory reserved
explicitly for kdump kernel, memory used for the tce-table, the OPAL
region and RTAS region as applicable. Restrict kdump kernel memory
to use only these regions by setting up usable-memory DT property.
Also, tell the kdump kernel to run at the loaded address by setting
the magic word at 0x5c.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602284284.575379.6962016255404325493.stgit@hbathini
crashkernel region could have an overlap with special memory regions
like OPAL, RTAS, TCE table & such. These regions are referred to as
excluded memory ranges. Setup these ranges during image probe in order
to avoid them while finding the buffer for different kdump segments.
Override arch_kexec_locate_mem_hole() to locate a memory hole taking
these ranges into account.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602281047.575379.6636807148335160795.stgit@hbathini
In kexec case, the kernel to be loaded uses the same memory layout as
the running kernel. So, passing on the DT of the running kernel would
be good enough.
But in case of kdump, different memory ranges are needed to manage
loading the kdump kernel, booting into it and exporting the elfcore of
the crashing kernel. The ranges are exclude memory ranges, usable
memory ranges, reserved memory ranges and crash memory ranges.
Exclude memory ranges specify the list of memory ranges to avoid while
loading kdump segments. Usable memory ranges list the memory ranges
that could be used for booting kdump kernel. Reserved memory ranges
list the memory regions for the loading kernel's reserve map. Crash
memory ranges list the memory ranges to be exported as the crashing
kernel's elfcore.
Add helper functions for setting up the above mentioned memory ranges.
This helpers facilitate in understanding the subsequent changes better
and make it easy to setup the different memory ranges listed above, as
and when appropriate.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602279194.575379.8526552316948643550.stgit@hbathini
Some of the kexec_file_load code isn't PPC64 specific. Move PPC64
specific code from kexec/file_load.c to kexec/file_load_64.c. Also,
rename purgatory/trampoline.S to purgatory/trampoline_64.S in the same
spirit. No functional changes.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/159602276920.575379.10390965946438306388.stgit@hbathini
As we kexec across kernels that use AMR/IAMR for different purposes
we need to ensure that new kernels get kexec'd with a reset value
of AMR/IAMR. For ex: the new kernel can use key 0 for kernel mapping and the old
AMR value prevents access to key 0.
This patch also removes reset if IAMR and AMOR in kexec_sequence. Reset of AMOR
is not needed and the IAMR reset is partial (it doesn't do the reset
on secondary cpus) and is redundant with this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200709032946.881753-19-aneesh.kumar@linux.ibm.com
Currently, if printk lock (logbuf_lock) is held by other thread during
crash, there is a chance of deadlocking the crash on next printk, and
blocking a possibly desired kdump.
At the start of default_machine_crash_shutdown, make printk enter
NMI context, as it will use per-cpu buffers to store the message,
and avoid locking logbuf_lock.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200512214533.93878-1-leobras.c@gmail.com