Following the pattern of identity domains, just assign the BLOCKED domain
global statics to a value in ops. Update the core code to use the global
static directly.
Update powerpc to use the new scheme and remove its empty domain_alloc
callback.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Sven Peter <sven@svenpeter.dev>
Link: https://lore.kernel.org/r/1-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
While powerpc doesn't use the seq_buf readpos, it did explicitly
initialise it for no good reason.
Link: https://lore.kernel.org/linux-trace-kernel/20231024145600.739451-1-willy@infradead.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes: d0ed46b603 ("tracing: Move readpos from seq_buf to trace_seq")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Erhard reported that his G5 was crashing with v6.6-rc kernels:
mpic: Setting up HT PICs workarounds for U3/U4
BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
Faulting instruction address: 0xc00000000005dc40
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G T 6.6.0-rc3-PMacGS #1
Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
NIP: c00000000005dc40 LR: c000000000066660 CTR: c000000000007730
REGS: c0000000022bf510 TRAP: 0380 Tainted: G T (6.6.0-rc3-PMacGS)
MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 44004242 XER: 00000000
IRQMASK: 3
GPR00: 0000000000000000 c0000000022bf7b0 c0000000010c0b00 00000000000001ac
GPR04: 0000000003c80000 0000000000000300 c0000000f20001ae 0000000000000300
GPR08: 0000000000000006 feffbb62ffec65ff 0000000000000001 0000000000000000
GPR12: 9000000000001032 c000000002362000 c000000000f76b80 000000000349ecd8
GPR16: 0000000002367ba8 0000000002367f08 0000000000000006 0000000000000000
GPR20: 00000000000001ac c000000000f6f920 c0000000022cd985 000000000000000c
GPR24: 0000000000000300 00000003b0a3691d c0003e008030000e 0000000000000000
GPR28: c00000000000000c c0000000f20001ee feffbb62ffec65fe 00000000000001ac
NIP hash_page_do_lazy_icache+0x50/0x100
LR __hash_page_4K+0x420/0x590
Call Trace:
hash_page_mm+0x364/0x6f0
do_hash_fault+0x114/0x2b0
data_access_common_virt+0x198/0x1f0
--- interrupt: 300 at mpic_init+0x4bc/0x10c4
NIP: c000000002020a5c LR: c000000002020a04 CTR: 0000000000000000
REGS: c0000000022bf9f0 TRAP: 0300 Tainted: G T (6.6.0-rc3-PMacGS)
MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 24004248 XER: 00000000
DAR: c0003e008030000e DSISR: 40000000 IRQMASK: 1
...
NIP mpic_init+0x4bc/0x10c4
LR mpic_init+0x464/0x10c4
--- interrupt: 300
pmac_setup_one_mpic+0x258/0x2dc
pmac_pic_init+0x28c/0x3d8
init_IRQ+0x90/0x140
start_kernel+0x57c/0x78c
start_here_common+0x1c/0x20
A bisect pointed to the breakage beginning with commit 9fee28baa6 ("powerpc:
implement the new page table range API").
Analysis of the oops pointed to a struct page with a corrupted
compound_head being loaded via page_folio() -> _compound_head() in
hash_page_do_lazy_icache().
The access by the mpic code is to an MMIO address, so the expectation
is that the struct page for that address would be initialised by
init_unavailable_range(), as pointed out by Aneesh.
Instrumentation showed that was not the case, which eventually lead to
the realisation that pfn_valid() was returning false for that address,
causing the struct page to not be initialised.
Because the system is using FLATMEM, the version of pfn_valid() in
memory_model.h is used:
static inline int pfn_valid(unsigned long pfn)
{
...
return pfn >= pfn_offset && (pfn - pfn_offset) < max_mapnr;
}
Which relies on max_mapnr being initialised. Early in boot max_mapnr is
zero meaning no PFNs are valid.
max_mapnr is initialised in mem_init() called via:
start_kernel()
mm_core_init() # init/main.c:928
mem_init()
But that is too late for the usage in init_unavailable_range() called via:
start_kernel()
setup_arch() # init/main.c:893
paging_init()
free_area_init()
init_unavailable_range()
Although max_mapnr is currently set in mem_init(), the value is actually
already available much earlier, as soon as mem_topology_setup() has
completed, which is also before paging_init() is called. So move the
initialisation there, which causes paging_init() to correctly initialise
the struct page and fixes the bug.
This bug seems to have been lurking for years, but went unnoticed
because the pre-folio code was inspecting the uninitialised page->flags
but not dereferencing it.
Thanks to Erhard and Aneesh for help debugging.
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://lore.kernel.org/all/20230929132750.3cd98452@yea/
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231023112500.1550208-1-mpe@ellerman.id.au
A thread started via eg. user_mode_thread() runs in the kernel to begin
with and then may later return to userspace. While it's running in the
kernel it has a pt_regs at the base of its kernel stack, but that
pt_regs is all zeroes.
If the thread oopses in that state, it leads to an ugly stack trace with
a big block of zero GPRs, as reported by Joel:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc7-00004-gf7757129e3de-dirty #3
Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV
Call Trace:
[c0000000036afb00] [c0000000010dd058] dump_stack_lvl+0x6c/0x9c (unreliable)
[c0000000036afb30] [c00000000013c524] panic+0x178/0x424
[c0000000036afbd0] [c000000002005100] mount_root_generic+0x250/0x324
[c0000000036afca0] [c0000000020057d0] prepare_namespace+0x2d4/0x344
[c0000000036afd20] [c0000000020049c0] kernel_init_freeable+0x358/0x3ac
[c0000000036afdf0] [c0000000000111b0] kernel_init+0x30/0x1a0
[c0000000036afe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
--- interrupt: 0 at 0x0
NIP: 0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
REGS: c0000000036afe80 TRAP: 0000 Not tainted (6.5.0-rc7-00004-gf7757129e3de-dirty)
MSR: 0000000000000000 <> CR: 00000000 XER: 00000000
CFAR: 0000000000000000 IRQMASK: 0
GPR00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
NIP [0000000000000000] 0x0
LR [0000000000000000] 0x0
--- interrupt: 0
The all-zero pt_regs looks ugly and conveys no useful information, other
than its presence. So detect that case and just show the presence of the
frame by printing the interrupt marker, eg:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc3-00126-g18e9506562a0-dirty #301
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
Call Trace:
[c000000003aabb00] [c000000001143db8] dump_stack_lvl+0x6c/0x9c (unreliable)
[c000000003aabb30] [c00000000014c624] panic+0x178/0x424
[c000000003aabbd0] [c0000000020050fc] mount_root_generic+0x250/0x324
[c000000003aabca0] [c0000000020057cc] prepare_namespace+0x2d4/0x344
[c000000003aabd20] [c0000000020049bc] kernel_init_freeable+0x358/0x3ac
[c000000003aabdf0] [c0000000000111b0] kernel_init+0x30/0x1a0
[c000000003aabe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
--- interrupt: 0 at 0x0
To avoid ever suppressing a valid pt_regs make sure the pt_regs has a
zero MSR and TRAP value, and is located at the very base of the stack.
Fixes: 6895dfc047 ("powerpc: copy_thread fill in interrupt frame marker and back chain")
Reported-by: Joel Stanley <joel@jms.id.au>
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230824064210.907266-1-mpe@ellerman.id.au
Sparse reports a warning when casting to an int. There is no need to
cast in the first place, so drop them.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-12-bgray@linux.ibm.com
Sparse reports dereferencing an __iomem pointer. These routines
are clearly low level handlers for IO memory, so force cast away
the __iomem annotation to tell sparse the dereferences are safe.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-11-bgray@linux.ibm.com
Sparse reports several endianness warnings on variables and functions
that are consistently treated as big endian. There are no
multi-endianness shenanigans going on here so fix these low hanging
fruit up in one patch.
All changes are just type annotations; no endianness switching
operations are introduced by this patch.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-7-bgray@linux.ibm.com
Sparse reports several function implementations annotated with extern.
This is clearly incorrect, likely just copied from an actual extern
declaration in another file.
Fix the sparse warnings by removing extern.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-6-bgray@linux.ibm.com
Sparse reports several uses of 0 for pointer arguments and comparisons.
Replace with NULL to better convey the intent. Remove entirely if a
comparison to follow the kernel style of implicit boolean conversions.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-5-bgray@linux.ibm.com
On 603 MMU, TLB missed are handled by SW and there are separated
DTLB and ITLB. It is therefore possible to implement execute-only
protection by not loading DTLB when read access is not permitted.
To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.
As _PAGE_NONE can also be implemented by clearing _PAGE_READ, remove
_PAGE_USER and add _PAGE_READ. Then use the virtual address to know
whether user rights or kernel rights are to be used.
With that change, 603 MMU now honors execute-only protection.
For hash (604) MMU it is more tricky because hash table is common to
load/store and execute. Nevertheless it is still possible to check
whether _PAGE_READ is set before loading hash table for a load/store
access. At least it can't be read unless it is executed first.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b7702dd5a041ec59055ed2880f4952e94c087a2e.1695659959.git.christophe.leroy@csgroup.eu
Several places, _PAGE_RW maps to write permission and don't
always imply read. To make it more clear, do as book3s/64 in
commit c7d54842de ("powerpc/mm: Use _PAGE_READ to indicate
Read access") and use _PAGE_WRITE when more relevant.
For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that
will change when _PAGE_READ gets added in following patches.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/5798782869fe4d2698f104948dabd17657b89395.1695659959.git.christophe.leroy@csgroup.eu
_PAGE_USER is used to select the zone. Today zone 0 is kernel
and zone 1 is user.
To implement _PAGE_NONE, _PAGE_USER is cleared, leading to no access
for user but kernel still has access to the page so it's possible for
a user application to write in that page by using a kernel function
as trampoline.
What is really wanted is to have user rights on pages below TASK_SIZE
and no user rights on pages above TASK_SIZE. Use zones for that.
There are 16 zones so lets use the 4 upper address bits to set the
zone and declare zone rights based on TASK_SIZE.
Then drop _PAGE_USER and reuse it as _PAGE_READ that will be checked
in Data TLB miss handler. That will properly handle PAGE_NONE for
both kernel and user.
In addition, it partially implements execute-only right. The
implementation won't be complete because once a TLB has been loaded
via the Instruction TLB miss handler, it will be possible to read
the page. But at least it can't be read unless it is executed first.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2a13e3ba8a5dec43143cc1f9a91ec71ea1529f3c.1695659959.git.christophe.leroy@csgroup.eu
44x MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user
It means that it can support X without R.
To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.
As _PAGE_NONE can also be implemented by clearing _PAGE_READ,
remove _PAGE_USER and add _PAGE_READ. In order to insert bits in
one go during TLB miss, move _PAGE_ACCESSED and put _PAGE_READ
just after _PAGE_DIRTY so that _PAGE_DIRTY is copied into SW and
_PAGE_READ into SR at once.
With that change, 44x now also honors execute-only protection.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/043e17987b260b99b45094138c6cb2e89e63d499.1695659959.git.christophe.leroy@csgroup.eu
e500 MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user
It means that it can support X without R.
To do that, _PAGE_READ flag is needed.
With 32 bits PTE there is no bit available for it in PTE. On the
other hand the only real use of _PAGE_USER is to implement PAGE_NONE
by clearing _PAGE_USER. As _PAGE_NONE can also be implemented by
clearing _PAGE_READ, remove _PAGE_USER and add _PAGE_READ. Move
_PAGE_PRESENT into bit 30 so that _PAGE_READ can match SR bit.
With 64 bits PTE _PAGE_USER is already the combination of SR and UR
so all we need to do is to rename it _PAGE_READ.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/0849ab6bf7ae2af23f94b0457fa40d0ea3983fe4.1695659959.git.christophe.leroy@csgroup.eu
Several places, _PAGE_RW maps to write permission and don't
always imply read. To make it more clear, do as book3s/64 in
commit c7d54842de ("powerpc/mm: Use _PAGE_READ to indicate
Read access") and use _PAGE_WRITE when more relevant.
For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that
will change when _PAGE_READ gets added in following patches.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1f79b88db54d030ada776dc9845e0e88345bfc28.1695659959.git.christophe.leroy@csgroup.eu
I noticed that commit 0db5b61e0d ("fbdev/vga16fb: Create
EGA/VGA devices in sysfb code") broke vga16fb on non-x86 platforms,
because the sysfb code never creates a vga-framebuffer device when
screen_info.orig_video_isVGA is set to '1' instead of VIDEO_TYPE_VGAC.
However, it turns out that the only architecture that has allowed
building vga16fb in the past 20 years is powerpc, and this only worked
on two 32-bit platforms and never on 64-bit powerpc. The last machine
that actually used this was removed in linux-3.10, so this is all dead
code and can be removed.
The big-endian support in vga16fb.c could also be removed, but I'd just
leave this in place.
Fixes: 933ee7119f ("powerpc: remove PReP platform")
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-8-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While reworking the x86 topology code Thomas tripped over creating a 'DIE' domain
for the package mask. :-)
Since these names are CONFIG_SCHED_DEBUG=y only, rename them to make the
name less ambiguous.
[ Shrikanth Hegde: rename on s390 as well. ]
[ Valentin Schneider: also rename it in the comments. ]
[ mingo: port to recent kernels & find all remaining occurances. ]
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Valentin Schneider <vschneid@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20230712141056.GI3100107@hirez.programming.kicks-ass.net
Eddie reported that newer kernels were crashing during boot on his 476
FSP2 system:
kernel tried to execute user page (b7ee2000) - exploit attempt? (uid: 0)
BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xb7ee2000
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K FSP-2
Modules linked in:
CPU: 0 PID: 61 Comm: mount Not tainted 6.1.55-d23900f.ppcnf-fsp2 #1
Hardware name: ibm,fsp2 476fpe 0x7ff520c0 FSP-2
NIP: b7ee2000 LR: 8c008000 CTR: 00000000
REGS: bffebd83 TRAP: 0400 Not tainted (6.1.55-d23900f.ppcnf-fs p2)
MSR: 00000030 <IR,DR> CR: 00001000 XER: 20000000
GPR00: c00110ac bffebe63 bffebe7e bffebe88 8c008000 00001000 00000d12 b7ee2000
GPR08: 00000033 00000000 00000000 c139df10 48224824 1016c314 10160000 00000000
GPR16: 10160000 10160000 00000008 00000000 10160000 00000000 10160000 1017f5b0
GPR24: 1017fa50 1017f4f0 1017fa50 1017f740 1017f630 00000000 00000000 1017f4f0
NIP [b7ee2000] 0xb7ee2000
LR [8c008000] 0x8c008000
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 0000000000000000 ]---
The problem is in ret_from_syscall where the check for
icache_44x_need_flush is done. When the flush is needed the code jumps
out-of-line to do the flush, and then intends to jump back to continue
the syscall return.
However the branch back to label 1b doesn't return to the correct
location, instead branching back just prior to the return to userspace,
causing bogus register values to be used by the rfi.
The breakage was introduced by commit 6f76a01173
("powerpc/syscall: implement system call entry/exit logic in C for PPC32") which
inadvertently removed the "1" label and reused it elsewhere.
Fix it by adding named local labels in the correct locations. Note that
the return label needs to be outside the ifdef so that CONFIG_PPC_47x=n
compiles.
Fixes: 6f76a01173 ("powerpc/syscall: implement system call entry/exit logic in C for PPC32")
Cc: stable@vger.kernel.org # v5.12+
Reported-by: Eddie James <eajames@linux.ibm.com>
Tested-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/linuxppc-dev/fdaadc46-7476-9237-e104-1d2168526e72@linux.ibm.com/
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://msgid.link/20231010114750.847794-1-mpe@ellerman.id.au
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
Remove sentinel from powersave_nap_ctl_table and nmi_wd_lpm_factor_ctl_table.
This removal is safe because register_sysctl implicitly uses ARRAY_SIZE()
in addition to checking for the sentinel.
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
and fix all in-tree references.
Architecture-specific documentation is being moved into Documentation/arch/
as a way of cleaning up the top-level documentation directory and making
the docs hierarchy more closely match the source hierarchy.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230826165737.2101199-1-costa.shul@redhat.com
Clang 17 reports:
arch/powerpc/kernel/traps.c:1167:19: error: unused function '__parse_fpscr' [-Werror,-Wunused-function]
__parse_fpscr() is called from two sites. First call is guarded
by #ifdef CONFIG_PPC_FPU_REGS
Second call is guarded by CONFIG_MATH_EMULATION which selects
CONFIG_PPC_FPU_REGS.
So only define __parse_fpscr() when CONFIG_PPC_FPU_REGS is defined.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309210327.WkqSd5Bq-lkp@intel.com/
Fixes: b6254ced4d ("powerpc/signal: Don't manage floating point regs when no FPU")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/5de2998c57f3983563b27b39228ea9a7229d4110.1695385984.git.christophe.leroy@csgroup.eu
commit c35559f94e ("x86/shstk: Introduce map_shadow_stack syscall")
recently added support for map_shadow_stack() but it is limited to x86
only for now. There is a possibility that other architectures (namely,
arm64 and RISC-V), that are implementing equivalent support for shadow
stacks, might need to add support for it.
Independent of that, reserving arch-specific syscall numbers in the
syscall tables of all architectures is good practice and would help
avoid future conflicts. map_shadow_stack() is marked as a conditional
syscall in sys_ni.c. Adding it to the syscall tables of other
architectures is harmless and would return ENOSYS when exercised.
Note, map_shadow_stack() was assigned #453 during the merge process
since #452 was taken by fchmodat2().
For Powerpc, map it to sys_ni_syscall() as is the norm for Powerpc
syscall tables.
For Alpha, map_shadow_stack() takes up #563 as Alpha still diverges from
the common syscall numbering system in the other architectures.
Link: https://lore.kernel.org/lkml/20230515212255.GA562920@debug.ba.rivosinc.com/
Link: https://lore.kernel.org/lkml/b402b80b-a7c6-4ef0-b977-c0f5f582b78a@sirena.org.uk/
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
rcu_report_dead() and rcutree_migrate_callbacks() have their headers in
rcupdate.h while those are pure rcutree calls, like the other CPU-hotplug
functions.
Also rcu_cpu_starting() and rcu_report_dead() have different naming
conventions while they mirror each other's effects.
Fix the headers and propose a naming that relates both functions and
aligns with the prefix of other rcutree CPU-hotplug functions.
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added. Make adjustments in
all call sites of parse_crashkernel() in arch.
Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
commit 'be65de6b03aa ("fs: Remove dcookies support")' removed the
syscall definition for lookup_dcookie. However, syscall tables still
point to the old sys_lookup_dcookie() definition. Update syscall tables
of all architectures to directly point to sys_ni_syscall() instead.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Namhyung Kim <namhyung@kernel.org> # for perf
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
PowerPC has a 'btext' font used for the console which is almost identical
to the shared font_sun8x16, so use it rather than duplicating the data.
They were actually identical until about a decade ago when
commit bcfbeecea1 ("drivers: console: font_: Change a glyph from
"broken bar" to "vertical line"")
which changed the | in the shared font to be a solid
bar rather than a broken bar. That's the only difference.
This was originally spotted by the PMF source code analyser, which
noticed that sparc does the same thing with the same data, and they
also share a bunch of functions to manipulate the data. I've previously
posted a near identical patch for sparc.
Tested very lightly with a boot without FS in qemu.
Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230825142754.1487900-1-linux@treblig.org
POWER is using the set_platform_dma_ops() callback to hook up its private
dma_ops, but this is buired under some indirection and is weirdly
happening for a BLOCKED domain as well.
For better documentation create a PLATFORM domain to manage the dma_ops,
since that is what it is for, and make the BLOCKED domain an alias for
it. BLOCKED is required for VFIO.
Also removes the leaky allocation of the BLOCKED domain by using a global
static.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The changes to copy_thread() made in commit eed7c420aa ("powerpc:
copy_thread differentiate kthreads and user mode threads") inadvertently
broke arch_stack_walk_reliable() because it has knowledge of the stack
layout.
Fix it by changing the condition to match the new logic in
copy_thread(). The changes make the comments about the stack layout
incorrect, rather than rephrasing them just refer the reader to
copy_thread().
Also the comment about the stack backchain is no longer true, since
commit edbd0387f3 ("powerpc: copy_thread add a back chain to the
switch stack frame"), so remove that as well.
Fixes: eed7c420aa ("powerpc: copy_thread differentiate kthreads and user mode threads")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230921232441.1181843-1-mpe@ellerman.id.au
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105248.511860556@noisy.programming.kicks-ass.net
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105248.164324363@noisy.programming.kicks-ass.net
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105247.936205525@noisy.programming.kicks-ass.net
Upstream Linux never had a "README.legal" file, but it was present
in early source releases of Linux/m68k. It contained a simple copyright
notice and a link to a version of the "COPYING" file that predated the
addition of the "only valid GPL version is v2" clause.
Get rid of the references to non-existent files by replacing the
boilerplate with SPDX license identifiers.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/d91725ff1ed5d4b6ba42474e2ebfeebe711cba23.1695031668.git.geert@linux-m68k.org
Syzkaller reported a sleep in atomic context bug relating to the HASHCHK
handler logic:
BUG: sleeping function called from invalid context at arch/powerpc/kernel/traps.c:1518
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 25040, name: syz-executor
preempt_count: 0, expected: 0
RCU nest depth: 0, expected: 0
no locks held by syz-executor/25040.
irq event stamp: 34
hardirqs last enabled at (33): [<c000000000048b38>] prep_irq_for_enabled_exit arch/powerpc/kernel/interrupt.c:56 [inline]
hardirqs last enabled at (33): [<c000000000048b38>] interrupt_exit_user_prepare_main+0x148/0x600 arch/powerpc/kernel/interrupt.c:230
hardirqs last disabled at (34): [<c00000000003e6a4>] interrupt_enter_prepare+0x144/0x4f0 arch/powerpc/include/asm/interrupt.h:176
softirqs last enabled at (0): [<c000000000281954>] copy_process+0x16e4/0x4750 kernel/fork.c:2436
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 15 PID: 25040 Comm: syz-executor Not tainted 6.5.0-rc5-00001-g3ccdff6bb06d #3
Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1040.00 (NL1040_021) hv:phyp pSeries
Call Trace:
[c0000000a8247ce0] [c00000000032b0e4] __might_resched+0x3b4/0x400 kernel/sched/core.c:10189
[c0000000a8247d80] [c0000000008c7dc8] __might_fault+0xa8/0x170 mm/memory.c:5853
[c0000000a8247dc0] [c00000000004160c] do_program_check+0x32c/0xb20 arch/powerpc/kernel/traps.c:1518
[c0000000a8247e50] [c000000000009b2c] program_check_common_virt+0x3bc/0x3c0
To determine if a trap was caused by a HASHCHK instruction, we inspect
the user instruction that triggered the trap. However this may sleep
if the page needs to be faulted in (get_user_instr() reaches
__get_user(), which calls might_fault() and triggers the bug message).
Move the HASHCHK handler logic to after we allow IRQs, which is fine
because we are only interested in HASHCHK if it's a user space trap.
Fixes: 5bcba4e6c1 ("powerpc/dexcr: Handle hashchk exception")
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230915034604.45393-1-bgray@linux.ibm.com
It can be easy to miss that the notifier mechanism invokes the callbacks
in an atomic context, so add some comments to that effect on the two
handlers we register here.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-4-bgray@linux.ibm.com
This is called in an atomic context, so is not allowed to sleep if a
user page needs to be faulted in and has nowhere it can be deferred to.
The pagefault_disabled() function is documented as preventing user
access methods from sleeping.
In practice the page will be mapped in nearly always because we are
reading the instruction that just triggered the watchpoint trap.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-3-bgray@linux.ibm.com
thread_change_pc() uses CPU local data, so must be protected from
swapping CPUs while it is reading the breakpoint struct.
The error is more noticeable after 1e60f3564b ("powerpc/watchpoints:
Track perf single step directly on the breakpoint"), which added an
unconditional __this_cpu_read() call in thread_change_pc(). However the
existing __this_cpu_read() that runs if a breakpoint does need to be
re-inserted has the same issue.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-2-bgray@linux.ibm.com
Currently, is_kdump_kernel() returns true in crash dump capture kernel
for both kdump and fadump crash dump capturing methods, as both these
methods set elfcorehdr_addr. Some restrictions enforced for crash dump
capture kernel, based on is_kdump_kernel(), are specifically meant for
kdump case and not desirable for fadump - eg. IO queues restriction in
device drivers. So, define is_kdump_kernel() to return false when f/w
assisted dump is active.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230912082950.856977-2-hbathini@linux.ibm.com
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
configured SMT state when hotplugging CPUs into the system.
- Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix
MMU to avoid a broadcast TLBIE flush on exit.
- Drop the exclusion between ptrace/perf watchpoints, and drop the now unused
associated arch hooks.
- Add support for the "nohlt" command line option to disable CPU idle.
- Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1.
- Rework memory block size determination, and support 256MB size on systems
with GPUs that have hotpluggable memory.
- Various other small features and fixes.
Thanks to: Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev,
Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand,
Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin
Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He,
Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara
R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick
Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell
Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav
Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, Zheng Zengkai.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmTwgbwTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFmpD/432vipeoqvkAYsyK0xi/Y3GcY0wcyd
WJApLXXadEbtKQrgXQ6sowWqalg5thYnQCRarg/tXKK/po3KfgwkPjGDpOL+cIdr
12QVN2XJm9VmJ1wYJxzk+yXx4F43AdmMdr94qWAGufbTHezwb4UpzVR1NxtFrOE/
X5TNsC2+2mdZY/ZaNHS5vsTIFv3EhQfqgjZPlIAdLn6CGc8xWT514Q/uHA8+ytM/
HL7Hqs33DoPSvgTa5TT/2E0d0k5nO3P5KObzAjpYlireTPaBi51mpKGewcrtm0o2
v3cBlbfx3C7pe9ZhKBK9BH8cjynfiqsVZ9/lCw/7eBNdm9tHuzG0jeS7Db9tCZXS
fM7G2R7SoIusPTqxlBmkU5DpYslwrHiVgCyy3ijxkoA/fakVwh/GgTcMsRt73IY6
n6DsUvWwuYHCIeIiHmHQJqCqCRtV+aMzU3AbbBHOjtdIanhlW16M686dEsgCirh7
akRVRD5VqKaqXs34PpkRL89Xv3wZRjl6XZ3hZFfCjSYXfpXDXhgSToIskpHYhKL8
gpY7WtG9YQP05Xz5HRCx6EluaZVeKe0lZi6fezX7Mi9AygJQO8FfXqP1mHBlEq40
ThWtvL9D89RV6lADqqFN20XepgvKNOyAXcE4szvsnIZYUSPmZQZSPxx+DHtROaLP
jX3ifxtxJp92pQ==
=5g7K
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
configured SMT state when hotplugging CPUs into the system
- Combine final TLB flush and lazy TLB mm shootdown IPIs when using the
Radix MMU to avoid a broadcast TLBIE flush on exit
- Drop the exclusion between ptrace/perf watchpoints, and drop the now
unused associated arch hooks
- Add support for the "nohlt" command line option to disable CPU idle
- Add support for -fpatchable-function-entry for ftrace, with GCC >=
13.1
- Rework memory block size determination, and support 256MB size on
systems with GPUs that have hotpluggable memory
- Various other small features and fixes
Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam
Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel
Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof
Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar,
Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar
Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh
Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain,
Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai.
* tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits)
macintosh/ams: linux/platform_device.h is needed
powerpc/xmon: Reapply "Relax frame size for clang"
powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
powerpc/mpc5xxx: Add missing fwnode_handle_put()
powerpc/config: Disable SLAB_DEBUG_ON in skiroot
powerpc/pseries: Remove unused hcall tracing instruction
powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
powerpc: dts: add missing space before {
powerpc/eeh: Use pci_dev_id() to simplify the code
powerpc/64s: Move CPU -mtune options into Kconfig
powerpc/powermac: Fix unused function warning
powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
powerpc: Don't include lppaca.h in paca.h
powerpc/pseries: Move hcall_vphn() prototype into vphn.h
powerpc/pseries: Move VPHN constants into vphn.h
cxl: Drop unused detach_spa()
powerpc: Drop zalloc_maybe_bootmem()
powerpc/powernv: Use struct opal_prd_msg in more places
...
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZO0WoxQcem9oYXJAbGlu
dXguaWJtLmNvbQAKCRDLwZzRsCrn5alsAP0UZQIKI2zEjFdtucgClcSouflIOC5i
Hvtgv3qVFXPZQwEA2H/SGjigtH5NruVXECDZdrIfaGGvBhyeY72lbswXfQ0=
=Gu8i
-----END PGP SIGNATURE-----
Merge tag 'integrity-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity subsystem updates from Mimi Zohar:
- With commit 099f26f22f ("integrity: machine keyring CA
configuration") certificates may be loaded onto the IMA keyring,
directly or indirectly signed by keys on either the "builtin" or the
"machine" keyrings.
With the ability for the system/machine owner to sign the IMA policy
itself without needing to recompile the kernel, update the IMA
architecture specific policy rules to require the IMA policy itself
be signed.
[ As commit 099f26f22f was upstreamed in linux-6.4, updating the
IMA architecture specific policy now to require signed IMA policies
may break userspace expectations. ]
- IMA only checked the file data hash was not on the system blacklist
keyring for files with an appended signature (e.g. kernel modules,
Power kernel image).
Check all file data hashes regardless of how it was signed
- Code cleanup, and a kernel-doc update
* tag 'integrity-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
kexec_lock: Replace kexec_mutex() by kexec_lock() in two comments
ima: require signed IMA policy when UEFI secure boot is enabled
integrity: Always reference the blacklist keyring with appraisal
ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
cNY=
=kVVr
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-maping updates from Christoph Hellwig:
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: optimize get_max_slots()
swiotlb: move slot allocation explanation comment where it belongs
swiotlb: search the software IO TLB only if the device makes use of it
swiotlb: allocate a new memory pool when existing pools are full
swiotlb: determine potential physical address limit
swiotlb: if swiotlb is full, fall back to a transient memory pool
swiotlb: add a flag whether SWIOTLB is allowed to grow
swiotlb: separate memory pool data from other allocator data
swiotlb: add documentation and rename swiotlb_do_find_slots()
swiotlb: make io_tlb_default_mem local to swiotlb.c
swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
dma-contiguous: check for memory region overlap
dma-contiguous: support numa CMA for specified node
dma-contiguous: support per-numa CMA for all architectures
dma-mapping: move arch_dma_set_mask() declaration to header
swiotlb: unexport is_swiotlb_active
x86: always initialize xen-swiotlb when xen-pcifront is enabling
xen/pci: add flag for PCI passthrough being possible
("refactor Kconfig to consolidate KEXEC and CRASH options").
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h").
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands").
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions").
- Switch the handling of kdump from a udev scheme to in-kernel handling,
by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
un/plug").
- Many singleton patches to various parts of the tree
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
=WJQa
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXT7QAKCRCRxhvAZXjc
ort3AP0VIK/oJk5skgjpinQrCfvtVz0XOtawuBtn0f1weIfb6AD9Hg1rqOKnQD5z
dkvn3xaEr3gPOVzqU5SvFwVoCM0cMwA=
=24Ha
-----END PGP SIGNATURE-----
Merge tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fchmodat2 system call from Christian Brauner:
"This adds the fchmodat2() system call. It is a revised version of the
fchmodat() system call, adding a missing flag argument. Support for
both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.
Adding this system call revision has been a longstanding request but
so far has always fallen through the cracks. While the kernel
implementation of fchmodat() does not have a flag argument the libc
provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
(see [1] and [2]).
The workaround is brittle because it relies not just on O_PATH and
O_NOFOLLOW semantics and procfs magic links but also on our rather
inconsistent symlink semantics.
This gives userspace a proper fchmodat2() system call that libcs can
use to properly implement fchmodat(3) and allows them to get rid of
their hacks. In this case it will immediately benefit them as the
current workaround is already defunct because of aformentioned
inconsistencies.
In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
AT_EMPTY_PATH with fchmodat2(). This is already possible with
fchownat() so there's no reason to not also support it for
fchmodat2().
The implementation is simple and comes with selftests. Implementation
of the system call and wiring up the system call are done as separate
patches even though they could arguably be one patch. But in case
there are merge conflicts from other system call additions it can be
beneficial to have separate patches"
Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]
* tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: fchmodat2: remove duplicate unneeded defines
fchmodat2: add support for AT_EMPTY_PATH
selftests: Add fchmodat2 selftest
arch: Register fchmodat2, usually as syscall 452
fs: Add fchmodat2()
Non-functional cleanup of a "__user * filename"
fail_iommu_setup() registers the fail_iommu_bus_notifier struct to both
PCI and VIO buses. struct notifier_block is a linked list node, so this
causes any notifiers later registered to either bus type to also be
registered to the other since they share the same node.
This causes issues in (at least) the vgaarb code, which registers a
notifier for PCI buses. pci_notify() ends up being called on a vio
device, converted with to_pci_dev() even though it's not a PCI device,
and finally makes a bad access in vga_arbiter_add_pci_device() as
discovered with KASAN:
BUG: KASAN: slab-out-of-bounds in vga_arbiter_add_pci_device+0x60/0xe00
Read of size 4 at addr c000000264c26fdc by task swapper/0/1
Call Trace:
dump_stack_lvl+0x1bc/0x2b8 (unreliable)
print_report+0x3f4/0xc60
kasan_report+0x244/0x698
__asan_load4+0xe8/0x250
vga_arbiter_add_pci_device+0x60/0xe00
pci_notify+0x88/0x444
notifier_call_chain+0x104/0x320
blocking_notifier_call_chain+0xa0/0x140
device_add+0xac8/0x1d30
device_register+0x58/0x80
vio_register_device_node+0x9ac/0xce0
vio_bus_scan_register_devices+0xc4/0x13c
__machine_initcall_pseries_vio_device_init+0x94/0xf0
do_one_initcall+0x12c/0xaa8
kernel_init_freeable+0xa48/0xba8
kernel_init+0x64/0x400
ret_from_kernel_thread+0x5c/0x64
Fix this by creating separate notifier_block structs for each bus type.
Fixes: d6b9a81b2a ("powerpc: IOMMU fault injection")
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
[mpe: Add #ifdef to fix CONFIG_IBMVIO=n build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230322035322.328709-1-ruscur@russell.cc
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These
used to be called early during boot before slab setup, and also during
runtime due to hotplug.
But commit 5537fcb319 ("powerpc/pci: Add ppc_md.discover_phbs()")
moved the boot-time calls later, after slab setup, meaning there's no
longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all
cases.
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to
emit nops after the local entry point, rather than before it. This
allows us to use this in the kernel for ftrace purposes. A new script is
added under arch/powerpc/tools/ to help detect if nops are emitted after
the function local entry point, or before the global entry point.
With -fpatchable-function-entry, we no longer have the profiling
instructions generated at function entry, so we only need to validate
the presence of two nops at the ftrace location in ftrace_init_nop(). We
patch the preceding instruction with 'mflr r0' to match the
-mprofile-kernel ABI for subsequent ftrace use.
This changes the profiling instructions used on ppc32. The default -pg
option emits an additional 'stw' instruction after 'mflr r0' and before
the branch to _mcount 'bl _mcount'. This is very similar to the original
-mprofile-kernel implementation on ppc64le, where an additional 'std'
instruction was used to save LR to its save location in the caller's
stackframe. Subsequently, this additional store was removed in later
compiler versions for performance reasons. The same reasons apply for
ppc32 so we only patch in a 'mflr r0'.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org
Implement ftrace_replace_code() to consolidate logic from the different
ftrace patching routines: ftrace_make_nop(), ftrace_make_call() and
ftrace_modify_call(). Note that ftrace_make_call() is still required
primarily to handle patching modules during their load time. The other
two routines should no longer be called.
This lays the groundwork to enable better control in patching ftrace
locations, including the ability to nop-out preceding profiling
instructions when ftrace is disabled.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/c28f852225646b0561bbf3c1d22d03f041ace8e0.1687166935.git.naveen@kernel.org
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_modify_call() to patch-in the
updated branch instruction without worrying about the instructions
surrounding the ftrace location. Note that we continue to ensure we
have the expected branch instruction at the ftrace location before
patching it with the updated branch destination.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/06275720939f8ee4c2f61c9e9a3e89b1fa3c441d.1687166935.git.naveen@kernel.org
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_make_call() to replace the nop
without worrying about the instructions surrounding the ftrace location.
Note that we continue to ensure that we have a nop at the ftrace
location before patching it.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2d28866d2f556488a663981abe5621511efb207b.1687166935.git.naveen@kernel.org
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_make_nop() to patch-in the nop
without worrying about the instructions surrounding the ftrace location.
Note that we continue to ensure that we have a bl to
ftrace_[regs_]caller at the ftrace location before nop-ing it out.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/e12ccbf28c50c3a07fb614f4d392e55f7098a729.1687166935.git.naveen@kernel.org
Currently, we validate instructions around the ftrace location every
time we have to enable/disable ftrace. Introduce ftrace_init_nop() to
instead perform all the validation during ftrace initialization. This
allows us to simply patch the necessary instructions during
enabling/disabling ftrace.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/f373684081e8e98be09b7f44d2d93069768324dc.1687166935.git.naveen@kernel.org
Commit 67361cf807 ("powerpc/ftrace: Handle large kernel configs")
added ftrace support for ppc64 kernel images with a text section larger
than 32MB. The patch did two things:
1. Add stubs at the end of .text to branch into ftrace_[regs_]caller for
functions that were out of branch range.
2. Re-purpose linker-generated long branches to _mcount to instead branch
to ftrace_[regs_]caller.
Before that, we only supported kernel .text up to ~32MB. With the above,
we now support up to ~96MB:
- The first 32MB of kernel text can branch directly into
ftrace_[regs_]caller since that symbol is usually at the beginning.
- The modified long_branch from (2) above is used by the next 32MB of
kernel text.
- The next 32MB of kernel text can use the stub at the end of text to
branch back to ftrace_[regs_]caller.
While re-purposing the long branch works in practice, it still restricts
ftrace to kernel text up to ~96MB. The stub at the end of kernel text
from (1) already enables us to extend ftrace support for kernel text
up to 64MB, which fulfils the original requirement. Further, once we
switch to -fpatchable-function-entry, there will not be a long branch
that we can use.
Stop re-purposing the linker-generated long branches for ftrace to
simplify the code. If there are good reasons to support ftrace on
kernels beyond 64MB, we can consider adding support by using
-fpatchable-function-entry.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/33fa3be97f8e1f2171254ef2e1b0d5c8836c11fd.1687166935.git.naveen@kernel.org
ftrace_low.S has just the _mcount stub and return_to_handler(). Merge
this back into ftrace_mprofile.S and ftrace_64_pg.S to keep all ftrace
code together, and to allow those to evolve independently.
ftrace_mprofile.S is also not an entirely accurate name since this also
holds ppc32 code. This will be all the more incorrect once support for
-fpatchable-function-entry is added. Rename files here to more
accurately describe the code:
- ftrace_mprofile.S is renamed to ftrace_entry.S
- ftrace_pg.c is renamed to ftrace_64_pg.c
- ftrace_64_pg.S is rename to ftrace_64_pg_entry.S
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b900c9a8bba9d6c3c295e0f99886acf3e5bf6f7b.1687166935.git.naveen@kernel.org
Commit 67361cf807 ("powerpc/ftrace: Handle large kernel configs")
added ftrace support for ppc64 kernel images with a text section larger
than 32MB. The approach itself isn't specific to ppc64, so extend the
same to also work on ppc32.
While at it, reduce the space reserved for the stub from 64 bytes to 32
bytes since the different stub variants are all less than 8
instructions.
To reduce use of #ifdef, a stub implementation is provided for
kernel_toc_address() and -SZ_2G is cast to 'long long' to prevent
errors on ppc32.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/9fa3258cbb9105cf8a0a8135214d44ffbc75fe84.1687166935.git.naveen@kernel.org
Instead of keying off DYNAMIC_FTRACE_WITH_REGS, use FTRACE_REGS_ADDR to
identify the proper ftrace trampoline address to use.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/6045a280a57a7ea937a5bb13ccac747026dbfb07.1687166935.git.naveen@kernel.org
Since we now support DYNAMIC_FTRACE_WITH_ARGS across ppc32 and ppc64
ELFv2, we can simplify function_graph tracer support code in ftrace.c
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/4dc92c4b1ed444dc62b748ae7327acdb9e096864.1687166935.git.naveen@kernel.org
ELFv1 support is deprecated and on the way out. Pre -mprofile-kernel
ftrace support (-pg only) is very limited and is retained primarily for
clang builds. It won't be necessary once clang lands support for
-fpatchable-function-entry.
Copy the existing ftrace code supporting these into ftrace_pg.c.
ftrace.c can then be refactored and enhanced with a focus on ppc32 and
ppc64 ELFv2.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1eb6cc6c3141ddb77a2a25f8a9e83d83ff312b02.1687166935.git.naveen@kernel.org
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU. This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.
Let's extend the API to take a CPU ID to exclude instead of just a
boolean. This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.
Arguably, this new API also encourages safer behavior. Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.
[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Invoke ibm,os-term call with rtas_call_unlocked(), without using the
RTAS spinlock, to avoid deadlock in the unlikely event of a machine
crash while making an RTAS call.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230609071404.425529-1-hbathini@linux.ibm.com
In case fadump_reserve_mem() fails to reserve memory, the
reserve_dump_area_size variable will retain the reserve area size. This
will lead to /sys/kernel/fadump/mem_reserved node displaying an incorrect
memory reserved by fadump.
To fix this problem, reserve dump area size variable is set to 0 if fadump
failed to reserve memory.
Fixes: 8255da95e5 ("powerpc/fadump: release all the memory above boot memory size")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230704050715.203581-1-sourabhjain@linux.ibm.com
Building ppc40x_defconfig throws the following error:
CC arch/powerpc/kernel/traps.o
arch/powerpc/kernel/traps.c:2232:29: warning: no previous prototype for 'WatchdogHandler' [-Wmissing-prototypes]
2232 | void __attribute__ ((weak)) WatchdogHandler(struct pt_regs *regs)
| ^~~~~~~~~~~~~~~
This function was imported by commit 14cf11af6c ("powerpc: Merge
enough to start building in arch/powerpc.") as a weak function but
never defined and/or called outside traps.c
As it has only one caller fold it inside its caller and remove it.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/38fe1078eb403eef74dc8f29387636fd7ecdf43c.1692276041.git.christophe.leroy@csgroup.eu
With hardened usercopy enabled (CONFIG_HARDENED_USERCOPY=y), using the
/proc/powerpc/rtas/firmware_update interface to prepare a system
firmware update yields a BUG():
kernel BUG at mm/usercopy.c:102!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 0 PID: 2232 Comm: dd Not tainted 6.5.0-rc3+ #2
Hardware name: IBM,8408-E8E POWER8E (raw) 0x4b0201 0xf000004 of:IBM,FW860.50 (SV860_146) hv:phyp pSeries
NIP: c0000000005991d0 LR: c0000000005991cc CTR: 0000000000000000
REGS: c0000000148c76a0 TRAP: 0700 Not tainted (6.5.0-rc3+)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 24002242 XER: 0000000c
CFAR: c0000000001fbd34 IRQMASK: 0
[ ... GPRs omitted ... ]
NIP usercopy_abort+0xa0/0xb0
LR usercopy_abort+0x9c/0xb0
Call Trace:
usercopy_abort+0x9c/0xb0 (unreliable)
__check_heap_object+0x1b4/0x1d0
__check_object_size+0x2d0/0x380
rtas_flash_write+0xe4/0x250
proc_reg_write+0xfc/0x160
vfs_write+0xfc/0x4e0
ksys_write+0x90/0x160
system_call_exception+0x178/0x320
system_call_common+0x160/0x2c4
The blocks of the firmware image are copied directly from user memory
to objects allocated from flash_block_cache, so flash_block_cache must
be created using kmem_cache_create_usercopy() to mark it safe for user
access.
Fixes: 6d07d1cd30 ("usercopy: Restrict non-usercopy caches to size 0")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
[mpe: Trim and indent oops]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230810-rtas-flash-vs-hardened-usercopy-v2-1-dcf63793a938@linux.ibm.com
objtool reports the following warning:
arch/powerpc/kernel/ptrace/ptrace-view.o: warning: objtool:
gpr32_set_common+0x23c (.text+0x860): redundant UACCESS disable
gpr32_set_common() conditionally opens and closes UACCESS based on
whether kbuf pointer is NULL or not. This is wackelig.
Split gpr32_set_common() in two fonctions, one for user one for
kernel.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix oops in gpr32_set_common_user() due to NULL kbuf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b8d6ae4483fcfd17524e79d803c969694a85cc02.1687428075.git.christophe.leroy@csgroup.eu
ptrace and perf watchpoints were considered incompatible in
commit 29da4f91c0 ("powerpc/watchpoint: Don't allow concurrent perf
and ptrace events"), but the logic in that commit doesn't really apply.
Ptrace doesn't automatically single step; the ptracer must request this
explicitly. And the ptracer can do so regardless of whether a
ptrace/perf watchpoint triggered or not: it could single step every
instruction if it wanted to. Whatever stopped the ptracee before
executing the instruction that would trigger the perf watchpoint is no
longer relevant by this point.
To get correct behaviour when perf and ptrace are watching the same
data we must ignore the perf watchpoint. After all, ptrace has
before-execute semantics, and perf is after-execute, so perf doesn't
actually care about the watchpoint trigger at this point in time.
Pausing before execution does not mean we will actually end up executing
the instruction.
Importantly though, we don't remove the perf watchpoint yet. This is
key.
The ptracer is free to do whatever it likes right now. E.g., it can
continue the process, single step. or even set the child PC somewhere
completely different.
If it does try to execute the instruction though, without reinserting
the watchpoint (in which case we go back to the start of this example),
the perf watchpoint would immediately trigger. This time there is no
ptrace watchpoint, so we can safely perform a single step and increment
the perf counter. Upon receiving the single step exception, the existing
code already handles propagating or consuming it based on whether
another subsystem (e.g. ptrace) requested a single step. Again, this is
needed with or without perf/ptrace exclusion, because ptrace could be
single stepping this instruction regardless of if a watchpoint is
involved.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-6-bgray@linux.ibm.com
We only remove watchpoints when they have the perf_single_step flag set,
so we can reinsert them during the first iteration.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-5-bgray@linux.ibm.com
There is a bug in the current watchpoint tracking logic, where the
teardown in arch_unregister_hw_breakpoint() uses bp->ctx->task, which it
does not have a reference of and parallel threads may be in the process
of destroying. This was partially addressed in commit fb822e6076
("powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event"),
but the underlying issue of accessing a struct member in an unknown
state still remained. Syzkaller managed to trigger a null pointer
derefernce due to the race between the task destructor and checking the
pointer and dereferencing it in the loop.
While this null pointer dereference could be fixed by using READ_ONCE
to access the task up front, that just changes the error to manipulating
possbily freed memory.
Instead, the breakpoint logic needs to be reworked to remove any
dependency on a context or task struct during breakpoint removal.
The reason we have this currently is to clear thread.last_hit_ubp. This
member is used to differentiate the perf DAWR single-step sequence from
other causes of single-step, such as userspace just calling
ptrace(PTRACE_SINGLESTEP, ...). We need to differentiate them because,
when the single step interrupt is received, we need to know whether to
re-insert the DAWR breakpoint (perf) or not (ptrace / other).
arch_unregister_hw_breakpoint() needs to clear this information to
prevent dangling pointers to possibly freed memory. These pointers are
dereferenced in single_step_dabr_instruction() without a way to check
their validity.
This patch moves the tracking of this information to the breakpoint
itself. This means we no longer have to do anything special to clean up.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-4-bgray@linux.ibm.com
info is cheap to retrieve, and is likely optimised by the compiler
anyway. On the other hand, propagating it across the functions makes it
possible to be inconsistent and adds needless complexity.
Remove it, and invoke counter_arch_bp() when we need to work with it.
As we don't persist it, we just use the local bp array to track whether
we are ignoring a breakpoint.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-3-bgray@linux.ibm.com
The behaviour of the thread_change_pc() function is a bit cryptic
without being more familiar with how the watchpoint logic handles
perf's after-execute semantics.
Expand the comment to explain why we can re-insert the breakpoint and
unset the perf_single_step flag.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-2-bgray@linux.ibm.com
Commit ddb5cdbafa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.
Replace #include <asm/export.h> with #include <linux/export.h>.
After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org
There is no EXPORT_SYMBOL line there, hence #include <asm/export.h>
is unneeded.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-1-masahiroy@kernel.org
Add support for HOTPLUG_SMT, which enables the generic sysfs SMT support
files in /sys/devices/system/cpu/smt, as well as the "nosmt" boot
parameter.
Implement the recently added hooks to allow partial SMT states, allow
any number of threads per core.
Tie the config symbol to HOTPLUG_CPU, which enables it on the major
platforms that support SMT. If there are other platforms that want the
SMT support that can be tweaked in future.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[ldufour: remove topology_smt_supported]
[ldufour: remove topology_smt_threads_supported]
[ldufour: select CONFIG_SMT_NUM_THREADS_DYNAMIC]
[ldufour: update kernel-parameters.txt]
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Link: https://msgid.link/20230705145143.40545-10-ldufour@linux.ibm.com
There are a few warnings in powerpc64 defconfig builds after -Wmissing-prototypes
gets promoted from W=1 to the default warning set:
arch/powerpc/mm/book3s64/pgtable.c:422:6: error: no previous prototype for 'arch_report_meminfo' [-Werror=missing-prototypes]
arch/powerpc/platforms/cell/ras.c:275:5: error: no previous prototype for 'cbe_sysreset_hack' [-Werror=missing-prototypes]
arch/powerpc/platforms/cell/spu_manage.c:29:21: error: no previous prototype for 'spu_devnode' [-Werror=missing-prototypes]
arch/powerpc/platforms/pasemi/time.c:12:17: error: no previous prototype for 'pas_get_boot_time' [-Werror=missing-prototypes]
arch/powerpc/platforms/powermac/feature.c:1532:13: error: no previous prototype for 'g5_phy_disable_cpu1' [-Werror=missing-prototypes]
arch/powerpc/platforms/86xx/pic.c:28:13: error: no previous prototype for 'mpc86xx_init_irq' [-Werror=missing-prototypes]
drivers/pci/pci-sysfs.c:936:13: error: no previous prototype for 'pci_adjust_legacy_attr' [-Werror=missing-prototypes]
Address these by including the right header files or marking the
functions static. The audit.c one is a bit tricky since compat_audit.h
cannot include regular kernel headers tht have conflicting types on
32-bit powerpc.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[mpe: Drop change to __vmemmap_free() which only exists in mm]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230727122720.2558065-1-arnd@kernel.org
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
[mpe: Fixup maple/setup.c which needs platform_device]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230724210247.778034-1-robh@kernel.org
init_mm mm_cpumask and context.active_cpus is not maintained at boot
and hotplug. This seems to be harmless because init_mm does not have a
userspace and so never gets user TLBs flushed, but it looks odd and it
prevents some sanity checks being added.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-2-npiggin@gmail.com
On book3s/32 KUAP is performed at segment level. At the moment,
when enabling userspace access, only current segment is modified.
Then if a write is performed on another user segment, a fault is
taken and all other user segments get enabled for userspace
access. This then require special attention when disabling
userspace access.
Having a userspace write access crossing a segment boundary is
unlikely. Having a userspace write access crossing a segment boundary
back and forth is even more unlikely. So, instead of enabling
userspace access on all segments when a write fault occurs, just
change which segment has userspace access enabled in order to
eliminate the case when more than one segment has userspace access
enabled. That simplifies userspace access deactivation.
There is however a corner case which is even more unlikely but has
to be handled anyway: an unaligned access which is crossing a
segment boundary. That would definitely require at least having
userspace access enabled on the two segments. To avoid complicating
the likely case for a so unlikely happening, handle such situation
like an alignment exception and emulate the store.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/8de8580513c1a6e880bad1ba9a69d3efad3d4fa5.1689091022.git.christophe.leroy@csgroup.eu
All but book3s/64 use a static branch key for disabling kuap.
book3s/64 uses an mmu feature.
Refactor all targets to use MMU_FTR_KUAP like book3s/64.
For PPC32 that implies updating mmu features fixups once KUAP
has been initialised.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/6b3d7c977bad73378ea368bc6818e9c94ea95ab0.1689091022.git.christophe.leroy@csgroup.eu
Commit 273df864cf ("ima: Check against blacklisted hashes for files with
modsig") introduced an appraise_flag option for referencing the blacklist
keyring. Any matching binary found on this keyring fails signature
validation. This flag only works with module appended signatures.
An important part of a PKI infrastructure is to have the ability to do
revocation at a later time should a vulnerability be found. Expand the
revocation flag usage to all appraisal functions. The flag is now
enabled by default. Setting the flag with an IMA policy has been
deprecated. Without a revocation capability like this in place, only
authenticity can be maintained. With this change, integrity can now be
achieved with digital signature based IMA appraisal.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
When booting on e6500 with an ELF v2 ABI kernel, the secondary threads do
not start correctly:
[ 0.051118] smp: Bringing up secondary CPUs ...
[ 5.072700] Processor 1 is stuck.
This occurs because the startup code is written to use function
descriptors when loading the entry point for the secondary threads. When
building with ELF v2 ABI there are no function descriptors, and the code
loads junk values for the entry point address.
Fix it by using ppc_function_entry() in C, and DOTSYM() in asm, both of
which work correctly for ELF v2 ABI as well as ELF v1 ABI kernels.
Fixes: 8c5fa3b5c4 ("powerpc/64: Make ELFv2 the default for big-endian builds")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801102650.48705-1-mpe@ellerman.id.au
This function has a __weak definition and an override that is only used on
freescale powerpc chips. The powerpc definition however does not see the
declaration that is in a .c file:
arch/powerpc/kernel/dma-mask.c:7:6: error: no previous prototype for 'arch_dma_set_mask' [-Werror=missing-prototypes]
Move it into the linux/dma-map-ops.h header where the other arch_dma_* functions
are declared.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
With ppc64 -mprofile-kernel and ppc32 -pg, profiling instructions to
call into ftrace are emitted right at function entry. The instruction
sequence used is minimal to reduce overhead. Crucially, a stackframe is
not created for the function being traced. This breaks stack unwinding
since the function being traced does not have a stackframe for itself.
As such, it never shows up in the backtrace:
/sys/kernel/debug/tracing # echo 1 > /proc/sys/kernel/stack_tracer_enabled
/sys/kernel/debug/tracing # cat stack_trace
Depth Size Location (17 entries)
----- ---- --------
0) 4144 32 ftrace_call+0x4/0x44
1) 4112 432 get_page_from_freelist+0x26c/0x1ad0
2) 3680 496 __alloc_pages+0x290/0x1280
3) 3184 336 __folio_alloc+0x34/0x90
4) 2848 176 vma_alloc_folio+0xd8/0x540
5) 2672 272 __handle_mm_fault+0x700/0x1cc0
6) 2400 208 handle_mm_fault+0xf0/0x3f0
7) 2192 80 ___do_page_fault+0x3e4/0xbe0
8) 2112 160 do_page_fault+0x30/0xc0
9) 1952 256 data_access_common_virt+0x210/0x220
10) 1696 400 0xc00000000f16b100
11) 1296 384 load_elf_binary+0x804/0x1b80
12) 912 208 bprm_execve+0x2d8/0x7e0
13) 704 64 do_execveat_common+0x1d0/0x2f0
14) 640 160 sys_execve+0x54/0x70
15) 480 64 system_call_exception+0x138/0x350
16) 416 416 system_call_common+0x160/0x2c4
Fix this by having ftrace create a dummy stackframe for the function
being traced. With this, backtraces now capture the function being
traced:
/sys/kernel/debug/tracing # cat stack_trace
Depth Size Location (17 entries)
----- ---- --------
0) 3888 32 _raw_spin_trylock+0x8/0x70
1) 3856 576 get_page_from_freelist+0x26c/0x1ad0
2) 3280 64 __alloc_pages+0x290/0x1280
3) 3216 336 __folio_alloc+0x34/0x90
4) 2880 176 vma_alloc_folio+0xd8/0x540
5) 2704 416 __handle_mm_fault+0x700/0x1cc0
6) 2288 96 handle_mm_fault+0xf0/0x3f0
7) 2192 48 ___do_page_fault+0x3e4/0xbe0
8) 2144 192 do_page_fault+0x30/0xc0
9) 1952 608 data_access_common_virt+0x210/0x220
10) 1344 16 0xc0000000334bbb50
11) 1328 416 load_elf_binary+0x804/0x1b80
12) 912 64 bprm_execve+0x2d8/0x7e0
13) 848 176 do_execveat_common+0x1d0/0x2f0
14) 672 192 sys_execve+0x54/0x70
15) 480 64 system_call_exception+0x138/0x350
16) 416 416 system_call_common+0x160/0x2c4
This results in two additional stores in the ftrace entry code, but
produces reliable backtraces.
Fixes: 153086644f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Cc: stable@vger.kernel.org
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230621051349.759567-1-naveen@kernel.org
This registers the new fchmodat2 syscall in most places as nuber 452,
with alpha being the exception where it's 562. I found all these sites
by grepping for fspick, which I assume has found me everything.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Message-Id: <a677d521f048e4ca439e7080a5328f21eb8e960e.1689092120.git.legion@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
This partly reverts commit 1e688dd2a3.
That commit aimed at optimising the code around generation of
WARN_ON/BUG_ON but this leads to a lot of dead code erroneously
generated by GCC.
That dead code becomes a problem when we start using objtool validation
because objtool will abort validation with a warning as soon as it
detects unreachable code. This is because unreachable code might
be the indication that objtool doesn't properly decode object text.
text data bss dec hex filename
9551585 3627834 224376 13403795 cc8693 vmlinux.before
9535281 3628358 224376 13388015 cc48ef vmlinux.after
Once this change is reverted, in a standard configuration (pmac32 +
function tracer) the text is reduced by 16k which is around 1.7%
We already had problem with it when starting to use objtool on powerpc
as a replacement for recordmcount, see commit 93e3f45a26 ("powerpc:
Fix __WARN_FLAGS() for use with Objtool")
There is also a problem with at least GCC 12, on ppc64_defconfig +
CONFIG_CC_OPTIMIZE_FOR_SIZE=y + CONFIG_DEBUG_SECTION_MISMATCH=y :
LD .tmp_vmlinux.kallsyms1
powerpc64-linux-ld: net/ipv4/tcp_input.o:(__ex_table+0xc4): undefined reference to `.L2136'
make[2]: *** [scripts/Makefile.vmlinux:36: vmlinux] Error 1
make[1]: *** [/home/chleroy/linux-powerpc/Makefile:1238: vmlinux] Error 2
Taking into account that other problems are encountered with that
'asm goto' in WARN_ON(), including build failures, keeping that
change is not worth it allthough it is primarily a compiler bug.
Revert it for now.
mpe: Retain EMIT_WARN_ENTRY as a synonym for EMIT_BUG_ENTRY to reduce
churn, as there are now nearly as many uses of EMIT_WARN_ENTRY as
EMIT_BUG_ENTRY.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230712134552.534955-1-mpe@ellerman.id.au
Since commit aec0ba7472 ("powerpc/64: Use -mprofile-kernel for big
endian ELFv2 kernels"), this file is checked by objtool. Fix warnings
such as:
arch/powerpc/kernel/idle_64e.o: warning: objtool: .text+0x20: unannotated intra-function call
arch/powerpc/kernel/exceptions-64e.o: warning: objtool: .text+0x218: unannotated intra-function call
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230622112451.735268-1-mpe@ellerman.id.au
Nageswara reported that /proc/self/status was showing "vulnerable" for
the Speculation_Store_Bypass feature on Power10, eg:
$ grep Speculation_Store_Bypass: /proc/self/status
Speculation_Store_Bypass: vulnerable
But at the same time the sysfs files, and lscpu, were showing "Not
affected".
This turns out to simply be a bug in the reporting of the
Speculation_Store_Bypass, aka. PR_SPEC_STORE_BYPASS, case.
When SEC_FTR_STF_BARRIER was added, so that firmware could communicate
the vulnerability was not present, the code in ssb_prctl_get() was not
updated to check the new flag.
So add the check for SEC_FTR_STF_BARRIER being disabled. Rather than
adding the new check to the existing if block and expanding the comment
to cover both cases, rewrite the three cases to be separate so they can
be commented separately for clarity.
Fixes: 84ed26fd00 ("powerpc/security: Add a security feature for STF barrier")
Cc: stable@vger.kernel.org # v5.14+
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230517074945.53188-1-mpe@ellerman.id.au
Here is the big set of tty/serial driver updates for 6.5-rc1.
Included in here are:
- tty_audit code cleanups from Jiri
- more 8250 cleanups from Ilpo
- samsung_tty driver bugfixes
- 8250 lock port updates
- usual fsl_lpuart driver updates and fixes
- other small serial driver fixes and updates, full details in the
shortlog.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZKKVTQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl+wACgwWX8fdrkBHASPVkcYOn8xa27E08AnjNz2Y8K
vvOII6EEYKwFjEkjAoIX
=By18
-----END PGP SIGNATURE-----
Merge tag 'tty-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty/serial driver updates for 6.5-rc1.
Included in here are:
- tty_audit code cleanups from Jiri
- more 8250 cleanups from Ilpo
- samsung_tty driver bugfixes
- 8250 lock port updates
- usual fsl_lpuart driver updates and fixes
- other small serial driver fixes and updates, full details in the
shortlog
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (58 commits)
tty_audit: make data of tty_audit_log() const
tty_audit: make tty pointers in exposed functions const
tty_audit: make icanon a bool
tty_audit: invert the condition in tty_audit_log()
tty_audit: use kzalloc() in tty_audit_buf_alloc()
tty_audit: use TASK_COMM_LEN for task comm
Revert "8250: add support for ASIX devices with a FIFO bug"
serial: atmel: don't enable IRQs prematurely
tty: serial: Add Nuvoton ma35d1 serial driver support
tty: serial: fsl_lpuart: add earlycon for imx8ulp platform
tty: serial: imx: fix rs485 rx after tx
selftests: tty: add selftest for tty timestamp updates
tty: tty_io: update timestamps on all device nodes
tty: fix hang on tty device with no_room set
serial: core: fix -EPROBE_DEFER handling in init
serial: 8250_omap: Use force_suspend and resume for system suspend
tty: serial: samsung_tty: Use abs() to simplify some code
tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk
tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() in case of error
serial: 8250: Apply FSL workarounds also without SERIAL_8250_CONSOLE
...
- Remove the deprecated rule to build *.dtbo from *.dts
- Refactor section mismatch detection in modpost
- Fix bogus ARM section mismatch detections
- Fix error of 'make gtags' with O= option
- Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with
the latest LLVM version
- Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed
- Ignore more compiler-generated symbols for kallsyms
- Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles
- Enable more kernel-doc warnings with W=2
- Refactor <linux/export.h> by generating KSYMTAB data by modpost
- Deprecate <asm/export.h> and <asm-generic/export.h>
- Remove the EXPORT_DATA_SYMBOL macro
- Move the check for static EXPORT_SYMBOL back to modpost, which makes
the build faster
- Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm
- Warn missing MODULE_DESCRIPTION when building modules with W=1
- Make 'make clean' robust against too long argument error
- Exclude more objects from GCOV to fix CFI failures with GCOV
- Allow 'make modules_install' to install modules.builtin and
modules.builtin.modinfo even when CONFIG_MODULES is disabled
- Include modules.builtin and modules.builtin.modinfo in the linux-image
Debian package even when CONFIG_MODULES is disabled
- Revive "Entering directory" logging for the latest Make version
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmSf6B0VHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGS2wP/1izzNJ/64XmQoyBDhZCbuOl7ODF
n4wgVJnsJmRnD/RxXR/AZ0JZwQHhzpGISWQM61rVIf/RVFOB7Apx1HpmomKUUjrL
Yc53wLfhTEizGgwttP6tusLM3RO6jkuMKhjC4rllc0tDLJ3zCcwAjSyiOQQ9PBcH
txwAb8r4/TZUzDDCJ0d98WdhIsNDca/ISeRXKHMiIkfvHe+6yizDKu25Y4B6BL5g
0VPJ9nVJZ+XVwRqdVR+UQoPYGZzZ/O2NqAtU7n4PpBKvFfLACILJW+aBDAz9SqN7
RSxn1ahxwq0vrhlB9bSrQRj3N0g8zsi7/xShEZSnGLCbyxYilr5Gq8C59+QxOIJf
5lGBwZlEgn5aWH+D9abwjEI/QOQbTI9kX09sVzweulGCN9iJlJqyIGsB0Ri0/S2R
c/n7c8nLwnWnGF/+LXYvkrak8L9YRKori//YYf9zdvh4h1c2/0SS0nDoC29DhDru
Am7YmhBAkJXXX3NUB2gLvtdp94GSumqefHeSJ5Sp9v/+f2Ft7ruY2ouJC81xDa4p
nNpvolAq2txlZ9t5OU7x7DQiuCWYSws0W7PJ9FBhyHJchf21UHbcm97/HfDoU8rN
ioLQGm+h+g6oZt8pArk45wccjkR3ydpEFDWenYbTEr2o3zLfeKigZps5uhCK3DW2
gnVk50VNagkzrzvA
=Rc1z
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove the deprecated rule to build *.dtbo from *.dts
- Refactor section mismatch detection in modpost
- Fix bogus ARM section mismatch detections
- Fix error of 'make gtags' with O= option
- Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error
with the latest LLVM version
- Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed
- Ignore more compiler-generated symbols for kallsyms
- Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles
- Enable more kernel-doc warnings with W=2
- Refactor <linux/export.h> by generating KSYMTAB data by modpost
- Deprecate <asm/export.h> and <asm-generic/export.h>
- Remove the EXPORT_DATA_SYMBOL macro
- Move the check for static EXPORT_SYMBOL back to modpost, which makes
the build faster
- Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm
- Warn missing MODULE_DESCRIPTION when building modules with W=1
- Make 'make clean' robust against too long argument error
- Exclude more objects from GCOV to fix CFI failures with GCOV
- Allow 'make modules_install' to install modules.builtin and
modules.builtin.modinfo even when CONFIG_MODULES is disabled
- Include modules.builtin and modules.builtin.modinfo in the
linux-image Debian package even when CONFIG_MODULES is disabled
- Revive "Entering directory" logging for the latest Make version
* tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits)
modpost: define more R_ARM_* for old distributions
kbuild: revive "Entering directory" for Make >= 4.4.1
kbuild: set correct abs_srctree and abs_objtree for package builds
scripts/mksysmap: Ignore prefixed KCFI symbols
kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb
kbuild: builddeb: always make modules_install, to install modules.builtin*
modpost: continue even with unknown relocation type
modpost: factor out Elf_Sym pointer calculation to section_rel()
modpost: factor out inst location calculation to section_rel()
kbuild: Disable GCOV for *.mod.o
kbuild: Fix CFI failures with GCOV
kbuild: make clean rule robust against too long argument error
script: modpost: emit a warning when the description is missing
kbuild: make modules_install copy modules.builtin(.modinfo)
linux/export.h: rename 'sec' argument to 'license'
modpost: show offset from symbol for section mismatch warnings
modpost: merge two similar section mismatch warnings
kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion
modpost: use null string instead of NULL pointer for default namespace
modpost: squash sym_update_namespace() into sym_add_exported()
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmSdtQYUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxp7A/+KmoBOm9ytiM2HPiq6pmHiJ9zF/DP
0jvKqDlc0BkmCyRw+/woxA5ZQgDnIYXxxt31toeSu+n31p6AR4wZ14Q5HBahABMw
O/NUXmLAhYaczcp8hK4lS4Opz65+MaDHomu5fNuD7j7CagIwu20MegSEoyo35YC1
nDRN0IVYRRy/58wW509deQi/3U04TuC3kflc/iBToa/34g77L9ESoxpsZuAzo7wI
nc9DF28H6PkuOtnp26iVufqkeYD3wfE5VAtaIZhyO+/WkhcTTUU5JB2hgFSACr0G
qJTofncvGQrRYTNS7aIYPVrtTZpSMmPS08tgZc+iDkTr8iKth1jcPf3YHLpenQzx
2B197BUOLENOiWJFPIAe2TAgoGYYBgKhnnwcPHCHoinvVLTO82wUUW5qfn/GckgY
WNYmS4PofkjlbJH3JdyHdH+vsL0VRzAmkhH+k6F+V02T78Lk+QdXKegLbr5yzRwh
YF/GjX0JYjnONQeL1LQQ/4hfiA1VzmoXbHvXc0XJew6d/RYMon5G5xBAAZ8tnUHC
PX6WMd/CG8RBxFv7IsPh6hTGKEXw4/1ElynPVP/ZEBGVelHda4Sm4G5nJ6/r2cwK
VICfsSb9Nw76STGHpVeQB2O2Y/yZ1xIwWRiAsCndsYOlnGi+knRVxH5UH/1aQPOE
N3DsyT0s/sZaJvc=
=xpTd
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Export pcie_retrain_link() for use outside ASPM
- Add Data Link Layer Link Active Reporting as another way for
pcie_retrain_link() to determine the link is up
- Work around link training failures (especially on the ASMedia
ASM2824 switch) by training first at 2.5GT/s and then attempting
higher rates
Resource management:
- When we coalesce host bridge windows, remove invalidated resources
from the resource tree so future allocations work correctly
Hotplug:
- Cancel bringup sequence if card is not present, to keep from
blinking Power Indicator indefinitely
- Reassign bridge resources if necessary for ACPI hotplug
Driver binding:
- Convert platform_device .remove() callbacks to return void instead
of a mostly useless int
Power management:
- Reduce wait time for secondary bus to be ready to speed up resume
- Avoid putting EloPOS E2/S2/H2 (as well as Elo i2) PCIe Ports in
D3cold
- Call _REG when transitioning D-states so AML that uses the PCI
config space OpRegion works, which fixes some ASMedia GPIO
controllers after resume
Virtualization:
- Delay extra 250ms after FLR of Solidigm P44 Pro NVMe to avoid KVM
hang when guest is rebooted
- Add function 1 DMA alias quirk for Marvell 88SE9235
Error handling:
- Unexport pci_save_aer_state() since it's only used in drivers/pci/
- Drop recommendation for drivers to configure AER Capability, since
the PCI core does this for all devices
ASPM:
- Disable ASPM on MFD function removal to avoid use-after-free
- Tighten up pci_enable_link_state() and pci_disable_link_state()
interfaces so they don't enable/disable states the driver didn't
specify
- Avoid link retraining race that can happen if ASPM sets link
control parameters while the link is in the midst of training for
some other reason
Endpoint framework:
- Change "PCI Endpoint Virtual NTB driver" Kconfig prompt to be
different from "PCI Endpoint NTB driver"
- Automatically create a function specific attributes group for
endpoint drivers to avoid reference counting issues
- Fix many EPC test issues
- Return pci_epf_type_add_cfs() error if EPF has no driver
- Add kernel-doc for pci_epc_raise_irq() and pci_epc_map_msi_irq()
MSI vector parameters
- Pass EPF device ID to driver probe functions
- Return -EALREADY if EPC has already been started/stopped
- Add linkdown notifier support and use it in qcom-ep
- Add Bus Master Enable event support and use it in qcom-ep
- Add Qualcomm Modem Host Interface (MHI) endpoint driver
- Add Layerscape PME interrupt handling to manage link-up
notification
Cadence PCIe controller driver:
- Wait for link retrain to complete when working around the J721E
i2085 erratum with Gen2 mode
Faraday FTPC100 PCI controller driver:
- Release clock resources on error paths
Freescale i.MX6 PCIe controller driver:
- Save and restore Root Port MSI control to work around hardware defect
Intel VMD host bridge driver:
- Reset VMD config register between soft reboots
- Capture pci_reset_bus() return value instead of printing junk when
it fails
Qualcomm PCIe controller driver:
- Add SDX65 endpoint compatible string to DT binding
- Disable register write access after init for IP v2.3.3, v2.9.0
- Use DWC helpers for enabling/disabling writes to DBI registers
- Hide slot hotplug capability for IP v1.0.0, v1.9.0, v2.1.0, v2.3.2,
v2.3.3, v2.7.0, v2.9.0
- Reuse v2.3.2 post-init sequence for v2.4.0
Renesas R-Car PCIe controller driver:
- Remove unused static pcie_base and pcie_dev
Rockchip PCIe controller driver:
- Remove writes to unused registers
- Write endpoint Device ID using correct register
- Assert PCI Configuration Enable bit after probe so endpoint
responds instead of generating Request Retry Status messages
- Poll waiting for PHY PLLs to lock
- Update RK3399 example DT binding to be valid
- Use RK3399 PCIE_CLIENT_LEGACY_INT_CTRL to generate INTx instead of
manually generating PCIe message
- Use multiple windows to avoid address translation conflicts
- Use u32 (not u16) when accessing 32-bit registers
- Hide MSI-X Capability, since RK3399 can't generate MSI-X
- Set endpoint controller required alignment to 256
Synopsys DesignWare PCIe controller driver:
- Wait for link to come up only if we've initiated link training
Miscellaneous:
- Add pci_clear_master() stub for non-CONFIG_PCI"
* tag 'pci-v6.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits)
Documentation: PCI: correct spelling
PCI: vmd: Fix uninitialized variable usage in vmd_enable_domain()
PCI: xgene-msi: Convert to platform remove callback returning void
PCI: tegra: Convert to platform remove callback returning void
PCI: rockchip-host: Convert to platform remove callback returning void
PCI: mvebu: Convert to platform remove callback returning void
PCI: mt7621: Convert to platform remove callback returning void
PCI: mediatek-gen3: Convert to platform remove callback returning void
PCI: mediatek: Convert to platform remove callback returning void
PCI: iproc: Convert to platform remove callback returning void
PCI: hisi-error: Convert to platform remove callback returning void
PCI: dwc: Convert to platform remove callback returning void
PCI: j721e: Convert to platform remove callback returning void
PCI: brcmstb: Convert to platform remove callback returning void
PCI: altera-msi: Convert to platform remove callback returning void
PCI: altera: Convert to platform remove callback returning void
PCI: aardvark: Convert to platform remove callback returning void
PCI: rcar: Use correct product family name for Renesas R-Car
PCI: layerscape: Add the endpoint linkup notifier support
PCI: endpoint: pci-epf-vntb: Fix typo in comments
...
- Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations.
- Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and use
the -mprofile-kernel option (kernel specific ftrace ABI) for big endian
ELFv2 kernels.
- Add initial Dynamic Execution Control Register (DEXCR) support, and allow
the ROP protection instructions to be used on Power 10.
- Various other small features and fixes.
Thanks to: Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King,
Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean Delvare,
Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry, Nathan
Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul Gortmaker, Randy
Dunlap, Rob Herring, Rohan McLure, Russell Currey, Sachin Sant, Timothy
Pearson, Tom Rix, Uwe Kleine-König.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmSeqQMTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKukD/sGUceX6gIc7UcjWhL1ZCVco0bsgLjT
JrY1NenisGKjwwRd/o+2+h3ziJDoO5AsQfT72EiNLhaYJhnlb1d0vXzsvN0THc+2
W5RrxAZUNhBy+c7gSSEJjy8+vBIwSQAliQLChHGOSejGCj94SxF5+zjUFvSX458I
z0+ZQK+Fiw5NcpzEnBT0XPnLzap74a7TL0JcG1MLbj2QtHXhbfjIlkkPDX3kK0Gw
xbelFy38X7KKbQsXXYSTCGqwRdJ3yqu21nEsjRuo2yT5H5rQbjCNggkMOL1DecDd
ULGxit/z13Pt1Ad3oe+6FF17ggOiCG0F75DONASjFDthFYx6NQffkJS1n1VZauQj
jU6LtWeD3HkgIYm6Udjq+LaSmkAmn5a+9tsElE/K+V1WG4rKyMeVmE3z/tCJG0l2
yhLKyFs+glXN/LiWHyX0mrQIIVZdRK237X1qXJuIvAuB7Drm5duXFAHR8pdJD0dg
H23OhoO2FvLxb9GvnzGxqjdazzattctz31wU/1RgnPxumYkJ9PlBcXn9h1uXa8/K
rDZFJADsQhEfRCjmLG3GIaFWqZdc4Cn+ZUk4iHkjPDFDL05Fq7JYHIuwteiN6/wP
NHRvtKdJisu583NI9RN9300JykrEqjSRbMOWlc3vuFwbRLGioXvWhWlIZ3/t58jG
R8s+f0nKSPr+fg==
=ssit
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations
- Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and
use the -mprofile-kernel option (kernel specific ftrace ABI) for big
endian ELFv2 kernels
- Add initial Dynamic Execution Control Register (DEXCR) support, and
allow the ROP protection instructions to be used on Power 10
- Various other small features and fixes
Thanks to Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King,
Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean
Delvare, Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry,
Nathan Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul
Gortmaker, Randy Dunlap, Rob Herring, Rohan McLure, Russell Currey,
Sachin Sant, Timothy Pearson, Tom Rix, and Uwe Kleine-König.
* tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (76 commits)
powerpc: remove checks for binutils older than 2.25
powerpc: Fail build if using recordmcount with binutils v2.37
powerpc/iommu: TCEs are incorrectly manipulated with DLPAR add/remove of memory
powerpc/iommu: Only build sPAPR access functions on pSeries
powerpc: powernv: Annotate data races in opal events
powerpc: Mark writes registering ipi to host cpu through kvm and polling
powerpc: Annotate accesses to ipi message flags
powerpc: powernv: Fix KCSAN datarace warnings on idle_state contention
powerpc: Mark [h]ssr_valid accesses in check_return_regs_valid
powerpc: qspinlock: Enforce qnode writes prior to publishing to queue
powerpc: qspinlock: Mark accesses to qnode lock checks
powerpc/powernv/pci: Remove last IODA1 defines
powerpc/powernv/pci: Remove MVE code
powerpc/powernv/pci: Remove ioda1 support
powerpc: 52xx: Make immr_id DT match tables static
powerpc: mpc512x: Remove open coded "ranges" parsing
powerpc: fsl_soc: Use of_range_to_resource() for "ranges" parsing
powerpc: fsl: Use of_property_read_reg() to parse "reg"
powerpc: fsl_rio: Use of_range_to_resource() for "ranges" parsing
macintosh: Use of_property_read_reg() to parse "reg"
...
top-level directories.
- Douglas Anderson has added a new "buddy" mode to the hardlockup
detector. It permits the detector to work on architectures which
cannot provide the required interrupts, by having CPUs periodically
perform checks on other CPUs.
- Zhen Lei has enhanced kexec's ability to support two crash regions.
- Petr Mladek has done a lot of cleanup on the hard lockup detector's
Kconfig entries.
- And the usual bunch of singleton patches in various places.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJelTAAKCRDdBJ7gKXxA
juDkAP0VXWynzkXoojdS/8e/hhi+htedmQ3v2dLZD+vBrctLhAEA7rcH58zAVoWa
2ejqO6wDrRGUC7JQcO9VEjT0nv73UwU=
=F293
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-mm updates from Andrew Morton:
- Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in top-level
directories
- Douglas Anderson has added a new "buddy" mode to the hardlockup
detector. It permits the detector to work on architectures which
cannot provide the required interrupts, by having CPUs periodically
perform checks on other CPUs
- Zhen Lei has enhanced kexec's ability to support two crash regions
- Petr Mladek has done a lot of cleanup on the hard lockup detector's
Kconfig entries
- And the usual bunch of singleton patches in various places
* tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
kernel/time/posix-stubs.c: remove duplicated include
ocfs2: remove redundant assignment to variable bit_off
watchdog/hardlockup: fix typo in config HARDLOCKUP_DETECTOR_PREFER_BUDDY
powerpc: move arch_trigger_cpumask_backtrace from nmi.h to irq.h
devres: show which resource was invalid in __devm_ioremap_resource()
watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
watchdog/sparc64: define HARDLOCKUP_DETECTOR_SPARC64
watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific
watchdog/hardlockup: declare arch_touch_nmi_watchdog() only in linux/nmi.h
watchdog/hardlockup: make the config checks more straightforward
watchdog/hardlockup: sort hardlockup detector related config values a logical way
watchdog/hardlockup: move SMP barriers from common code to buddy code
watchdog/buddy: simplify the dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY
watchdog/buddy: don't copy the cpumask in watchdog_next_cpu()
watchdog/buddy: cleanup how watchdog_buddy_check_hardlockup() is called
watchdog/hardlockup: remove softlockup comment in touch_nmi_watchdog()
watchdog/hardlockup: in watchdog_hardlockup_check() use cpumask_copy()
watchdog/hardlockup: don't use raw_cpu_ptr() in watchdog_hardlockup_kick()
watchdog/hardlockup: HAVE_NMI_WATCHDOG must implement watchdog_hardlockup_probe()
watchdog/hardlockup: keep kernel.nmi_watchdog sysctl as 0444 if probe fails
...
- Yosry has also eliminated cgroup's atomic rstat flushing.
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability.
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning.
- Lorenzo Stoakes has done some maintanance work on the get_user_pages()
interface.
- Liam Howlett continues with cleanups and maintenance work to the maple
tree code. Peng Zhang also does some work on maple tree.
- Johannes Weiner has done some cleanup work on the compaction code.
- David Hildenbrand has contributed additional selftests for
get_user_pages().
- Thomas Gleixner has contributed some maintenance and optimization work
for the vmalloc code.
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code.
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting.
- Christoph Hellwig has some cleanups for the filemap/directio code.
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the provided
APIs rather than open-coding accesses.
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings.
- John Hubbard has a series of fixes to the MM selftesting code.
- ZhangPeng continues the folio conversion campaign.
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock.
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
128 to 8.
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management.
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code.
- Vishal Moola also has done some folio conversion work.
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
=B7yQ
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
- Yosry Ahmed brought back some cgroup v1 stats in OOM logs
- Yosry has also eliminated cgroup's atomic rstat flushing
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning
- Lorenzo Stoakes has done some maintanance work on the
get_user_pages() interface
- Liam Howlett continues with cleanups and maintenance work to the
maple tree code. Peng Zhang also does some work on maple tree
- Johannes Weiner has done some cleanup work on the compaction code
- David Hildenbrand has contributed additional selftests for
get_user_pages()
- Thomas Gleixner has contributed some maintenance and optimization
work for the vmalloc code
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting
- Christoph Hellwig has some cleanups for the filemap/directio code
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the
provided APIs rather than open-coding accesses
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings
- John Hubbard has a series of fixes to the MM selftesting code
- ZhangPeng continues the folio conversion campaign
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
from 128 to 8
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code
- Vishal Moola also has done some folio conversion work
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch
* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
mm/hugetlb: remove hugetlb_set_page_subpool()
mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
hugetlb: revert use of page_cache_next_miss()
Revert "page cache: fix page_cache_next/prev_miss off by one"
mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
mm: memcg: rename and document global_reclaim()
mm: kill [add|del]_page_to_lru_list()
mm: compaction: convert to use a folio in isolate_migratepages_block()
mm: zswap: fix double invalidate with exclusive loads
mm: remove unnecessary pagevec includes
mm: remove references to pagevec
mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
mm: remove struct pagevec
net: convert sunrpc from pagevec to folio_batch
i915: convert i915_gpu_error to use a folio_batch
pagevec: rename fbatch_count()
mm: remove check_move_unevictable_pages()
drm: convert drm_gem_put_pages() to use a folio_batch
i915: convert shmem_sg_free_table() to use a folio_batch
scatterlist: add sg_set_folio()
...