Recently, ld.lld moved from '--undefined-version' to
'--no-undefined-version' as the default, which breaks the compat vDSO
build:
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_gettimeofday' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_gettime' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_getres' failed: symbol not defined
These symbols are not present in the compat vDSO or the regular vDSO for
32-bit but they are unconditionally included in the version section of
the linker script, which is prohibited with '--no-undefined-version'.
Fix this issue by only including the symbols that are actually exported
in the version section of the linker script.
Link: https://github.com/ClangBuiltLinux/linux/issues/1756
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221108171324.3377226-1-nathan@kernel.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Currently, RISC-V sets up reserved memory using the "early" copy of the
device tree. As a result, when trying to get a reserved memory region
using of_reserved_mem_lookup(), the pointer to reserved memory regions
is using the early, pre-virtual-memory address which causes a kernel
panic when trying to use the buffer's name:
Unable to handle kernel paging request at virtual address 00000000401c31ac
Oops [#1]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1
Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
epc : string+0x4a/0xea
ra : vsnprintf+0x1e4/0x336
epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0
gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000
t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20
s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000
a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff
a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff
s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008
s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00
s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002
s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617
t5 : ffffffff812f3618 t6 : ffffffff81203d08
status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d
[<ffffffff80338936>] vsnprintf+0x1e4/0x336
[<ffffffff80055ae2>] vprintk_store+0xf6/0x344
[<ffffffff80055d86>] vprintk_emit+0x56/0x192
[<ffffffff80055ed8>] vprintk_default+0x16/0x1e
[<ffffffff800563d2>] vprintk+0x72/0x80
[<ffffffff806813b2>] _printk+0x36/0x50
[<ffffffff8068af48>] print_reserved_mem+0x1c/0x24
[<ffffffff808057ec>] paging_init+0x528/0x5bc
[<ffffffff808031ae>] setup_arch+0xd0/0x592
[<ffffffff8080070e>] start_kernel+0x82/0x73c
early_init_fdt_scan_reserved_mem() takes no arguments as it operates on
initial_boot_params, which is populated by early_init_dt_verify(). On
RISC-V, early_init_dt_verify() is called twice. Once, directly, in
setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly,
very early in the boot process, by parse_dtb() when it calls
early_init_dt_scan_nodes().
This first call uses dtb_early_va to set initial_boot_params, which is
not usable later in the boot process when
early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the
corresponding call to early_init_dt_scan_nodes() uses fixmap addresses
and doesn't suffer the same fate.
Move early_init_fdt_scan_reserved_mem() further along the boot sequence,
after the direct call to early_init_dt_verify() in setup_arch() so that
the names use the correct virtual memory addresses. The above supposed
that CONFIG_BUILTIN_DTB was not set, but should work equally in the case
where it is - unflatted_and_copy_device_tree() also updates
initial_boot_params.
Reported-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/
Fixes: 922b0375fc ("riscv: Fix memblock reservation for device tree blob")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Even after commit 89fd4a1df8 ("riscv: jump_label: mark arguments as
const to satisfy asm constraints"), building with CC_OPTIMIZE_FOR_SIZE
+ LLVM=1 can reproduce below build error:
CC arch/riscv/kernel/vdso/vgettimeofday.o
In file included from <built-in>:4:
In file included from lib/vdso/gettimeofday.c:5:
In file included from include/vdso/datapage.h:17:
In file included from include/vdso/processor.h:10:
In file included from arch/riscv/include/asm/vdso/processor.h:7:
In file included from include/linux/jump_label.h:112:
arch/riscv/include/asm/jump_label.h:42:3: error:
invalid operand for inline asm constraint 'i'
" .option push \n\t"
^
1 error generated.
I think the problem is when "-Os" is passed as CFLAGS, it's removed by
"CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os" which is
introduced in commit e05d57dcb8 ("riscv: Fixup __vdso_gettimeofday
broke dynamic ftrace"), thus no optimization at all for vgettimeofday.c
arm64 does remove "-Os" as well, but it forces "-O2" after removing
"-Os".
I compared the generated vgettimeofday.o with "-O2" and "-Os",
I think no big performance difference. So let's tell the kbuild not
to remove "-Os" rather than follow arm64 style.
vdso related performance can be improved a lot when building kernel with
CC_OPTIMIZE_FOR_SIZE after this commit, ("-Os" VS no optimization)
Fixes: e05d57dcb8 ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221031182943.2453-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Commit 78e5a33994 ("cpumask: fix checking valid cpu range") has
started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1
are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's
start and next seq operations implement a pattern like
n = cpumask_next(n - 1, mask);
show(n);
while (1) {
++n;
n = cpumask_next(n - 1, mask);
if (n >= nr_cpu_ids)
break;
show(n);
}
which will issue the warning when reading /proc/cpuinfo. Ensure no
warning is generated by validating the cpu index before calling
cpumask_next().
[*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20221014155845.1986223-2-ajones@ventanamicro.com/
Fixes: 78e5a33994 ("cpumask: fix checking valid cpu range")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* A handful of DT updates for the PolarFire SOC.
* A fix to correct the handling of write-only mappings.
* m{vetndor,arcd,imp}id is now in /proc/cpuinfo
* The SiFive L2 cache controller support has been refactored to also
support L3 caches.
There's also a handful of fixes, cleanups and improvements throughout
the tree.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNJiegTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiaW+D/9nN7JMKD4KbED8MUgRVcs+TS4MQk5J
QkjmAXme7w2H1+T5mxNWHk0QvEC6qMu2JQWHeott0ROYPkbXoNtuOOkzcsZhVaeb
rjWH/WC5QhEeMPDc1qc0AmxVkOa937f2NkGtoEvhW6SMWvStHpefdOHo/ij106Re
7wxkcj0fjgn/zPmDltRbWSMyFZWJec5DvZ3AB4NYHnc2ycr8Z7HAG08rjZkdkIjQ
zYGsCkUtrk4qShikKK3cSelW6hH1/FdM+bAo0rzt1frmTw1FLaFOsZZjOaVYekzi
l0jxsb8FRBWyKBFqcagukjZwFy6D+7Q+masb0cXY03eZpEVrroA4bHiPkuZ22Ms/
7ol/I5FvTyrj2R4zd70ziYVF3usO78t5HC4AIQmFl25TNcQdYQd8X28yh12iG6QN
Pa0lh/EOr5idaT2+TErhzRepICnK1Nj9y0H5TZxYljLAhH9j0d/8+Iw88vzJnPga
vek7unZ3BzkooLVIpfpkT8vC94MA0hoP849MVFQDtZgZhPMbjIkN91HrDiw9ktV2
SK9cuPndfrs5nW1WKu8F0cDziusbMHv4F51TPVRu/dFcIkspv+aLojAThgvcm42K
55LgvDgLjo7P3PUghiDXUoPZXvL19t5Bnq2/E47rlSTvvGssIu/onZO6BkxybAm6
BkHuchr8TGBWMg==
=iCOr
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- DT updates for the PolarFire SOC
- a fix to correct the handling of write-only mappings
- m{vetndor,arcd,imp}id is now in /proc/cpuinfo
- the SiFive L2 cache controller support has been refactored to also
support L3 caches
- misc fixes, cleanups and improvements throughout the tree
* tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
MAINTAINERS: add RISC-V's patchwork
RISC-V: Make port I/O string accessors actually work
riscv: enable software resend of irqs
RISC-V: Re-enable counter access from userspace
riscv: vdso: fix NULL deference in vdso_join_timens() when vfork
riscv: Add cache information in AUX vector
soc: sifive: ccache: define the macro for the register shifts
soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes
soc: sifive: ccache: reduce printing on init
soc: sifive: ccache: determine the cache level from dts
soc: sifive: ccache: Rename SiFive L2 cache to Composable cache.
dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache
riscv: check for kernel config option in t-head memory types errata
riscv: use BIT() marco for cpufeature probing
riscv: use BIT() macros in t-head errata init
riscv: drop some idefs from CMO initialization
riscv: cleanup svpbmt cpufeature probing
riscv: Pass -mno-relax only on lld < 15.0.0
RISC-V: Avoid dereferening NULL regs in die()
dt-bindings: riscv: add new riscv,isa strings for emulators
...
I'm merging this in as a single commit as it's a dependency for some
other work.
* commit '3baca1a4d490484fcd555413f1fec85b2e071912':
RISC-V: Add mvendorid, marchid, and mimpid to /proc/cpuinfo output
Commit 2139619bca ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() reject mappings with only PROT_WRITE set in an
attempt to fix an observed inconsistency in behavior when attempting
to read from a PROT_WRITE-only mapping. The root cause of this behavior
was actually that while RISC-V's protection_map maps VM_WRITE to
readable PTE permissions (since write-only PTEs are considered reserved
by the privileged spec), the page fault handler considered loads from
VM_WRITE-only VMAs illegal accesses. Fix the underlying cause by
handling faults in VM_WRITE-only VMAs (patch 1) and then re-enable
use of mmap(PROT_WRITE) (patch 2), making RISC-V's behavior consistent
with all other architectures that don't support write-only PTEs.
* remotes/palmer/riscv-wonly:
riscv: Allow PROT_WRITE-only mmap()
riscv: Make VM_WRITE imply VM_READ
Link: https://lore.kernel.org/r/20220915193702.2201018-1-abrestic@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Heiko Stuebner <heiko@sntech.de> says:
As noted by some people, some parts of the recently added extensions
(svpbmt, zicbom) + t-head errata could use some styling upgrades.
So this series provides these.
changes in v2:
- add patch also converting cpufeature probe to BIT()
- update commit message in patch1 (Conor)
Heiko Stuebner (5):
riscv: cleanup svpbmt cpufeature probing
riscv: drop some idefs from CMO initialization
riscv: use BIT() macros in t-head errata init
riscv: use BIT() marco for cpufeature probing
riscv: check for kernel config option in t-head memory types errata
arch/riscv/errata/thead/errata.c | 14 ++++++-----
arch/riscv/include/asm/cacheflush.h | 2 ++
arch/riscv/kernel/cpufeature.c | 39 ++++++++++++-----------------
3 files changed, 26 insertions(+), 29 deletions(-)
Link: https://lore.kernel.org/r/20220905111027.2463297-1-heiko@sntech.de
* b4-shazam-merge:
riscv: check for kernel config option in t-head memory types errata
riscv: use BIT() marco for cpufeature probing
riscv: use BIT() macros in t-head errata init
riscv: drop some idefs from CMO initialization
riscv: cleanup svpbmt cpufeature probing
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Wrapping things in #ifdefs makes the code harder to read
while we also have IS_ENABLED() macros to do this in regular code
and the extension detection is not _that_ runtime critical.
So define a stub for riscv_noncoherent_supported() in the
non-CONFIG_RISCV_DMA_NONCOHERENT case and move the code to
us IS_ENABLED.
Suggested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20220905111027.2463297-3-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
For better readability (and compile time coverage) use IS_ENABLED
instead of ifdef and drop the new unneeded switch statement.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20220905111027.2463297-2-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
I don't think we can actually die() without a regs pointer, but the
compiler was warning about a NULL check after a dereference. It seems
prudent to just avoid the possibly-NULL dereference, given that when
die()ing the system is already toast so who knows how we got there.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220920200037.6727-1-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
* Fixes for single-stepping in the presence of an async
exception as well as the preservation of PSTATE.SS
* Better handling of AArch32 ID registers on AArch64-only
systems
* Fixes for the dirty-ring API, allowing it to work on
architectures with relaxed memory ordering
* Advertise the new kvmarm mailing list
* Various minor cleanups and spelling fixes
RISC-V:
* Improved instruction encoding infrastructure for
instructions not yet supported by binutils
* Svinval support for both KVM Host and KVM Guest
* Zihintpause support for KVM Guest
* Zicbom support for KVM Guest
* Record number of signal exits as a VCPU stat
* Use generic guest entry infrastructure
x86:
* Misc PMU fixes and cleanups.
* selftests: fixes for Hyper-V hypercall
* selftests: fix nx_huge_pages_test on TDP-disabled hosts
* selftests: cleanups for fix_hypercall_test
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM7OcMUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAFgf/Rqc9hrXZVdbh2OZ+gScSsFsPK1zO
DISUksLcXaYVYYsvQAEg/N2BPz3XbmO4jA+z8bIUrYTA7fC98we2C4jfR+EaX/fO
+/Kzf0lAgu/nQZyFzUya+1jRsZqvVbC/HmDCI2kzN4u78e/LZ7NVcMijdV/ly6ib
cq0b0LLqJHe/fcpJ806JZP3p5sndQhDmlUkZ2AWZf6CUKSEFcufbbYkt+84ZK4PL
N9mEqXYQ3DXClLQmIBv+NZhtGlmADkWDE4BNouw8dVxhaXH7Hw/jfBHdb6SSHMRe
tQ6Src1j8AYOhf5J35SMudgkbGcMelm0yeZ7Sizk+5Ft0EmdbZsnkvsGdQ==
=4RA+
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
"The main batch of ARM + RISC-V changes, and a few fixes and cleanups
for x86 (PMU virtualization and selftests).
ARM:
- Fixes for single-stepping in the presence of an async exception as
well as the preservation of PSTATE.SS
- Better handling of AArch32 ID registers on AArch64-only systems
- Fixes for the dirty-ring API, allowing it to work on architectures
with relaxed memory ordering
- Advertise the new kvmarm mailing list
- Various minor cleanups and spelling fixes
RISC-V:
- Improved instruction encoding infrastructure for instructions not
yet supported by binutils
- Svinval support for both KVM Host and KVM Guest
- Zihintpause support for KVM Guest
- Zicbom support for KVM Guest
- Record number of signal exits as a VCPU stat
- Use generic guest entry infrastructure
x86:
- Misc PMU fixes and cleanups.
- selftests: fixes for Hyper-V hypercall
- selftests: fix nx_huge_pages_test on TDP-disabled hosts
- selftests: cleanups for fix_hypercall_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (57 commits)
riscv: select HAVE_POSIX_CPU_TIMERS_TASK_WORK
RISC-V: KVM: Use generic guest entry infrastructure
RISC-V: KVM: Record number of signal exits as a vCPU stat
RISC-V: KVM: add __init annotation to riscv_kvm_init()
RISC-V: KVM: Expose Zicbom to the guest
RISC-V: KVM: Provide UAPI for Zicbom block size
RISC-V: KVM: Make ISA ext mappings explicit
RISC-V: KVM: Allow Guest use Zihintpause extension
RISC-V: KVM: Allow Guest use Svinval extension
RISC-V: KVM: Use Svinval for local TLB maintenance when available
RISC-V: Probe Svinval extension form ISA string
RISC-V: KVM: Change the SBI specification version to v1.0
riscv: KVM: Apply insn-def to hlv encodings
riscv: KVM: Apply insn-def to hfence encodings
riscv: Introduce support for defining instructions
riscv: Add X register names to gpr-nums
KVM: arm64: Advertise new kvmarm mailing list
kvm: vmx: keep constant definition format consistent
kvm: mmu: fix typos in struct kvm_arch
KVM: selftests: Fix nx_huge_pages_test on TDP-disabled hosts
...
When CONFIG_CMDLINE_FORCE is enabled, cmdline provided by
CONFIG_CMDLINE are always used. This allows CONFIG_CMDLINE to be
used regardless of the result of device tree scanning.
This especially fixes the case where a device tree without the
chosen node is supplied to the kernel. In such cases,
early_init_dt_scan would return true. But inside
early_init_dt_scan_chosen, the cmdline won't be updated as there
is no chosen node in the device tree. As a result, CONFIG_CMDLINE
is not copied into boot_command_line even if CONFIG_CMDLINE_FORCE
is enabled. This commit allows properly update boot_command_line
in this situation.
Fixes: 8fd6e05c74 ("arch: riscv: support kernel command line forcing when no DTB passed")
Signed-off-by: Wenting Zhang <zephray@outlook.com>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/PSBPR04MB399135DFC54928AB958D0638B1829@PSBPR04MB3991.apcprd04.prod.outlook.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
- Remove potentially incomplete targets when Kbuid is interrupted by
SIGINT etc. in case GNU Make may miss to do that when stderr is piped
to another program.
- Rewrite the single target build so it works more correctly.
- Fix rpm-pkg builds with V=1.
- List top-level subdirectories in ./Kbuild.
- Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms.
- Avoid two different modules in lib/zstd/ having shared code, which
potentially causes building the common code as build-in and modular
back-and-forth.
- Unify two modpost invocations to optimize the build process.
- Remove head-y syntax in favor of linker scripts for placing particular
sections in the head of vmlinux.
- Bump the minimal GNU Make version to 3.82.
- Clean up misc Makefiles and scripts.
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmM+4vcVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGY2IQAInr0JUNnkkxwUSXtOcQuA3IK8RJ
FbU9HXJRoV9H+7+l3SMlN7mIbrs5eE5fTY3iwQ3CVe139d1+1q7nvTMRv8owywJx
GBgzswncuu1lk7iQQ//CxiqMwSCG8GJdYn1uDVy4I5jg3o+DtFZJtyq2Wb7pqsMm
ZhZ4PozRN+idYQJSF6Vx/zEVLHI7quMBwfe4CME8/0Kg2+hnYzbXV/aUf0ED2emq
zdCMDQgIOK5AhY+8qgMXKYnBUJMTqBp6LoR4p3ApfUkwRFY0sGa0/LK3U/B22OE7
uWyR4fCUExGyerlcHEVev+9eBfmsLLPyqlchNwpSDOPf5OSdnKmgqJEBR/Cvx0eh
URerPk7EHxyH3G8yi+cU2GtofNTGc5RHPRgJE2ADsQEi5TAUKGmbXMlsFRL/51Vn
lTANZObBNa1d4enljF6TfTL5nuccOa+DKvXnH9fQ49t0QdtSikv6J/lGwilwm1Sr
BctmCsySPuURZfkpI9OQnLuouloMXl9f7Q/+S39haS/tSgvPpyITyO71nxDnXn/s
BbFObZJUk9QkqOACjBP1hNErTLt83uBxQ9z+rDCw/SbLIe4nw0wyneuygfHI5rI8
3RZB2DbGauuJHX2Zs6YGS14SLSY33IsLqKR1/Vy3LrPvOHuEvNiOR8LITq5E0YCK
OffK2Y5cIlXR0QWf
=DHiN
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove potentially incomplete targets when Kbuid is interrupted by
SIGINT etc in case GNU Make may miss to do that when stderr is piped
to another program.
- Rewrite the single target build so it works more correctly.
- Fix rpm-pkg builds with V=1.
- List top-level subdirectories in ./Kbuild.
- Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in
kallsyms.
- Avoid two different modules in lib/zstd/ having shared code, which
potentially causes building the common code as build-in and modular
back-and-forth.
- Unify two modpost invocations to optimize the build process.
- Remove head-y syntax in favor of linker scripts for placing
particular sections in the head of vmlinux.
- Bump the minimal GNU Make version to 3.82.
- Clean up misc Makefiles and scripts.
* tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
docs: bump minimal GNU Make version to 3.82
ia64: simplify esi object addition in Makefile
Revert "kbuild: Check if linker supports the -X option"
kbuild: rebuild .vmlinux.export.o when its prerequisite is updated
kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o
zstd: Fixing mixed module-builtin objects
kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols
kallsyms: take the input file instead of reading stdin
kallsyms: drop duplicated ignore patterns from kallsyms.c
kbuild: reuse mksysmap output for kallsyms
mksysmap: update comment about __crc_*
kbuild: remove head-y syntax
kbuild: use obj-y instead extra-y for objects placed at the head
kbuild: hide error checker logs for V=1 builds
kbuild: re-run modpost when it is updated
kbuild: unify two modpost invocations
kbuild: move vmlinux.o rule to the top Makefile
kbuild: move .vmlinux.objs rule to Makefile.modpost
kbuild: list sub-directories in ./Kbuild
Makefile.compiler: replace cc-ifversion with compiler-specific macros
...
* Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
* The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
* The CD-ROM filesystems have been enabled in the defconfig.
* Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
+/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
5JoE1jLMVhjcKQ==
=LJg6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
- The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
- The CD-ROM filesystems have been enabled in the defconfig.
- Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: enable THP_SWAP for RV64
RISC-V: Print SSTC in canonical order
riscv: compat: s/failed/unsupported if compat mode isn't supported
RISC-V: Increase range and default value of NR_CPUS
cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
perf: RISC-V: throttle perf events
perf: RISC-V: exclude invalid pmu counters from SBI calls
riscv: enable CD-ROM file systems in defconfig
riscv: topology: fix default topology reporting
arm64: topology: move store_cpu_topology() to shared code
- implement EFI boot support for LoongArch
- implement generic EFI compressed boot support for arm64, RISC-V and
LoongArch, none of which implement a decompressor today
- measure the kernel command line into the TPM if measured boot is in
effect
- refactor the EFI stub code in order to isolate DT dependencies for
architectures other than x86
- avoid calling SetVirtualAddressMap() on arm64 if the configured size
of the VA space guarantees that doing so is unnecessary
- move some ARM specific code out of the generic EFI source files
- unmap kernel code from the x86 mixed mode 1:1 page tables
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmM5mfEACgkQw08iOZLZ
jySnJwv9G2nBheSlK9bbWKvCpnDvVIExtlL+mg1wB64oxPrGiWRgjxeyA9+92bT0
Y6jYfKbGOGKnxkEJQl19ik6C3JfEwtGm4SnOVp4+osFeDRB7lFemfcIYN5dqz111
wkZA/Y15rnz3tZeGaXnq2jMoFuccQDXPJtOlqbdVqFQ5Py6YT92uMyuI079pN0T+
GSu7VVOX+SBsv4nGaUKIpSVwAP0gXkS/7s7CTf47QiR2+j8WMTlQEYZVjOKZjMJZ
/7hXY2/mduxnuVuT7cfx0mpZKEryUREJoBL5nDzjTnlhLb5X8cHKiaE1lx0aJ//G
JYTR8lDklJZl/7RUw/IW/YodcKcofr3F36NMzWB5vzM+KHOOpv4qEZhoGnaXv94u
auqhzYA83heaRjz7OISlk6kgFxdlIRE1VdrkEBXSlQeCQUv1woS+ZNVGYcKqgR0B
48b31Ogm2A0pAuba89+U9lz/n33lhIDtYvJqLO6AAPLGiVacD9ZdapN5kMftVg/1
SfhFqNzy
=d8Ps
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"A bit more going on than usual in the EFI subsystem. The main driver
for this has been the introduction of the LoonArch architecture last
cycle, which inspired some cleanup and refactoring of the EFI code.
Another driver for EFI changes this cycle and in the future is
confidential compute.
The LoongArch architecture does not use either struct bootparams or DT
natively [yet], and so passing information between the EFI stub and
the core kernel using either of those is undesirable. And in general,
overloading DT has been a source of issues on arm64, so using DT for
this on new architectures is a to avoid for the time being (even if we
might converge on something DT based for non-x86 architectures in the
future). For this reason, in addition to the patch that enables EFI
boot for LoongArch, there are a number of refactoring patches applied
on top of which separate the DT bits from the generic EFI stub bits.
These changes are on a separate topich branch that has been shared
with the LoongArch maintainers, who will include it in their pull
request as well. This is not ideal, but the best way to manage the
conflicts without stalling LoongArch for another cycle.
Another development inspired by LoongArch is the newly added support
for EFI based decompressors. Instead of adding yet another
arch-specific incarnation of this pattern for LoongArch, we are
introducing an EFI app based on the existing EFI libstub
infrastructure that encapulates the decompression code we use on other
architectures, but in a way that is fully generic. This has been
developed and tested in collaboration with distro and systemd folks,
who are eager to start using this for systemd-boot and also for arm64
secure boot on Fedora. Note that the EFI zimage files this introduces
can also be decompressed by non-EFI bootloaders if needed, as the
image header describes the location of the payload inside the image,
and the type of compression that was used. (Note that Fedora's arm64
GRUB is buggy [0] so you'll need a recent version or switch to
systemd-boot in order to use this.)
Finally, we are adding TPM measurement of the kernel command line
provided by EFI. There is an oversight in the TCG spec which results
in a blind spot for command line arguments passed to loaded images,
which means that either the loader or the stub needs to take the
measurement. Given the combinatorial explosion I am anticipating when
it comes to firmware/bootloader stacks and firmware based attestation
protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now
when it comes to EFI measured boot, which is that the kernel measures
the initrd and command line. Intermediate loaders can measure
additional assets if needed, but with the baseline in place, we can
deploy measured boot in a meaningful way even if you boot into Linux
straight from the EFI firmware.
Summary:
- implement EFI boot support for LoongArch
- implement generic EFI compressed boot support for arm64, RISC-V and
LoongArch, none of which implement a decompressor today
- measure the kernel command line into the TPM if measured boot is in
effect
- refactor the EFI stub code in order to isolate DT dependencies for
architectures other than x86
- avoid calling SetVirtualAddressMap() on arm64 if the configured
size of the VA space guarantees that doing so is unnecessary
- move some ARM specific code out of the generic EFI source files
- unmap kernel code from the x86 mixed mode 1:1 page tables"
* tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
efi/arm64: libstub: avoid SetVirtualAddressMap() when possible
efi: zboot: create MemoryMapped() device path for the parent if needed
efi: libstub: fix up the last remaining open coded boot service call
efi/arm: libstub: move ARM specific code out of generic routines
efi/libstub: measure EFI LoadOptions
efi/libstub: refactor the initrd measuring functions
efi/loongarch: libstub: remove dependency on flattened DT
efi: libstub: install boot-time memory map as config table
efi: libstub: remove DT dependency from generic stub
efi: libstub: unify initrd loading between architectures
efi: libstub: remove pointless goto kludge
efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap
efi: libstub: avoid efi_get_memory_map() for allocating the virt map
efi: libstub: drop pointless get_memory_map() call
efi: libstub: fix type confusion for load_options_size
arm64: efi: enable generic EFI compressed boot
loongarch: efi: enable generic EFI compressed boot
riscv: efi: enable generic EFI compressed boot
efi/libstub: implement generic EFI zboot
efi/libstub: move efi_system_table global var into separate object
...
This got out of order during a merge conflict, fix it by putting the
entries in the correct order.
Fixes: 7ab52f75a9 ("RISC-V: Add Sstc extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220920204518.10988-1-palmer@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When compat mode isn't supported(I believe this is the most case now),
kernel will emit somthing as:
[ 0.050407] riscv: ELF compat mode failed
This msg may make users think there's something wrong with the kernel
itself, replace "failed" with "unsupported" to make it clear. In fact
this is the real compat_mode_supported meaning. After the patch, the
msg would be:
[ 0.050407] riscv: ELF compat mode unsupported
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220821141819.3804-1-jszhang@kernel.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Identifying the underlying RISC-V implementation can be important
for some of the user space applications. For example, the perf tool
uses arch specific CPU implementation id (i.e. CPUID) to select a
JSON file describing custom perf events on a CPU.
Currently, there is no way to identify RISC-V implementation so we
add mvendorid, marchid, and mimpid to /proc/cpuinfo output.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Tested-by: Nikita Shubin <n.shubin@yadro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220727043829.151794-1-apatel@ventanamicro.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The objects placed at the head of vmlinux need special treatments:
- arch/$(SRCARCH)/Makefile adds them to head-y in order to place
them before other archives in the linker command line.
- arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of
obj-y to avoid them going into built-in.a.
This commit gets rid of the latter.
Create vmlinux.a to collect all the objects that are unconditionally
linked to vmlinux. The objects listed in head-y are moved to the head
of vmlinux.a by using 'ar m'.
With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y
for builtin objects.
There is no *.o that is directly linked to vmlinux. Drop unneeded code
in scripts/clang-tools/gen_compile_commands.py.
$(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested
by Nathan Chancellor [1].
[1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Just like other ISA extensions, we allow callers/users to detect the
presence of Svinval extension from ISA string.
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
Remove the linked list use in favour of the vma iterator.
Link: https://lkml.kernel.org/r/20220906194824.2110408-67-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 2139619bca ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() return EINVAL if PROT_WRITE was set wihtout
PROT_READ with the justification that a write-only PTE is considered a
reserved PTE permission bit pattern in the privileged spec. This check
is unnecessary since we let VM_WRITE imply VM_READ on RISC-V, and it is
inconsistent with other architectures that don't support write-only PTEs,
creating a potential software portability issue. Just remove the check
altogether and let PROT_WRITE imply PROT_READ as is the case on other
architectures.
Note that this also allows PROT_WRITE|PROT_EXEC mappings which were
disallowed prior to the aforementioned commit; PROT_READ is implied in
such mappings as well.
Fixes: 2139619bca ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220915193702.2201018-3-abrestic@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The stub is used in different execution environments, but on arm64,
RISC-V and LoongArch, we still use the core kernel's implementation of
memcpy and memset, as they are just a branch instruction away, and can
generally be reused even from code such as the EFI stub that runs in a
completely different address space.
KAsan complicates this slightly, resulting in the need for some hacks to
expose the uninstrumented, __ prefixed versions as the normal ones, as
the latter are instrumented to include the KAsan checks, which only work
in the core kernel.
Unfortunately, #define'ing memcpy to __memcpy when building C code does
not guarantee that no explicit memcpy() calls will be emitted. And with
the upcoming zboot support, which consists of a separate binary which
therefore needs its own implementation of memcpy/memset anyway, it's
better to provide one explicitly instead of linking to the existing one.
Given that EFI exposes implementations of memmove() and memset() via the
boot services table, let's wire those up in the appropriate way, and
drop the references to the core kernel ones.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
riscv has an equivalent of arm bug fixed by 653d48b221 ("arm: fix
really nasty sigreturn bug"); if signal gets caught by an interrupt that
hits when we have the right value in a0 (-513), *and* another signal
gets delivered upon sigreturn() (e.g. included into the blocked mask for
the first signal and posted while the handler had been running), the
syscall restart logics will see regs->cause equal to EXC_SYSCALL (we are
in a syscall, after all) and a0 already restored to its original value
(-513, which happens to be -ERESTARTNOINTR) and assume that we need to
apply the usual syscall restart logics.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YxJEiSq%2FCGaL6Gm9@ZenIV/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This fixes two issues: I truncated the warning's hart ID when porting to
the 64-bit hart ID code, and the original code's warning handling could
fire on an uninitialized hart ID.
The biggest change here is that riscv_cbom_block_size is no longer
initialized, as IMO the default isn't sane: there's nothing in the ISA
that mandates any specific cache block size, so falling back to one will
just silently produce the wrong answer on some systems. This also
changes the probing order so the cache block size is known before
enabling Zicbom support.
CC: stable@vger.kernel.org
CC: Andrew Jones <ajones@ventanamicro.com>
CC: Heiko Stuebner <heiko@sntech.de>
CC: Atish Patra <atishp@rivosinc.com>
Fixes: 3aefb2ee5b ("riscv: implement Zicbom-based CMO instructions + the t-head variant")
Fixes: 1631ba1259 ("riscv: Add support for non-coherent devices using zicbom extension")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
[Conor: fixed the redefinition errors]
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220912224800.998121-1-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This contains a pair of fixes for build-time warnings.
* 'riscv-variable_fixes_without_kvm' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git:
riscv: traps: add missing prototype
riscv: signal: fix missing prototype warning
Sparse complains:
arch/riscv/kernel/traps.c:213:6: warning: symbol 'shadow_stack' was not declared. Should it be static?
The variable is used in entry.S, so declare shadow_stack there
alongside SHADOW_OVERFLOW_STACK_SIZE.
Fixes: 31da94c25a ("riscv: add VMAP_STACK overflow detection")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220814141237.493457-5-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Fix the warning:
arch/riscv/kernel/signal.c:316:27: warning: no previous prototype for function 'do_notify_resume' [-Wmissing-prototypes]
asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
All other functions in the file are static & none of the existing
headers stood out as an obvious location. Create signal.h to hold the
declaration.
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220814141237.493457-4-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
riscv_isa_ext_keys[] is an array of static keys used in the unified
ISA extension framework. The keys added to this array may be used
anywhere, including in modules. Ensure the keys remain writable by
placing them in the data section.
The need to change riscv_isa_ext_keys[]'s section was found when the
kvm module started failing to load. Commit 8eb060e101 ("arch/riscv:
add Zihintpause support") adds a static branch check for a newly
added isa-ext key to cpu_relax(), which kvm uses.
Fixes: c360cbec35 ("riscv: introduce unified static key mechanism for ISA extensions")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Cc: stable@vger.kernel.org
Reported-by: Ron Economos <re@w6rz.net>
Reported-by: Anup Patel <apatel@ventanamicro.com>
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220816163058.3004536-1-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
RISC-V has no sane defaults to fall back on where there is no cpu-map
in the devicetree.
Without sane defaults, the package, core and thread IDs are all set to
-1. This causes user-visible inaccuracies for tools like hwloc/lstopo
which rely on the sysfs cpu topology files to detect a system's
topology.
On a PolarFire SoC, which should have 4 harts with a thread each,
lstopo currently reports:
Machine (793MB total)
Package L#0
NUMANode L#0 (P#0 793MB)
Core L#0
L1d L#0 (32KB) + L1i L#0 (32KB) + PU L#0 (P#0)
L1d L#1 (32KB) + L1i L#1 (32KB) + PU L#1 (P#1)
L1d L#2 (32KB) + L1i L#2 (32KB) + PU L#2 (P#2)
L1d L#3 (32KB) + L1i L#3 (32KB) + PU L#3 (P#3)
Adding calls to store_cpu_topology() in {boot,smp} hart bringup code
results in the correct topolgy being reported:
Machine (793MB total)
Package L#0
NUMANode L#0 (P#0 793MB)
L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
CC: stable@vger.kernel.org # 456797da79: arm64: topology: move store_cpu_topology() to shared code
Fixes: 03f11f03db ("RISC-V: Parse cpu topology during boot.")
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Link: https://github.com/open-mpi/hwloc/issues/536
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
There's still a handful of new features in here, but there are a lot of
fixes/cleanups as well:
* Support for the Zicbom for explicit cache-block management, along with
the necessary bits to make the non-standard cache management ops on
the Allwinner D1 function.
* Support for the Zihintpause extension, which codifies a go-slow
instruction used for cpu_relax().
* Support for the Sstc extension for supervisor-mode timer/counter
management.
* Many device tree fixes and cleanups, including a large set for the
Canaan device trees.
* A handful of fixes and cleanups for the PMU driver.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmL21egTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiSsiD/9cCN/7ndt4v7N65PUya+mVYC9VPppB
d/UC74M0mMUHQbtdtHzlCZVHW0pxc6Pc8oDTWLviKxNSHa6LQkLQJ/RZZz4YlH91
V/vh6DCZv9TRfHJS2E6jMUKEAVAiGg+723gE5EqLc5uapIBrvmiluQwBIQcu8dj1
egfdxJH3IVrEZWwROrYtffDgw4sipENuch5v4yhk4vH0bMlatcIM+hMpPZgOfbgX
xip4K4B/HTAJRn5vunrlCQzYdg+g9l5iEy73A/A9HfzOFCMTMJFp1zHvIrzLUjKC
79MZza3GJLpwMG4C1j8u+qOL01wVrQcA5gNp+14UuUedl/jHaceZwBkCL4cmFyGP
LE94Ed7+6cIJJH/NTcNfOOSD9byOePjfan+qJIlRBxZGbHKt+Ip6Lu2FGeftpXah
MlhhN5S1nGTuFpn7XGRsYrB/VLBD/KWsLxvWBZZWsSYwHwnFA9ZTUbcuQfxTJ+Qq
mH9wZIZ/z8MaEjKcdooIPHjKl+CFjDpVYWge83/t12LLYC9ryTM4vIlltZ84bs6i
2CMSNjBRSuPa7FQPHW+a6CWAjz2Lv1u9jCSH0iI62ytZR5/zinI39tv6LvwbypbY
VfWBnrtLNLlZmNbk13ODV64ayhTpZoZXWGL2TvkJklBqEPqda+9/nopiqb8a8jFZ
yEmKFdEqrb1LAg==
=caji
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
"There's still a handful of new features in here, but there are a lot
of fixes/cleanups as well:
- Support for the Zicbom extension for explicit cache-block
management, along with the necessary bits to make the non-standard
cache management ops on the Allwinner D1 function
- Support for the Zihintpause extension, which codifies a go-slow
instruction used for cpu_relax()
- Support for the Sstc extension for supervisor-mode timer/counter
management
- Many device tree fixes and cleanups, including a large set for the
Canaan device trees
- A handful of fixes and cleanups for the PMU driver"
* tag 'riscv-for-linus-5.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (43 commits)
dt-bindings: gpio: sifive: add gpio-line-names
wireguard: selftests: set CONFIG_NONPORTABLE on riscv32
RISC-V: KVM: Support sstc extension
RISC-V: Improve SBI definitions
RISC-V: Move counter info definition to sbi header file
RISC-V: Fix SBI PMU calls for RV32
RISC-V: Update user page mapping only once during start
RISC-V: Fix counter restart during overflow for RV32
RISC-V: Prefer sstc extension if available
RISC-V: Enable sstc extension parsing from DT
RISC-V: Add SSTC extension CSR details
riscv:uprobe fix SR_SPIE set/clear handling
dt-bindings: riscv: fix SiFive l2-cache's cache-sets
riscv: ensure cpu_ops_sbi is declared
RISC-V: cpu_ops_spinwait.c should include head.h
RISC-V: Declare cpu_ops_spinwait in <asm/cpu_ops.h>
riscv: dts: starfive: correct number of external interrupts
riscv: dts: sifive unmatched: Add PWM controlled LEDs
riscv/purgatory: Omit use of bin2c
riscv/purgatory: hard-code obj-y in Makefile
...
This series implements Sstc extension support which was ratified
recently. Before the Sstc extension, an SBI call is necessary to
generate timer interrupts as only M-mode have access to the timecompare
registers. Thus, there is significant latency to generate timer
interrupts at kernel. For virtualized enviornments, its even worse as
the KVM handles the SBI call and uses a software timer to emulate the
timecomapre register.
Sstc extension solves both these problems by defining a
stimecmp/vstimecmp at supervisor (host/guest) level. It allows kernel to
program a timer and recieve interrupt without supervisor execution
enviornment (M-mode/HS mode) intervention.
* palmer/riscv-sstc:
RISC-V: Prefer sstc extension if available
RISC-V: Enable sstc extension parsing from DT
RISC-V: Add SSTC extension CSR details
The ISA extension framework now allows parsing any multi-letter
ISA extension.
Enable that for sstc extension.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220722165047.519994-3-atishp@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
In riscv the process of uprobe going to clear spie before exec
the origin insn,and set spie after that.But When access the page
which origin insn has been placed a page fault may happen and
irq was disabled in arch_uprobe_pre_xol function,It cause a WARN
as follows.
There is no need to clear/set spie in arch_uprobe_pre/post/abort_xol.
We can just remove it.
[ 31.684157] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1488
[ 31.684677] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 76, name: work
[ 31.684929] preempt_count: 0, expected: 0
[ 31.685969] CPU: 2 PID: 76 Comm: work Tainted: G
[ 31.686542] Hardware name: riscv-virtio,qemu (DT)
[ 31.686797] Call Trace:
[ 31.687053] [<ffffffff80006442>] dump_backtrace+0x30/0x38
[ 31.687699] [<ffffffff80812118>] show_stack+0x40/0x4c
[ 31.688141] [<ffffffff8081817a>] dump_stack_lvl+0x44/0x5c
[ 31.688396] [<ffffffff808181aa>] dump_stack+0x18/0x20
[ 31.688653] [<ffffffff8003e454>] __might_resched+0x114/0x122
[ 31.688948] [<ffffffff8003e4b2>] __might_sleep+0x50/0x7a
[ 31.689435] [<ffffffff80822676>] down_read+0x30/0x130
[ 31.689728] [<ffffffff8000b650>] do_page_fault+0x166/x446
[ 31.689997] [<ffffffff80003c0c>] ret_from_exception+0x0/0xc
Fixes: 74784081aa ("riscv: Add uprobes supported")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220721065820.245755-1-zouyipeng@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Sparse complains that cpu_ops_sbi is used undeclared:
arch/riscv/kernel/cpu_ops_sbi.c:17:29: warning: symbol 'cpu_ops_sbi' was not declared. Should it be static?
Fix the warning by adding cpu_ops_sbi to cpu_ops_sbi.h & including that
where used.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220714080235.3853374-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Running sparse shows cpu_ops_spinwait.c is missing two definitions
found in head.h, so include it to stop the following warnings:
arch/riscv/kernel/cpu_ops_spinwait.c:15:6: warning: symbol '__cpu_spinwait_stack_pointer' was not declared. Should it be static?
arch/riscv/kernel/cpu_ops_spinwait.c:16:6: warning: symbol '__cpu_spinwait_task_pointer' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
Link: https://lore.kernel.org/r/20220713215306.94675-1-ben.dooks@sifive.com
Fixes: c78f94f35c ("RISC-V: Use __cpu_up_stack/task_pointer only for spinwait method")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The cpu_ops_spinwait is used in a couple of places in arch/riscv
and is causing a sparse warning due to no declaration. Add this
to <asm/cpu_ops.h> with the others to fix the following:
arch/riscv/kernel/cpu_ops_spinwait.c:16:29: warning: symbol 'cpu_ops_spinwait' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
Link: https://lore.kernel.org/r/20220714071811.187491-1-ben.dooks@sifive.com
[Palmer: Drop the extern from cpu_ops.c]
Fixes: 2ffc48fc70 ("RISC-V: Move spinwait booting method to its own config")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
A handful of fixes to our kexec/crash kernel support that allow crash
tool to function.
Link: https://lore.kernel.org/r/mhng-f5fdaa37-e99a-4214-a297-ec81f0fed0c1@palmer-mbp2014
* commit 'f9293ad46d8ba9909187a37b7215324420ad4596':
RISC-V: Add modules to virtual kernel memory layout dump
RISC-V: Fixup schedule out issue in machine_crash_shutdown()
RISC-V: Fixup get incorrect user mode PC for kernel mode regs
RISC-V: kexec: Fixup use of smp_processor_id() in preemptible context
This series is based on the alternatives changes done in my svpbmt
series and thus also depends on Atish's isa-extension parsing series.
It implements using the cache-management instructions from the Zicbom-
extension to handle cache flush, etc actions on platforms needing them.
SoCs using cpu cores from T-Head like the Allwinne D1 implement a
different set of cache instructions. But while they are different,
instructions they provide the same functionality, so a variant can easly
hook into the existing alternatives mechanism on those.
[Palmer: Some minor fixups, including a RISCV_ISA_ZICBOM dependency on
MMU that's probably not strictly necessary. The Zicbom support will
trip up sparse for users that have new toolchains, I just sent a patch.]
Link: https://lore.kernel.org/all/20220706231536.2041855-1-heiko@sntech.de/
Link: https://lore.kernel.org/linux-sparse/20220811033138.20676-1-palmer@rivosinc.com/T/#u
* palmer/riscv-zicbom:
riscv: implement cache-management errata for T-Head SoCs
riscv: Add support for non-coherent devices using zicbom extension
dt-bindings: riscv: document cbom-block-size
of: also handle dma-noncoherent in of_dma_is_coherent()
fatfs, autofs, squashfs, procfs, etc.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYu9BeQAKCRDdBJ7gKXxA
jp1DAP4mjCSvAwYzXklrIt+Knv3CEY5oVVdS+pWOAOGiJpldTAD9E5/0NV+VmlD9
kwS/13j38guulSlXRzDLmitbg81zAAI=
=Zfum
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"Updates to various subsystems which I help look after. lib, ocfs2,
fatfs, autofs, squashfs, procfs, etc. A relatively small amount of
material this time"
* tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
scripts/gdb: ensure the absolute path is generated on initial source
MAINTAINERS: kunit: add David Gow as a maintainer of KUnit
mailmap: add linux.dev alias for Brendan Higgins
mailmap: update Kirill's email
profile: setup_profiling_timer() is moslty not implemented
ocfs2: fix a typo in a comment
ocfs2: use the bitmap API to simplify code
ocfs2: remove some useless functions
lib/mpi: fix typo 'the the' in comment
proc: add some (hopefully) insightful comments
bdi: remove enum wb_congested_state
kernel/hung_task: fix address space of proc_dohung_task_timeout_secs
lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t()
squashfs: support reading fragments in readahead call
squashfs: implement readahead
squashfs: always build "file direct" version of page actor
Revert "squashfs: provide backing_dev_info in order to disable read-ahead"
fs/ocfs2: Fix spelling typo in comment
ia64: old_rr4 added under CONFIG_HUGETLB_PAGE
proc: fix test for "vsyscall=xonly" boot option
...