Commit Graph

260 Commits

Author SHA1 Message Date
Linus Torvalds
18fd049731 arm64 updates for 6.1:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.
 
 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation
   to match the actual kernel behaviour (zeroing the registers on syscall
   rather than "zeroed or preserved" previously).
 
 - More conversions to automatic system registers generation.
 
 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.
 
 - arm64 stacktrace cleanups and improvements.
 
 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.
 
 - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception
   handling, better EL1 undefs reporting.
 
 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.
 
 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.
 
 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).
 
 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.
 
 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include larger
   SVE and SME vector lengths in signal tests, various cleanups.
 
 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.
 
 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI
 XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI
 kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU
 eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ
 qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72
 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1
 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H
 SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS
 MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy
 /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT
 MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps
 LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8=
 =jl3Z
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.

 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
   documentation to match the actual kernel behaviour (zeroing the
   registers on syscall rather than "zeroed or preserved" previously).

 - More conversions to automatic system registers generation.

 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.

 - arm64 stacktrace cleanups and improvements.

 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.

 - Improve the reporting of EL1 exceptions: rework BTI and FPAC
   exception handling, better EL1 undefs reporting.

 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.

 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.

 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).

 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.

 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include
   larger SVE and SME vector lengths in signal tests, various cleanups.

 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.

 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
  arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
  arm64/kprobe: Optimize the performance of patching single-step slot
  arm64: defconfig: Add Coresight as module
  kselftest/arm64: Handle EINTR while reading data from children
  kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
  kselftest/arm64: Don't repeat termination handler for fp-stress
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
  arm64: fix the build with binutils 2.27
  kselftest/arm64: Don't enable v8.5 for MTE selftest builds
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  kselftest/arm64: Fix typo in hwcap check
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64/sve: Add Perf extensions documentation
  ...
2022-10-06 11:51:49 -07:00
Catalin Marinas
53630a1f61 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  : Miscellaneous patches
  arm64/kprobe: Optimize the performance of patching single-step slot
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64: support huge vmalloc mappings
  arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually
  arm64: run softirqs on the per-CPU IRQ stack
  arm64: compat: Implement misalignment fixups for multiword loads
2022-09-30 09:18:26 +01:00
Mike Rapoport
b9dd04a20f arm64/mm: fold check for KFENCE into can_set_direct_map()
KFENCE requires linear map to be mapped at page granularity, so that it
is possible to protect/unprotect single pages, just like with
rodata_full and DEBUG_PAGEALLOC.

Instead of repating

	can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE)

make can_set_direct_map() handle the KFENCE case.

This also prevents potential false positives in kernel_page_present()
that may return true for non-present page if CONFIG_KFENCE is enabled.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220921074841.382615-1-rppt@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-29 17:59:28 +01:00
Kefeng Wang
739e49e0fc arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
Directly check ARM64_SWAPPER_USES_SECTION_MAPS to choose base page
or PMD level huge page mapping in vmemmap_populate() to simplify
code a bit.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220920014951.196191-1-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 16:25:51 +01:00
Mark Rutland
61d2d1808b arm64: mm: don't acquire mutex when rewriting swapper
Since commit:

  47546a1912 ("arm64: mm: install KPTI nG mappings with MMU enabled)"

... when building with CONFIG_DEBUG_ATOMIC_SLEEP=y and booting under
QEMU TCG with '-cpu max', there's a boot-time splat:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 15, name: migration/0
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| no locks held by migration/0/15.
| irq event stamp: 28
| hardirqs last  enabled at (27): [<ffff8000091ed180>] _raw_spin_unlock_irq+0x3c/0x7c
| hardirqs last disabled at (28): [<ffff8000081b8d74>] multi_cpu_stop+0x150/0x18c
| softirqs last  enabled at (0): [<ffff80000809a314>] copy_process+0x594/0x1964
| softirqs last disabled at (0): [<0000000000000000>] 0x0
| CPU: 0 PID: 15 Comm: migration/0 Not tainted 6.0.0-rc3-00002-g419b42ff7eef #3
| Hardware name: linux,dummy-virt (DT)
| Stopper: multi_cpu_stop+0x0/0x18c <- stop_cpus.constprop.0+0xa0/0xfc
| Call trace:
|  dump_backtrace.part.0+0xd0/0xe0
|  show_stack+0x1c/0x5c
|  dump_stack_lvl+0x88/0xb4
|  dump_stack+0x1c/0x38
|  __might_resched+0x180/0x230
|  __might_sleep+0x4c/0xa0
|  __mutex_lock+0x5c/0x450
|  mutex_lock_nested+0x30/0x40
|  create_kpti_ng_temp_pgd+0x4fc/0x6d0
|  kpti_install_ng_mappings+0x2b8/0x3b0
|  cpu_enable_non_boot_scope_capabilities+0x7c/0xd0
|  multi_cpu_stop+0xa0/0x18c
|  cpu_stopper_thread+0x88/0x11c
|  smpboot_thread_fn+0x1ec/0x290
|  kthread+0x118/0x120
|  ret_from_fork+0x10/0x20

Since commit:

  ee017ee353 ("arm64/mm: avoid fixmap race condition when create pud mapping")

... once the kernel leave the SYSTEM_BOOTING state, the fixmap pagetable
entries are protected by the fixmap_lock mutex.

The new KPTI rewrite code uses __create_pgd_mapping() to create a
temporary pagetable. This happens in atomic context, after secondary
CPUs are brought up and the kernel has left the SYSTEM_BOOTING state.
Hence we try to acquire a mutex in atomic context, which is generally
unsound (though benign in this case as the mutex should be free and all
other CPUs are quiescent).

This patch avoids the issue by pulling the mutex out of alloc_init_pud()
and calling it at a higher level in the pagetable manipulation code.
This allows it to be used without locking where one CPU is known to be
in exclusive control of the machine, even after having left the
SYSTEM_BOOTING state.

Fixes: 47546a1912 ("arm64: mm: install KPTI nG mappings with MMU enabled")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220920134731.1625740-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 12:55:39 +01:00
Mark Brown
6ca2b9ca45 arm64/sysreg: Add _EL1 into ID_AA64PFR1_EL1 constant names
Our standard is to include the _EL1 in the constant names for registers but
we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for
conversion to automatic generation. No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Mark Rutland
2e8cff0a0e arm64: fix rodata=full
On arm64, "rodata=full" has been suppored (but not documented) since
commit:

  c55191e96c ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well")

As it's necessary to determine the rodata configuration early during
boot, arm64 has an early_param() handler for this, whereas init/main.c
has a __setup() handler which is run later.

Unfortunately, this split meant that since commit:

  f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")

... passing "rodata=full" would result in a spurious warning from the
__setup() handler (though RO permissions would be configured
appropriately).

Further, "rodata=full" has been broken since commit:

  0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")

... which caused strtobool() to parse "full" as false (in addition to
many other values not documented for the "rodata=" kernel parameter.

This patch fixes this breakage by:

* Moving the core parameter parser to an __early_param(), such that it
  is available early.

* Adding an (optional) arch hook which arm64 can use to parse "full".

* Updating the documentation to mention that "full" is valid for arm64.

* Having the core parameter parser handle "on" and "off" explicitly,
  such that any undocumented values (e.g. typos such as "ful") are
  reported as errors rather than being silently accepted.

Note that __setup() and early_param() have opposite conventions for
their return values, where __setup() uses 1 to indicate a parameter was
handled and early_param() uses 0 to indicate a parameter was handled.

Fixes: f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")
Fixes: 0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jagdish Gediya <jvgediya@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-08-23 11:02:02 +01:00
Will Deacon
f96d67a8af Merge branch 'for-next/boot' into for-next/core
* for-next/boot: (34 commits)
  arm64: fix KASAN_INLINE
  arm64: Add an override for ID_AA64SMFR0_EL1.FA64
  arm64: Add the arm64.nosve command line option
  arm64: Add the arm64.nosme command line option
  arm64: Expose a __check_override primitive for oddball features
  arm64: Allow the idreg override to deal with variable field width
  arm64: Factor out checking of a feature against the override into a macro
  arm64: Allow sticky E2H when entering EL1
  arm64: Save state of HCR_EL2.E2H before switch to EL1
  arm64: Rename the VHE switch to "finalise_el2"
  arm64: mm: fix booting with 52-bit address space
  arm64: head: remove __PHYS_OFFSET
  arm64: lds: use PROVIDE instead of conditional definitions
  arm64: setup: drop early FDT pointer helpers
  arm64: head: avoid relocating the kernel twice for KASLR
  arm64: kaslr: defer initialization to initcall where permitted
  arm64: head: record CPU boot mode after enabling the MMU
  arm64: head: populate kernel page tables with MMU and caches on
  arm64: head: factor out TTBR1 assignment into a macro
  arm64: idreg-override: use early FDT mapping in ID map
  ...
2022-07-25 10:59:15 +01:00
Will Deacon
02eab44c71 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: numa: Don't check node against MAX_NUMNODES
  arm64: mm: Remove assembly DMA cache maintenance wrappers
  arm64/mm: Define defer_reserve_crashkernel()
  arm64: fix oops in concurrently setting insn_emulation sysctls
  arm64: Do not forget syscall when starting a new thread.
  arm64: boot: add zstd support
2022-07-25 10:56:57 +01:00
Anshuman Khandual
4890cc18f9 arm64/mm: Define defer_reserve_crashkernel()
Crash kernel memory reservation gets deferred, when either CONFIG_ZONE_DMA
or CONFIG_ZONE_DMA32 config is enabled on the platform. This deferral also
impacts overall linear mapping creation including the crash kernel itself.
Just encapsulate this deferral check in a new helper for better clarity.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220705062556.1845734-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:43:47 +01:00
Ard Biesheuvel
005e12676a arm64: head: record CPU boot mode after enabling the MMU
In order to avoid having to touch memory with the MMU and caches
disabled, and therefore having to invalidate it from the caches
explicitly, just defer storing the value until after the MMU has been
turned on, unless we are giving up with an error.

While at it, move the associated variable definitions into C code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-19-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
c3cee924bd arm64: head: cover entire kernel image in initial ID map
As a first step towards avoiding the need to create, tear down and
recreate the kernel virtual mapping with MMU and caches disabled, start
by expanding the ID map so it covers the page tables as well as all
executable code. This will allow us to populate the page tables with the
MMU and caches on, and call KASLR init code before setting up the
virtual mapping.

Since this ID map is only needed at boot, create it as a temporary set
of page tables, and populate the permanent ID map after enabling the MMU
and caches. While at it, switch to read-only attributes for the where
possible, as writable permissions are only needed for the initial kernel
page tables. Note that on 4k granule configurations, the permanent ID
map will now be reduced to a single page rather than a 2M block mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
1682c45b92 arm64: mm: provide idmap pointer to cpu_replace_ttbr1()
In preparation for changing the way we initialize the permanent ID map,
update cpu_replace_ttbr1() so we can use it with the initial ID map as
well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-11-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
ebd9aea1f2 arm64: head: drop idmap_ptrs_per_pgd
The assignment of idmap_ptrs_per_pgd lacks any cache invalidation, even
though it is updated with the MMU and caches disabled. However, we never
bother to read the value again except in the very next instruction, and
so we can just drop the variable entirely.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-5-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
e8d13cced5 arm64: head: move assignment of idmap_t0sz to C code
Setting idmap_t0sz involves fiddling with the caches if done with the
MMU off. Since we will be creating an initial ID map with the MMU and
caches off, and the permanent ID map with the MMU and caches on, let's
move this assignment of idmap_t0sz out of the startup code, and replace
it with a macro that simply issues the three instructions needed to
calculate the value wherever it is needed before the MMU is turned on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-4-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
0d9b1ffefa arm64: mm: make vabits_actual a build time constant if possible
Currently, we only support 52-bit virtual addressing on 64k pages
configurations, and in all other cases, vabits_actual is guaranteed to
equal VA_BITS (== VA_BITS_MIN). So get rid of the variable entirely in
that case.

While at it, move the assignment out of the asm entry code - it has no
need to be there.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
475031b6ed arm64: head: move kimage_vaddr variable into C file
This variable definition does not need to be in head.S so move it out.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
1c9a8e8768 arm64: entry: simplify trampoline data page
Get rid of some clunky open coded arithmetic on section addresses, by
emitting the trampoline data variables into a separate, dedicated r/o
data section, and putting it at the next page boundary. This way, we can
access the literals via single LDR instruction.

While at it, get rid of other, implicit literals, and use ADRP/ADD or
MOVZ/MOVK sequences, as appropriate. Note that the latter are only
supported for CONFIG_RELOCATABLE=n (which is usually the case if
CONFIG_RANDOMIZE_BASE=n), so update the CPP conditionals to reflect
this.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220622161010.3845775-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 13:08:30 +01:00
Ard Biesheuvel
47546a1912 arm64: mm: install KPTI nG mappings with MMU enabled
In cases where we unmap the kernel while running in user space, we rely
on ASIDs to distinguish the minimal trampoline from the full kernel
mapping, and this means we must use non-global attributes for those
mappings, to ensure they are scoped by ASID and will not hit in the TLB
inadvertently.

We only do this when needed, as this is generally more costly in terms
of TLB pressure, and so we boot without these non-global attributes, and
apply them to all existing kernel mappings once all CPUs are up and we
know whether or not the non-global attributes are needed. At this point,
we cannot simply unmap and remap the entire address space, so we have to
update all existing block and page descriptors in place.

Currently, we go through a lot of trouble to perform these updates with
the MMU and caches off, to avoid violating break before make (BBM) rules
imposed by the architecture. Since we make changes to page tables that
are not covered by the ID map, we gain access to those descriptors by
disabling translations altogether. This means that the stores to memory
are issued with device attributes, and require extra care in terms of
coherency, which is costly. We also rely on the ID map to access a
shared flag, which requires the ID map to be executable and writable at
the same time, which is another thing we'd prefer to avoid.

So let's switch to an approach where we replace the kernel mapping with
a minimal mapping of a few pages that can be used for a minimal, ad-hoc
fixmap that we can use to map each page table in turn as we traverse the
hierarchy.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220609174320.4035379-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 18:26:13 +01:00
Linus Torvalds
9030fb0bb9 Folio changes for 5.18
- Rewrite how munlock works to massively reduce the contention
    on i_mmap_rwsem (Hugh Dickins):
    https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
  - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
    https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
  - Convert GUP to use folios and make pincount available for order-1
    pages. (Matthew Wilcox)
  - Convert a few more truncation functions to use folios (Matthew Wilcox)
  - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
  - Convert rmap_walk to use folios (Matthew Wilcox)
  - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
  - Add support for creating large folios in readahead (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
 gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
 P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
 s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
 Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
 CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
 v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
 =n9Ad
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
2022-03-22 17:03:12 -07:00
Will Deacon
641d804157 Merge branch 'for-next/spectre-bhb' into for-next/core
Merge in the latest Spectre mess to fix up conflicts with what was
already queued for 5.18 when the embargo finally lifted.

* for-next/spectre-bhb: (21 commits)
  arm64: Do not include __READ_ONCE() block in assembly files
  arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting
  arm64: Use the clearbhb instruction in mitigations
  KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated
  arm64: Mitigate spectre style branch history side channels
  arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2
  arm64: Add percpu vectors for EL1
  arm64: entry: Add macro for reading symbol addresses from the trampoline
  arm64: entry: Add vectors that have the bhb mitigation sequences
  arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations
  arm64: entry: Allow the trampoline text to occupy multiple pages
  arm64: entry: Make the kpti trampoline's kpti sequence optional
  arm64: entry: Move trampoline macros out of ifdef'd section
  arm64: entry: Don't assume tramp_vectors is the start of the vectors
  arm64: entry: Allow tramp_alias to access symbols after the 4K boundary
  arm64: entry: Move the trampoline data page before the text page
  arm64: entry: Free up another register on kpti's tramp_exit path
  arm64: entry: Make the trampoline cleanup optional
  KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A
  arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit
  ...
2022-03-14 19:08:31 +00:00
Will Deacon
20fd2ed10f Merge branch 'for-next/mm' into for-next/core
* for-next/mm:
  Documentation: vmcoreinfo: Fix htmldocs warning
  arm64/mm: Drop use_1G_block()
  arm64: avoid flushing icache multiple times on contiguous HugeTLB
  arm64: crash_core: Export MODULES, VMALLOC, and VMEMMAP ranges
  arm64/hugetlb: Define __hugetlb_valid_size()
  arm64/mm: avoid fixmap race condition when create pud mapping
  arm64/mm: Consolidate TCR_EL1 fields
2022-03-14 19:01:18 +00:00
Vijay Balakrishna
031495635b arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
The following patches resulted in deferring crash kernel reservation to
mem_init(), mainly aimed at platforms with DMA memory zones (no IOMMU),
in particular Raspberry Pi 4.

commit 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
commit 8424ecdde7 ("arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges")
commit 0a30c53573 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
commit 2687275a58 ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required")

Above changes introduced boot slowdown due to linear map creation for
all the memory banks with NO_BLOCK_MAPPINGS, see discussion[1].  The proposed
changes restore crash kernel reservation to earlier behavior thus avoids
slow boot, particularly for platforms with IOMMU (no DMA memory zones).

Tested changes to confirm no ~150ms boot slowdown on our SoC with IOMMU
and 8GB memory.  Also tested with ZONE_DMA and/or ZONE_DMA32 configs to confirm
no regression to deferring scheme of crash kernel memory reservation.
In both cases successfully collected kernel crash dump.

[1] https://lore.kernel.org/all/9436d033-579b-55fa-9b00-6f4b661c2dd7@linux.microsoft.com/

Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Cc: stable@vger.kernel.org
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Link: https://lore.kernel.org/r/1646242689-20744-1-git-send-email-vijayb@linux.microsoft.com
[will: Add #ifdef CONFIG_KEXEC_CORE guards to fix 'crashk_res' references in allnoconfig build]
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 10:22:33 +00:00
Anshuman Khandual
1310222c27 arm64/mm: Drop use_1G_block()
pud_sect_supported() already checks for PUD level block mapping support i.e
on ARM64_4K_PAGES config. Hence pud_sect_supported(), along with some other
required alignment checks can help completely drop use_1G_block().

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1644988012-25455-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-07 21:44:22 +00:00
Christoph Hellwig
dc90f0846d mm: don't include <linux/memremap.h> in <linux/mm.h>
Move the check for the actual pgmap types that need the free at refcount
one behavior into the out of line helper, and thus avoid the need to
pull memremap.h into mm.h.

Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
James Morse
a9c406e646 arm64: entry: Allow the trampoline text to occupy multiple pages
Adding a second set of vectors to .entry.tramp.text will make it
larger than a single 4K page.

Allow the trampoline text to occupy up to three pages by adding two
more fixmap slots. Previous changes to tramp_valias allowed it to reach
beyond a single page.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2022-02-15 17:40:28 +00:00
Jianyong Wu
ee017ee353 arm64/mm: avoid fixmap race condition when create pud mapping
The 'fixmap' is a global resource and is used recursively by
create pud mapping(), leading to a potential race condition in the
presence of a concurrent call to alloc_init_pud():

kernel_init thread                          virtio-mem workqueue thread
==================                          ===========================

  alloc_init_pud(...)                       alloc_init_pud(...)
  pudp = pud_set_fixmap_offset(...)         pudp = pud_set_fixmap_offset(...)
  READ_ONCE(*pudp)
  pud_clear_fixmap(...)
                                            READ_ONCE(*pudp) // CRASH!

As kernel may sleep during creating pud mapping, introduce a mutex lock to
serialise use of the fixmap entries by alloc_init_pud(). However, there is
no need for locking in early boot stage and it doesn't work well with
KASLR enabled when early boot. So, enable lock when system_state doesn't
equal to "SYSTEM_BOOTING".

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: f471044545 ("arm64: mm: use fixmap when creating page tables")
Link: https://lore.kernel.org/r/20220201114400.56885-1-jianyong.wu@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15 15:45:24 +00:00
Linus Torvalds
89fa0be0a0 arm64 fixes for -rc1
- Fix double-evaluation of 'pte' macro argument when using 52-bit PAs
 
 - Fix signedness of some MTE prctl PR_* constants
 
 - Fix kmemleak memory usage by skipping early pgtable allocations
 
 - Fix printing of CPU feature register strings
 
 - Remove redundant -nostdlib linker flag for vDSO binaries
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmGL8ukQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNB0XCADIFxVeLM3YMPsnDC6ttmGp/w1TXh5RoNBm
 IY5utdu8ybdrdCYO//Z6i7rQiYwPBRbR8Xts1LAfZDVuD/IPKqxiY+wjMe9FYu0U
 lb14dnT5iI7L3l0yo/5s1EC8FC/pfCMwi50+trAS5H5jo2wg/+6B/bgbymzTd1ZB
 vzVVzyGv1L/3xKHT4uEDjY1jMPRBH4Ugx2bJ1tzRzsC3Gz0D7ZeVd1Dg3ftPMsqI
 MA4Q5QeE9MsafcV8sKDRLnzNSYEr50goA0xtCre81VmJDNcm/y6UybUCF75KU72M
 kAOImXs0+wrDqNIocYgYlSWRu9xjw8qjoxjnyqikFx47HBesjY8T
 =pHQk
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix double-evaluation of 'pte' macro argument when using 52-bit PAs

 - Fix signedness of some MTE prctl PR_* constants

 - Fix kmemleak memory usage by skipping early pgtable allocations

 - Fix printing of CPU feature register strings

 - Remove redundant -nostdlib linker flag for vDSO binaries

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions
  arm64: Track no early_pgtable_alloc() for kmemleak
  arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long
  arm64: vdso: remove -nostdlib compiler flag
  arm64: arm64_ftr_reg->name may not be a human-readable string
2021-11-10 11:29:30 -08:00
Qian Cai
c6975d7cab arm64: Track no early_pgtable_alloc() for kmemleak
After switched page size from 64KB to 4KB on several arm64 servers here,
kmemleak starts to run out of early memory pool due to a huge number of
those early_pgtable_alloc() calls:

  kmemleak_alloc_phys()
  memblock_alloc_range_nid()
  memblock_phys_alloc_range()
  early_pgtable_alloc()
  init_pmd()
  alloc_init_pud()
  __create_pgd_mapping()
  __map_memblock()
  paging_init()
  setup_arch()
  start_kernel()

Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
won't be enough for a server with 200GB+ memory. There isn't much
interesting to check memory leaks for those early page tables and those
early memory mappings should not reference to other memory. Hence, no
kmemleak false positives, and we can safely skip tracking those early
allocations from kmemleak like we did in the commit fed84c7852
("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
introduce complications to automatically scale the value depends on the
runtime memory size etc. After the patch, the default value of
DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.

Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20211105150509.7826-1-quic_qiancai@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-11-08 10:05:22 +00:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Mike Rapoport
3ecc68349b memblock: rename memblock_free to memblock_phys_free
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

    @@
    expression addr;
    expression size;
    @@
    - memblock_free(addr, size);
    + memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Sudarshan Rajagopalan
8fac67ca23 arm64: mm: update max_pfn after memory hotplug
After new memory blocks have been hotplugged, max_pfn and max_low_pfn
needs updating to reflect on new PFNs being hot added to system.
Without this patch, debug-related functions that use max_pfn such as
get_max_dump_pfn() or read_page_owner() will not work with any page in
memory that is hot-added after boot.

Fixes: 4ab2150615 ("arm64: Add memory hotplug support")
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Georgi Djakov <quic_c_gdjako@quicinc.com>
Tested-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/a51a27ee7be66024b5ce626310d673f24107bcb8.1632853776.git.quic_cgoldswo@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 17:11:06 +01:00
David Hildenbrand
65a2aa5f48 mm/memory_hotplug: remove nid parameter from arch_remove_memory()
The parameter is unused, let's remove it.

Link: https://lkml.kernel.org/r/20210712124052.26491-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Acked-by: Heiko Carstens <hca@linux.ibm.com>	[s390]
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Sergei Trofimovich <slyfox@gentoo.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: Jia He <justin.he@arm.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:23 -07:00
Jonathan Marek
d8a719059b Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
This reverts commit c742199a01.

c742199a01 ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
breaks arm64 in at least two ways for configurations where PUD or PMD
folding occur:

  1. We no longer install huge-vmap mappings and silently fall back to
     page-granular entries, despite being able to install block entries
     at what is effectively the PGD level.

  2. If the linear map is backed with block mappings, these will now
     silently fail to be created in alloc_init_pud(), causing a panic
     early during boot.

The pgtable selftests caught this, although a fix has not been
forthcoming and Christophe is AWOL at the moment, so just revert the
change for now to get a working -rc3 on which we can queue patches for
5.15.

A simple revert breaks the build for 32-bit PowerPC 8xx machines, which
rely on the default function definitions when the corresponding
page-table levels are folded, since commit a6a8f7c4aa ("powerpc/8xx:
add support for huge pages on VMAP and VMALLOC"), eg:

  powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
  linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'

To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in
arch/powerpc/mm/nohash/8xx.c as suggested by Christophe.

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: c742199a01 ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
[mpe: Fold in 8xx.c changes from Christophe and mention in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca
Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au
Signed-off-by: Will Deacon <will@kernel.org>
2021-07-21 11:28:09 +01:00
Mike Rapoport
6d47c23b16 set_memory: allow querying whether set_direct_map_*() is actually enabled
On arm64, set_direct_map_*() functions may return 0 without actually
changing the linear map.  This behaviour can be controlled using kernel
parameters, so we need a way to determine at runtime whether calls to
set_direct_map_invalid_noflush() and set_direct_map_default_noflush() have
any effect.

Extend set_memory API with can_set_direct_map() function that allows
checking if calling set_direct_map_*() will actually change the page
table, replace several occurrences of open coded checks in arm64 with the
new function and provide a generic stub for architectures that always
modify page tables upon calls to set_direct_map APIs.

[arnd@arndb.de: arm64: kfence: fix header inclusion ]

Link: https://lkml.kernel.org/r/20210518072034.31572-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christopher Lameter <cl@linux.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Will Deacon <will@kernel.org>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:20 -07:00
Linus Torvalds
71bd934101 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "190 patches.

  Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
  vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
  migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
  zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
  core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
  signals, exec, kcov, selftests, compress/decompress, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  ipc/util.c: use binary search for max_idx
  ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
  ipc: use kmalloc for msg_queue and shmid_kernel
  ipc sem: use kvmalloc for sem_undo allocation
  lib/decompressors: remove set but not used variabled 'level'
  selftests/vm/pkeys: exercise x86 XSAVE init state
  selftests/vm/pkeys: refill shadow register after implicit kernel write
  selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
  selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
  kcov: add __no_sanitize_coverage to fix noinstr for all architectures
  exec: remove checks in __register_bimfmt()
  x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
  hfsplus: report create_date to kstat.btime
  hfsplus: remove unnecessary oom message
  nilfs2: remove redundant continue statement in a while-loop
  kprobes: remove duplicated strong free_insn_page in x86 and s390
  init: print out unknown kernel parameters
  checkpatch: do not complain about positive return values starting with EPOLL
  checkpatch: improve the indented label test
  checkpatch: scripts/spdxcheck.py now requires python3
  ...
2021-07-02 12:08:10 -07:00
Mike Rapoport
873ba46391 arm64: decouple check whether pfn is in linear map from pfn_valid()
The intended semantics of pfn_valid() is to verify whether there is a
struct page for the pfn in question and nothing else.

Yet, on arm64 it is used to distinguish memory areas that are mapped in
the linear map vs those that require ioremap() to access them.

Introduce a dedicated pfn_is_map_memory() wrapper for
memblock_is_map_memory() to perform such check and use it where
appropriate.

Using a wrapper allows to avoid cyclic include dependencies.

While here also update style of pfn_valid() so that both pfn_valid() and
pfn_is_map_memory() declarations will be consistent.

Link: https://lkml.kernel.org/r/20210511100550.28178-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:29 -07:00
Christophe Leroy
c742199a01 mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge
For architectures with no PMD and/or no PUD, add stubs similar to what we
have for architectures without P4D.

[christophe.leroy@csgroup.eu: arm64: define only {pud/pmd}_{set/clear}_huge when useful]
  Link: https://lkml.kernel.org/r/73ec95f40cafbbb69bdfb43a7f53876fd845b0ce.1620990479.git.christophe.leroy@csgroup.eu
[christophe.leroy@csgroup.eu: x86: define only {pud/pmd}_{set/clear}_huge when useful]
  Link: https://lkml.kernel.org/r/7fbf1b6bc3e15c07c24fa45278d57064f14c896b.1620930415.git.christophe.leroy@csgroup.eu

Link: https://lkml.kernel.org/r/5ac5976419350e8e048d463a64cae449eb3ba4b0.1620795204.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Uladzislau Rezki <uladzislau.rezki@sony.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:26 -07:00
Linus Torvalds
9840cfcb97 arm64 updates for 5.14
- Optimise SVE switching for CPUs with 128-bit implementations.
 
  - Fix output format from SVE selftest.
 
  - Add support for versions v1.2 and 1.3 of the SMC calling convention.
 
  - Allow Pointer Authentication to be configured independently for
    kernel and userspace.
 
  - PMU driver cleanups for managing IRQ affinity and exposing event
    attributes via sysfs.
 
  - KASAN optimisations for both hardware tagging (MTE) and out-of-line
    software tagging implementations.
 
  - Relax frame record alignment requirements to facilitate 8-byte
    alignment with KASAN and Clang.
 
  - Cleanup of page-table definitions and removal of unused memory types.
 
  - Reduction of ARCH_DMA_MINALIGN back to 64 bytes.
 
  - Refactoring of our instruction decoding routines and addition of some
    missing encodings.
 
  - Move entry code moved into C and hardened against harmful compiler
    instrumentation.
 
  - Update booting requirements for the FEAT_HCX feature, added to v8.7
    of the architecture.
 
  - Fix resume from idle when pNMI is being used.
 
  - Additional CPU sanity checks for MTE and preparatory changes for
    systems where not all of the CPUs support 32-bit EL0.
 
  - Update our kernel string routines to the latest Cortex Strings
    implementation.
 
  - Big cleanup of our cache maintenance routines, which were confusingly
    named and inconsistent in their implementations.
 
  - Tweak linker flags so that GDB can understand vmlinux when using RELR
    relocations.
 
  - Boot path cleanups to enable early initialisation of per-cpu
    operations needed by KCSAN.
 
  - Non-critical fixes and miscellaneous cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmDUh1YQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNDaUCAC+2Jy2Yopd94uBPYajGybM0rqCUgE7b5n1
 A7UzmQ6fia2hwqCPmxGG+sRabovwN7C1bKrUCc03RIbErIa7wum1edeyqmF/Aw44
 DUDY1MAOSZaFmX8L62QCvxG1hfdLPtGmHMd1hdXvxYK7PCaigEFnzbLRWTtgE+Ok
 JhdvNfsoeITJObHnvYPF3rV3NAbyYni9aNJ5AC/qb3dlf6XigEraXaMj29XHKfwc
 +vmn+25oqFkLHyFeguqIoK+vUQAy/8TjFfjX83eN3LZknNhDJgWS1Iq1Nm+Vxt62
 RvDUUecWJjAooCWgmil6pt0enI+q6E8LcX3A3cWWrM6psbxnYzkU
 =I6KS
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's a reasonable amount here and the juicy details are all below.

  It's worth noting that the MTE/KASAN changes strayed outside of our
  usual directories due to core mm changes and some associated changes
  to some other architectures; Andrew asked for us to carry these [1]
  rather that take them via the -mm tree.

  Summary:

   - Optimise SVE switching for CPUs with 128-bit implementations.

   - Fix output format from SVE selftest.

   - Add support for versions v1.2 and 1.3 of the SMC calling
     convention.

   - Allow Pointer Authentication to be configured independently for
     kernel and userspace.

   - PMU driver cleanups for managing IRQ affinity and exposing event
     attributes via sysfs.

   - KASAN optimisations for both hardware tagging (MTE) and out-of-line
     software tagging implementations.

   - Relax frame record alignment requirements to facilitate 8-byte
     alignment with KASAN and Clang.

   - Cleanup of page-table definitions and removal of unused memory
     types.

   - Reduction of ARCH_DMA_MINALIGN back to 64 bytes.

   - Refactoring of our instruction decoding routines and addition of
     some missing encodings.

   - Move entry code moved into C and hardened against harmful compiler
     instrumentation.

   - Update booting requirements for the FEAT_HCX feature, added to v8.7
     of the architecture.

   - Fix resume from idle when pNMI is being used.

   - Additional CPU sanity checks for MTE and preparatory changes for
     systems where not all of the CPUs support 32-bit EL0.

   - Update our kernel string routines to the latest Cortex Strings
     implementation.

   - Big cleanup of our cache maintenance routines, which were
     confusingly named and inconsistent in their implementations.

   - Tweak linker flags so that GDB can understand vmlinux when using
     RELR relocations.

   - Boot path cleanups to enable early initialisation of per-cpu
     operations needed by KCSAN.

   - Non-critical fixes and miscellaneous cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (150 commits)
  arm64: tlb: fix the TTL value of tlb_get_level
  arm64: Restrict undef hook for cpufeature registers
  arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS
  arm64: insn: avoid circular include dependency
  arm64: smp: Bump debugging information print down to KERN_DEBUG
  drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
  perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
  arm64: suspend: Use cpuidle context helpers in cpu_suspend()
  PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter()
  arm64: Convert cpu_do_idle() to using cpuidle context helpers
  arm64: Add cpuidle context save/restore helpers
  arm64: head: fix code comments in set_cpu_boot_mode_flag
  arm64: mm: drop unused __pa(__idmap_text_start)
  arm64: mm: fix the count comments in compute_indices
  arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan
  arm64: mm: Pass original fault address to handle_mm_fault()
  arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK]
  arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT
  arm64/mm: Drop SWAPPER_INIT_MAP_SIZE
  arm64: Conditionally configure PTR_AUTH key of the kernel.
  ...
2021-06-28 14:04:24 -07:00
Anshuman Khandual
2062d44da3 arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS
ARM64_SWAPPER_USES_SECTION_MAPS implies that a PMD level huge page mappings
are used for swapper, idmap and vmemmap. Lets make it PMD explicit removing
any possible confusion with generic memory sections and also bit generic as
it's applicable for idmap and vmemmap mappings as well. Hence rename it as
ARM64_KERNEL_USES_PMD_MAPS instead.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1623991622-24294-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-21 18:22:23 +01:00
Anshuman Khandual
4aaa87ab3d arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK]
SECTION_[SHIFT|SIZE|MASK] are essentially PMD_[SHIFT|SIZE|MASK]. But these
create confusion being similar to generic sparsemem memory sections, which
are derived from SECTION_SIZE_BITS. Section references have always implied
PMD level block mapping. Instead just use all PMD level macros which would
make it explicit and also remove confusion with sparsmem memory sections.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1623658706-7182-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-15 12:08:39 +01:00
Anshuman Khandual
40221c7376 arm64/mm: Make vmemmap_free() available only with CONFIG_MEMORY_HOTPLUG
vmemmap_free() callsites (mm/sparse.c) and declaration (include/linux/mm.h)
are protected with CONFIG_MEMORY_HOTPLUG. This function is not required if
CONFIG_MEMORY_HOTPLUG is not enabled. Hence move the config wrapper outside
the function definition.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1621842030-23256-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 19:23:56 +01:00
Jisheng Zhang
e69012400b arm64: mm: don't use CON and BLK mapping if KFENCE is enabled
When we added KFENCE support for arm64, we intended that it would
force the entire linear map to be mapped at page granularity, but we
only enforced this in arch_add_memory() and not in map_mem(), so
memory mapped at boot time can be mapped at a larger granularity.

When booting a kernel with KFENCE=y and RODATA_FULL=n, this results in
the following WARNING at boot:

[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at mm/memory.c:2462 apply_to_pmd_range+0xec/0x190
[    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc1+ #10
[    0.000000] Hardware name: linux,dummy-virt (DT)
[    0.000000] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
[    0.000000] pc : apply_to_pmd_range+0xec/0x190
[    0.000000] lr : __apply_to_page_range+0x94/0x170
[    0.000000] sp : ffffffc010573e20
[    0.000000] x29: ffffffc010573e20 x28: ffffff801f400000 x27: ffffff801f401000
[    0.000000] x26: 0000000000000001 x25: ffffff801f400fff x24: ffffffc010573f28
[    0.000000] x23: ffffffc01002b710 x22: ffffffc0105fa450 x21: ffffffc010573ee4
[    0.000000] x20: ffffff801fffb7d0 x19: ffffff801f401000 x18: 00000000fffffffe
[    0.000000] x17: 000000000000003f x16: 000000000000000a x15: ffffffc01060b940
[    0.000000] x14: 0000000000000000 x13: 0098968000000000 x12: 0000000098968000
[    0.000000] x11: 0000000000000000 x10: 0000000098968000 x9 : 0000000000000001
[    0.000000] x8 : 0000000000000000 x7 : ffffffc010573ee4 x6 : 0000000000000001
[    0.000000] x5 : ffffffc010573f28 x4 : ffffffc01002b710 x3 : 0000000040000000
[    0.000000] x2 : ffffff801f5fffff x1 : 0000000000000001 x0 : 007800005f400705
[    0.000000] Call trace:
[    0.000000]  apply_to_pmd_range+0xec/0x190
[    0.000000]  __apply_to_page_range+0x94/0x170
[    0.000000]  apply_to_page_range+0x10/0x20
[    0.000000]  __change_memory_common+0x50/0xdc
[    0.000000]  set_memory_valid+0x30/0x40
[    0.000000]  kfence_init_pool+0x9c/0x16c
[    0.000000]  kfence_init+0x20/0x98
[    0.000000]  start_kernel+0x284/0x3f8

Fixes: 840b239863 ("arm64, kfence: enable KFENCE for ARM64")
Cc: <stable@vger.kernel.org> # 5.12.x
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210525104551.2ec37f77@xhacker.debian
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-05-25 14:04:38 +01:00
Linus Torvalds
51595e3b49 Assorted arm64 fixes and clean-ups, the most important:
- Restore terminal stack frame records. Their previous removal caused
   traces which cross secondary_start_kernel to terminate one entry too
   late, with a spurious "0" entry.
 
 - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR
   register.
 
 - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe
   failure (GTDT), prevent unregistering of GIC SGIs.
 
 - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having
   to test all the other combinations.
 
 - Documentation fixes and updates: tagged address ABI exceptions on
   brk/mmap/mremap(), event stream frequency, update booting requirements
   on the configuration of traps.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmCVba0ACgkQa9axLQDI
 XvEClxAAsqigp+Mnotdr8YUOuXLjHWU41EMShV6WbFcmlViEyZxxtZ5qavw19T3L
 rPxb8hq9QqI8kCd+j4MAU7cdc0ry+047njJmQ3Va0WeiDsbgEfPvLWPguDbeDFXW
 EjKKib+F/u58IffDkn6rVA7ZVPgYHRH+8yw6EdApp0BN4JuxEFzGBzG4EWKXnNHH
 IOu4IIXlbLX+U1kTtUFR4u6i4uBs2pZdEYzo1NF/Joacg14F01CBRuh8U04eeWFD
 HF4pWd4eCl/bLYPurF1rOi1dIUyrPuaPgNInGEdSaocD0hIvQH0r0wyIt+aMmqvK
 9Jm+dDEGeLxQn2nDrXfyldYG5EbFa3OplkUt2MVDDMWwN2Gpsjlnf/ucff/SBT/N
 7D6AL2OH6KDDCsNgU1JH9H6rAlh4nWJcsMBrWmP7aQtBMRyccQLywrt4HXB8cy7E
 +MyhTit05P3lpsrK2uZSFujK35Ts8hxywA7lAlU7YP4ADKu3Noc6qXSaxZRe+1Gb
 O5k3Qdcih0VLE843PjJj8f8fW1ysJW5J60cK9BaZxpB77gNufKkh/hS6YAiA8qkt
 PT3J0jk/cgGvwKK54rW52dG7qvDImgUMGkXGKQnEimgb62DatCZ4ZOPC+UoiDiqO
 SEd1DSW0Lt1VxVIulAjatVgzIJGM0jGCm9L7/vBguR0+Lahakbg=
 =vYok
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "A mix of fixes and clean-ups that turned up too late for the first
  pull request:

   - Restore terminal stack frame records. Their previous removal caused
     traces which cross secondary_start_kernel to terminate one entry
     too late, with a spurious "0" entry.

   - Fix boot warning with pseudo-NMI due to the way we manipulate the
     PMR register.

   - ACPI fixes: avoid corruption of interrupt mappings on watchdog
     probe failure (GTDT), prevent unregistering of GIC SGIs.

   - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with
     having to test all the other combinations.

   - Documentation fixes and updates: tagged address ABI exceptions on
     brk/mmap/mremap(), event stream frequency, update booting
     requirements on the configuration of traps"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Update the stale comment
  arm64: Fix the documented event stream frequency
  arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
  arm64: Explicitly document boot requirements for SVE
  arm64: Explicitly require that FPSIMD instructions do not trap
  arm64: Relax booting requirements for configuration of traps
  arm64: cpufeatures: use min and max
  arm64: stacktrace: restore terminal records
  arm64/vdso: Discard .note.gnu.property sections in vDSO
  arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions
  psci: Remove unneeded semicolon
  ACPI: irq: Prevent unregistering of GIC SGIs
  ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
  arm64: Show three registers per line
  arm64: remove HAVE_DEBUG_BUGVERBOSE
  arm64: alternative: simplify passing alt_region
  arm64: Force SPARSEMEM_VMEMMAP as the only memory management model
  arm64: vdso32: drop -no-integrated-as flag
2021-05-07 12:11:05 -07:00
Nicholas Piggin
168a633314 arm64: inline huge vmap supported functions
This allows unsupported levels to be constant folded away, and so
p4d_free_pud_page can be removed because it's no longer linked to.

Link: https://lkml.kernel.org/r/20210317062402.533919-9-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:40 -07:00
Nicholas Piggin
bbc180a5ad mm: HUGE_VMAP arch support cleanup
This changes the awkward approach where architectures provide init
functions to determine which levels they can provide large mappings for,
to one where the arch is queried for each call.

This removes code and indirection, and allows constant-folding of dead
code for unsupported levels.

This also adds a prot argument to the arch query.  This is unused
currently but could help with some architectures (e.g., some powerpc
processors can't map uncacheable memory with large pages).

Link: https://lkml.kernel.org/r/20210317062402.533919-7-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:40 -07:00
Linus Torvalds
31a24ae89c arm64 updates for 5.13:
- MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not allow
   precise identification of the illegal access.
 
 - Run kernel mode SIMD with softirqs disabled. This allows using NEON in
   softirq context for crypto performance improvements. The conditional
   yield support is modified to take softirqs into account and reduce the
   latency.
 
 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.
 
 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new
   functions for the HiSilicon HHA and L3C PMU, cleanups.
 
 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.
 
 - Disable fine-grained traps at boot and improve the documented boot
   requirements.
 
 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).
 
 - Add hierarchical eXecute Never permissions for all page tables.
 
 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.
 
 - arm64 kselftests for BTI and some improvements to the MTE tests.
 
 - Minor improvements to the compat vdso and sigpage.
 
 - Miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmB5xkkACgkQa9axLQDI
 XvEBgRAAsr6r8gsBQJP3FDHmbtbVf2ej5QJTCOAQAGHbTt0JH7Pk03pWSBr7h5nF
 vsddRDxxeDgB6xd7jWP7EvDaPxHeB0CdSj5gG8EP/ZdOm8sFAwB1ZIHWikgUgSwW
 nu6R28yXTMSj+EkyFtahMhTMJ1EMF4sCPuIgAo59ST5w/UMMqLCJByOu4ej6RPKZ
 aeSJJWaDLBmbgnTKWxRvCc/MgIx4J/LAHWGkdpGjuMK6SLp38Kdf86XcrklXtzwf
 K30ZYeoKq8zZ+nFOsK9gBVlOlocZcbS1jEbN842jD6imb6vKLQtBWrKk9A6o4v5E
 XulORWcSBhkZb3ItIU9+6SmelUExf0VeVlSp657QXYPgquoIIGvFl6rCwhrdGMGO
 bi6NZKCfJvcFZJoIN1oyhuHejgZSBnzGEcvhvzNdg7ItvOCed7q3uXcGHz/OI6tL
 2TZKddzHSEMVfTo0D+RUsYfasZHI1qAiQ0mWVC31c+YHuRuW/K/jlc3a5TXlSBUa
 Dwu0/zzMLiqx65ISx9i7XNMrngk55uzrS6MnwSByPoz4M4xsElZxt3cbUxQ8YAQz
 jhxTHs1Pwes8i7f4n61ay/nHCFbmVvN/LlsPRpZdwd8JumThLrDolF3tc6aaY0xO
 hOssKtnGY4Xvh/WitfJ5uvDb1vMObJKTXQEoZEJh4hlNQDxdeUE=
 =6NGI
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
2021-04-26 10:25:03 -07:00
Catalin Marinas
782276b4d0 arm64: Force SPARSEMEM_VMEMMAP as the only memory management model
Currently arm64 allows a choice of FLATMEM, SPARSEMEM and
SPARSEMEM_VMEMMAP. However, only the latter is tested regularly. FLATMEM
does not seem to boot in certain configurations (guest under KVM with
Qemu as a VMM). Since the reduction of the SECTION_SIZE_BITS to 27 (4K
pages) or 29 (64K page), there's little argument against the memory
wasted by the mem_map array with SPARSEMEM.

Make SPARSEMEM_VMEMMAP the only available option, non-selectable, and
remove the corresponding #ifdefs under arch/arm64/.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20210420093559.23168-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-23 14:18:21 +01:00
Pavel Tatashin
ee7febce05 arm64: mm: correct the inside linear map range during hotplug check
Memory hotplug may fail on systems with CONFIG_RANDOMIZE_BASE because the
linear map range is not checked correctly.

The start physical address that linear map covers can be actually at the
end of the range because of randomization. Check that and if so reduce it
to 0.

This can be verified on QEMU with setting kaslr-seed to ~0ul:

memstart_offset_seed = 0xffff
START: __pa(_PAGE_OFFSET(vabits_actual)) = ffff9000c0000000
END:   __pa(PAGE_END - 1) =  1000bfffffff

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Fixes: 58284a901b ("arm64/mm: Validate hotplug range before creating linear mapping")
Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210216150351.129018-2-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:47:40 +00:00
Ard Biesheuvel
87143f404f arm64: mm: use XN table mapping attributes for the linear region
The way the arm64 kernel virtual address space is constructed guarantees
that swapper PGD entries are never shared between the linear region on
the one hand, and the vmalloc region on the other, which is where all
kernel text, module text and BPF text mappings reside.

This means that mappings in the linear region (which never require
executable permissions) never share any table entries at any level with
mappings that do require executable permissions, and so we can set the
table-level PXN attributes for all table entries that are created while
setting up mappings in the linear region. Since swapper's PGD level page
table is mapped r/o itself, this adds another layer of robustness to the
way the kernel manages its own page tables. While at it, set the UXN
attribute as well for all kernel mappings created at boot.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210310104942.174584-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-19 18:50:45 +00:00