Commit Graph

1314 Commits

Author SHA1 Message Date
Linus Torvalds
f23cdfcd04 IOMMU Updates for Linux v6.1:
Including:
 
 	- Removal of the bus_set_iommu() interface which became
 	  unnecesary because of IOMMU per-device probing
 
 	- Make the dma-iommu.h header private
 
 	- Intel VT-d changes from Lu Baolu:
 	  - Decouple PASID and PRI from SVA
 	  - Add ESRTPS & ESIRTPS capability check
 	  - Cleanups
 
 	- Apple DART support for the M1 Pro/MAX SOCs
 
 	- Support for AMD IOMMUv2 page-tables for the DMA-API layer. The
 	  v2 page-tables are compatible with the x86 CPU page-tables.
 	  Using them for DMA-API prepares support for hardware-assisted
 	  IOMMU virtualization
 
 	- Support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver
 
 	- Some smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmNEC5oACgkQK/BELZcB
 GuNcOQ/6A5SXmcvDRLYZW1ENM5Z6xsZ1LabSZkjhYSpmbJyu8Uny/Z2aRWqxPMLJ
 hJeHTsWSLhrTq1VfjFhELHB3kgT2DRr7H3LXXaMNC6qz690EcavX1wKX2AxH0m22
 8YrktkyAmFQ3BG6rsQLdlMMasLph/x06ix/xO9opQZVFdj/fV0Jx7ekX1JK+U3hx
 MI96i5W3G5PBVHBypAvjxSlmA4saj9Fhk7l3IZL7py9AOKz7NypuwWRs+86PMBiO
 EzLt5aF4g8pmKChF/c9BsoIbjBYvTG/s3NbycIng0ACc2SOvf+EvtoVZQclWifbT
 lwti9PLdsoVUnPOZHLYOTx4xSf/UyoLVzaLxJ52aoXnNYe2qaX5DANXhT2mWIY/Y
 z1mzOkShmK7WF7a8arRyqJeLJ4SvDx8GrbvLiom3DAzmqVHzzFGadHtt5fvGYN4F
 Jet/JIN3HjECQbamqtPBpWquBFhLmgusPksIiyMFscRvYdZqkaVkTkElcF3WqAMm
 QkeecfoTQ9Vdtdz44ZVLRjKpS77yRZmHshp1r/rfSI+9Ok8uRI+xmmcyrAI6ElqH
 DH14tLHPzw694rTHF+bTCd+pPMGOoFLi0xAfUXAeGWm1uzC1JIRrVu5JeQNOUOSD
 5SQDXB7dPrhXngaws5Fx2u3amCO3688mslcGgM7q54kC+LyVo0E=
 =h0sT
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - remove the bus_set_iommu() interface which became unnecesary because
   of IOMMU per-device probing

 - make the dma-iommu.h header private

 - Intel VT-d changes from Lu Baolu:
	  - Decouple PASID and PRI from SVA
	  - Add ESRTPS & ESIRTPS capability check
	  - Cleanups

 - Apple DART support for the M1 Pro/MAX SOCs

 - support for AMD IOMMUv2 page-tables for the DMA-API layer.

   The v2 page-tables are compatible with the x86 CPU page-tables. Using
   them for DMA-API prepares support for hardware-assisted IOMMU
   virtualization

 - support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver

 - some smaller fixes and cleanups

* tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits)
  iommu/vt-d: Avoid unnecessary global DMA cache invalidation
  iommu/vt-d: Avoid unnecessary global IRTE cache invalidation
  iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_support
  iommu/vt-d: Remove pasid_set_eafe()
  iommu/vt-d: Decouple PASID & PRI enabling from SVA
  iommu/vt-d: Remove unnecessary SVA data accesses in page fault path
  dt-bindings: iommu: arm,smmu-v3: Relax order of interrupt names
  iommu: dart: Support t6000 variant
  iommu/io-pgtable-dart: Add DART PTE support for t6000
  iommu/io-pgtable: Add DART subpage protection support
  iommu/io-pgtable: Move Apple DART support to its own file
  iommu/mediatek: Add support for MT6795 Helio X10 M4Us
  iommu/mediatek: Introduce new flag TF_PORT_TO_ADDR_MT8173
  dt-bindings: mediatek: Add bindings for MT6795 M4U
  iommu/iova: Fix module config properly
  iommu/amd: Fix sparse warning
  iommu/amd: Remove outdated comment
  iommu/amd: Free domain ID after domain_flush_pages
  iommu/amd: Free domain id in error path
  iommu/virtio: Fix compile error with viommu_capable()
  ...
2022-10-10 13:20:53 -07:00
Linus Torvalds
18fd049731 arm64 updates for 6.1:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.
 
 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation
   to match the actual kernel behaviour (zeroing the registers on syscall
   rather than "zeroed or preserved" previously).
 
 - More conversions to automatic system registers generation.
 
 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.
 
 - arm64 stacktrace cleanups and improvements.
 
 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.
 
 - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception
   handling, better EL1 undefs reporting.
 
 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.
 
 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.
 
 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).
 
 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.
 
 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include larger
   SVE and SME vector lengths in signal tests, various cleanups.
 
 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.
 
 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI
 XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI
 kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU
 eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ
 qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72
 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1
 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H
 SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS
 MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy
 /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT
 MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps
 LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8=
 =jl3Z
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.

 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
   documentation to match the actual kernel behaviour (zeroing the
   registers on syscall rather than "zeroed or preserved" previously).

 - More conversions to automatic system registers generation.

 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.

 - arm64 stacktrace cleanups and improvements.

 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.

 - Improve the reporting of EL1 exceptions: rework BTI and FPAC
   exception handling, better EL1 undefs reporting.

 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.

 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.

 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).

 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.

 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include
   larger SVE and SME vector lengths in signal tests, various cleanups.

 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.

 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
  arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
  arm64/kprobe: Optimize the performance of patching single-step slot
  arm64: defconfig: Add Coresight as module
  kselftest/arm64: Handle EINTR while reading data from children
  kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
  kselftest/arm64: Don't repeat termination handler for fp-stress
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
  arm64: fix the build with binutils 2.27
  kselftest/arm64: Don't enable v8.5 for MTE selftest builds
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  kselftest/arm64: Fix typo in hwcap check
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64/sve: Add Perf extensions documentation
  ...
2022-10-06 11:51:49 -07:00
Linus Torvalds
865dad2022 kcfi updates for v6.1-rc1
This replaces the prior support for Clang's standard Control Flow
 Integrity (CFI) instrumentation, which has required a lot of special
 conditions (e.g. LTO) and work-arounds. The current implementation
 ("Kernel CFI") is specific to C, directly designed for the Linux kernel,
 and takes advantage of architectural features like x86's IBT. This
 series retains arm64 support and adds x86 support. Additional "generic"
 architectural support is expected soon:
 https://github.com/samitolvanen/llvm-project/commits/kcfi_generic
 
 - treewide: Remove old CFI support details
 
 - arm64: Replace Clang CFI support with Clang KCFI support
 
 - x86: Introduce Clang KCFI support
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmM4aAUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJkgWD/4mUgb7xewNIG/+fuipGd620Iao
 K0T8q4BNxLNRltOxNc3Q0WMDCggX0qJGCeds7EdFQJQOGxWcbifM8MAS4idAGM0G
 fc3Gxl1imC/oF6goCAbQgndA6jYFIWXGsv8LsRjAXRidWLFr3GFAqVqYJyokSySr
 8zMQsEDuF4I1gQnOhEWdtPZbV3MQ4ZjfFzpv+33agbq6Gb72vKvDh3G6g2VXlxjt
 1qnMtS+eEpbBU65cJkOi4MSLgymWbnIAeTMb0dbsV4kJ08YoTl8uz1B+weeH6GgT
 WP73ZJ4nqh1kkkT9EqS9oKozNB9fObhvCokEuAjuQ7i1eCEZsbShvRc0iL7OKTGG
 UfuTJa5qQ4h7Z0JS35FCSJETa+fcG0lTyEd133nLXLMZP9K2antf+A6O//fd0J1V
 Jg4VN7DQmZ+UNGOzRkL6dTtQUy4PkxhniIloaClfSYXxhNirA+v//sHTnTK3z2Bl
 6qceYqmFmns2Laual7+lvnZgt6egMBcmAL/MOdbU74+KIR9Xw76wxQjifktHX+WF
 FEUQkUJDB5XcUyKlbvHoqobRMxvEZ8RIlC5DIkgFiPRE3TI0MqfzNSFnQ/6+lFNg
 Y0AS9HYJmcj8sVzAJ7ji24WPFCXzsbFn6baJa9usDNbWyQZokYeiv7ZPNPHPDVrv
 YEBP6aYko0lVSUS9qw==
 =Li4D
 -----END PGP SIGNATURE-----

Merge tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kcfi updates from Kees Cook:
 "This replaces the prior support for Clang's standard Control Flow
  Integrity (CFI) instrumentation, which has required a lot of special
  conditions (e.g. LTO) and work-arounds.

  The new implementation ("Kernel CFI") is specific to C, directly
  designed for the Linux kernel, and takes advantage of architectural
  features like x86's IBT. This series retains arm64 support and adds
  x86 support.

  GCC support is expected in the future[1], and additional "generic"
  architectural support is expected soon[2].

  Summary:

   - treewide: Remove old CFI support details

   - arm64: Replace Clang CFI support with Clang KCFI support

   - x86: Introduce Clang KCFI support"

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1]
Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2]

* tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  x86: Add support for CONFIG_CFI_CLANG
  x86/purgatory: Disable CFI
  x86: Add types to indirectly called assembly functions
  x86/tools/relocs: Ignore __kcfi_typeid_ relocations
  kallsyms: Drop CONFIG_CFI_CLANG workarounds
  objtool: Disable CFI warnings
  objtool: Preserve special st_shndx indexes in elf_update_symbol
  treewide: Drop __cficanonical
  treewide: Drop WARN_ON_FUNCTION_MISMATCH
  treewide: Drop function_nocfi
  init: Drop __nocfi from __init
  arm64: Drop unneeded __nocfi attributes
  arm64: Add CFI error handling
  arm64: Add types to indirect called assembly functions
  psci: Fix the function type for psci_initcall_t
  lkdtm: Emit an indirect call for CFI tests
  cfi: Add type helper macros
  cfi: Switch to -fsanitize=kcfi
  cfi: Drop __CFI_ADDRESSABLE
  cfi: Remove CONFIG_CFI_CLANG_SHADOW
  ...
2022-10-03 17:11:07 -07:00
Catalin Marinas
53630a1f61 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  : Miscellaneous patches
  arm64/kprobe: Optimize the performance of patching single-step slot
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64: support huge vmalloc mappings
  arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually
  arm64: run softirqs on the per-CPU IRQ stack
  arm64: compat: Implement misalignment fixups for multiword loads
2022-09-30 09:18:26 +01:00
Mike Rapoport
b9dd04a20f arm64/mm: fold check for KFENCE into can_set_direct_map()
KFENCE requires linear map to be mapped at page granularity, so that it
is possible to protect/unprotect single pages, just like with
rodata_full and DEBUG_PAGEALLOC.

Instead of repating

	can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE)

make can_set_direct_map() handle the KFENCE case.

This also prevents potential false positives in kernel_page_present()
that may return true for non-present page if CONFIG_KFENCE is enabled.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220921074841.382615-1-rppt@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-29 17:59:28 +01:00
Sami Tolvanen
c50d32859e arm64: Add types to indirect called assembly functions
With CONFIG_CFI_CLANG, assembly functions indirectly called from C
code must be annotated with type identifiers to pass CFI checking. Use
SYM_TYPED_FUNC_START for the indirectly called functions, and ensure
we emit `bti c` also with SYM_TYPED_FUNC_START.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-10-samitolvanen@google.com
2022-09-26 10:13:13 -07:00
Joerg Roedel
38713c6028 Merge branches 'apple/dart', 'arm/mediatek', 'arm/omap', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next 2022-09-26 15:52:31 +02:00
Peter Collingbourne
973b9e3733 arm64: mte: move register initialization to C
If FEAT_MTE2 is disabled via the arm64.nomte command line argument on a
CPU that claims to support FEAT_MTE2, the kernel will use Tagged Normal
in the MAIR. If we interpret arm64.nomte to mean that the CPU does not
in fact implement FEAT_MTE2, setting the system register like this may
lead to UNSPECIFIED behavior. Fix it by arranging for MAIR to be set
in the C function cpu_enable_mte which is called based on the sanitized
version of the system register.

There is no need for the rest of the MTE-related system register
initialization to happen from assembly, with the exception of TCR_EL1,
which must be set to include at least TBI1 because the secondary CPUs
access KASan-allocated data structures early. Therefore, make the TCR_EL1
initialization unconditional and move the rest of the initialization to
cpu_enable_mte so that we no longer have a dependency on the unsanitized
ID register value.

Co-developed-by: Evgenii Stepanov <eugenis@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Evgenii Stepanov <eugenis@google.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 3b714d24ef ("arm64: mte: CPU feature detection and initial sysreg configuration")
Cc: <stable@vger.kernel.org> # 5.10.x
Link: https://lore.kernel.org/r/20220915222053.3484231-1-eugenis@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 18:02:50 +01:00
Kefeng Wang
739e49e0fc arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
Directly check ARM64_SWAPPER_USES_SECTION_MAPS to choose base page
or PMD level huge page mapping in vmemmap_populate() to simplify
code a bit.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220920014951.196191-1-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 16:25:51 +01:00
Will Deacon
c44094eee3 arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arch_dma_prep_coherent() is called when preparing a non-cacheable region
for a consistent DMA buffer allocation. Since the buffer pages may
previously have been written via a cacheable mapping and consequently
allocated as dirty cachelines, the purpose of this function is to remove
these dirty lines from the cache, writing them back so that the
non-coherent device is able to see them.

On arm64, this operation can be achieved with a clean to the point of
coherency; a subsequent invalidation is not required and serves little
purpose in the presence of a cacheable alias (e.g. the linear map),
since clean lines can be speculatively fetched back into the cache after
the invalidation operation has completed.

Relax the cache maintenance in arch_dma_prep_coherent() so that only a
clean, and not a clean-and-invalidate operation is performed.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220823122111.17439-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 15:09:31 +01:00
Mark Rutland
61d2d1808b arm64: mm: don't acquire mutex when rewriting swapper
Since commit:

  47546a1912 ("arm64: mm: install KPTI nG mappings with MMU enabled)"

... when building with CONFIG_DEBUG_ATOMIC_SLEEP=y and booting under
QEMU TCG with '-cpu max', there's a boot-time splat:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 15, name: migration/0
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| no locks held by migration/0/15.
| irq event stamp: 28
| hardirqs last  enabled at (27): [<ffff8000091ed180>] _raw_spin_unlock_irq+0x3c/0x7c
| hardirqs last disabled at (28): [<ffff8000081b8d74>] multi_cpu_stop+0x150/0x18c
| softirqs last  enabled at (0): [<ffff80000809a314>] copy_process+0x594/0x1964
| softirqs last disabled at (0): [<0000000000000000>] 0x0
| CPU: 0 PID: 15 Comm: migration/0 Not tainted 6.0.0-rc3-00002-g419b42ff7eef #3
| Hardware name: linux,dummy-virt (DT)
| Stopper: multi_cpu_stop+0x0/0x18c <- stop_cpus.constprop.0+0xa0/0xfc
| Call trace:
|  dump_backtrace.part.0+0xd0/0xe0
|  show_stack+0x1c/0x5c
|  dump_stack_lvl+0x88/0xb4
|  dump_stack+0x1c/0x38
|  __might_resched+0x180/0x230
|  __might_sleep+0x4c/0xa0
|  __mutex_lock+0x5c/0x450
|  mutex_lock_nested+0x30/0x40
|  create_kpti_ng_temp_pgd+0x4fc/0x6d0
|  kpti_install_ng_mappings+0x2b8/0x3b0
|  cpu_enable_non_boot_scope_capabilities+0x7c/0xd0
|  multi_cpu_stop+0xa0/0x18c
|  cpu_stopper_thread+0x88/0x11c
|  smpboot_thread_fn+0x1ec/0x290
|  kthread+0x118/0x120
|  ret_from_fork+0x10/0x20

Since commit:

  ee017ee353 ("arm64/mm: avoid fixmap race condition when create pud mapping")

... once the kernel leave the SYSTEM_BOOTING state, the fixmap pagetable
entries are protected by the fixmap_lock mutex.

The new KPTI rewrite code uses __create_pgd_mapping() to create a
temporary pagetable. This happens in atomic context, after secondary
CPUs are brought up and the kernel has left the SYSTEM_BOOTING state.
Hence we try to acquire a mutex in atomic context, which is generally
unsound (though benign in this case as the mutex should be free and all
other CPUs are quiescent).

This patch avoids the issue by pulling the mutex out of alloc_init_pud()
and calling it at a higher level in the pagetable manipulation code.
This allows it to be used without locking where one CPU is known to be
in exclusive control of the machine, even after having left the
SYSTEM_BOOTING state.

Fixes: 47546a1912 ("arm64: mm: install KPTI nG mappings with MMU enabled")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220920134731.1625740-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 12:55:39 +01:00
Mark Brown
2e75b393ff arm64/sysreg: Standardise naming for MTE feature enumeration
In preparation for conversion to automatic generation refresh the names
given to the items in the MTE feture enumeration to reflect our standard
pattern for naming, corresponding to the architecture feature names they
reflect. No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-17-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:04 +01:00
Mark Brown
07d7d848b9 arm64/sysreg: Standardise naming of ID_AA64MMFR0_EL1.ASIDBits
For some reason we refer to ID_AA64MMFR0_EL1.ASIDBits as ASID. Add BITS
into the name, bringing the naming into sync with DDI0487H.a. Due to the
large amount of MixedCase in this register which isn't really consistent
with either the kernel style or the majority of the architecture the use of
upper case is preserved. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-10-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Mark Brown
6ca2b9ca45 arm64/sysreg: Add _EL1 into ID_AA64PFR1_EL1 constant names
Our standard is to include the _EL1 in the constant names for registers but
we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for
conversion to automatic generation. No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Mark Brown
2d987e64e8 arm64/sysreg: Add _EL1 into ID_AA64MMFR0_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64MMFR0_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Robin Murphy
fa49364cd5 iommu/dma: Move public interfaces to linux/iommu.h
The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with
only a couple more public interfaces left pertaining to MSI integration.
Since these depend on the main IOMMU API header anyway, move their
declarations there, taking the opportunity to update the half-baked
comments to proper kerneldoc along the way.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07 14:47:00 +02:00
Ard Biesheuvel
3fc24ef32d arm64: compat: Implement misalignment fixups for multiword loads
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.

This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.

Note that the intent is to implement the exact same handling of
misaligned multi-word loads and stores as the 32-bit kernel does,
including what appears to be missing support for user space programs
that rely on SETEND to switch to a different byte order and back. Also,
like the 32-bit ARM version, we rely on the faulting address reported by
the CPU to infer the memory address, instead of decoding the instruction
fully to obtain this information.

This implementation is taken from the 32-bit ARM tree, with all pieces
removed that deal with instructions other than LDRD/STRD and LDM/STM, or
that deal with alignment exceptions taken in kernel mode.

Cc: debian-arm@lists.debian.org
Cc: Vagrant Cascadian <vagrant@debian.org>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Steve McIntyre <steve@einval.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220701135322.3025321-1-ardb@kernel.org
[catalin.marinas@arm.com: change the option to 'default n']
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-06 09:34:53 +01:00
Mark Rutland
2e8cff0a0e arm64: fix rodata=full
On arm64, "rodata=full" has been suppored (but not documented) since
commit:

  c55191e96c ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well")

As it's necessary to determine the rodata configuration early during
boot, arm64 has an early_param() handler for this, whereas init/main.c
has a __setup() handler which is run later.

Unfortunately, this split meant that since commit:

  f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")

... passing "rodata=full" would result in a spurious warning from the
__setup() handler (though RO permissions would be configured
appropriately).

Further, "rodata=full" has been broken since commit:

  0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")

... which caused strtobool() to parse "full" as false (in addition to
many other values not documented for the "rodata=" kernel parameter.

This patch fixes this breakage by:

* Moving the core parameter parser to an __early_param(), such that it
  is available early.

* Adding an (optional) arch hook which arm64 can use to parse "full".

* Updating the documentation to mention that "full" is valid for arm64.

* Having the core parameter parser handle "on" and "off" explicitly,
  such that any undocumented values (e.g. typos such as "ful") are
  reported as errors rather than being silently accepted.

Note that __setup() and early_param() have opposite conventions for
their return values, where __setup() uses 1 to indicate a parameter was
handled and early_param() uses 0 to indicate a parameter was handled.

Fixes: f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")
Fixes: 0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jagdish Gediya <jvgediya@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-08-23 11:02:02 +01:00
Muchun Song
2da1c30929 mm: hugetlb_vmemmap: delete hugetlb_optimize_vmemmap_enabled()
Patch series "Simplify hugetlb vmemmap and improve its readability", v2.

This series aims to simplify hugetlb vmemmap and improve its readability.


This patch (of 8):

The name hugetlb_optimize_vmemmap_enabled() a bit confusing as it tests
two conditions (enabled and pages in use).  Instead of coming up to an
appropriate name, we could just delete it.  There is already a discussion
about deleting it in thread [1].

There is only one user of hugetlb_optimize_vmemmap_enabled() outside of
hugetlb_vmemmap, that is flush_dcache_page() in arch/arm64/mm/flush.c. 
However, it does not need to call hugetlb_optimize_vmemmap_enabled() in
flush_dcache_page() since HugeTLB pages are always fully mapped and only
head page will be set PG_dcache_clean meaning only head page's flag may
need to be cleared (see commit cf5a501d98).  So it is easy to remove
hugetlb_optimize_vmemmap_enabled().

Link: https://lore.kernel.org/all/c77c61c8-8a5a-87e8-db89-d04d8aaab4cc@oracle.com/ [1]
Link: https://lkml.kernel.org/r/20220628092235.91270-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-08 18:06:42 -07:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
97a77ab14f EFI updates for v5.20
- Enable mirrored memory for arm64
 - Fix up several abuses of the efivar API
 - Refactor the efivar API in preparation for moving the 'business logic'
   part of it into efivarfs
 - Enable ACPI PRM on arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmLhuDIACgkQw08iOZLZ
 jyS9IQv/Wc2nhjN50S3gfrL+68/el/hGdP/J0FK5BOOjNosG2t1ZNYZtSthXqpPH
 hRrTU2m6PpQUalRpFDyLiHkJvdBFQe4VmvrzBa3TIBIzyflLQPJzkWrqThPchV+B
 qi4lmCtTDNIEJmayewqx1wWA+QmUiyI5zJ8wrZp84LTctBPL75seVv0SB20nqai0
 3/I73omB2RLVGpCpeWvb++vePXL8euFW3FEwCTM8hRboICjORTyIZPy8Y5os+3xT
 UgrIgVDOtn1Xwd4tK0qVwjOVA51east4Fcn3yGOrL40t+3SFm2jdpAJOO3UvyNPl
 vkbtjvXsIjt3/oxreKxXHLbamKyueWIfZRyCLsrg6wrr96oypPk6ID4iDCQoen/X
 Zf0VjM2vmvSd4YgnEIblOfSBxVg48cHJA4iVHVxFodNTrVnzGGFYPTmNKmJqo+Xn
 JeUILM7jlR4h/t0+cTTK3Busu24annTuuz5L5rjf4bUm6pPf4crb1yJaFWtGhlpa
 er233D6O
 =zI0R
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - Enable mirrored memory for arm64

 - Fix up several abuses of the efivar API

 - Refactor the efivar API in preparation for moving the 'business
   logic' part of it into efivarfs

 - Enable ACPI PRM on arm64

* tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
  ACPI: Move PRM config option under the main ACPI config
  ACPI: Enable Platform Runtime Mechanism(PRM) support on ARM64
  ACPI: PRM: Change handler_addr type to void pointer
  efi: Simplify arch_efi_call_virt() macro
  drivers: fix typo in firmware/efi/memmap.c
  efi: vars: Drop __efivar_entry_iter() helper which is no longer used
  efi: vars: Use locking version to iterate over efivars linked lists
  efi: pstore: Omit efivars caching EFI varstore access layer
  efi: vars: Add thin wrapper around EFI get/set variable interface
  efi: vars: Don't drop lock in the middle of efivar_init()
  pstore: Add priv field to pstore_record for backend specific use
  Input: applespi - avoid efivars API and invoke EFI services directly
  selftests/kexec: remove broken EFI_VARS secure boot fallback check
  brcmfmac: Switch to appropriate helper to load EFI variable contents
  iwlwifi: Switch to proper EFI variable store interface
  media: atomisp_gmin_platform: stop abusing efivar API
  efi: efibc: avoid efivar API for setting variables
  efi: avoid efivars layer when loading SSDTs from variables
  efi: Correct comment on efi_memmap_alloc
  memblock: Disable mirror feature if kernelcore is not specified
  ...
2022-08-03 14:38:02 -07:00
Linus Torvalds
0cec3f24a7 arm64 updates for 5.20
- Remove unused generic cpuidle support (replaced by PSCI version)
 
 - Fix documentation describing the kernel virtual address space
 
 - Handling of some new CPU errata in Arm implementations
 
 - Rework of our exception table code in preparation for handling
   machine checks (i.e. RAS errors) more gracefully
 
 - Switch over to the generic implementation of ioremap()
 
 - Fix lockdep tracking in NMI context
 
 - Instrument our memory barrier macros for KCSAN
 
 - Rework of the kPTI G->nG page-table repainting so that the MMU remains
   enabled and the boot time is no longer slowed to a crawl for systems
   which require the late remapping
 
 - Enable support for direct swapping of 2MiB transparent huge-pages on
   systems without MTE
 
 - Fix handling of MTE tags with allocating new pages with HW KASAN
 
 - Expose the SMIDR register to userspace via sysfs
 
 - Continued rework of the stack unwinder, particularly improving the
   behaviour under KASAN
 
 - More repainting of our system register definitions to match the
   architectural terminology
 
 - Improvements to the layout of the vDSO objects
 
 - Support for allocating additional bits of HWCAP2 and exposing
   FEAT_EBF16 to userspace on CPUs that support it
 
 - Considerable rework and optimisation of our early boot code to reduce
   the need for cache maintenance and avoid jumping in and out of the
   kernel when handling relocation under KASLR
 
 - Support for disabling SVE and SME support on the kernel command-line
 
 - Support for the Hisilicon HNS3 PMU
 
 - Miscellanous cleanups, trivial updates and minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x
 Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4
 wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ
 kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs
 UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS
 jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1
 =LVH0
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Highlights include a major rework of our kPTI page-table rewriting
  code (which makes it both more maintainable and considerably faster in
  the cases where it is required) as well as significant changes to our
  early boot code to reduce the need for data cache maintenance and
  greatly simplify the KASLR relocation dance.

  Summary:

   - Remove unused generic cpuidle support (replaced by PSCI version)

   - Fix documentation describing the kernel virtual address space

   - Handling of some new CPU errata in Arm implementations

   - Rework of our exception table code in preparation for handling
     machine checks (i.e. RAS errors) more gracefully

   - Switch over to the generic implementation of ioremap()

   - Fix lockdep tracking in NMI context

   - Instrument our memory barrier macros for KCSAN

   - Rework of the kPTI G->nG page-table repainting so that the MMU
     remains enabled and the boot time is no longer slowed to a crawl
     for systems which require the late remapping

   - Enable support for direct swapping of 2MiB transparent huge-pages
     on systems without MTE

   - Fix handling of MTE tags with allocating new pages with HW KASAN

   - Expose the SMIDR register to userspace via sysfs

   - Continued rework of the stack unwinder, particularly improving the
     behaviour under KASAN

   - More repainting of our system register definitions to match the
     architectural terminology

   - Improvements to the layout of the vDSO objects

   - Support for allocating additional bits of HWCAP2 and exposing
     FEAT_EBF16 to userspace on CPUs that support it

   - Considerable rework and optimisation of our early boot code to
     reduce the need for cache maintenance and avoid jumping in and out
     of the kernel when handling relocation under KASLR

   - Support for disabling SVE and SME support on the kernel
     command-line

   - Support for the Hisilicon HNS3 PMU

   - Miscellanous cleanups, trivial updates and minor fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
  arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
  arm64: fix KASAN_INLINE
  arm64/hwcap: Support FEAT_EBF16
  arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
  arm64/hwcap: Document allocation of upper bits of AT_HWCAP
  arm64: enable THP_SWAP for arm64
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: errata: Remove AES hwcap for COMPAT tasks
  arm64: numa: Don't check node against MAX_NUMNODES
  drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
  perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
  docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
  mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
  mm: kasan: Skip unpoisoning of user pages
  mm: kasan: Ensure the tags are visible before the tag in page->flags
  drivers/perf: hisi: add driver for HNS3 PMU
  drivers/perf: hisi: Add description for HNS3 PMU driver
  drivers/perf: riscv_pmu_sbi: perf format
  perf/arm-cci: Use the bitmap API to allocate bitmaps
  ...
2022-08-01 10:37:00 -07:00
Will Deacon
f96d67a8af Merge branch 'for-next/boot' into for-next/core
* for-next/boot: (34 commits)
  arm64: fix KASAN_INLINE
  arm64: Add an override for ID_AA64SMFR0_EL1.FA64
  arm64: Add the arm64.nosve command line option
  arm64: Add the arm64.nosme command line option
  arm64: Expose a __check_override primitive for oddball features
  arm64: Allow the idreg override to deal with variable field width
  arm64: Factor out checking of a feature against the override into a macro
  arm64: Allow sticky E2H when entering EL1
  arm64: Save state of HCR_EL2.E2H before switch to EL1
  arm64: Rename the VHE switch to "finalise_el2"
  arm64: mm: fix booting with 52-bit address space
  arm64: head: remove __PHYS_OFFSET
  arm64: lds: use PROVIDE instead of conditional definitions
  arm64: setup: drop early FDT pointer helpers
  arm64: head: avoid relocating the kernel twice for KASLR
  arm64: kaslr: defer initialization to initcall where permitted
  arm64: head: record CPU boot mode after enabling the MMU
  arm64: head: populate kernel page tables with MMU and caches on
  arm64: head: factor out TTBR1 assignment into a macro
  arm64: idreg-override: use early FDT mapping in ID map
  ...
2022-07-25 10:59:15 +01:00
Will Deacon
c436500d9f Merge branch 'for-next/mte' into for-next/core
* for-next/mte:
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
  mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
  mm: kasan: Skip unpoisoning of user pages
  mm: kasan: Ensure the tags are visible before the tag in page->flags
2022-07-25 10:57:08 +01:00
Will Deacon
02eab44c71 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: numa: Don't check node against MAX_NUMNODES
  arm64: mm: Remove assembly DMA cache maintenance wrappers
  arm64/mm: Define defer_reserve_crashkernel()
  arm64: fix oops in concurrently setting insn_emulation sysctls
  arm64: Do not forget syscall when starting a new thread.
  arm64: boot: add zstd support
2022-07-25 10:56:57 +01:00
Will Deacon
8184a8bc1c Merge branch 'for-next/kpti' into for-next/core
* for-next/kpti:
  arm64: correct the effect of mitigations off on kpti
  arm64: entry: simplify trampoline data page
  arm64: mm: install KPTI nG mappings with MMU enabled
  arm64: kpti-ng: simplify page table traversal logic
2022-07-25 10:56:49 +01:00
Will Deacon
84d8857af4 Merge branch 'for-next/ioremap' into for-next/core
* for-next/ioremap:
  arm64: Add HAVE_IOREMAP_PROT support
  arm64: mm: Convert to GENERIC_IOREMAP
  mm: ioremap: Add ioremap/iounmap_allowed()
  mm: ioremap: Setup phys_addr of struct vm_struct
  mm: ioremap: Use more sensible name in ioremap_prot()
  ARM: mm: kill unused runtime hook arch_iounmap()
2022-07-25 10:56:23 +01:00
Anshuman Khandual
42251045cc arm64/mm: move protection_map[] inside the platform
This moves protection_map[] inside the platform and makes it a static.

Link: https://lkml.kernel.org/r/20220711070600.2378316-6-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:37 -07:00
Baolin Wang
1bcdb769f9 arm64/hugetlb: implement arm64 specific hugetlb_mask_last_page
The HugeTLB address ranges are linearly scanned during fork, unmap and
remap operations, and the linear scan can skip to the end of range mapped
by the page table page if hitting a non-present entry, which can help to
speed linear scanning of the HugeTLB address ranges.

So hugetlb_mask_last_page() is introduced to help to update the address in
the loop of HugeTLB linear scanning with getting the last huge page mapped
by the associated page table page[1], when a non-present entry is
encountered.

Considering ARM64 specific cont-pte/pmd size HugeTLB, this patch
implemented an ARM64 specific hugetlb_mask_last_page() to help this case.

[1] https://lore.kernel.org/linux-mm/20220527225849.284839-1-mike.kravetz@oracle.com/

[baolin.wang@linux.alibaba.com: fix build]
  Link: https://lkml.kernel.org/r/a14e7b39-6a8a-4609-b4a1-84ac574f5c96@linux.alibaba.com
Link: https://lkml.kernel.org/r/20220621235620.291305-3-mike.kravetz@oracle.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Houghton <jthoughton@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:34 -07:00
Catalin Marinas
20794545c1 arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
This reverts commit e5b8d92189.

Pages mapped in user-space with PROT_MTE have the allocation tags either
zeroed or copied/restored to some user values. In order for the kernel
to access such pages via page_address(), resetting the tag in
page->flags was necessary. This tag resetting was deferred to
set_pte_at() -> mte_sync_page_tags() but it can race with another CPU
reading the flags (via page_to_virt()):

P0 (mte_sync_page_tags):	P1 (memcpy from virt_to_page):
				  Rflags!=0xff
  Wflags=0xff
  DMB (doesn't help)
  Wtags=0
				  Rtags=0   // fault

Since now the post_alloc_hook() function resets the page->flags tag when
unpoisoning is skipped for user pages (including the __GFP_ZEROTAGS
case), revert the arm64 commit calling page_kasan_tag_reset().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20220610152141.2148929-5-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-07 10:48:37 +01:00
Catalin Marinas
70c248aca9 mm: kasan: Skip unpoisoning of user pages
Commit c275c5c6d5 ("kasan: disable freed user page poisoning with HW
tags") added __GFP_SKIP_KASAN_POISON to GFP_HIGHUSER_MOVABLE. A similar
argument can be made about unpoisoning, so also add
__GFP_SKIP_KASAN_UNPOISON to user pages. To ensure the user page is
still accessible via page_address() without a kasan fault, reset the
page->flags tag.

With the above changes, there is no need for the arm64
tag_clear_highpage() to reset the page->flags tag.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20220610152141.2148929-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-07 10:48:37 +01:00
Will Deacon
7eacf1858b arm64: mm: Remove assembly DMA cache maintenance wrappers
Remove the __dma_{flush,map,unmap}_area assembly wrappers and call the
appropriate cache maintenance functions directly from the DMA mapping
callbacks.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220610151228.4562-3-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 13:06:31 +01:00
Anshuman Khandual
4890cc18f9 arm64/mm: Define defer_reserve_crashkernel()
Crash kernel memory reservation gets deferred, when either CONFIG_ZONE_DMA
or CONFIG_ZONE_DMA32 config is enabled on the platform. This deferral also
impacts overall linear mapping creation including the crash kernel itself.
Just encapsulate this deferral check in a new helper for better clarity.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220705062556.1845734-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:43:47 +01:00
Qi Zheng
18f3962953 mm: hugetlb: kill set_huge_swap_pte_at()
Commit e5251fd430 ("mm/hugetlb: introduce set_huge_swap_pte_at()
helper") add set_huge_swap_pte_at() to handle swap entries on
architectures that support hugepages consisting of contiguous ptes.  And
currently the set_huge_swap_pte_at() is only overridden by arm64.

set_huge_swap_pte_at() provide a sz parameter to help determine the number
of entries to be updated.  But in fact, all hugetlb swap entries contain
pfn information, so we can find the corresponding folio through the pfn
recorded in the swap entry, then the folio_size() is the number of entries
that need to be updated.

And considering that users will easily cause bugs by ignoring the
difference between set_huge_swap_pte_at() and set_huge_pte_at().  Let's
handle swap entries in set_huge_pte_at() and remove the
set_huge_swap_pte_at(), then we can call set_huge_pte_at() anywhere, which
simplifies our coding.

Link: https://lkml.kernel.org/r/20220626145717.53572-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-03 18:08:50 -07:00
Will Deacon
4109823037 arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
Commit fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
removed TLB invalidation from get_clear_flush() [now get_clear_contig()]
on the basis that the core TLB invalidation code is aware of hugetlb
mappings backed by contiguous page-table entries and will cover the
correct virtual address range.

However, this change also resulted in the TLB invalidation being removed
from the "break" step in the break-before-make (BBM) sequence used
internally by huge_ptep_set_{access_flags,wrprotect}(), therefore
making the BBM sequence unsafe irrespective of later invalidation.

Although the architecture is desperately unclear about how exactly
contiguous ptes should be updated in a live page-table, restore TLB
invalidation to our BBM sequence under the assumption that BBM is the
right thing to be doing in the first place.

Fixes: fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220629095349.25748-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-07-01 18:29:26 +01:00
Ard Biesheuvel
0aaa68532e arm64: mm: fix booting with 52-bit address space
Joey reports that booting 52-bit VA capable builds on 52-bit VA capable
CPUs is broken since commit 0d9b1ffefa ("arm64: mm: make vabits_actual
a build time constant if possible"). This is due to the fact that the
primary CPU reads the vabits_actual variable before it has been
assigned.

The reason for deferring the assignment of vabits_actual was that we try
to perform as few stores to memory as we can with the MMU and caches
off, due to the cache coherency issues it creates.

Since __cpu_setup() [which is where the read of vabits_actual occurs] is
also called on the secondary boot path, we cannot just read the CPU ID
registers directly, given that the size of the VA space is decided by
the capabilities of the primary CPU. So let's read vabits_actual only on
the secondary boot path, and read the CPU ID registers directly on the
primary boot path, by making it a function parameter of __cpu_setup().

To ensure that all users of vabits_actual (including kasan_early_init())
observe the correct value, move the assignment of vabits_actual back
into asm code, but still defer it to after the MMU and caches have been
enabled.

Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 0d9b1ffefa ("arm64: mm: make vabits_actual a build time constant if possible")
Reported-by: Joey Gouly <joey.gouly@arm.com>
Co-developed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220701111045.2944309-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01 15:19:07 +01:00
Tong Tiangen
bacac63702 arm64: extable: cleanup redundant extable type EX_TYPE_FIXUP
Currently, extable type EX_TYPE_FIXUP is no place to use, We can safely
remove it.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220621072638.1273594-7-tongtiangen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28 12:11:47 +01:00
Tong Tiangen
4953fc3d32 arm64: extable: add new extable type EX_TYPE_KACCESS_ERR_ZERO support
Currently, The extable type EX_TYPE_UACCESS_ERR_ZERO is used by
__get/put_kernel_nofault(), but those helpers are not uaccess type, so we
add a new extable type EX_TYPE_KACCESS_ERR_ZERO which can be used by
__get/put_kernel_no_fault().

This is also to prepare for distinguishing the two types in machine check
safe process.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220621072638.1273594-2-tongtiangen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28 12:11:00 +01:00
akpm
46a3b11253 Merge branch 'master' into mm-stable 2022-06-27 10:31:34 -07:00
Kefeng Wang
893dea9ccd arm64: Add HAVE_IOREMAP_PROT support
With ioremap_prot() definition from generic ioremap, also move
pte_pgprot() from hugetlbpage.c into pgtable.h, then arm64 could
have HAVE_IOREMAP_PROT, which will enable generic_access_phys()
code, it is useful for debug, eg, gdb.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220607125027.44946-7-wangkefeng.wang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 12:22:31 +01:00
Kefeng Wang
f23eab0bfa arm64: mm: Convert to GENERIC_IOREMAP
Add hook for arm64's special operation when ioremap(), then
ioremap_wc/np/cache is converted to use ioremap_prot() from
GENERIC_IOREMAP, update the Copyright and kill the unused
inclusions.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220607125027.44946-6-wangkefeng.wang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 12:22:31 +01:00
Ard Biesheuvel
005e12676a arm64: head: record CPU boot mode after enabling the MMU
In order to avoid having to touch memory with the MMU and caches
disabled, and therefore having to invalidate it from the caches
explicitly, just defer storing the value until after the MMU has been
turned on, unless we are giving up with an error.

While at it, move the associated variable definitions into C code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-19-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
c3cee924bd arm64: head: cover entire kernel image in initial ID map
As a first step towards avoiding the need to create, tear down and
recreate the kernel virtual mapping with MMU and caches disabled, start
by expanding the ID map so it covers the page tables as well as all
executable code. This will allow us to populate the page tables with the
MMU and caches on, and call KASLR init code before setting up the
virtual mapping.

Since this ID map is only needed at boot, create it as a temporary set
of page tables, and populate the permanent ID map after enabling the MMU
and caches. While at it, switch to read-only attributes for the where
possible, as writable permissions are only needed for the initial kernel
page tables. Note that on 4k granule configurations, the permanent ID
map will now be reduced to a single page rather than a 2M block mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
1682c45b92 arm64: mm: provide idmap pointer to cpu_replace_ttbr1()
In preparation for changing the way we initialize the permanent ID map,
update cpu_replace_ttbr1() so we can use it with the initial ID map as
well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-11-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
ebd9aea1f2 arm64: head: drop idmap_ptrs_per_pgd
The assignment of idmap_ptrs_per_pgd lacks any cache invalidation, even
though it is updated with the MMU and caches disabled. However, we never
bother to read the value again except in the very next instruction, and
so we can just drop the variable entirely.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-5-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
e8d13cced5 arm64: head: move assignment of idmap_t0sz to C code
Setting idmap_t0sz involves fiddling with the caches if done with the
MMU off. Since we will be creating an initial ID map with the MMU and
caches off, and the permanent ID map with the MMU and caches on, let's
move this assignment of idmap_t0sz out of the startup code, and replace
it with a macro that simply issues the three instructions needed to
calculate the value wherever it is needed before the MMU is turned on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-4-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
0d9b1ffefa arm64: mm: make vabits_actual a build time constant if possible
Currently, we only support 52-bit virtual addressing on 64k pages
configurations, and in all other cases, vabits_actual is guaranteed to
equal VA_BITS (== VA_BITS_MIN). So get rid of the variable entirely in
that case.

While at it, move the assignment out of the asm entry code - it has no
need to be there.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
475031b6ed arm64: head: move kimage_vaddr variable into C file
This variable definition does not need to be in head.S so move it out.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
1c9a8e8768 arm64: entry: simplify trampoline data page
Get rid of some clunky open coded arithmetic on section addresses, by
emitting the trampoline data variables into a separate, dedicated r/o
data section, and putting it at the next page boundary. This way, we can
access the literals via single LDR instruction.

While at it, get rid of other, implicit literals, and use ADRP/ADD or
MOVZ/MOVK sequences, as appropriate. Note that the latter are only
supported for CONFIG_RELOCATABLE=n (which is usually the case if
CONFIG_RANDOMIZE_BASE=n), so update the CPP conditionals to reflect
this.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220622161010.3845775-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 13:08:30 +01:00
Ard Biesheuvel
47546a1912 arm64: mm: install KPTI nG mappings with MMU enabled
In cases where we unmap the kernel while running in user space, we rely
on ASIDs to distinguish the minimal trampoline from the full kernel
mapping, and this means we must use non-global attributes for those
mappings, to ensure they are scoped by ASID and will not hit in the TLB
inadvertently.

We only do this when needed, as this is generally more costly in terms
of TLB pressure, and so we boot without these non-global attributes, and
apply them to all existing kernel mappings once all CPUs are up and we
know whether or not the non-global attributes are needed. At this point,
we cannot simply unmap and remap the entire address space, so we have to
update all existing block and page descriptors in place.

Currently, we go through a lot of trouble to perform these updates with
the MMU and caches off, to avoid violating break before make (BBM) rules
imposed by the architecture. Since we make changes to page tables that
are not covered by the ID map, we gain access to those descriptors by
disabling translations altogether. This means that the stores to memory
are issued with device attributes, and require extra care in terms of
coherency, which is costly. We also rely on the ID map to access a
shared flag, which requires the ID map to be executable and writable at
the same time, which is another thing we'd prefer to avoid.

So let's switch to an approach where we replace the kernel mapping with
a minimal mapping of a few pages that can be used for a minimal, ad-hoc
fixmap that we can use to map each page table in turn as we traverse the
hierarchy.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220609174320.4035379-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 18:26:13 +01:00