Commit Graph

1301 Commits

Author SHA1 Message Date
Peter Collingbourne
973b9e3733 arm64: mte: move register initialization to C
If FEAT_MTE2 is disabled via the arm64.nomte command line argument on a
CPU that claims to support FEAT_MTE2, the kernel will use Tagged Normal
in the MAIR. If we interpret arm64.nomte to mean that the CPU does not
in fact implement FEAT_MTE2, setting the system register like this may
lead to UNSPECIFIED behavior. Fix it by arranging for MAIR to be set
in the C function cpu_enable_mte which is called based on the sanitized
version of the system register.

There is no need for the rest of the MTE-related system register
initialization to happen from assembly, with the exception of TCR_EL1,
which must be set to include at least TBI1 because the secondary CPUs
access KASan-allocated data structures early. Therefore, make the TCR_EL1
initialization unconditional and move the rest of the initialization to
cpu_enable_mte so that we no longer have a dependency on the unsanitized
ID register value.

Co-developed-by: Evgenii Stepanov <eugenis@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Evgenii Stepanov <eugenis@google.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 3b714d24ef ("arm64: mte: CPU feature detection and initial sysreg configuration")
Cc: <stable@vger.kernel.org> # 5.10.x
Link: https://lore.kernel.org/r/20220915222053.3484231-1-eugenis@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 18:02:50 +01:00
Kefeng Wang
739e49e0fc arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
Directly check ARM64_SWAPPER_USES_SECTION_MAPS to choose base page
or PMD level huge page mapping in vmemmap_populate() to simplify
code a bit.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220920014951.196191-1-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 16:25:51 +01:00
Will Deacon
c44094eee3 arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arch_dma_prep_coherent() is called when preparing a non-cacheable region
for a consistent DMA buffer allocation. Since the buffer pages may
previously have been written via a cacheable mapping and consequently
allocated as dirty cachelines, the purpose of this function is to remove
these dirty lines from the cache, writing them back so that the
non-coherent device is able to see them.

On arm64, this operation can be achieved with a clean to the point of
coherency; a subsequent invalidation is not required and serves little
purpose in the presence of a cacheable alias (e.g. the linear map),
since clean lines can be speculatively fetched back into the cache after
the invalidation operation has completed.

Relax the cache maintenance in arch_dma_prep_coherent() so that only a
clean, and not a clean-and-invalidate operation is performed.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220823122111.17439-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 15:09:31 +01:00
Ard Biesheuvel
3fc24ef32d arm64: compat: Implement misalignment fixups for multiword loads
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.

This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.

Note that the intent is to implement the exact same handling of
misaligned multi-word loads and stores as the 32-bit kernel does,
including what appears to be missing support for user space programs
that rely on SETEND to switch to a different byte order and back. Also,
like the 32-bit ARM version, we rely on the faulting address reported by
the CPU to infer the memory address, instead of decoding the instruction
fully to obtain this information.

This implementation is taken from the 32-bit ARM tree, with all pieces
removed that deal with instructions other than LDRD/STRD and LDM/STM, or
that deal with alignment exceptions taken in kernel mode.

Cc: debian-arm@lists.debian.org
Cc: Vagrant Cascadian <vagrant@debian.org>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Steve McIntyre <steve@einval.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220701135322.3025321-1-ardb@kernel.org
[catalin.marinas@arm.com: change the option to 'default n']
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-06 09:34:53 +01:00
Mark Rutland
2e8cff0a0e arm64: fix rodata=full
On arm64, "rodata=full" has been suppored (but not documented) since
commit:

  c55191e96c ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well")

As it's necessary to determine the rodata configuration early during
boot, arm64 has an early_param() handler for this, whereas init/main.c
has a __setup() handler which is run later.

Unfortunately, this split meant that since commit:

  f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")

... passing "rodata=full" would result in a spurious warning from the
__setup() handler (though RO permissions would be configured
appropriately).

Further, "rodata=full" has been broken since commit:

  0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")

... which caused strtobool() to parse "full" as false (in addition to
many other values not documented for the "rodata=" kernel parameter.

This patch fixes this breakage by:

* Moving the core parameter parser to an __early_param(), such that it
  is available early.

* Adding an (optional) arch hook which arm64 can use to parse "full".

* Updating the documentation to mention that "full" is valid for arm64.

* Having the core parameter parser handle "on" and "off" explicitly,
  such that any undocumented values (e.g. typos such as "ful") are
  reported as errors rather than being silently accepted.

Note that __setup() and early_param() have opposite conventions for
their return values, where __setup() uses 1 to indicate a parameter was
handled and early_param() uses 0 to indicate a parameter was handled.

Fixes: f9a40b0890 ("init/main.c: return 1 from handled __setup() functions")
Fixes: 0d6ea3ac94 ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jagdish Gediya <jvgediya@linux.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-08-23 11:02:02 +01:00
Muchun Song
2da1c30929 mm: hugetlb_vmemmap: delete hugetlb_optimize_vmemmap_enabled()
Patch series "Simplify hugetlb vmemmap and improve its readability", v2.

This series aims to simplify hugetlb vmemmap and improve its readability.


This patch (of 8):

The name hugetlb_optimize_vmemmap_enabled() a bit confusing as it tests
two conditions (enabled and pages in use).  Instead of coming up to an
appropriate name, we could just delete it.  There is already a discussion
about deleting it in thread [1].

There is only one user of hugetlb_optimize_vmemmap_enabled() outside of
hugetlb_vmemmap, that is flush_dcache_page() in arch/arm64/mm/flush.c. 
However, it does not need to call hugetlb_optimize_vmemmap_enabled() in
flush_dcache_page() since HugeTLB pages are always fully mapped and only
head page will be set PG_dcache_clean meaning only head page's flag may
need to be cleared (see commit cf5a501d98).  So it is easy to remove
hugetlb_optimize_vmemmap_enabled().

Link: https://lore.kernel.org/all/c77c61c8-8a5a-87e8-db89-d04d8aaab4cc@oracle.com/ [1]
Link: https://lkml.kernel.org/r/20220628092235.91270-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-08 18:06:42 -07:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
97a77ab14f EFI updates for v5.20
- Enable mirrored memory for arm64
 - Fix up several abuses of the efivar API
 - Refactor the efivar API in preparation for moving the 'business logic'
   part of it into efivarfs
 - Enable ACPI PRM on arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmLhuDIACgkQw08iOZLZ
 jyS9IQv/Wc2nhjN50S3gfrL+68/el/hGdP/J0FK5BOOjNosG2t1ZNYZtSthXqpPH
 hRrTU2m6PpQUalRpFDyLiHkJvdBFQe4VmvrzBa3TIBIzyflLQPJzkWrqThPchV+B
 qi4lmCtTDNIEJmayewqx1wWA+QmUiyI5zJ8wrZp84LTctBPL75seVv0SB20nqai0
 3/I73omB2RLVGpCpeWvb++vePXL8euFW3FEwCTM8hRboICjORTyIZPy8Y5os+3xT
 UgrIgVDOtn1Xwd4tK0qVwjOVA51east4Fcn3yGOrL40t+3SFm2jdpAJOO3UvyNPl
 vkbtjvXsIjt3/oxreKxXHLbamKyueWIfZRyCLsrg6wrr96oypPk6ID4iDCQoen/X
 Zf0VjM2vmvSd4YgnEIblOfSBxVg48cHJA4iVHVxFodNTrVnzGGFYPTmNKmJqo+Xn
 JeUILM7jlR4h/t0+cTTK3Busu24annTuuz5L5rjf4bUm6pPf4crb1yJaFWtGhlpa
 er233D6O
 =zI0R
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - Enable mirrored memory for arm64

 - Fix up several abuses of the efivar API

 - Refactor the efivar API in preparation for moving the 'business
   logic' part of it into efivarfs

 - Enable ACPI PRM on arm64

* tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
  ACPI: Move PRM config option under the main ACPI config
  ACPI: Enable Platform Runtime Mechanism(PRM) support on ARM64
  ACPI: PRM: Change handler_addr type to void pointer
  efi: Simplify arch_efi_call_virt() macro
  drivers: fix typo in firmware/efi/memmap.c
  efi: vars: Drop __efivar_entry_iter() helper which is no longer used
  efi: vars: Use locking version to iterate over efivars linked lists
  efi: pstore: Omit efivars caching EFI varstore access layer
  efi: vars: Add thin wrapper around EFI get/set variable interface
  efi: vars: Don't drop lock in the middle of efivar_init()
  pstore: Add priv field to pstore_record for backend specific use
  Input: applespi - avoid efivars API and invoke EFI services directly
  selftests/kexec: remove broken EFI_VARS secure boot fallback check
  brcmfmac: Switch to appropriate helper to load EFI variable contents
  iwlwifi: Switch to proper EFI variable store interface
  media: atomisp_gmin_platform: stop abusing efivar API
  efi: efibc: avoid efivar API for setting variables
  efi: avoid efivars layer when loading SSDTs from variables
  efi: Correct comment on efi_memmap_alloc
  memblock: Disable mirror feature if kernelcore is not specified
  ...
2022-08-03 14:38:02 -07:00
Linus Torvalds
0cec3f24a7 arm64 updates for 5.20
- Remove unused generic cpuidle support (replaced by PSCI version)
 
 - Fix documentation describing the kernel virtual address space
 
 - Handling of some new CPU errata in Arm implementations
 
 - Rework of our exception table code in preparation for handling
   machine checks (i.e. RAS errors) more gracefully
 
 - Switch over to the generic implementation of ioremap()
 
 - Fix lockdep tracking in NMI context
 
 - Instrument our memory barrier macros for KCSAN
 
 - Rework of the kPTI G->nG page-table repainting so that the MMU remains
   enabled and the boot time is no longer slowed to a crawl for systems
   which require the late remapping
 
 - Enable support for direct swapping of 2MiB transparent huge-pages on
   systems without MTE
 
 - Fix handling of MTE tags with allocating new pages with HW KASAN
 
 - Expose the SMIDR register to userspace via sysfs
 
 - Continued rework of the stack unwinder, particularly improving the
   behaviour under KASAN
 
 - More repainting of our system register definitions to match the
   architectural terminology
 
 - Improvements to the layout of the vDSO objects
 
 - Support for allocating additional bits of HWCAP2 and exposing
   FEAT_EBF16 to userspace on CPUs that support it
 
 - Considerable rework and optimisation of our early boot code to reduce
   the need for cache maintenance and avoid jumping in and out of the
   kernel when handling relocation under KASLR
 
 - Support for disabling SVE and SME support on the kernel command-line
 
 - Support for the Hisilicon HNS3 PMU
 
 - Miscellanous cleanups, trivial updates and minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x
 Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4
 wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ
 kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs
 UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS
 jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1
 =LVH0
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Highlights include a major rework of our kPTI page-table rewriting
  code (which makes it both more maintainable and considerably faster in
  the cases where it is required) as well as significant changes to our
  early boot code to reduce the need for data cache maintenance and
  greatly simplify the KASLR relocation dance.

  Summary:

   - Remove unused generic cpuidle support (replaced by PSCI version)

   - Fix documentation describing the kernel virtual address space

   - Handling of some new CPU errata in Arm implementations

   - Rework of our exception table code in preparation for handling
     machine checks (i.e. RAS errors) more gracefully

   - Switch over to the generic implementation of ioremap()

   - Fix lockdep tracking in NMI context

   - Instrument our memory barrier macros for KCSAN

   - Rework of the kPTI G->nG page-table repainting so that the MMU
     remains enabled and the boot time is no longer slowed to a crawl
     for systems which require the late remapping

   - Enable support for direct swapping of 2MiB transparent huge-pages
     on systems without MTE

   - Fix handling of MTE tags with allocating new pages with HW KASAN

   - Expose the SMIDR register to userspace via sysfs

   - Continued rework of the stack unwinder, particularly improving the
     behaviour under KASAN

   - More repainting of our system register definitions to match the
     architectural terminology

   - Improvements to the layout of the vDSO objects

   - Support for allocating additional bits of HWCAP2 and exposing
     FEAT_EBF16 to userspace on CPUs that support it

   - Considerable rework and optimisation of our early boot code to
     reduce the need for cache maintenance and avoid jumping in and out
     of the kernel when handling relocation under KASLR

   - Support for disabling SVE and SME support on the kernel
     command-line

   - Support for the Hisilicon HNS3 PMU

   - Miscellanous cleanups, trivial updates and minor fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
  arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
  arm64: fix KASAN_INLINE
  arm64/hwcap: Support FEAT_EBF16
  arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
  arm64/hwcap: Document allocation of upper bits of AT_HWCAP
  arm64: enable THP_SWAP for arm64
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: errata: Remove AES hwcap for COMPAT tasks
  arm64: numa: Don't check node against MAX_NUMNODES
  drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
  perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
  docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
  mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
  mm: kasan: Skip unpoisoning of user pages
  mm: kasan: Ensure the tags are visible before the tag in page->flags
  drivers/perf: hisi: add driver for HNS3 PMU
  drivers/perf: hisi: Add description for HNS3 PMU driver
  drivers/perf: riscv_pmu_sbi: perf format
  perf/arm-cci: Use the bitmap API to allocate bitmaps
  ...
2022-08-01 10:37:00 -07:00
Will Deacon
f96d67a8af Merge branch 'for-next/boot' into for-next/core
* for-next/boot: (34 commits)
  arm64: fix KASAN_INLINE
  arm64: Add an override for ID_AA64SMFR0_EL1.FA64
  arm64: Add the arm64.nosve command line option
  arm64: Add the arm64.nosme command line option
  arm64: Expose a __check_override primitive for oddball features
  arm64: Allow the idreg override to deal with variable field width
  arm64: Factor out checking of a feature against the override into a macro
  arm64: Allow sticky E2H when entering EL1
  arm64: Save state of HCR_EL2.E2H before switch to EL1
  arm64: Rename the VHE switch to "finalise_el2"
  arm64: mm: fix booting with 52-bit address space
  arm64: head: remove __PHYS_OFFSET
  arm64: lds: use PROVIDE instead of conditional definitions
  arm64: setup: drop early FDT pointer helpers
  arm64: head: avoid relocating the kernel twice for KASLR
  arm64: kaslr: defer initialization to initcall where permitted
  arm64: head: record CPU boot mode after enabling the MMU
  arm64: head: populate kernel page tables with MMU and caches on
  arm64: head: factor out TTBR1 assignment into a macro
  arm64: idreg-override: use early FDT mapping in ID map
  ...
2022-07-25 10:59:15 +01:00
Will Deacon
c436500d9f Merge branch 'for-next/mte' into for-next/core
* for-next/mte:
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
  mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
  mm: kasan: Skip unpoisoning of user pages
  mm: kasan: Ensure the tags are visible before the tag in page->flags
2022-07-25 10:57:08 +01:00
Will Deacon
02eab44c71 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: numa: Don't check node against MAX_NUMNODES
  arm64: mm: Remove assembly DMA cache maintenance wrappers
  arm64/mm: Define defer_reserve_crashkernel()
  arm64: fix oops in concurrently setting insn_emulation sysctls
  arm64: Do not forget syscall when starting a new thread.
  arm64: boot: add zstd support
2022-07-25 10:56:57 +01:00
Will Deacon
8184a8bc1c Merge branch 'for-next/kpti' into for-next/core
* for-next/kpti:
  arm64: correct the effect of mitigations off on kpti
  arm64: entry: simplify trampoline data page
  arm64: mm: install KPTI nG mappings with MMU enabled
  arm64: kpti-ng: simplify page table traversal logic
2022-07-25 10:56:49 +01:00
Will Deacon
84d8857af4 Merge branch 'for-next/ioremap' into for-next/core
* for-next/ioremap:
  arm64: Add HAVE_IOREMAP_PROT support
  arm64: mm: Convert to GENERIC_IOREMAP
  mm: ioremap: Add ioremap/iounmap_allowed()
  mm: ioremap: Setup phys_addr of struct vm_struct
  mm: ioremap: Use more sensible name in ioremap_prot()
  ARM: mm: kill unused runtime hook arch_iounmap()
2022-07-25 10:56:23 +01:00
Anshuman Khandual
42251045cc arm64/mm: move protection_map[] inside the platform
This moves protection_map[] inside the platform and makes it a static.

Link: https://lkml.kernel.org/r/20220711070600.2378316-6-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:37 -07:00
Baolin Wang
1bcdb769f9 arm64/hugetlb: implement arm64 specific hugetlb_mask_last_page
The HugeTLB address ranges are linearly scanned during fork, unmap and
remap operations, and the linear scan can skip to the end of range mapped
by the page table page if hitting a non-present entry, which can help to
speed linear scanning of the HugeTLB address ranges.

So hugetlb_mask_last_page() is introduced to help to update the address in
the loop of HugeTLB linear scanning with getting the last huge page mapped
by the associated page table page[1], when a non-present entry is
encountered.

Considering ARM64 specific cont-pte/pmd size HugeTLB, this patch
implemented an ARM64 specific hugetlb_mask_last_page() to help this case.

[1] https://lore.kernel.org/linux-mm/20220527225849.284839-1-mike.kravetz@oracle.com/

[baolin.wang@linux.alibaba.com: fix build]
  Link: https://lkml.kernel.org/r/a14e7b39-6a8a-4609-b4a1-84ac574f5c96@linux.alibaba.com
Link: https://lkml.kernel.org/r/20220621235620.291305-3-mike.kravetz@oracle.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Houghton <jthoughton@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:34 -07:00
Catalin Marinas
20794545c1 arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
This reverts commit e5b8d92189.

Pages mapped in user-space with PROT_MTE have the allocation tags either
zeroed or copied/restored to some user values. In order for the kernel
to access such pages via page_address(), resetting the tag in
page->flags was necessary. This tag resetting was deferred to
set_pte_at() -> mte_sync_page_tags() but it can race with another CPU
reading the flags (via page_to_virt()):

P0 (mte_sync_page_tags):	P1 (memcpy from virt_to_page):
				  Rflags!=0xff
  Wflags=0xff
  DMB (doesn't help)
  Wtags=0
				  Rtags=0   // fault

Since now the post_alloc_hook() function resets the page->flags tag when
unpoisoning is skipped for user pages (including the __GFP_ZEROTAGS
case), revert the arm64 commit calling page_kasan_tag_reset().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20220610152141.2148929-5-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-07 10:48:37 +01:00
Catalin Marinas
70c248aca9 mm: kasan: Skip unpoisoning of user pages
Commit c275c5c6d5 ("kasan: disable freed user page poisoning with HW
tags") added __GFP_SKIP_KASAN_POISON to GFP_HIGHUSER_MOVABLE. A similar
argument can be made about unpoisoning, so also add
__GFP_SKIP_KASAN_UNPOISON to user pages. To ensure the user page is
still accessible via page_address() without a kasan fault, reset the
page->flags tag.

With the above changes, there is no need for the arm64
tag_clear_highpage() to reset the page->flags tag.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20220610152141.2148929-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-07 10:48:37 +01:00
Will Deacon
7eacf1858b arm64: mm: Remove assembly DMA cache maintenance wrappers
Remove the __dma_{flush,map,unmap}_area assembly wrappers and call the
appropriate cache maintenance functions directly from the DMA mapping
callbacks.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220610151228.4562-3-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 13:06:31 +01:00
Anshuman Khandual
4890cc18f9 arm64/mm: Define defer_reserve_crashkernel()
Crash kernel memory reservation gets deferred, when either CONFIG_ZONE_DMA
or CONFIG_ZONE_DMA32 config is enabled on the platform. This deferral also
impacts overall linear mapping creation including the crash kernel itself.
Just encapsulate this deferral check in a new helper for better clarity.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220705062556.1845734-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:43:47 +01:00
Qi Zheng
18f3962953 mm: hugetlb: kill set_huge_swap_pte_at()
Commit e5251fd430 ("mm/hugetlb: introduce set_huge_swap_pte_at()
helper") add set_huge_swap_pte_at() to handle swap entries on
architectures that support hugepages consisting of contiguous ptes.  And
currently the set_huge_swap_pte_at() is only overridden by arm64.

set_huge_swap_pte_at() provide a sz parameter to help determine the number
of entries to be updated.  But in fact, all hugetlb swap entries contain
pfn information, so we can find the corresponding folio through the pfn
recorded in the swap entry, then the folio_size() is the number of entries
that need to be updated.

And considering that users will easily cause bugs by ignoring the
difference between set_huge_swap_pte_at() and set_huge_pte_at().  Let's
handle swap entries in set_huge_pte_at() and remove the
set_huge_swap_pte_at(), then we can call set_huge_pte_at() anywhere, which
simplifies our coding.

Link: https://lkml.kernel.org/r/20220626145717.53572-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-03 18:08:50 -07:00
Will Deacon
4109823037 arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
Commit fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
removed TLB invalidation from get_clear_flush() [now get_clear_contig()]
on the basis that the core TLB invalidation code is aware of hugetlb
mappings backed by contiguous page-table entries and will cover the
correct virtual address range.

However, this change also resulted in the TLB invalidation being removed
from the "break" step in the break-before-make (BBM) sequence used
internally by huge_ptep_set_{access_flags,wrprotect}(), therefore
making the BBM sequence unsafe irrespective of later invalidation.

Although the architecture is desperately unclear about how exactly
contiguous ptes should be updated in a live page-table, restore TLB
invalidation to our BBM sequence under the assumption that BBM is the
right thing to be doing in the first place.

Fixes: fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220629095349.25748-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-07-01 18:29:26 +01:00
Ard Biesheuvel
0aaa68532e arm64: mm: fix booting with 52-bit address space
Joey reports that booting 52-bit VA capable builds on 52-bit VA capable
CPUs is broken since commit 0d9b1ffefa ("arm64: mm: make vabits_actual
a build time constant if possible"). This is due to the fact that the
primary CPU reads the vabits_actual variable before it has been
assigned.

The reason for deferring the assignment of vabits_actual was that we try
to perform as few stores to memory as we can with the MMU and caches
off, due to the cache coherency issues it creates.

Since __cpu_setup() [which is where the read of vabits_actual occurs] is
also called on the secondary boot path, we cannot just read the CPU ID
registers directly, given that the size of the VA space is decided by
the capabilities of the primary CPU. So let's read vabits_actual only on
the secondary boot path, and read the CPU ID registers directly on the
primary boot path, by making it a function parameter of __cpu_setup().

To ensure that all users of vabits_actual (including kasan_early_init())
observe the correct value, move the assignment of vabits_actual back
into asm code, but still defer it to after the MMU and caches have been
enabled.

Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 0d9b1ffefa ("arm64: mm: make vabits_actual a build time constant if possible")
Reported-by: Joey Gouly <joey.gouly@arm.com>
Co-developed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220701111045.2944309-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01 15:19:07 +01:00
Tong Tiangen
bacac63702 arm64: extable: cleanup redundant extable type EX_TYPE_FIXUP
Currently, extable type EX_TYPE_FIXUP is no place to use, We can safely
remove it.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220621072638.1273594-7-tongtiangen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28 12:11:47 +01:00
Tong Tiangen
4953fc3d32 arm64: extable: add new extable type EX_TYPE_KACCESS_ERR_ZERO support
Currently, The extable type EX_TYPE_UACCESS_ERR_ZERO is used by
__get/put_kernel_nofault(), but those helpers are not uaccess type, so we
add a new extable type EX_TYPE_KACCESS_ERR_ZERO which can be used by
__get/put_kernel_no_fault().

This is also to prepare for distinguishing the two types in machine check
safe process.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220621072638.1273594-2-tongtiangen@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28 12:11:00 +01:00
akpm
46a3b11253 Merge branch 'master' into mm-stable 2022-06-27 10:31:34 -07:00
Kefeng Wang
893dea9ccd arm64: Add HAVE_IOREMAP_PROT support
With ioremap_prot() definition from generic ioremap, also move
pte_pgprot() from hugetlbpage.c into pgtable.h, then arm64 could
have HAVE_IOREMAP_PROT, which will enable generic_access_phys()
code, it is useful for debug, eg, gdb.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220607125027.44946-7-wangkefeng.wang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 12:22:31 +01:00
Kefeng Wang
f23eab0bfa arm64: mm: Convert to GENERIC_IOREMAP
Add hook for arm64's special operation when ioremap(), then
ioremap_wc/np/cache is converted to use ioremap_prot() from
GENERIC_IOREMAP, update the Copyright and kill the unused
inclusions.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220607125027.44946-6-wangkefeng.wang@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 12:22:31 +01:00
Ard Biesheuvel
005e12676a arm64: head: record CPU boot mode after enabling the MMU
In order to avoid having to touch memory with the MMU and caches
disabled, and therefore having to invalidate it from the caches
explicitly, just defer storing the value until after the MMU has been
turned on, unless we are giving up with an error.

While at it, move the associated variable definitions into C code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-19-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
c3cee924bd arm64: head: cover entire kernel image in initial ID map
As a first step towards avoiding the need to create, tear down and
recreate the kernel virtual mapping with MMU and caches disabled, start
by expanding the ID map so it covers the page tables as well as all
executable code. This will allow us to populate the page tables with the
MMU and caches on, and call KASLR init code before setting up the
virtual mapping.

Since this ID map is only needed at boot, create it as a temporary set
of page tables, and populate the permanent ID map after enabling the MMU
and caches. While at it, switch to read-only attributes for the where
possible, as writable permissions are only needed for the initial kernel
page tables. Note that on 4k granule configurations, the permanent ID
map will now be reduced to a single page rather than a 2M block mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
1682c45b92 arm64: mm: provide idmap pointer to cpu_replace_ttbr1()
In preparation for changing the way we initialize the permanent ID map,
update cpu_replace_ttbr1() so we can use it with the initial ID map as
well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-11-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:10 +01:00
Ard Biesheuvel
ebd9aea1f2 arm64: head: drop idmap_ptrs_per_pgd
The assignment of idmap_ptrs_per_pgd lacks any cache invalidation, even
though it is updated with the MMU and caches disabled. However, we never
bother to read the value again except in the very next instruction, and
so we can just drop the variable entirely.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-5-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
e8d13cced5 arm64: head: move assignment of idmap_t0sz to C code
Setting idmap_t0sz involves fiddling with the caches if done with the
MMU off. Since we will be creating an initial ID map with the MMU and
caches off, and the permanent ID map with the MMU and caches on, let's
move this assignment of idmap_t0sz out of the startup code, and replace
it with a macro that simply issues the three instructions needed to
calculate the value wherever it is needed before the MMU is turned on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-4-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
0d9b1ffefa arm64: mm: make vabits_actual a build time constant if possible
Currently, we only support 52-bit virtual addressing on 64k pages
configurations, and in all other cases, vabits_actual is guaranteed to
equal VA_BITS (== VA_BITS_MIN). So get rid of the variable entirely in
that case.

While at it, move the assignment out of the asm entry code - it has no
need to be there.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220624150651.1358849-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
475031b6ed arm64: head: move kimage_vaddr variable into C file
This variable definition does not need to be in head.S so move it out.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220624150651.1358849-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 17:18:09 +01:00
Ard Biesheuvel
1c9a8e8768 arm64: entry: simplify trampoline data page
Get rid of some clunky open coded arithmetic on section addresses, by
emitting the trampoline data variables into a separate, dedicated r/o
data section, and putting it at the next page boundary. This way, we can
access the literals via single LDR instruction.

While at it, get rid of other, implicit literals, and use ADRP/ADD or
MOVZ/MOVK sequences, as appropriate. Note that the latter are only
supported for CONFIG_RELOCATABLE=n (which is usually the case if
CONFIG_RANDOMIZE_BASE=n), so update the CPP conditionals to reflect
this.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220622161010.3845775-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24 13:08:30 +01:00
Ard Biesheuvel
47546a1912 arm64: mm: install KPTI nG mappings with MMU enabled
In cases where we unmap the kernel while running in user space, we rely
on ASIDs to distinguish the minimal trampoline from the full kernel
mapping, and this means we must use non-global attributes for those
mappings, to ensure they are scoped by ASID and will not hit in the TLB
inadvertently.

We only do this when needed, as this is generally more costly in terms
of TLB pressure, and so we boot without these non-global attributes, and
apply them to all existing kernel mappings once all CPUs are up and we
know whether or not the non-global attributes are needed. At this point,
we cannot simply unmap and remap the entire address space, so we have to
update all existing block and page descriptors in place.

Currently, we go through a lot of trouble to perform these updates with
the MMU and caches off, to avoid violating break before make (BBM) rules
imposed by the architecture. Since we make changes to page tables that
are not covered by the ID map, we gain access to those descriptors by
disabling translations altogether. This means that the stores to memory
are issued with device attributes, and require extra care in terms of
coherency, which is costly. We also rely on the ID map to access a
shared flag, which requires the ID map to be executable and writable at
the same time, which is another thing we'd prefer to avoid.

So let's switch to an approach where we replace the kernel mapping with
a minimal mapping of a few pages that can be used for a minimal, ad-hoc
fixmap that we can use to map each page table in turn as we traverse the
hierarchy.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220609174320.4035379-3-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 18:26:13 +01:00
Ard Biesheuvel
c7eff738cf arm64: kpti-ng: simplify page table traversal logic
Simplify the KPTI G-to-nG asm helper code by:
- pulling the 'table bit' test into the get/put macros so we can combine
  them and incorporate the entire loop;
- moving the 'table bit' test after the update of bit #11 so we no
  longer need separate next_xxx and skip_xxx labels;
- redefining the pmd/pud register aliases and the next_pmd/next_pud
  labels instead of branching to them if the number of configured page
  table levels is less than 3 or 4, respectively.

No functional change intended, except for the fact that we now descend
into a next level table after setting bit #11 on its descriptor but this
should make no difference in practice.

While at it, switch to .L prefixed local labels so they don't clutter up
the symbol tables, kallsyms, etc, and clean up the indentation for
legibility.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220609174320.4035379-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 18:26:13 +01:00
Will Deacon
c50f11c619 arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
Invalidating the buffer memory in arch_sync_dma_for_device() for
FROM_DEVICE transfers

When using the streaming DMA API to map a buffer prior to inbound
non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU
cachelines so that they will not be written back during the transfer and
corrupt the buffer contents written by the DMA. This, however, poses two
potential problems:

  (1) If the DMA transfer does not write to every byte in the buffer,
      then the unwritten bytes will contain stale data once the transfer
      has completed.

  (2) If the buffer has a virtual alias in userspace, then stale data
      may be visible via this alias during the period between performing
      the cache invalidation and the DMA writes landing in memory.

Address both of these issues by cleaning (aka writing-back) the dirty
lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding
them using invalidation.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220606152150.GA31568@willie-the-truck
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220610151228.4562-2-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-06-17 19:06:06 +01:00
Peter Xu
d92725256b mm: avoid unnecessary page fault retires on shared memory types
I observed that for each of the shared file-backed page faults, we're very
likely to retry one more time for the 1st write fault upon no page.  It's
because we'll need to release the mmap lock for dirty rate limit purpose
with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).

Then after that throttling we return VM_FAULT_RETRY.

We did that probably because VM_FAULT_RETRY is the only way we can return
to the fault handler at that time telling it we've released the mmap lock.

However that's not ideal because it's very likely the fault does not need
to be retried at all since the pgtable was well installed before the
throttling, so the next continuous fault (including taking mmap read lock,
walk the pgtable, etc.) could be in most cases unnecessary.

It's not only slowing down page faults for shared file-backed, but also add
more mmap lock contention which is in most cases not needed at all.

To observe this, one could try to write to some shmem page and look at
"pgfault" value in /proc/vmstat, then we should expect 2 counts for each
shmem write simply because we retried, and vm event "pgfault" will capture
that.

To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
show that we've completed the whole fault and released the lock.  It's also
a hint that we should very possibly not need another fault immediately on
this page because we've just completed it.

This patch provides a ~12% perf boost on my aarch64 test VM with a simple
program sequentially dirtying 400MB shmem file being mmap()ed and these are
the time it needs:

  Before: 650.980 ms (+-1.94%)
  After:  569.396 ms (+-1.38%)

I believe it could help more than that.

We need some special care on GUP and the s390 pgfault handler (for gmap
code before returning from pgfault), the rest changes in the page fault
handlers should be relatively straightforward.

Another thing to mention is that mm_account_fault() does take this new
fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.

I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
them as-is.

Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	[arm part]
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Will Deacon <will@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Chris Zankel <chris@zankel.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:48:27 -07:00
Ma Wupeng
c0b978fedf arm64: mm: Only remove nomap flag for initrd
Commit 177e15f0c1 ("arm64: add the initrd region to the linear mapping explicitly")
remove all the flags of the memory used by initrd. This is fine since
MEMBLOCK_MIRROR is not used in arm64.

However with mirrored feature introduced to arm64, this will clear the mirrored
flag used by initrd, which will lead to error log printed by
find_zone_movable_pfns_for_nodes() if the lower 4G range has some non-mirrored
memory.

To solve this problem, only MEMBLOCK_NOMAP flag will be removed via
memblock_clear_nomap().

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220614092156.1972846-5-mawupeng1@huawei.com
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-15 12:14:15 +02:00
Oleksandr Tyshchenko
9bf22421dc arm/xen: Introduce xen_setup_dma_ops()
This patch introduces new helper and places it in new header.
The helper's purpose is to assign any Xen specific DMA ops in
a single place. For now, we deal with xen-swiotlb DMA ops only.
The one of the subsequent commits in current series will add
xen-grant DMA ops case.

Also re-use the xen_swiotlb_detect() check on Arm32.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
[For arm64]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1654197833-25362-2-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 08:54:33 +02:00
Baolin Wang
e68b823ab0 arm64/hugetlb: Fix building errors in huge_ptep_clear_flush()
Fix the arm64 build error which was caused by commit ae07562909 ("mm:
change huge_ptep_clear_flush() to return the original pte") interacting
with commit fb396bb459 ("arm64/hugetlb: Drop TLB flush from
get_clear_flush()"):

  arch/arm64/mm/hugetlbpage.c: In function ‘huge_ptep_clear_flush’:
  arch/arm64/mm/hugetlbpage.c:515:9: error: implicit declaration of function ‘get_clear_flush’; did you mean ‘ptep_clear_flush’? [-Werror=implicit-function-declaration]
    515 |  return get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
        |         ^~~~~~~~~~~~~~~
        |         ptep_clear_flush

Due to the new get_clear_contig() has dropped TLB flush, we should add
an explicit TLB flush in huge_ptep_clear_flush() to keep original
semantics when changing to use new get_clear_contig().

Fixes: fb396bb459 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()").
Fixes: ae07562909 ("mm: change huge_ptep_clear_flush() to return the original pte")
Reported-and-tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-27 10:56:35 -07:00
Linus Torvalds
98931dd95f Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages.
 
 Johannes Weiner has arranged for zswap memory use to be tracked and
 managed on a per-cgroup basis.
 
 Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
 enablement of the recent huge page vmemmap optimization feature.
 
 Baolin Wang contributes a series to fix some issues around hugetlb
 pagetable invalidation.
 
 Zhenwei Pi has fixed some interactions between hwpoisoned pages and
 virtualization.
 
 Tong Tiangen has enabled the use of the presently x86-only
 page_table_check debugging feature on arm64 and riscv.
 
 David Vernet has done some fixup work on the memcg selftests.
 
 Peter Xu has taught userfaultfd to handle write protection faults against
 shmem- and hugetlbfs-backed files.
 
 More DAMON development from SeongJae Park - adding online tuning of the
 feature and support for monitoring of fixed virtual address ranges.  Also
 easier discovery of which monitoring operations are available.
 
 Nadav Amit has done some optimization of TLB flushing during mprotect().
 
 Neil Brown continues to labor away at improving our swap-over-NFS support.
 
 David Hildenbrand has some fixes to anon page COWing versus
 get_user_pages().
 
 Peng Liu fixed some errors in the core hugetlb code.
 
 Joao Martins has reduced the amount of memory consumed by device-dax's
 compound devmaps.
 
 Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
 
 Muchun Song has found and fixed some errors in the TLB flushing of
 transparent hugepages.
 
 Roman Gushchin has done more work on the memcg selftests.
 
 And, of course, many smaller fixes and cleanups.  Notably, the customary
 million cleanup serieses from Miaohe Lin.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
 jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
 PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
 =nFe6
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Linus Torvalds
3f306ea2e1 dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy)
  - takes min align mask into account for the swiotlb max mapping size
    (Tianyu Lan)
  - use GFP_ATOMIC in dma-debug (Mikulas Patocka)
  - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)
  - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)
  - cleanup swiotlb initialization and share more code with swiotlb-xen
    (me, Stefano Stabellini)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT
 Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B
 r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP
 ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2
 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD
 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h
 V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9
 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0
 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu
 yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK
 otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff
 HNA=
 =n2rk
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Linus Torvalds
143a6252e1 arm64 updates for 5.19:
- Initial support for the ARMv9 Scalable Matrix Extension (SME). SME
   takes the approach used for vectors in SVE and extends this to provide
   architectural support for matrix operations. No KVM support yet, SME
   is disabled in guests.
 
 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.
 
 - btrfs search_ioctl() fix for live-lock with sub-page faults.
 
 - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring
   coherent I/O traffic, support for Arm's CMN-650 and CMN-700
   interconnect PMUs, minor driver fixes, kerneldoc cleanup.
 
 - Kselftest updates for SME, BTI, MTE.
 
 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.
 
 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).
 
 - stacktrace cleanups.
 
 - ftrace cleanups.
 
 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmKH19IACgkQa9axLQDI
 XvEFWg//bf0p6zjeNaOJmBbyVFsXsVyYiEaLUpFPUs3oB+81s2YZ+9i1rgMrNCft
 EIDQ9+/HgScKxJxnzWf68heMdcBDbk76VJtLALExbge6owFsjByQDyfb/b3v/bLd
 ezAcGzc6G5/FlI1IP7ct4Z9MnQry4v5AG8lMNAHjnf6GlBS/tYNAqpmj8HpQfgRQ
 ZbhfZ8Ayu3TRSLWL39NHVevpmxQm/bGcpP3Q9TtjUqg0r1FQ5sK/LCqOksueIAzT
 UOgUVYWSFwTpLEqbYitVqgERQp9LiLoK5RmNYCIEydfGM7+qmgoxofSq5e2hQtH2
 SZM1XilzsZctRbBbhMit1qDBqMlr/XAy/R5FO0GauETVKTaBhgtj6mZGyeC9nU/+
 RGDljaArbrOzRwMtSuXF+Fp6uVo5spyRn1m8UT/k19lUTdrV9z6EX5Fzuc4Mnhed
 oz4iokbl/n8pDObXKauQspPA46QpxUYhrAs10B/ELc3yyp/Qj3jOfzYHKDNFCUOq
 HC9mU+YiO9g2TbYgCrrFM6Dah2E8fU6/cR0ZPMeMgWK4tKa+6JMEINYEwak9e7M+
 8lZnvu3ntxiJLN+PrPkiPyG+XBh2sux1UfvNQ+nw4Oi9xaydeX7PCbQVWmzTFmHD
 q7UPQ8220e2JNCha9pULS8cxDLxiSksce06DQrGXwnHc1Ir7T04=
 =0DjE
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Initial support for the ARMv9 Scalable Matrix Extension (SME).

   SME takes the approach used for vectors in SVE and extends this to
   provide architectural support for matrix operations. No KVM support
   yet, SME is disabled in guests.

 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.

 - btrfs search_ioctl() fix for live-lock with sub-page faults.

 - arm64 perf updates: support for the Hisilicon "CPA" PMU for
   monitoring coherent I/O traffic, support for Arm's CMN-650 and
   CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup.

 - Kselftest updates for SME, BTI, MTE.

 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.

 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).

 - stacktrace cleanups.

 - ftrace cleanups.

 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits)
  arm64/sysreg: Generate definitions for FAR_ELx
  arm64/sysreg: Generate definitions for DACR32_EL2
  arm64/sysreg: Generate definitions for CSSELR_EL1
  arm64/sysreg: Generate definitions for CPACR_ELx
  arm64/sysreg: Generate definitions for CONTEXTIDR_ELx
  arm64/sysreg: Generate definitions for CLIDR_EL1
  arm64/sve: Move sve_free() into SVE code section
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: kdump: Do not allocate crash low memory if not needed
  arm64/sve: Generate ZCR definitions
  arm64/sme: Generate defintions for SVCR
  arm64/sme: Generate SMPRI_EL1 definitions
  arm64/sme: Automatically generate SMPRIMAP_EL2 definitions
  arm64/sme: Automatically generate SMIDR_EL1 defines
  arm64/sme: Automatically generate defines for SMCR
  ...
2022-05-23 21:06:11 -07:00
Catalin Marinas
0616ea3f1b Merge branch 'for-next/esr-elx-64-bit' into for-next/core
* for-next/esr-elx-64-bit:
  : Treat ESR_ELx as a 64-bit register.
  KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high
  KVM: arm64: Treat ESR_EL2 as a 64-bit register
  arm64: Treat ESR_ELx as a 64-bit register
  arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
  arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly
2022-05-20 18:51:54 +01:00
Catalin Marinas
e003d5335c Merge branch 'for-next/sysreg-gen' into for-next/core
* for-next/sysreg-gen: (32 commits)
  : Automatic system register definition generation.
  arm64/sysreg: Generate definitions for FAR_ELx
  arm64/sysreg: Generate definitions for DACR32_EL2
  arm64/sysreg: Generate definitions for CSSELR_EL1
  arm64/sysreg: Generate definitions for CPACR_ELx
  arm64/sysreg: Generate definitions for CONTEXTIDR_ELx
  arm64/sysreg: Generate definitions for CLIDR_EL1
  arm64/sve: Generate ZCR definitions
  arm64/sme: Generate defintions for SVCR
  arm64/sme: Generate SMPRI_EL1 definitions
  arm64/sme: Automatically generate SMPRIMAP_EL2 definitions
  arm64/sme: Automatically generate SMIDR_EL1 defines
  arm64/sme: Automatically generate defines for SMCR
  arm64/sysreg: Support generation of RAZ fields
  arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.h
  arm64/sme: Standardise bitfield names for SVCR
  arm64/sme: Drop SYS_ from SMIDR_EL1 defines
  arm64/fp: Rename SVE and SME LEN field name to _WIDTH
  arm64/fp: Make SVE and SME length register definition match architecture
  arm64/sysreg: fix odd line spacing
  arm64/sysreg: improve comment for regs without fields
  ...
2022-05-20 18:50:57 +01:00
Catalin Marinas
201729d53a Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  perf/arm-cmn: Decode CAL devices properly in debugfs
  perf/arm-cmn: Fix filter_sel lookup
  perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first
  drivers/perf: hisi: Add Support for CPA PMU
  drivers/perf: hisi: Associate PMUs in SICL with CPUs online
  drivers/perf: arm_spe: Expose saturating counter to 16-bit
  perf/arm-cmn: Add CMN-700 support
  perf/arm-cmn: Refactor occupancy filter selector
  perf/arm-cmn: Add CMN-650 support
  dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700
  perf: check return value of armpmu_request_irq()
  perf: RISC-V: Remove non-kernel-doc ** comments

* for-next/sme: (30 commits)
  : Scalable Matrix Extensions support.
  arm64/sve: Move sve_free() into SVE code section
  arm64/sve: Make kernel FPU protection RT friendly
  arm64/sve: Delay freeing memory in fpsimd_flush_thread()
  arm64/sme: More sensibly define the size for the ZA register set
  arm64/sme: Fix NULL check after kzalloc
  arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding()
  arm64/sme: Provide Kconfig for SME
  KVM: arm64: Handle SME host state when running guests
  KVM: arm64: Trap SME usage in guest
  KVM: arm64: Hide SME system registers from guests
  arm64/sme: Save and restore streaming mode over EFI runtime calls
  arm64/sme: Disable streaming mode and ZA when flushing CPU state
  arm64/sme: Add ptrace support for ZA
  arm64/sme: Implement ptrace support for streaming mode SVE registers
  arm64/sme: Implement ZA signal handling
  arm64/sme: Implement streaming SVE signal handling
  arm64/sme: Disable ZA and streaming mode when handling signals
  arm64/sme: Implement traps and syscall handling for SME
  arm64/sme: Implement ZA context switching
  arm64/sme: Implement streaming SVE context switching
  ...

* for-next/stacktrace:
  : Stacktrace cleanups.
  arm64: stacktrace: align with common naming
  arm64: stacktrace: rename stackframe to unwind_state
  arm64: stacktrace: rename unwinder functions
  arm64: stacktrace: make struct stackframe private to stacktrace.c
  arm64: stacktrace: delete PCS comment
  arm64: stacktrace: remove NULL task check from unwind_frame()

* for-next/fault-in-subpage:
  : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable().
  btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
  arm64: Add support for user sub-page fault probing
  mm: Add fault_in_subpage_writeable() to probe at sub-page granularity

* for-next/misc:
  : Miscellaneous patches.
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: mm: Make arch_faults_on_old_pte() check for migratability
  arm64: mte: Clean up user tag accessors
  arm64/hugetlb: Drop TLB flush from get_clear_flush()
  arm64: Declare non global symbols as static
  arm64: mm: Cleanup useless parameters in zone_sizes_init()
  arm64: fix types in copy_highpage()
  arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE
  arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK
  arm64: document the boot requirements for MTE
  arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE

* for-next/ftrace:
  : ftrace cleanups.
  arm64/ftrace: Make function graph use ftrace directly
  ftrace: cleanup ftrace_graph_caller enable and disable

* for-next/crashkernel:
  : Support for crashkernel reservations above ZONE_DMA.
  arm64: kdump: Do not allocate crash low memory if not needed
  docs: kdump: Update the crashkernel description for arm64
  of: Support more than one crash kernel regions for kexec -s
  of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  arm64: kdump: Reimplement crashkernel=X
  arm64: Use insert_resource() to simplify code
  kdump: return -ENOENT if required cmdline option does not exist
2022-05-20 18:50:35 +01:00
Ard Biesheuvel
01142791b0 arm64: mm: avoid writable executable mappings in kexec/hibernate code
The temporary mappings of the low-level kexec and hibernate helpers are
created with both writable and executable attributes, which is not
necessary here, and generally best avoided. So use read-only, executable
attributes instead.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220429131347.3621090-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-17 09:32:45 +01:00