arm64 updates for 6.12

ACPI:
 * Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms.
 * Ensure arm64-specific IORT header is covered by MAINTAINERS.
 
 CPU Errata:
 * Enable workaround for hardware access/dirty issue on Ampere-1A cores.
 
 Memory management:
 * Define PHYSMEM_END to fix a crash in the amdgpu driver.
 * Avoid tripping over invalid kernel mappings on the kexec() path.
 * Userspace support for the Permission Overlay Extension (POE) using
   protection keys.
 
 Perf and PMUs:
 * Add support for the "fixed instruction counter" extension in the CPU
   PMU architecture.
 * Extend and fix the event encodings for Apple's M1 CPU PMU.
 * Allow LSM hooks to decide on SPE permissions for physical profiling.
 * Add support for the CMN S3 and NI-700 PMUs.
 
 Confidential Computing:
 * Add support for booting an arm64 kernel as a protected guest under
   Android's "Protected KVM" (pKVM) hypervisor.
 
 Selftests:
 * Fix vector length issues in the SVE/SME sigreturn tests
 * Fix build warning in the ptrace tests.
 
 Timers:
 * Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
   non-determinism arising from the architected counter.
 
 Miscellaneous:
 * Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
   don't succeed.
 * Minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmbkVNEQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNKeIB/9YtbN7JMgsXktM94GP03r3tlFF36Y1S51S
 +zdDZclAVZCTCZN+PaFeAZ/+ah2EQYrY6rtDoHUSEMQdF9kH+ycuIPDTwaJ4Qkam
 QKXMpAgtY/4yf2rX4lhDF8rEvkhLDsu7oGDhqUZQsA33GrMBHfgA3oqpYwlVjvGq
 gkm7olTo9LdWAxkPpnjGrjB6Mv5Dq8dJRhW+0Q5AntI5zx3RdYGJZA9GUSzyYCCt
 FIYOtMmWPkQ0kKxIVxOxAOm/ubhfyCs2sjSfkaa3vtvtt+Yjye1Xd81rFciIbPgP
 QlK/Mes2kBZmjhkeus8guLI5Vi7tx3DQMkNqLXkHAAzOoC4oConE
 =6osL
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The highlights are support for Arm's "Permission Overlay Extension"
  using memory protection keys, support for running as a protected guest
  on Android as well as perf support for a bunch of new interconnect
  PMUs.

  Summary:

  ACPI:
   - Enable PMCG erratum workaround for HiSilicon HIP10 and 11
     platforms.
   - Ensure arm64-specific IORT header is covered by MAINTAINERS.

  CPU Errata:
   - Enable workaround for hardware access/dirty issue on Ampere-1A
     cores.

  Memory management:
   - Define PHYSMEM_END to fix a crash in the amdgpu driver.
   - Avoid tripping over invalid kernel mappings on the kexec() path.
   - Userspace support for the Permission Overlay Extension (POE) using
     protection keys.

  Perf and PMUs:
   - Add support for the "fixed instruction counter" extension in the
     CPU PMU architecture.
   - Extend and fix the event encodings for Apple's M1 CPU PMU.
   - Allow LSM hooks to decide on SPE permissions for physical
     profiling.
   - Add support for the CMN S3 and NI-700 PMUs.

  Confidential Computing:
   - Add support for booting an arm64 kernel as a protected guest under
     Android's "Protected KVM" (pKVM) hypervisor.

  Selftests:
   - Fix vector length issues in the SVE/SME sigreturn tests
   - Fix build warning in the ptrace tests.

  Timers:
   - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
     non-determinism arising from the architected counter.

  Miscellaneous:
   - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
     don't succeed.
   - Minor fixes and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
  perf: arm-ni: Fix an NULL vs IS_ERR() bug
  arm64: hibernate: Fix warning for cast from restricted gfp_t
  arm64: esr: Define ESR_ELx_EC_* constants as UL
  arm64: pkeys: remove redundant WARN
  perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
  MAINTAINERS: List Arm interconnect PMUs as supported
  perf: Add driver for Arm NI-700 interconnect PMU
  dt-bindings/perf: Add Arm NI-700 PMU
  perf/arm-cmn: Improve format attr printing
  perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
  arm64/mm: use lm_alias() with addresses passed to memblock_free()
  mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
  arm64: Expose the end of the linear map in PHYSMEM_END
  arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
  arm64/mm: Delete __init region from memblock.reserved
  perf/arm-cmn: Support CMN S3
  dt-bindings: perf: arm-cmn: Add CMN S3
  perf/arm-cmn: Refactor DTC PMU register access
  perf/arm-cmn: Make cycle counts less surprising
  perf/arm-cmn: Improve build-time assertion
  ...
This commit is contained in:
Linus Torvalds 2024-09-16 06:55:07 +02:00
commit 114143a595
125 changed files with 3395 additions and 893 deletions

View File

@ -0,0 +1,17 @@
====================================
Arm Network-on Chip Interconnect PMU
====================================
NI-700 and friends implement a distinct PMU for each clock domain within the
interconnect. Correspondingly, the driver exposes multiple PMU devices named
arm_ni_<x>_cd_<y>, where <x> is an (arbitrary) instance identifier and <y> is
the clock domain ID within that particular instance. If multiple NI instances
exist within a system, the PMU devices can be correlated with the underlying
hardware instance via sysfs parentage.
Each PMU exposes base event aliases for the interface types present in its clock
domain. These require qualifying with the "eventid" and "nodeid" parameters
to specify the event code to count and the interface at which to count it
(per the configured hardware ID as reflected in the xxNI_NODE_INFO register).
The exception is the "cycles" alias for the PMU cycle counter, which is encoded
with the PMU node type and needs no further qualification.

View File

@ -46,16 +46,16 @@ Some of the events only exist for specific configurations.
DesignWare Cores (DWC) PCIe PMU Driver
=======================================
This driver adds PMU devices for each PCIe Root Port named based on the BDF of
This driver adds PMU devices for each PCIe Root Port named based on the SBDF of
the Root Port. For example,
30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
0001:30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
the PMU device name for this Root Port is dwc_rootport_3018.
the PMU device name for this Root Port is dwc_rootport_13018.
The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.
/sys/bus/event_source/devices/dwc_rootport_{sbdf}.
The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
@ -66,16 +66,16 @@ The "perf list" command shall list the available events from sysfs, e.g.::
$# perf list | grep dwc_rootport
<...>
dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
<...>
dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event]
dwc_rootport_13018/rx_memory_read,lane=?/ [Kernel PMU event]
Time Based Analysis Event Usage
-------------------------------
Example usage of counting PCIe RX TLP data payload (Units of bytes)::
$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
$# perf stat -a -e dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/
The average RX/TX bandwidth can be calculated using the following formula:
@ -88,7 +88,7 @@ Lane Event Usage
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::
$# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/
$# perf stat -a -e dwc_rootport_13018/rx_memory_read,lane=4/
The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.

View File

@ -28,7 +28,9 @@ The "identifier" sysfs file allows users to identify the version of the
PMU hardware device.
The "bus" sysfs file allows users to get the bus number of Root Ports
monitored by PMU.
monitored by PMU. Furthermore users can get the Root Ports range in
[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes
respectively.
Example usage of perf::

View File

@ -16,6 +16,7 @@ Performance monitor support
starfive_starlink_pmu
arm-ccn
arm-cmn
arm-ni
xgene-pmu
arm_dsu_pmu
thunderx2-pmu

View File

@ -365,6 +365,8 @@ HWCAP2_SME_SF8DP2
HWCAP2_SME_SF8DP4
Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
HWCAP2_POE
Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
4. Unused AT_HWCAP bits
-----------------------

View File

@ -55,6 +55,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Ampere | AmpereOne | AC03_CPU_38 | AMPERE_ERRATUM_AC03_CPU_38 |
+----------------+-----------------+-----------------+-----------------------------+
| Ampere | AmpereOne AC04 | AC04_CPU_10 | AMPERE_ERRATUM_AC03_CPU_38 |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 |
+----------------+-----------------+-----------------+-----------------------------+
@ -249,8 +251,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001900 | N/A |
| | Hip09 SMMU PMCG | | |
| Hisilicon | Hip{08,09,10,10C| #162001900 | N/A |
| | ,11} SMMU PMCG | | |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 |

View File

@ -16,6 +16,7 @@ properties:
- arm,cmn-600
- arm,cmn-650
- arm,cmn-700
- arm,cmn-s3
- arm,ci-700
reg:

View File

@ -0,0 +1,30 @@
# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
%YAML 1.2
---
$id: http://devicetree.org/schemas/perf/arm,ni.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Arm NI (Network-on-Chip Interconnect) Performance Monitors
maintainers:
- Robin Murphy <robin.murphy@arm.com>
properties:
compatible:
const: arm,ni-700
reg:
items:
- description: Complete configuration register space
interrupts:
minItems: 1
maxItems: 32
description: Overflow interrupts, one per clock domain, in order of domain ID
required:
- compatible
- reg
- interrupts
additionalProperties: false

View File

@ -44,3 +44,101 @@ Provides a discovery mechanism for other KVM/arm64 hypercalls.
----------------------------------------
See ptp_kvm.rst
``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``
----------------------------------
Query the memory protection parameters for a pKVM protected virtual machine.
+---------------------+-------------------------------------------------------------+
| Presence: | Optional; pKVM protected guests only. |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000002 |
+---------------------+----------+----+---------------------------------------------+
| Arguments: | (uint64) | R1 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``INVALID_PARAMETER (-3)`` on error, else |
| | | | memory protection granule in bytes |
+---------------------+----------+----+---------------------------------------------+
``ARM_SMCCC_KVM_FUNC_MEM_SHARE``
--------------------------------
Share a region of memory with the KVM host, granting it read, write and execute
permissions. The size of the region is equal to the memory protection granule
advertised by ``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``.
+---------------------+-------------------------------------------------------------+
| Presence: | Optional; pKVM protected guests only. |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000003 |
+---------------------+----------+----+---------------------------------------------+
| Arguments: | (uint64) | R1 | Base IPA of memory region to share |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``SUCCESS (0)`` |
| | | +---------------------------------------------+
| | | | ``INVALID_PARAMETER (-3)`` |
+---------------------+----------+----+---------------------------------------------+
``ARM_SMCCC_KVM_FUNC_MEM_UNSHARE``
----------------------------------
Revoke access permission from the KVM host to a memory region previously shared
with ``ARM_SMCCC_KVM_FUNC_MEM_SHARE``. The size of the region is equal to the
memory protection granule advertised by ``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``.
+---------------------+-------------------------------------------------------------+
| Presence: | Optional; pKVM protected guests only. |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000004 |
+---------------------+----------+----+---------------------------------------------+
| Arguments: | (uint64) | R1 | Base IPA of memory region to unshare |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``SUCCESS (0)`` |
| | | +---------------------------------------------+
| | | | ``INVALID_PARAMETER (-3)`` |
+---------------------+----------+----+---------------------------------------------+
``ARM_SMCCC_KVM_FUNC_MMIO_GUARD``
----------------------------------
Request that a given memory region is handled as MMIO by the hypervisor,
allowing accesses to this region to be emulated by the KVM host. The size of the
region is equal to the memory protection granule advertised by
``ARM_SMCCC_KVM_FUNC_HYP_MEMINFO``.
+---------------------+-------------------------------------------------------------+
| Presence: | Optional; pKVM protected guests only. |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000007 |
+---------------------+----------+----+---------------------------------------------+
| Arguments: | (uint64) | R1 | Base IPA of MMIO memory region |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``SUCCESS (0)`` |
| | | +---------------------------------------------+
| | | | ``INVALID_PARAMETER (-3)`` |
+---------------------+----------+----+---------------------------------------------+

View File

@ -334,6 +334,7 @@ L: linux-acpi@vger.kernel.org
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: drivers/acpi/arm64
F: include/linux/acpi_iort.h
ACPI FOR RISC-V (ACPI/riscv)
M: Sunil V L <sunilvl@ventanamicro.com>
@ -1752,6 +1753,17 @@ F: drivers/mtd/maps/physmap-versatile.*
F: drivers/power/reset/arm-versatile-reboot.c
F: drivers/soc/versatile/
ARM INTERCONNECT PMU DRIVERS
M: Robin Murphy <robin.murphy@arm.com>
S: Supported
F: Documentation/admin-guide/perf/arm-cmn.rst
F: Documentation/admin-guide/perf/arm-ni.rst
F: Documentation/devicetree/bindings/perf/arm,cmn.yaml
F: Documentation/devicetree/bindings/perf/arm,ni.yaml
F: drivers/perf/arm-cmn.c
F: drivers/perf/arm-ni.c
F: tools/perf/pmu-events/arch/arm64/arm/cmn/
ARM KOMEDA DRM-KMS DRIVER
M: Liviu Dudau <liviu.dudau@arm.com>
S: Supported

View File

@ -127,6 +127,12 @@ static inline u32 read_pmuver(void)
return (dfr0 >> 24) & 0xf;
}
static inline bool pmuv3_has_icntr(void)
{
/* FEAT_PMUv3_ICNTR not accessible for 32-bit */
return false;
}
static inline void write_pmcr(u32 val)
{
write_sysreg(val, PMCR);
@ -152,6 +158,13 @@ static inline u64 read_pmccntr(void)
return read_sysreg(PMCCNTR);
}
static inline void write_pmicntr(u64 val) {}
static inline u64 read_pmicntr(void)
{
return 0;
}
static inline void write_pmcntenset(u32 val)
{
write_sysreg(val, PMCNTENSET);
@ -177,6 +190,13 @@ static inline void write_pmccfiltr(u32 val)
write_sysreg(val, PMCCFILTR);
}
static inline void write_pmicfiltr(u64 val) {}
static inline u64 read_pmicfiltr(void)
{
return 0;
}
static inline void write_pmovsclr(u32 val)
{
write_sysreg(val, PMOVSR);

View File

@ -7,4 +7,6 @@
void kvm_init_hyp_services(void);
bool kvm_arm_hyp_service_available(u32 func_id);
static inline void kvm_arch_init_hyp_services(void) { };
#endif

View File

@ -34,6 +34,7 @@ config ARM64
select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP
@ -423,7 +424,7 @@ config AMPERE_ERRATUM_AC03_CPU_38
default y
help
This option adds an alternative code sequence to work around Ampere
erratum AC03_CPU_38 on AmpereOne.
errata AC03_CPU_38 and AC04_CPU_10 on AmpereOne.
The affected design reports FEAT_HAFDBS as not implemented in
ID_AA64MMFR1_EL1.HAFDBS, but (V)TCR_ELx.{HA,HD} are not RES0
@ -2137,6 +2138,29 @@ config ARM64_EPAN
if the cpu does not implement the feature.
endmenu # "ARMv8.7 architectural features"
menu "ARMv8.9 architectural features"
config ARM64_POE
prompt "Permission Overlay Extension"
def_bool y
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
help
The Permission Overlay Extension is used to implement Memory
Protection Keys. Memory Protection Keys provides a mechanism for
enforcing page-based protections, but without requiring modification
of the page tables when an application changes protection domains.
For details, see Documentation/core-api/protection-keys.rst
If unsure, say y.
config ARCH_PKEY_BITS
int
default 3
endmenu # "ARMv8.9 architectural features"
config ARM64_SVE
bool "ARM Scalable Vector Extension support"
default y

View File

@ -33,6 +33,14 @@ static inline void write_pmevtypern(int n, unsigned long val)
PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
}
#define RETURN_READ_PMEVTYPERN(n) \
return read_sysreg(pmevtyper##n##_el0)
static inline unsigned long read_pmevtypern(int n)
{
PMEVN_SWITCH(n, RETURN_READ_PMEVTYPERN);
return 0;
}
static inline unsigned long read_pmmir(void)
{
return read_cpuid(PMMIR_EL1);
@ -46,6 +54,14 @@ static inline u32 read_pmuver(void)
ID_AA64DFR0_EL1_PMUVer_SHIFT);
}
static inline bool pmuv3_has_icntr(void)
{
u64 dfr1 = read_sysreg(id_aa64dfr1_el1);
return !!cpuid_feature_extract_unsigned_field(dfr1,
ID_AA64DFR1_EL1_PMICNTR_SHIFT);
}
static inline void write_pmcr(u64 val)
{
write_sysreg(val, pmcr_el0);
@ -71,22 +87,32 @@ static inline u64 read_pmccntr(void)
return read_sysreg(pmccntr_el0);
}
static inline void write_pmcntenset(u32 val)
static inline void write_pmicntr(u64 val)
{
write_sysreg_s(val, SYS_PMICNTR_EL0);
}
static inline u64 read_pmicntr(void)
{
return read_sysreg_s(SYS_PMICNTR_EL0);
}
static inline void write_pmcntenset(u64 val)
{
write_sysreg(val, pmcntenset_el0);
}
static inline void write_pmcntenclr(u32 val)
static inline void write_pmcntenclr(u64 val)
{
write_sysreg(val, pmcntenclr_el0);
}
static inline void write_pmintenset(u32 val)
static inline void write_pmintenset(u64 val)
{
write_sysreg(val, pmintenset_el1);
}
static inline void write_pmintenclr(u32 val)
static inline void write_pmintenclr(u64 val)
{
write_sysreg(val, pmintenclr_el1);
}
@ -96,12 +122,27 @@ static inline void write_pmccfiltr(u64 val)
write_sysreg(val, pmccfiltr_el0);
}
static inline void write_pmovsclr(u32 val)
static inline u64 read_pmccfiltr(void)
{
return read_sysreg(pmccfiltr_el0);
}
static inline void write_pmicfiltr(u64 val)
{
write_sysreg_s(val, SYS_PMICFILTR_EL0);
}
static inline u64 read_pmicfiltr(void)
{
return read_sysreg_s(SYS_PMICFILTR_EL0);
}
static inline void write_pmovsclr(u64 val)
{
write_sysreg(val, pmovsclr_el0);
}
static inline u32 read_pmovsclr(void)
static inline u64 read_pmovsclr(void)
{
return read_sysreg(pmovsclr_el0);
}

View File

@ -832,6 +832,12 @@ static inline bool system_supports_lpa2(void)
return cpus_have_final_cap(ARM64_HAS_LPA2);
}
static inline bool system_supports_poe(void)
{
return IS_ENABLED(CONFIG_ARM64_POE) &&
alternative_has_cap_unlikely(ARM64_HAS_S1POE);
}
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);

View File

@ -143,6 +143,7 @@
#define APPLE_CPU_PART_M2_AVALANCHE_MAX 0x039
#define AMPERE_CPU_PART_AMPERE1 0xAC3
#define AMPERE_CPU_PART_AMPERE1A 0xAC4
#define MICROSOFT_CPU_PART_AZURE_COBALT_100 0xD49 /* Based on r0p0 of ARM Neoverse N2 */
@ -212,6 +213,7 @@
#define MIDR_APPLE_M2_BLIZZARD_MAX MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M2_BLIZZARD_MAX)
#define MIDR_APPLE_M2_AVALANCHE_MAX MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M2_AVALANCHE_MAX)
#define MIDR_AMPERE1 MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1)
#define MIDR_AMPERE1A MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1A)
#define MIDR_MICROSOFT_AZURE_COBALT_100 MIDR_CPU_MODEL(ARM_CPU_IMP_MICROSOFT, MICROSOFT_CPU_PART_AZURE_COBALT_100)
/* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */

View File

@ -165,42 +165,53 @@
mrs x1, id_aa64dfr0_el1
ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
cmp x1, #3
b.lt .Lset_debug_fgt_\@
b.lt .Lskip_spe_fgt_\@
/* Disable PMSNEVFR_EL1 read and write traps */
orr x0, x0, #(1 << 62)
.Lset_debug_fgt_\@:
.Lskip_spe_fgt_\@:
msr_s SYS_HDFGRTR_EL2, x0
msr_s SYS_HDFGWTR_EL2, x0
mov x0, xzr
mrs x1, id_aa64pfr1_el1
ubfx x1, x1, #ID_AA64PFR1_EL1_SME_SHIFT, #4
cbz x1, .Lset_pie_fgt_\@
cbz x1, .Lskip_debug_fgt_\@
/* Disable nVHE traps of TPIDR2 and SMPRI */
orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK
orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK
.Lset_pie_fgt_\@:
.Lskip_debug_fgt_\@:
mrs_s x1, SYS_ID_AA64MMFR3_EL1
ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
cbz x1, .Lset_fgt_\@
cbz x1, .Lskip_pie_fgt_\@
/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
orr x0, x0, #HFGxTR_EL2_nPIR_EL1
orr x0, x0, #HFGxTR_EL2_nPIRE0_EL1
.Lset_fgt_\@:
.Lskip_pie_fgt_\@:
mrs_s x1, SYS_ID_AA64MMFR3_EL1
ubfx x1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
cbz x1, .Lskip_poe_fgt_\@
/* Disable trapping of POR_EL0 */
orr x0, x0, #HFGxTR_EL2_nPOR_EL0
.Lskip_poe_fgt_\@:
msr_s SYS_HFGRTR_EL2, x0
msr_s SYS_HFGWTR_EL2, x0
msr_s SYS_HFGITR_EL2, xzr
mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4
cbz x1, .Lskip_fgt_\@
cbz x1, .Lskip_amu_fgt_\@
msr_s SYS_HAFGRTR_EL2, xzr
.Lskip_amu_fgt_\@:
.Lskip_fgt_\@:
.endm

View File

@ -10,63 +10,63 @@
#include <asm/memory.h>
#include <asm/sysreg.h>
#define ESR_ELx_EC_UNKNOWN (0x00)
#define ESR_ELx_EC_WFx (0x01)
#define ESR_ELx_EC_UNKNOWN UL(0x00)
#define ESR_ELx_EC_WFx UL(0x01)
/* Unallocated EC: 0x02 */
#define ESR_ELx_EC_CP15_32 (0x03)
#define ESR_ELx_EC_CP15_64 (0x04)
#define ESR_ELx_EC_CP14_MR (0x05)
#define ESR_ELx_EC_CP14_LS (0x06)
#define ESR_ELx_EC_FP_ASIMD (0x07)
#define ESR_ELx_EC_CP10_ID (0x08) /* EL2 only */
#define ESR_ELx_EC_PAC (0x09) /* EL2 and above */
#define ESR_ELx_EC_CP15_32 UL(0x03)
#define ESR_ELx_EC_CP15_64 UL(0x04)
#define ESR_ELx_EC_CP14_MR UL(0x05)
#define ESR_ELx_EC_CP14_LS UL(0x06)
#define ESR_ELx_EC_FP_ASIMD UL(0x07)
#define ESR_ELx_EC_CP10_ID UL(0x08) /* EL2 only */
#define ESR_ELx_EC_PAC UL(0x09) /* EL2 and above */
/* Unallocated EC: 0x0A - 0x0B */
#define ESR_ELx_EC_CP14_64 (0x0C)
#define ESR_ELx_EC_BTI (0x0D)
#define ESR_ELx_EC_ILL (0x0E)
#define ESR_ELx_EC_CP14_64 UL(0x0C)
#define ESR_ELx_EC_BTI UL(0x0D)
#define ESR_ELx_EC_ILL UL(0x0E)
/* Unallocated EC: 0x0F - 0x10 */
#define ESR_ELx_EC_SVC32 (0x11)
#define ESR_ELx_EC_HVC32 (0x12) /* EL2 only */
#define ESR_ELx_EC_SMC32 (0x13) /* EL2 and above */
#define ESR_ELx_EC_SVC32 UL(0x11)
#define ESR_ELx_EC_HVC32 UL(0x12) /* EL2 only */
#define ESR_ELx_EC_SMC32 UL(0x13) /* EL2 and above */
/* Unallocated EC: 0x14 */
#define ESR_ELx_EC_SVC64 (0x15)
#define ESR_ELx_EC_HVC64 (0x16) /* EL2 and above */
#define ESR_ELx_EC_SMC64 (0x17) /* EL2 and above */
#define ESR_ELx_EC_SYS64 (0x18)
#define ESR_ELx_EC_SVE (0x19)
#define ESR_ELx_EC_ERET (0x1a) /* EL2 only */
#define ESR_ELx_EC_SVC64 UL(0x15)
#define ESR_ELx_EC_HVC64 UL(0x16) /* EL2 and above */
#define ESR_ELx_EC_SMC64 UL(0x17) /* EL2 and above */
#define ESR_ELx_EC_SYS64 UL(0x18)
#define ESR_ELx_EC_SVE UL(0x19)
#define ESR_ELx_EC_ERET UL(0x1a) /* EL2 only */
/* Unallocated EC: 0x1B */
#define ESR_ELx_EC_FPAC (0x1C) /* EL1 and above */
#define ESR_ELx_EC_SME (0x1D)
#define ESR_ELx_EC_FPAC UL(0x1C) /* EL1 and above */
#define ESR_ELx_EC_SME UL(0x1D)
/* Unallocated EC: 0x1E */
#define ESR_ELx_EC_IMP_DEF (0x1f) /* EL3 only */
#define ESR_ELx_EC_IABT_LOW (0x20)
#define ESR_ELx_EC_IABT_CUR (0x21)
#define ESR_ELx_EC_PC_ALIGN (0x22)
#define ESR_ELx_EC_IMP_DEF UL(0x1f) /* EL3 only */
#define ESR_ELx_EC_IABT_LOW UL(0x20)
#define ESR_ELx_EC_IABT_CUR UL(0x21)
#define ESR_ELx_EC_PC_ALIGN UL(0x22)
/* Unallocated EC: 0x23 */
#define ESR_ELx_EC_DABT_LOW (0x24)
#define ESR_ELx_EC_DABT_CUR (0x25)
#define ESR_ELx_EC_SP_ALIGN (0x26)
#define ESR_ELx_EC_MOPS (0x27)
#define ESR_ELx_EC_FP_EXC32 (0x28)
#define ESR_ELx_EC_DABT_LOW UL(0x24)
#define ESR_ELx_EC_DABT_CUR UL(0x25)
#define ESR_ELx_EC_SP_ALIGN UL(0x26)
#define ESR_ELx_EC_MOPS UL(0x27)
#define ESR_ELx_EC_FP_EXC32 UL(0x28)
/* Unallocated EC: 0x29 - 0x2B */
#define ESR_ELx_EC_FP_EXC64 (0x2C)
#define ESR_ELx_EC_FP_EXC64 UL(0x2C)
/* Unallocated EC: 0x2D - 0x2E */
#define ESR_ELx_EC_SERROR (0x2F)
#define ESR_ELx_EC_BREAKPT_LOW (0x30)
#define ESR_ELx_EC_BREAKPT_CUR (0x31)
#define ESR_ELx_EC_SOFTSTP_LOW (0x32)
#define ESR_ELx_EC_SOFTSTP_CUR (0x33)
#define ESR_ELx_EC_WATCHPT_LOW (0x34)
#define ESR_ELx_EC_WATCHPT_CUR (0x35)
#define ESR_ELx_EC_SERROR UL(0x2F)
#define ESR_ELx_EC_BREAKPT_LOW UL(0x30)
#define ESR_ELx_EC_BREAKPT_CUR UL(0x31)
#define ESR_ELx_EC_SOFTSTP_LOW UL(0x32)
#define ESR_ELx_EC_SOFTSTP_CUR UL(0x33)
#define ESR_ELx_EC_WATCHPT_LOW UL(0x34)
#define ESR_ELx_EC_WATCHPT_CUR UL(0x35)
/* Unallocated EC: 0x36 - 0x37 */
#define ESR_ELx_EC_BKPT32 (0x38)
#define ESR_ELx_EC_BKPT32 UL(0x38)
/* Unallocated EC: 0x39 */
#define ESR_ELx_EC_VECTOR32 (0x3A) /* EL2 only */
#define ESR_ELx_EC_VECTOR32 UL(0x3A) /* EL2 only */
/* Unallocated EC: 0x3B */
#define ESR_ELx_EC_BRK64 (0x3C)
#define ESR_ELx_EC_BRK64 UL(0x3C)
/* Unallocated EC: 0x3D - 0x3F */
#define ESR_ELx_EC_MAX (0x3F)
#define ESR_ELx_EC_MAX UL(0x3F)
#define ESR_ELx_EC_SHIFT (26)
#define ESR_ELx_EC_WIDTH (6)

View File

@ -155,8 +155,6 @@ extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused);
extern u64 read_smcr_features(void);
/*
* Helpers to translate bit indices in sve_vq_map to VQ values (and
* vice versa). This allows find_next_bit() to be used to find the

View File

@ -157,6 +157,7 @@
#define KERNEL_HWCAP_SME_SF8FMA __khwcap2_feature(SME_SF8FMA)
#define KERNEL_HWCAP_SME_SF8DP4 __khwcap2_feature(SME_SF8DP4)
#define KERNEL_HWCAP_SME_SF8DP2 __khwcap2_feature(SME_SF8DP2)
#define KERNEL_HWCAP_POE __khwcap2_feature(POE)
/*
* This yields a mask that user programs can use to figure out what

View File

@ -7,4 +7,15 @@
void kvm_init_hyp_services(void);
bool kvm_arm_hyp_service_available(u32 func_id);
#ifdef CONFIG_ARM_PKVM_GUEST
void pkvm_init_hyp_services(void);
#else
static inline void pkvm_init_hyp_services(void) { };
#endif
static inline void kvm_arch_init_hyp_services(void)
{
pkvm_init_hyp_services();
};
#endif

View File

@ -271,6 +271,10 @@ __iowrite64_copy(void __iomem *to, const void *from, size_t count)
* I/O memory mapping functions.
*/
typedef int (*ioremap_prot_hook_t)(phys_addr_t phys_addr, size_t size,
pgprot_t *prot);
int arm64_ioremap_prot_hook_register(const ioremap_prot_hook_t hook);
#define ioremap_prot ioremap_prot
#define _PAGE_IOREMAP PROT_DEVICE_nGnRE

View File

@ -10,6 +10,7 @@
#include <asm/hyp_image.h>
#include <asm/insn.h>
#include <asm/virt.h>
#include <asm/sysreg.h>
#define ARM_EXIT_WITH_SERROR_BIT 31
#define ARM_EXCEPTION_CODE(x) ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
@ -259,7 +260,7 @@ extern u64 __kvm_get_mdcr_el2(void);
asm volatile( \
" mrs %1, spsr_el2\n" \
" mrs %2, elr_el2\n" \
"1: at "at_op", %3\n" \
"1: " __msr_s(at_op, "%3") "\n" \
" isb\n" \
" b 9f\n" \
"2: msr spsr_el2, %1\n" \

View File

@ -446,6 +446,8 @@ enum vcpu_sysreg {
GCR_EL1, /* Tag Control Register */
TFSRE0_EL1, /* Tag Fault Status Register (EL0) */
POR_EL0, /* Permission Overlay Register 0 (EL0) */
/* 32bit specific registers. */
DACR32_EL2, /* Domain Access Control Register */
IFSR32_EL2, /* Instruction Fault Status Register */
@ -517,6 +519,8 @@ enum vcpu_sysreg {
VNCR(PIR_EL1), /* Permission Indirection Register 1 (EL1) */
VNCR(PIRE0_EL1), /* Permission Indirection Register 0 (EL1) */
VNCR(POR_EL1), /* Permission Overlay Register 1 (EL1) */
VNCR(HFGRTR_EL2),
VNCR(HFGWTR_EL2),
VNCR(HFGITR_EL2),
@ -1330,12 +1334,12 @@ void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
#ifdef CONFIG_KVM
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u32 clr);
void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u64 clr);
bool kvm_set_pmuserenr(u64 val);
#else
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u64 clr) {}
static inline bool kvm_set_pmuserenr(u64 val)
{
return false;

View File

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef __ASM_MEM_ENCRYPT_H
#define __ASM_MEM_ENCRYPT_H
struct arm64_mem_crypt_ops {
int (*encrypt)(unsigned long addr, int numpages);
int (*decrypt)(unsigned long addr, int numpages);
};
int arm64_mem_crypt_ops_register(const struct arm64_mem_crypt_ops *ops);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
#endif /* __ASM_MEM_ENCRYPT_H */

View File

@ -110,6 +110,8 @@
#define PAGE_END (_PAGE_END(VA_BITS_MIN))
#endif /* CONFIG_KASAN */
#define PHYSMEM_END __pa(PAGE_END - 1)
#define MIN_THREAD_SHIFT (14 + KASAN_THREAD_SHIFT)
/*

View File

@ -7,7 +7,7 @@
#include <uapi/asm/mman.h>
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey __always_unused)
unsigned long pkey)
{
unsigned long ret = 0;
@ -17,6 +17,14 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
if (system_supports_mte() && (prot & PROT_MTE))
ret |= VM_MTE;
#ifdef CONFIG_ARCH_HAS_PKEYS
if (system_supports_poe()) {
ret |= pkey & BIT(0) ? VM_PKEY_BIT0 : 0;
ret |= pkey & BIT(1) ? VM_PKEY_BIT1 : 0;
ret |= pkey & BIT(2) ? VM_PKEY_BIT2 : 0;
}
#endif
return ret;
}
#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)

View File

@ -25,6 +25,7 @@ typedef struct {
refcount_t pinned;
void *vdso;
unsigned long flags;
u8 pkey_allocation_map;
} mm_context_t;
/*
@ -63,7 +64,6 @@ static inline bool arm64_kernel_unmapped_at_el0(void)
extern void arm64_memblock_init(void);
extern void paging_init(void);
extern void bootmem_init(void);
extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
extern void create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
phys_addr_t size, pgprot_t prot);
extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,

View File

@ -15,12 +15,12 @@
#include <linux/sched/hotplug.h>
#include <linux/mm_types.h>
#include <linux/pgtable.h>
#include <linux/pkeys.h>
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/daifflags.h>
#include <asm/proc-fns.h>
#include <asm-generic/mm_hooks.h>
#include <asm/cputype.h>
#include <asm/sysreg.h>
#include <asm/tlbflush.h>
@ -175,9 +175,36 @@ init_new_context(struct task_struct *tsk, struct mm_struct *mm)
{
atomic64_set(&mm->context.id, 0);
refcount_set(&mm->context.pinned, 0);
/* pkey 0 is the default, so always reserve it. */
mm->context.pkey_allocation_map = BIT(0);
return 0;
}
static inline void arch_dup_pkeys(struct mm_struct *oldmm,
struct mm_struct *mm)
{
/* Duplicate the oldmm pkey state in mm: */
mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
}
static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
{
arch_dup_pkeys(oldmm, mm);
return 0;
}
static inline void arch_exit_mmap(struct mm_struct *mm)
{
}
static inline void arch_unmap(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
}
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
static inline void update_saved_ttbr0(struct task_struct *tsk,
struct mm_struct *mm)
@ -267,6 +294,23 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm)
return -1UL >> 8;
}
/*
* Only enforce protection keys on the current process, because there is no
* user context to access POR_EL0 for another address space.
*/
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
{
if (!system_supports_poe())
return true;
/* allow access if the VMA is not one from this process */
if (foreign || vma_is_foreign(vma))
return true;
return por_el0_allows_pkey(vma_pkey(vma), write, execute);
}
#include <asm-generic/mmu_context.h>
#endif /* !__ASSEMBLY__ */

View File

@ -135,7 +135,6 @@
/*
* Section
*/
#define PMD_SECT_VALID (_AT(pmdval_t, 1) << 0)
#define PMD_SECT_USER (_AT(pmdval_t, 1) << 6) /* AP[1] */
#define PMD_SECT_RDONLY (_AT(pmdval_t, 1) << 7) /* AP[2] */
#define PMD_SECT_S (_AT(pmdval_t, 3) << 8)
@ -199,6 +198,16 @@
#define PTE_PI_IDX_2 53 /* PXN */
#define PTE_PI_IDX_3 54 /* UXN */
/*
* POIndex[2:0] encoding (Permission Overlay Extension)
*/
#define PTE_PO_IDX_0 (_AT(pteval_t, 1) << 60)
#define PTE_PO_IDX_1 (_AT(pteval_t, 1) << 61)
#define PTE_PO_IDX_2 (_AT(pteval_t, 1) << 62)
#define PTE_PO_IDX_MASK GENMASK_ULL(62, 60)
/*
* Memory Attribute override for Stage-2 (MemAttr[3:0])
*/

View File

@ -154,10 +154,10 @@ static inline bool __pure lpa2_is_enabled(void)
#define PIE_E0 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_X_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RWX) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW))
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RWX_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW_O))
#define PIE_E1 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_NONE_O) | \

View File

@ -34,6 +34,7 @@
#include <asm/cmpxchg.h>
#include <asm/fixmap.h>
#include <asm/por.h>
#include <linux/mmdebug.h>
#include <linux/mm_types.h>
#include <linux/sched.h>
@ -149,6 +150,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
#define pte_accessible(mm, pte) \
(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
{
u64 por;
if (!system_supports_poe())
return true;
por = read_sysreg_s(SYS_POR_EL0);
if (write)
return por_elx_allows_write(por, pkey);
if (execute)
return por_elx_allows_exec(por, pkey);
return por_elx_allows_read(por, pkey);
}
/*
* p??_access_permitted() is true for valid user mappings (PTE_USER
* bit set, subject to the write permission check). For execute-only
@ -156,8 +175,11 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
* not set) must return false. PROT_NONE mappings do not have the
* PTE_VALID bit set.
*/
#define pte_access_permitted(pte, write) \
#define pte_access_permitted_no_overlay(pte, write) \
(((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && (!(write) || pte_write(pte)))
#define pte_access_permitted(pte, write) \
(pte_access_permitted_no_overlay(pte, write) && \
por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, false))
#define pmd_access_permitted(pmd, write) \
(pte_access_permitted(pmd_pte(pmd), (write)))
#define pud_access_permitted(pud, write) \
@ -373,10 +395,11 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages)
/*
* If the PTE would provide user space access to the tags associated
* with it then ensure that the MTE tags are synchronised. Although
* pte_access_permitted() returns false for exec only mappings, they
* don't expose tags (instruction fetches don't check tags).
* pte_access_permitted_no_overlay() returns false for exec only
* mappings, they don't expose tags (instruction fetches don't check
* tags).
*/
if (system_supports_mte() && pte_access_permitted(pte, false) &&
if (system_supports_mte() && pte_access_permitted_no_overlay(pte, false) &&
!pte_special(pte) && pte_tagged(pte))
mte_sync_tags(pte, nr_pages);
}
@ -1103,7 +1126,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
*/
const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
PTE_PRESENT_INVALID | PTE_VALID | PTE_WRITE |
PTE_GP | PTE_ATTRINDX_MASK;
PTE_GP | PTE_ATTRINDX_MASK | PTE_PO_IDX_MASK;
/* preserve the hardware dirty information */
if (pte_hw_dirty(pte))
pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));

View File

@ -0,0 +1,106 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2023 Arm Ltd.
*
* Based on arch/x86/include/asm/pkeys.h
*/
#ifndef _ASM_ARM64_PKEYS_H
#define _ASM_ARM64_PKEYS_H
#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
#define arch_max_pkey() 8
int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
static inline bool arch_pkeys_enabled(void)
{
return system_supports_poe();
}
static inline int vma_pkey(struct vm_area_struct *vma)
{
return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
}
static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
int prot, int pkey)
{
if (pkey != -1)
return pkey;
return vma_pkey(vma);
}
static inline int execute_only_pkey(struct mm_struct *mm)
{
// Execute-only mappings are handled by EPAN/FEAT_PAN3.
return -1;
}
#define mm_pkey_allocation_map(mm) (mm)->context.pkey_allocation_map
#define mm_set_pkey_allocated(mm, pkey) do { \
mm_pkey_allocation_map(mm) |= (1U << pkey); \
} while (0)
#define mm_set_pkey_free(mm, pkey) do { \
mm_pkey_allocation_map(mm) &= ~(1U << pkey); \
} while (0)
static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
{
/*
* "Allocated" pkeys are those that have been returned
* from pkey_alloc() or pkey 0 which is allocated
* implicitly when the mm is created.
*/
if (pkey < 0 || pkey >= arch_max_pkey())
return false;
return mm_pkey_allocation_map(mm) & (1U << pkey);
}
/*
* Returns a positive, 3-bit key on success, or -1 on failure.
*/
static inline int mm_pkey_alloc(struct mm_struct *mm)
{
/*
* Note: this is the one and only place we make sure
* that the pkey is valid as far as the hardware is
* concerned. The rest of the kernel trusts that
* only good, valid pkeys come out of here.
*/
u8 all_pkeys_mask = GENMASK(arch_max_pkey() - 1, 0);
int ret;
if (!arch_pkeys_enabled())
return -1;
/*
* Are we out of pkeys? We must handle this specially
* because ffz() behavior is undefined if there are no
* zeros.
*/
if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
return -1;
ret = ffz(mm_pkey_allocation_map(mm));
mm_set_pkey_allocated(mm, ret);
return ret;
}
static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
{
if (!mm_pkey_is_allocated(mm, pkey))
return -EINVAL;
mm_set_pkey_free(mm, pkey);
return 0;
}
#endif /* _ASM_ARM64_PKEYS_H */

View File

@ -0,0 +1,33 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2023 Arm Ltd.
*/
#ifndef _ASM_ARM64_POR_H
#define _ASM_ARM64_POR_H
#define POR_BITS_PER_PKEY 4
#define POR_ELx_IDX(por_elx, idx) (((por_elx) >> ((idx) * POR_BITS_PER_PKEY)) & 0xf)
static inline bool por_elx_allows_read(u64 por, u8 pkey)
{
u8 perm = POR_ELx_IDX(por, pkey);
return perm & POE_R;
}
static inline bool por_elx_allows_write(u64 por, u8 pkey)
{
u8 perm = POR_ELx_IDX(por, pkey);
return perm & POE_W;
}
static inline bool por_elx_allows_exec(u64 por, u8 pkey)
{
u8 perm = POR_ELx_IDX(por, pkey);
return perm & POE_X;
}
#endif /* _ASM_ARM64_POR_H */

View File

@ -184,6 +184,7 @@ struct thread_struct {
u64 sctlr_user;
u64 svcr;
u64 tpidr2_el0;
u64 por_el0;
};
static inline unsigned int thread_get_vl(struct thread_struct *thread,
@ -402,5 +403,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl(current)
#endif
int get_tsc_mode(unsigned long adr);
int set_tsc_mode(unsigned int val);
#define GET_TSC_CTL(adr) get_tsc_mode((adr))
#define SET_TSC_CTL(val) set_tsc_mode((val))
#endif /* __ASSEMBLY__ */
#endif /* __ASM_PROCESSOR_H */

View File

@ -3,6 +3,7 @@
#ifndef _ASM_ARM64_SET_MEMORY_H
#define _ASM_ARM64_SET_MEMORY_H
#include <asm/mem_encrypt.h>
#include <asm-generic/set_memory.h>
bool can_set_direct_map(void);

View File

@ -403,7 +403,6 @@
#define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2)
#define SYS_PMOVSCLR_EL0 sys_reg(3, 3, 9, 12, 3)
#define SYS_PMSWINC_EL0 sys_reg(3, 3, 9, 12, 4)
#define SYS_PMSELR_EL0 sys_reg(3, 3, 9, 12, 5)
#define SYS_PMCEID0_EL0 sys_reg(3, 3, 9, 12, 6)
#define SYS_PMCEID1_EL0 sys_reg(3, 3, 9, 12, 7)
#define SYS_PMCCNTR_EL0 sys_reg(3, 3, 9, 13, 0)
@ -1077,6 +1076,9 @@
#define POE_RXW UL(0x7)
#define POE_MASK UL(0xf)
/* Initial value for Permission Overlay Extension for EL0 */
#define POR_EL0_INIT POE_RXW
#define ARM64_FEATURE_FIELD_BITS 4
/* Defined for compatibility only, do not add new users. */

View File

@ -81,6 +81,7 @@ void arch_setup_new_exec(void);
#define TIF_SME 27 /* SME in use */
#define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */
#define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */
#define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
@ -97,6 +98,7 @@ void arch_setup_new_exec(void);
#define _TIF_SVE (1 << TIF_SVE)
#define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
#define _TIF_TSC_SIGSEGV (1 << TIF_TSC_SIGSEGV)
#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
_TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \

View File

@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
void force_signal_inject(int signal, int code, unsigned long address, unsigned long err);
void arm64_notify_segfault(unsigned long addr);
void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
void arm64_force_sig_fault_pkey(unsigned long far, const char *str, int pkey);
void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);

View File

@ -52,6 +52,7 @@
#define VNCR_PIRE0_EL1 0x290
#define VNCR_PIRE0_EL2 0x298
#define VNCR_PIR_EL1 0x2A0
#define VNCR_POR_EL1 0x2A8
#define VNCR_ICH_LR0_EL2 0x400
#define VNCR_ICH_LR1_EL2 0x408
#define VNCR_ICH_LR2_EL2 0x410

View File

@ -122,5 +122,6 @@
#define HWCAP2_SME_SF8FMA (1UL << 60)
#define HWCAP2_SME_SF8DP4 (1UL << 61)
#define HWCAP2_SME_SF8DP2 (1UL << 62)
#define HWCAP2_POE (1UL << 63)
#endif /* _UAPI__ASM_HWCAP_H */

View File

@ -7,4 +7,13 @@
#define PROT_BTI 0x10 /* BTI guarded page */
#define PROT_MTE 0x20 /* Normal Tagged mapping */
/* Override any generic PKEY permission defines */
#define PKEY_DISABLE_EXECUTE 0x4
#define PKEY_DISABLE_READ 0x8
#undef PKEY_ACCESS_MASK
#define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\
PKEY_DISABLE_WRITE |\
PKEY_DISABLE_READ |\
PKEY_DISABLE_EXECUTE)
#endif /* ! _UAPI__ASM_MMAN_H */

View File

@ -98,6 +98,13 @@ struct esr_context {
__u64 esr;
};
#define POE_MAGIC 0x504f4530
struct poe_context {
struct _aarch64_ctx head;
__u64 por_el0;
};
/*
* extra_context: describes extra space in the signal frame for
* additional structures that don't fit in sigcontext.__reserved[].
@ -320,10 +327,10 @@ struct zt_context {
((sizeof(struct za_context) + (__SVE_VQ_BYTES - 1)) \
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES))
#define ZA_SIG_REGS_SIZE(vq) (((vq) * __SVE_VQ_BYTES) * ((vq) * __SVE_VQ_BYTES))
#define ZA_SIG_ZAV_OFFSET(vq, n) (ZA_SIG_REGS_OFFSET + \
(SVE_SIG_ZREG_SIZE(vq) * n))
(SVE_SIG_ZREG_SIZE(vq) * (n)))
#define ZA_SIG_CONTEXT_SIZE(vq) \
(ZA_SIG_REGS_OFFSET + ZA_SIG_REGS_SIZE(vq))
@ -334,7 +341,7 @@ struct zt_context {
#define ZT_SIG_REGS_OFFSET sizeof(struct zt_context)
#define ZT_SIG_REGS_SIZE(n) (ZT_SIG_REG_BYTES * n)
#define ZT_SIG_REGS_SIZE(n) (ZT_SIG_REG_BYTES * (n))
#define ZT_SIG_CONTEXT_SIZE(n) \
(sizeof(struct zt_context) + ZT_SIG_REGS_SIZE(n))

View File

@ -456,6 +456,14 @@ static const struct midr_range erratum_spec_ssbs_list[] = {
};
#endif
#ifdef CONFIG_AMPERE_ERRATUM_AC03_CPU_38
static const struct midr_range erratum_ac03_cpu_38_list[] = {
MIDR_ALL_VERSIONS(MIDR_AMPERE1),
MIDR_ALL_VERSIONS(MIDR_AMPERE1A),
{},
};
#endif
const struct arm64_cpu_capabilities arm64_errata[] = {
#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
{
@ -772,7 +780,7 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
.desc = "AmpereOne erratum AC03_CPU_38",
.capability = ARM64_WORKAROUND_AMPERE_AC03_CPU_38,
ERRATA_MIDR_ALL_VERSIONS(MIDR_AMPERE1),
ERRATA_MIDR_RANGE_LIST(erratum_ac03_cpu_38_list),
},
#endif
{

View File

@ -466,6 +466,8 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
};
static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_POE),
FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0),
ARM64_FTR_END,
@ -2348,6 +2350,14 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused)
sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn);
}
#ifdef CONFIG_ARM64_POE
static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
{
sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
}
#endif
/* Internal helper functions to match cpu capability type */
static bool
cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@ -2870,6 +2880,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.matches = has_nv1,
ARM64_CPUID_FIELDS_NEG(ID_AA64MMFR4_EL1, E2H0, NI_NV1)
},
#ifdef CONFIG_ARM64_POE
{
.desc = "Stage-1 Permission Overlay Extension (S1POE)",
.capability = ARM64_HAS_S1POE,
.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
.matches = has_cpuid_feature,
.cpu_enable = cpu_enable_poe,
ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
},
#endif
{},
};
@ -3034,6 +3054,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2),
HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3),
HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2),
#ifdef CONFIG_ARM64_POE
HWCAP_CAP(ID_AA64MMFR3_EL1, S1POE, IMP, CAP_HWCAP, KERNEL_HWCAP_POE),
#endif
{},
};

View File

@ -143,6 +143,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SME_SF8FMA] = "smesf8fma",
[KERNEL_HWCAP_SME_SF8DP4] = "smesf8dp4",
[KERNEL_HWCAP_SME_SF8DP2] = "smesf8dp2",
[KERNEL_HWCAP_POE] = "poe",
};
#ifdef CONFIG_COMPAT
@ -280,7 +281,7 @@ const struct seq_operations cpuinfo_op = {
};
static struct kobj_type cpuregs_kobj_type = {
static const struct kobj_type cpuregs_kobj_type = {
.sysfs_ops = &kobj_sysfs_ops,
};

View File

@ -407,7 +407,7 @@ int swsusp_arch_resume(void)
void *, phys_addr_t, phys_addr_t);
struct trans_pgd_info trans_info = {
.trans_alloc_page = hibernate_page_alloc,
.trans_alloc_arg = (void *)GFP_ATOMIC,
.trans_alloc_arg = (__force void *)GFP_ATOMIC,
};
/*

View File

@ -43,6 +43,7 @@
#include <linux/stacktrace.h>
#include <asm/alternative.h>
#include <asm/arch_timer.h>
#include <asm/compat.h>
#include <asm/cpufeature.h>
#include <asm/cacheflush.h>
@ -271,12 +272,21 @@ static void flush_tagged_addr_state(void)
clear_thread_flag(TIF_TAGGED_ADDR);
}
static void flush_poe(void)
{
if (!system_supports_poe())
return;
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
}
void flush_thread(void)
{
fpsimd_flush_thread();
tls_thread_flush();
flush_ptrace_hw_breakpoint(current);
flush_tagged_addr_state();
flush_poe();
}
void arch_release_task_struct(struct task_struct *tsk)
@ -371,6 +381,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
if (system_supports_tpidr2())
p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
if (system_supports_poe())
p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
if (stack_start) {
if (is_compat_thread(task_thread_info(p)))
childregs->compat_sp = stack_start;
@ -472,27 +485,63 @@ static void entry_task_switch(struct task_struct *next)
}
/*
* ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
* Ensure access is disabled when switching to a 32bit task, ensure
* access is enabled when switching to a 64bit task.
* Handle sysreg updates for ARM erratum 1418040 which affects the 32bit view of
* CNTVCT, various other errata which require trapping all CNTVCT{,_EL0}
* accesses and prctl(PR_SET_TSC). Ensure access is disabled iff a workaround is
* required or PR_TSC_SIGSEGV is set.
*/
static void erratum_1418040_thread_switch(struct task_struct *next)
static void update_cntkctl_el1(struct task_struct *next)
{
if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) ||
!this_cpu_has_cap(ARM64_WORKAROUND_1418040))
return;
struct thread_info *ti = task_thread_info(next);
if (is_compat_thread(task_thread_info(next)))
if (test_ti_thread_flag(ti, TIF_TSC_SIGSEGV) ||
has_erratum_handler(read_cntvct_el0) ||
(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
this_cpu_has_cap(ARM64_WORKAROUND_1418040) &&
is_compat_thread(ti)))
sysreg_clear_set(cntkctl_el1, ARCH_TIMER_USR_VCT_ACCESS_EN, 0);
else
sysreg_clear_set(cntkctl_el1, 0, ARCH_TIMER_USR_VCT_ACCESS_EN);
}
static void erratum_1418040_new_exec(void)
static void cntkctl_thread_switch(struct task_struct *prev,
struct task_struct *next)
{
if ((read_ti_thread_flags(task_thread_info(prev)) &
(_TIF_32BIT | _TIF_TSC_SIGSEGV)) !=
(read_ti_thread_flags(task_thread_info(next)) &
(_TIF_32BIT | _TIF_TSC_SIGSEGV)))
update_cntkctl_el1(next);
}
static int do_set_tsc_mode(unsigned int val)
{
bool tsc_sigsegv;
if (val == PR_TSC_SIGSEGV)
tsc_sigsegv = true;
else if (val == PR_TSC_ENABLE)
tsc_sigsegv = false;
else
return -EINVAL;
preempt_disable();
erratum_1418040_thread_switch(current);
update_thread_flag(TIF_TSC_SIGSEGV, tsc_sigsegv);
update_cntkctl_el1(current);
preempt_enable();
return 0;
}
static void permission_overlay_switch(struct task_struct *next)
{
if (!system_supports_poe())
return;
current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
if (current->thread.por_el0 != next->thread.por_el0) {
write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
}
}
/*
@ -528,8 +577,9 @@ struct task_struct *__switch_to(struct task_struct *prev,
contextidr_thread_switch(next);
entry_task_switch(next);
ssbs_thread_switch(next);
erratum_1418040_thread_switch(next);
cntkctl_thread_switch(prev, next);
ptrauth_thread_switch_user(next);
permission_overlay_switch(next);
/*
* Complete any pending TLB or cache maintenance on this CPU in case
@ -645,7 +695,7 @@ void arch_setup_new_exec(void)
current->mm->context.flags = mmflags;
ptrauth_thread_init_user();
mte_thread_init_user();
erratum_1418040_new_exec();
do_set_tsc_mode(PR_TSC_ENABLE);
if (task_spec_ssb_noexec(current)) {
arch_prctl_spec_ctrl_set(current, PR_SPEC_STORE_BYPASS,
@ -754,3 +804,26 @@ int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
return prot;
}
#endif
int get_tsc_mode(unsigned long adr)
{
unsigned int val;
if (is_compat_task())
return -EINVAL;
if (test_thread_flag(TIF_TSC_SIGSEGV))
val = PR_TSC_SIGSEGV;
else
val = PR_TSC_ENABLE;
return put_user(val, (unsigned int __user *)adr);
}
int set_tsc_mode(unsigned int val)
{
if (is_compat_task())
return -EINVAL;
return do_set_tsc_mode(val);
}

View File

@ -1440,6 +1440,39 @@ static int tagged_addr_ctrl_set(struct task_struct *target, const struct
}
#endif
#ifdef CONFIG_ARM64_POE
static int poe_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
{
if (!system_supports_poe())
return -EINVAL;
return membuf_write(&to, &target->thread.por_el0,
sizeof(target->thread.por_el0));
}
static int poe_set(struct task_struct *target, const struct
user_regset *regset, unsigned int pos,
unsigned int count, const void *kbuf, const
void __user *ubuf)
{
int ret;
long ctrl;
if (!system_supports_poe())
return -EINVAL;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1);
if (ret)
return ret;
target->thread.por_el0 = ctrl;
return 0;
}
#endif
enum aarch64_regset {
REGSET_GPR,
REGSET_FPR,
@ -1469,6 +1502,9 @@ enum aarch64_regset {
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
REGSET_TAGGED_ADDR_CTRL,
#endif
#ifdef CONFIG_ARM64_POE
REGSET_POE
#endif
};
static const struct user_regset aarch64_regsets[] = {
@ -1628,6 +1664,16 @@ static const struct user_regset aarch64_regsets[] = {
.set = tagged_addr_ctrl_set,
},
#endif
#ifdef CONFIG_ARM64_POE
[REGSET_POE] = {
.core_note_type = NT_ARM_POE,
.n = 1,
.size = sizeof(long),
.align = sizeof(long),
.regset_get = poe_get,
.set = poe_set,
},
#endif
};
static const struct user_regset_view user_aarch64_view = {

View File

@ -61,6 +61,7 @@ struct rt_sigframe_user_layout {
unsigned long za_offset;
unsigned long zt_offset;
unsigned long fpmr_offset;
unsigned long poe_offset;
unsigned long extra_offset;
unsigned long end_offset;
};
@ -185,6 +186,8 @@ struct user_ctxs {
u32 zt_size;
struct fpmr_context __user *fpmr;
u32 fpmr_size;
struct poe_context __user *poe;
u32 poe_size;
};
static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
@ -258,6 +261,32 @@ static int restore_fpmr_context(struct user_ctxs *user)
return err;
}
static int preserve_poe_context(struct poe_context __user *ctx)
{
int err = 0;
__put_user_error(POE_MAGIC, &ctx->head.magic, err);
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
__put_user_error(read_sysreg_s(SYS_POR_EL0), &ctx->por_el0, err);
return err;
}
static int restore_poe_context(struct user_ctxs *user)
{
u64 por_el0;
int err = 0;
if (user->poe_size != sizeof(*user->poe))
return -EINVAL;
__get_user_error(por_el0, &(user->poe->por_el0), err);
if (!err)
write_sysreg_s(por_el0, SYS_POR_EL0);
return err;
}
#ifdef CONFIG_ARM64_SVE
static int preserve_sve_context(struct sve_context __user *ctx)
@ -621,6 +650,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->za = NULL;
user->zt = NULL;
user->fpmr = NULL;
user->poe = NULL;
if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
@ -671,6 +701,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
/* ignore */
break;
case POE_MAGIC:
if (!system_supports_poe())
goto invalid;
if (user->poe)
goto invalid;
user->poe = (struct poe_context __user *)head;
user->poe_size = size;
break;
case SVE_MAGIC:
if (!system_supports_sve() && !system_supports_sme())
goto invalid;
@ -857,6 +898,9 @@ static int restore_sigframe(struct pt_regs *regs,
if (err == 0 && system_supports_sme2() && user.zt)
err = restore_zt_context(&user);
if (err == 0 && system_supports_poe() && user.poe)
err = restore_poe_context(&user);
return err;
}
@ -980,6 +1024,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
return err;
}
if (system_supports_poe()) {
err = sigframe_alloc(user, &user->poe_offset,
sizeof(struct poe_context));
if (err)
return err;
}
return sigframe_alloc_end(user);
}
@ -1042,6 +1093,14 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
err |= preserve_fpmr_context(fpmr_ctx);
}
if (system_supports_poe() && err == 0 && user->poe_offset) {
struct poe_context __user *poe_ctx =
apply_user_offset(user, user->poe_offset);
err |= preserve_poe_context(poe_ctx);
}
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =
@ -1178,6 +1237,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
sme_smstop();
}
if (system_supports_poe())
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else

View File

@ -68,7 +68,7 @@ enum ipi_msg_type {
IPI_RESCHEDULE,
IPI_CALL_FUNC,
IPI_CPU_STOP,
IPI_CPU_CRASH_STOP,
IPI_CPU_STOP_NMI,
IPI_TIMER,
IPI_IRQ_WORK,
NR_IPI,
@ -85,6 +85,8 @@ static int ipi_irq_base __ro_after_init;
static int nr_ipi __ro_after_init = NR_IPI;
static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init;
static bool crash_stop;
static void ipi_setup(int cpu);
#ifdef CONFIG_HOTPLUG_CPU
@ -823,7 +825,7 @@ static const char *ipi_types[MAX_IPI] __tracepoint_string = {
[IPI_RESCHEDULE] = "Rescheduling interrupts",
[IPI_CALL_FUNC] = "Function call interrupts",
[IPI_CPU_STOP] = "CPU stop interrupts",
[IPI_CPU_CRASH_STOP] = "CPU stop (for crash dump) interrupts",
[IPI_CPU_STOP_NMI] = "CPU stop NMIs",
[IPI_TIMER] = "Timer broadcast interrupts",
[IPI_IRQ_WORK] = "IRQ work interrupts",
[IPI_CPU_BACKTRACE] = "CPU backtrace interrupts",
@ -867,9 +869,9 @@ void arch_irq_work_raise(void)
}
#endif
static void __noreturn local_cpu_stop(void)
static void __noreturn local_cpu_stop(unsigned int cpu)
{
set_cpu_online(smp_processor_id(), false);
set_cpu_online(cpu, false);
local_daif_mask();
sdei_mask_local_cpu();
@ -883,21 +885,26 @@ static void __noreturn local_cpu_stop(void)
*/
void __noreturn panic_smp_self_stop(void)
{
local_cpu_stop();
local_cpu_stop(smp_processor_id());
}
#ifdef CONFIG_KEXEC_CORE
static atomic_t waiting_for_crash_ipi = ATOMIC_INIT(0);
#endif
static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
{
#ifdef CONFIG_KEXEC_CORE
/*
* Use local_daif_mask() instead of local_irq_disable() to make sure
* that pseudo-NMIs are disabled. The "crash stop" code starts with
* an IRQ and falls back to NMI (which might be pseudo). If the IRQ
* finally goes through right as we're timing out then the NMI could
* interrupt us. It's better to prevent the NMI and let the IRQ
* finish since the pt_regs will be better.
*/
local_daif_mask();
crash_save_cpu(regs, cpu);
atomic_dec(&waiting_for_crash_ipi);
set_cpu_online(cpu, false);
local_irq_disable();
sdei_mask_local_cpu();
if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
@ -962,14 +969,12 @@ static void do_handle_IPI(int ipinr)
break;
case IPI_CPU_STOP:
local_cpu_stop();
break;
case IPI_CPU_CRASH_STOP:
if (IS_ENABLED(CONFIG_KEXEC_CORE)) {
case IPI_CPU_STOP_NMI:
if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) {
ipi_cpu_crash_stop(cpu, get_irq_regs());
unreachable();
} else {
local_cpu_stop(cpu);
}
break;
@ -1024,8 +1029,7 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi)
return false;
switch (ipi) {
case IPI_CPU_STOP:
case IPI_CPU_CRASH_STOP:
case IPI_CPU_STOP_NMI:
case IPI_CPU_BACKTRACE:
case IPI_KGDB_ROUNDUP:
return true;
@ -1138,47 +1142,10 @@ static inline unsigned int num_other_online_cpus(void)
void smp_send_stop(void)
{
unsigned long timeout;
if (num_other_online_cpus()) {
cpumask_t mask;
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
if (system_state <= SYSTEM_RUNNING)
pr_crit("SMP: stopping secondary CPUs\n");
smp_cross_call(&mask, IPI_CPU_STOP);
}
/* Wait up to one second for other CPUs to stop */
timeout = USEC_PER_SEC;
while (num_other_online_cpus() && timeout--)
udelay(1);
if (num_other_online_cpus())
pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
cpumask_pr_args(cpu_online_mask));
sdei_mask_local_cpu();
}
#ifdef CONFIG_KEXEC_CORE
void crash_smp_send_stop(void)
{
static int cpus_stopped;
static unsigned long stop_in_progress;
cpumask_t mask;
unsigned long timeout;
/*
* This function can be called twice in panic path, but obviously
* we execute this only once.
*/
if (cpus_stopped)
return;
cpus_stopped = 1;
/*
* If this cpu is the only one alive at this point in time, online or
* not, there are no stop messages to be sent around, so just back out.
@ -1186,31 +1153,98 @@ void crash_smp_send_stop(void)
if (num_other_online_cpus() == 0)
goto skip_ipi;
/* Only proceed if this is the first CPU to reach this code */
if (test_and_set_bit(0, &stop_in_progress))
return;
/*
* Send an IPI to all currently online CPUs except the CPU running
* this code.
*
* NOTE: we don't do anything here to prevent other CPUs from coming
* online after we snapshot `cpu_online_mask`. Ideally, the calling code
* should do something to prevent other CPUs from coming up. This code
* can be called in the panic path and thus it doesn't seem wise to
* grab the CPU hotplug mutex ourselves. Worst case:
* - If a CPU comes online as we're running, we'll likely notice it
* during the 1 second wait below and then we'll catch it when we try
* with an NMI (assuming NMIs are enabled) since we re-snapshot the
* mask before sending an NMI.
* - If we leave the function and see that CPUs are still online we'll
* at least print a warning. Especially without NMIs this function
* isn't foolproof anyway so calling code will just have to accept
* the fact that there could be cases where a CPU can't be stopped.
*/
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
atomic_set(&waiting_for_crash_ipi, num_other_online_cpus());
if (system_state <= SYSTEM_RUNNING)
pr_crit("SMP: stopping secondary CPUs\n");
smp_cross_call(&mask, IPI_CPU_CRASH_STOP);
/* Wait up to one second for other CPUs to stop */
/*
* Start with a normal IPI and wait up to one second for other CPUs to
* stop. We do this first because it gives other processors a chance
* to exit critical sections / drop locks and makes the rest of the
* stop process (especially console flush) more robust.
*/
smp_cross_call(&mask, IPI_CPU_STOP);
timeout = USEC_PER_SEC;
while ((atomic_read(&waiting_for_crash_ipi) > 0) && timeout--)
while (num_other_online_cpus() && timeout--)
udelay(1);
if (atomic_read(&waiting_for_crash_ipi) > 0)
/*
* If CPUs are still online, try an NMI. There's no excuse for this to
* be slow, so we only give them an extra 10 ms to respond.
*/
if (num_other_online_cpus() && ipi_should_be_nmi(IPI_CPU_STOP_NMI)) {
smp_rmb();
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
pr_info("SMP: retry stop with NMI for CPUs %*pbl\n",
cpumask_pr_args(&mask));
smp_cross_call(&mask, IPI_CPU_STOP_NMI);
timeout = USEC_PER_MSEC * 10;
while (num_other_online_cpus() && timeout--)
udelay(1);
}
if (num_other_online_cpus()) {
smp_rmb();
cpumask_copy(&mask, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), &mask);
pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
cpumask_pr_args(&mask));
}
skip_ipi:
sdei_mask_local_cpu();
}
#ifdef CONFIG_KEXEC_CORE
void crash_smp_send_stop(void)
{
/*
* This function can be called twice in panic path, but obviously
* we execute this only once.
*
* We use this same boolean to tell whether the IPI we send was a
* stop or a "crash stop".
*/
if (crash_stop)
return;
crash_stop = 1;
smp_send_stop();
sdei_handler_abort();
}
bool smp_crash_stop_failed(void)
{
return (atomic_read(&waiting_for_crash_ipi) > 0);
return num_other_online_cpus() != 0;
}
#endif

View File

@ -273,6 +273,12 @@ void arm64_force_sig_fault(int signo, int code, unsigned long far,
force_sig_fault(signo, code, (void __user *)far);
}
void arm64_force_sig_fault_pkey(unsigned long far, const char *str, int pkey)
{
arm64_show_signal(SIGSEGV, str);
force_sig_pkuerr((void __user *)far, pkey);
}
void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
const char *str)
{
@ -601,19 +607,27 @@ static void ctr_read_handler(unsigned long esr, struct pt_regs *regs)
static void cntvct_read_handler(unsigned long esr, struct pt_regs *regs)
{
if (test_thread_flag(TIF_TSC_SIGSEGV)) {
force_sig(SIGSEGV);
} else {
int rt = ESR_ELx_SYS64_ISS_RT(esr);
pt_regs_write_reg(regs, rt, arch_timer_read_counter());
arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
}
}
static void cntfrq_read_handler(unsigned long esr, struct pt_regs *regs)
{
if (test_thread_flag(TIF_TSC_SIGSEGV)) {
force_sig(SIGSEGV);
} else {
int rt = ESR_ELx_SYS64_ISS_RT(esr);
pt_regs_write_reg(regs, rt, arch_timer_get_rate());
arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
}
}
static void mrs_handler(unsigned long esr, struct pt_regs *regs)
{

View File

@ -14,6 +14,7 @@
static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
{
int ret;
u64 par, tmp;
/*
@ -27,7 +28,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
* saved the guest context yet, and we may return early...
*/
par = read_sysreg_par();
if (!__kvm_at("s1e1r", far))
ret = system_supports_poe() ? __kvm_at(OP_AT_S1E1A, far) :
__kvm_at(OP_AT_S1E1R, far);
if (!ret)
tmp = read_sysreg_par();
else
tmp = SYS_PAR_EL1_F; /* back to the guest */

View File

@ -16,9 +16,15 @@
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt);
static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
{
ctxt_sys_reg(ctxt, MDSCR_EL1) = read_sysreg(mdscr_el1);
// POR_EL0 can affect uaccess, so must be saved/restored early.
if (ctxt_has_s1poe(ctxt))
ctxt_sys_reg(ctxt, POR_EL0) = read_sysreg_s(SYS_POR_EL0);
}
static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
@ -66,6 +72,17 @@ static inline bool ctxt_has_tcrx(struct kvm_cpu_context *ctxt)
return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, TCRX, IMP);
}
static inline bool ctxt_has_s1poe(struct kvm_cpu_context *ctxt)
{
struct kvm_vcpu *vcpu;
if (!system_supports_poe())
return false;
vcpu = ctxt_to_vcpu(ctxt);
return kvm_has_feat(kern_hyp_va(vcpu->kvm), ID_AA64MMFR3_EL1, S1POE, IMP);
}
static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
{
ctxt_sys_reg(ctxt, SCTLR_EL1) = read_sysreg_el1(SYS_SCTLR);
@ -80,6 +97,9 @@ static inline void __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
ctxt_sys_reg(ctxt, PIR_EL1) = read_sysreg_el1(SYS_PIR);
ctxt_sys_reg(ctxt, PIRE0_EL1) = read_sysreg_el1(SYS_PIRE0);
}
if (ctxt_has_s1poe(ctxt))
ctxt_sys_reg(ctxt, POR_EL1) = read_sysreg_el1(SYS_POR);
}
ctxt_sys_reg(ctxt, ESR_EL1) = read_sysreg_el1(SYS_ESR);
ctxt_sys_reg(ctxt, AFSR0_EL1) = read_sysreg_el1(SYS_AFSR0);
@ -120,6 +140,10 @@ static inline void __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
{
write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1), mdscr_el1);
// POR_EL0 can affect uaccess, so must be saved/restored early.
if (ctxt_has_s1poe(ctxt))
write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0), SYS_POR_EL0);
}
static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
@ -158,6 +182,9 @@ static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1), SYS_PIR);
write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1), SYS_PIRE0);
}
if (ctxt_has_s1poe(ctxt))
write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1), SYS_POR);
}
write_sysreg_el1(ctxt_sys_reg(ctxt, ESR_EL1), SYS_ESR);
write_sysreg_el1(ctxt_sys_reg(ctxt, AFSR0_EL1), SYS_AFSR0);

View File

@ -233,7 +233,7 @@ void kvm_pmu_vcpu_init(struct kvm_vcpu *vcpu)
int i;
struct kvm_pmu *pmu = &vcpu->arch.pmu;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++)
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++)
pmu->pmc[i].idx = i;
}
@ -260,7 +260,7 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu)
{
int i;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++)
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++)
kvm_pmu_release_perf_event(kvm_vcpu_idx_to_pmc(vcpu, i));
irq_work_sync(&vcpu->arch.pmu.overflow_work);
}
@ -291,7 +291,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu, u64 val)
if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) || !val)
return;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {
struct kvm_pmc *pmc;
if (!(val & BIT(i)))
@ -323,7 +323,7 @@ void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_vcpu_has_pmu(vcpu) || !val)
return;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {
struct kvm_pmc *pmc;
if (!(val & BIT(i)))
@ -910,10 +910,10 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
/*
* The arm_pmu->num_events considers the cycle counter as well.
* Ignore that and return only the general-purpose counters.
* The arm_pmu->cntr_mask considers the fixed counter(s) as well.
* Ignore those and return only the general-purpose counters.
*/
return arm_pmu->num_events - 1;
return bitmap_weight(arm_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS);
}
static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)

View File

@ -5,6 +5,8 @@
*/
#include <linux/kvm_host.h>
#include <linux/perf_event.h>
#include <linux/perf/arm_pmu.h>
#include <linux/perf/arm_pmuv3.h>
static DEFINE_PER_CPU(struct kvm_pmu_events, kvm_pmu_events);
@ -35,7 +37,7 @@ struct kvm_pmu_events *kvm_get_pmu_events(void)
* Add events to track that we may want to switch at guest entry/exit
* time.
*/
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr)
{
struct kvm_pmu_events *pmu = kvm_get_pmu_events();
@ -51,7 +53,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
/*
* Stop tracking events
*/
void kvm_clr_pmu_events(u32 clr)
void kvm_clr_pmu_events(u64 clr)
{
struct kvm_pmu_events *pmu = kvm_get_pmu_events();
@ -62,79 +64,32 @@ void kvm_clr_pmu_events(u32 clr)
pmu->events_guest &= ~clr;
}
#define PMEVTYPER_READ_CASE(idx) \
case idx: \
return read_sysreg(pmevtyper##idx##_el0)
#define PMEVTYPER_WRITE_CASE(idx) \
case idx: \
write_sysreg(val, pmevtyper##idx##_el0); \
break
#define PMEVTYPER_CASES(readwrite) \
PMEVTYPER_##readwrite##_CASE(0); \
PMEVTYPER_##readwrite##_CASE(1); \
PMEVTYPER_##readwrite##_CASE(2); \
PMEVTYPER_##readwrite##_CASE(3); \
PMEVTYPER_##readwrite##_CASE(4); \
PMEVTYPER_##readwrite##_CASE(5); \
PMEVTYPER_##readwrite##_CASE(6); \
PMEVTYPER_##readwrite##_CASE(7); \
PMEVTYPER_##readwrite##_CASE(8); \
PMEVTYPER_##readwrite##_CASE(9); \
PMEVTYPER_##readwrite##_CASE(10); \
PMEVTYPER_##readwrite##_CASE(11); \
PMEVTYPER_##readwrite##_CASE(12); \
PMEVTYPER_##readwrite##_CASE(13); \
PMEVTYPER_##readwrite##_CASE(14); \
PMEVTYPER_##readwrite##_CASE(15); \
PMEVTYPER_##readwrite##_CASE(16); \
PMEVTYPER_##readwrite##_CASE(17); \
PMEVTYPER_##readwrite##_CASE(18); \
PMEVTYPER_##readwrite##_CASE(19); \
PMEVTYPER_##readwrite##_CASE(20); \
PMEVTYPER_##readwrite##_CASE(21); \
PMEVTYPER_##readwrite##_CASE(22); \
PMEVTYPER_##readwrite##_CASE(23); \
PMEVTYPER_##readwrite##_CASE(24); \
PMEVTYPER_##readwrite##_CASE(25); \
PMEVTYPER_##readwrite##_CASE(26); \
PMEVTYPER_##readwrite##_CASE(27); \
PMEVTYPER_##readwrite##_CASE(28); \
PMEVTYPER_##readwrite##_CASE(29); \
PMEVTYPER_##readwrite##_CASE(30)
/*
* Read a value direct from PMEVTYPER<idx> where idx is 0-30
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
* or PMxCFILTR_EL0 where idx is 31-32.
*/
static u64 kvm_vcpu_pmu_read_evtype_direct(int idx)
{
switch (idx) {
PMEVTYPER_CASES(READ);
case ARMV8_PMU_CYCLE_IDX:
return read_sysreg(pmccfiltr_el0);
default:
WARN_ON(1);
}
if (idx == ARMV8_PMU_CYCLE_IDX)
return read_pmccfiltr();
else if (idx == ARMV8_PMU_INSTR_IDX)
return read_pmicfiltr();
return 0;
return read_pmevtypern(idx);
}
/*
* Write a value direct to PMEVTYPER<idx> where idx is 0-30
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
* or PMxCFILTR_EL0 where idx is 31-32.
*/
static void kvm_vcpu_pmu_write_evtype_direct(int idx, u32 val)
{
switch (idx) {
PMEVTYPER_CASES(WRITE);
case ARMV8_PMU_CYCLE_IDX:
write_sysreg(val, pmccfiltr_el0);
break;
default:
WARN_ON(1);
}
if (idx == ARMV8_PMU_CYCLE_IDX)
write_pmccfiltr(val);
else if (idx == ARMV8_PMU_INSTR_IDX)
write_pmicfiltr(val);
else
write_pmevtypern(idx, val);
}
/*
@ -145,7 +100,7 @@ static void kvm_vcpu_pmu_enable_el0(unsigned long events)
u64 typer;
u32 counter;
for_each_set_bit(counter, &events, 32) {
for_each_set_bit(counter, &events, ARMPMU_MAX_HWEVENTS) {
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
typer &= ~ARMV8_PMU_EXCLUDE_EL0;
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
@ -160,7 +115,7 @@ static void kvm_vcpu_pmu_disable_el0(unsigned long events)
u64 typer;
u32 counter;
for_each_set_bit(counter, &events, 32) {
for_each_set_bit(counter, &events, ARMPMU_MAX_HWEVENTS) {
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
typer |= ARMV8_PMU_EXCLUDE_EL0;
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
@ -176,7 +131,7 @@ static void kvm_vcpu_pmu_disable_el0(unsigned long events)
void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
{
struct kvm_pmu_events *pmu;
u32 events_guest, events_host;
u64 events_guest, events_host;
if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
@ -197,7 +152,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
{
struct kvm_pmu_events *pmu;
u32 events_guest, events_host;
u64 events_guest, events_host;
if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;

View File

@ -18,6 +18,7 @@
#include <linux/printk.h>
#include <linux/uaccess.h>
#include <asm/arm_pmuv3.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/debug-monitors.h>
@ -893,7 +894,7 @@ static u64 reset_pmevtyper(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
static u64 reset_pmselr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
reset_unknown(vcpu, r);
__vcpu_sys_reg(vcpu, r->reg) &= ARMV8_PMU_COUNTER_MASK;
__vcpu_sys_reg(vcpu, r->reg) &= PMSELR_EL0_SEL_MASK;
return __vcpu_sys_reg(vcpu, r->reg);
}
@ -985,7 +986,7 @@ static bool access_pmselr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
else
/* return PMSELR.SEL field */
p->regval = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK;
& PMSELR_EL0_SEL_MASK;
return true;
}
@ -1053,8 +1054,8 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
if (pmu_access_event_counter_el0_disabled(vcpu))
return false;
idx = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK;
idx = SYS_FIELD_GET(PMSELR_EL0, SEL,
__vcpu_sys_reg(vcpu, PMSELR_EL0));
} else if (r->Op2 == 0) {
/* PMCCNTR_EL0 */
if (pmu_access_cycle_counter_el0_disabled(vcpu))
@ -1104,7 +1105,7 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 1) {
/* PMXEVTYPER_EL0 */
idx = __vcpu_sys_reg(vcpu, PMSELR_EL0) & ARMV8_PMU_COUNTER_MASK;
idx = SYS_FIELD_GET(PMSELR_EL0, SEL, __vcpu_sys_reg(vcpu, PMSELR_EL0));
reg = PMEVTYPER0_EL0 + idx;
} else if (r->CRn == 14 && (r->CRm & 12) == 12) {
idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
@ -1562,6 +1563,9 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
case SYS_ID_AA64MMFR2_EL1:
val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;
break;
case SYS_ID_AA64MMFR3_EL1:
val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE;
break;
case SYS_ID_MMFR4_EL1:
val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX);
break;
@ -2261,6 +2265,15 @@ static bool access_zcr_el2(struct kvm_vcpu *vcpu,
return true;
}
static unsigned int s1poe_visibility(const struct kvm_vcpu *vcpu,
const struct sys_reg_desc *rd)
{
if (kvm_has_feat(vcpu->kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
return 0;
return REG_HIDDEN;
}
/*
* Architected system registers.
* Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
@ -2424,7 +2437,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_AA64MMFR2_EL1_IDS |
ID_AA64MMFR2_EL1_NV |
ID_AA64MMFR2_EL1_CCIDX)),
ID_SANITISED(ID_AA64MMFR3_EL1),
ID_WRITABLE(ID_AA64MMFR3_EL1, (ID_AA64MMFR3_EL1_TCRX |
ID_AA64MMFR3_EL1_S1POE)),
ID_SANITISED(ID_AA64MMFR4_EL1),
ID_UNALLOCATED(7,5),
ID_UNALLOCATED(7,6),
@ -2498,6 +2512,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
{ SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1,
.visibility = s1poe_visibility },
{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
@ -2584,6 +2600,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
.access = access_pmovs, .reg = PMOVSSET_EL0,
.get_user = get_pmreg, .set_user = set_pmreg },
{ SYS_DESC(SYS_POR_EL0), NULL, reset_unknown, POR_EL0,
.visibility = s1poe_visibility },
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
@ -4574,8 +4592,6 @@ void kvm_calculate_traps(struct kvm_vcpu *vcpu)
kvm->arch.fgu[HFGxTR_GROUP] = (HFGxTR_EL2_nAMAIR2_EL1 |
HFGxTR_EL2_nMAIR2_EL1 |
HFGxTR_EL2_nS2POR_EL1 |
HFGxTR_EL2_nPOR_EL1 |
HFGxTR_EL2_nPOR_EL0 |
HFGxTR_EL2_nACCDATA_EL1 |
HFGxTR_EL2_nSMPRI_EL1_MASK |
HFGxTR_EL2_nTPIDR2_EL0_MASK);
@ -4610,6 +4626,10 @@ void kvm_calculate_traps(struct kvm_vcpu *vcpu)
kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPIRE0_EL1 |
HFGxTR_EL2_nPIR_EL1);
if (!kvm_has_feat(kvm, ID_AA64MMFR3_EL1, S1POE, IMP))
kvm->arch.fgu[HFGxTR_GROUP] |= (HFGxTR_EL2_nPOR_EL1 |
HFGxTR_EL2_nPOR_EL0);
if (!kvm_has_feat(kvm, ID_AA64PFR0_EL1, AMU, IMP))
kvm->arch.fgu[HAFGRTR_GROUP] |= ~(HAFGRTR_EL2_RES0 |
HAFGRTR_EL2_RES1);

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
obj-y := dma-mapping.o extable.o fault.o init.o \
cache.o copypage.o flush.o \
ioremap.o mmap.o pgd.o mmu.o \
ioremap.o mmap.o pgd.o mem_encrypt.o mmu.o \
context.o proc.o pageattr.o fixmap.o
obj-$(CONFIG_ARM64_CONTPTE) += contpte.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o

View File

@ -421,6 +421,12 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
ptep = contpte_align_down(ptep);
start_addr = addr = ALIGN_DOWN(addr, CONT_PTE_SIZE);
/*
* We are not advancing entry because __ptep_set_access_flags()
* only consumes access flags from entry. And since we have checked
* for the whole contpte block and returned early, pte_same()
* within __ptep_set_access_flags() is likely false.
*/
for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE)
__ptep_set_access_flags(vma, addr, ptep, entry, 0);

View File

@ -23,6 +23,7 @@
#include <linux/sched/debug.h>
#include <linux/highmem.h>
#include <linux/perf_event.h>
#include <linux/pkeys.h>
#include <linux/preempt.h>
#include <linux/hugetlb.h>
@ -486,6 +487,23 @@ static void do_bad_area(unsigned long far, unsigned long esr,
}
}
static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
unsigned int mm_flags)
{
unsigned long iss2 = ESR_ELx_ISS2(esr);
if (!system_supports_poe())
return false;
if (esr_fsc_is_permission_fault(esr) && (iss2 & ESR_ELx_Overlay))
return true;
return !arch_vma_access_permitted(vma,
mm_flags & FAULT_FLAG_WRITE,
mm_flags & FAULT_FLAG_INSTRUCTION,
false);
}
static bool is_el0_instruction_abort(unsigned long esr)
{
return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@ -511,6 +529,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
unsigned long addr = untagged_addr(far);
struct vm_area_struct *vma;
int si_code;
int pkey = -1;
if (kprobe_page_fault(regs, esr))
return 0;
@ -575,6 +594,16 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
goto bad_area;
}
if (fault_from_pkey(esr, vma, mm_flags)) {
pkey = vma_pkey(vma);
vma_end_read(vma);
fault = 0;
si_code = SEGV_PKUERR;
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
goto bad_area;
}
fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
vma_end_read(vma);
@ -610,7 +639,16 @@ retry:
goto bad_area;
}
if (fault_from_pkey(esr, vma, mm_flags)) {
pkey = vma_pkey(vma);
mmap_read_unlock(mm);
fault = 0;
si_code = SEGV_PKUERR;
goto bad_area;
}
fault = handle_mm_fault(vma, addr, mm_flags, regs);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
if (!user_mode(regs))
@ -669,7 +707,22 @@ bad_area:
arm64_force_sig_mceerr(BUS_MCEERR_AR, far, lsb, inf->name);
} else {
/*
* The pkey value that we return to userspace can be different
* from the pkey that caused the fault.
*
* 1. T1 : mprotect_key(foo, PAGE_SIZE, pkey=4);
* 2. T1 : set POR_EL0 to deny access to pkey=4, touches, page
* 3. T1 : faults...
* 4. T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
* 5. T1 : enters fault handler, takes mmap_lock, etc...
* 6. T1 : reaches here, sees vma_pkey(vma)=5, when we really
* faulted on a pte with its pkey=4.
*/
/* Something tried to access memory that out of memory map */
if (si_code == SEGV_PKUERR)
arm64_force_sig_fault_pkey(far, inf->name, pkey);
else
arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
}

View File

@ -414,8 +414,16 @@ void __init mem_init(void)
void free_initmem(void)
{
free_reserved_area(lm_alias(__init_begin),
lm_alias(__init_end),
void *lm_init_begin = lm_alias(__init_begin);
void *lm_init_end = lm_alias(__init_end);
WARN_ON(!IS_ALIGNED((unsigned long)lm_init_begin, PAGE_SIZE));
WARN_ON(!IS_ALIGNED((unsigned long)lm_init_end, PAGE_SIZE));
/* Delete __init region from memblock.reserved. */
memblock_free(lm_init_begin, lm_init_end - lm_init_begin);
free_reserved_area(lm_init_begin, lm_init_end,
POISON_FREE_INITMEM, "unused kernel");
/*
* Unmap the __init region but leave the VM area in place. This

View File

@ -3,10 +3,22 @@
#include <linux/mm.h>
#include <linux/io.h>
static ioremap_prot_hook_t ioremap_prot_hook;
int arm64_ioremap_prot_hook_register(ioremap_prot_hook_t hook)
{
if (WARN_ON(ioremap_prot_hook))
return -EBUSY;
ioremap_prot_hook = hook;
return 0;
}
void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size,
unsigned long prot)
{
unsigned long last_addr = phys_addr + size - 1;
pgprot_t pgprot = __pgprot(prot);
/* Don't allow outside PHYS_MASK */
if (last_addr & ~PHYS_MASK)
@ -16,7 +28,16 @@ void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size,
if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
return NULL;
return generic_ioremap_prot(phys_addr, size, __pgprot(prot));
/*
* If a hook is registered (e.g. for confidential computing
* purposes), call that now and barf if it fails.
*/
if (unlikely(ioremap_prot_hook) &&
WARN_ON(ioremap_prot_hook(phys_addr, size, &pgprot))) {
return NULL;
}
return generic_ioremap_prot(phys_addr, size, pgprot);
}
EXPORT_SYMBOL(ioremap_prot);

View File

@ -0,0 +1,50 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Implementation of the memory encryption/decryption API.
*
* Since the low-level details of the operation depend on the
* Confidential Computing environment (e.g. pKVM, CCA, ...), this just
* acts as a top-level dispatcher to whatever hooks may have been
* registered.
*
* Author: Will Deacon <will@kernel.org>
* Copyright (C) 2024 Google LLC
*
* "Hello, boils and ghouls!"
*/
#include <linux/bug.h>
#include <linux/compiler.h>
#include <linux/err.h>
#include <linux/mm.h>
#include <asm/mem_encrypt.h>
static const struct arm64_mem_crypt_ops *crypt_ops;
int arm64_mem_crypt_ops_register(const struct arm64_mem_crypt_ops *ops)
{
if (WARN_ON(crypt_ops))
return -EBUSY;
crypt_ops = ops;
return 0;
}
int set_memory_encrypted(unsigned long addr, int numpages)
{
if (likely(!crypt_ops) || WARN_ON(!PAGE_ALIGNED(addr)))
return 0;
return crypt_ops->encrypt(addr, numpages);
}
EXPORT_SYMBOL_GPL(set_memory_encrypted);
int set_memory_decrypted(unsigned long addr, int numpages)
{
if (likely(!crypt_ops) || WARN_ON(!PAGE_ALIGNED(addr)))
return 0;
return crypt_ops->decrypt(addr, numpages);
}
EXPORT_SYMBOL_GPL(set_memory_decrypted);

View File

@ -102,6 +102,17 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
if (vm_flags & VM_MTE)
prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
#ifdef CONFIG_ARCH_HAS_PKEYS
if (system_supports_poe()) {
if (vm_flags & VM_PKEY_BIT0)
prot |= PTE_PO_IDX_0;
if (vm_flags & VM_PKEY_BIT1)
prot |= PTE_PO_IDX_1;
if (vm_flags & VM_PKEY_BIT2)
prot |= PTE_PO_IDX_2;
}
#endif
return __pgprot(prot);
}
EXPORT_SYMBOL(vm_get_page_prot);

View File

@ -25,6 +25,7 @@
#include <linux/vmalloc.h>
#include <linux/set_memory.h>
#include <linux/kfence.h>
#include <linux/pkeys.h>
#include <asm/barrier.h>
#include <asm/cputype.h>
@ -1549,3 +1550,47 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
cpu_uninstall_idmap();
}
#ifdef CONFIG_ARCH_HAS_PKEYS
int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
{
u64 new_por = POE_RXW;
u64 old_por;
u64 pkey_shift;
if (!system_supports_poe())
return -ENOSPC;
/*
* This code should only be called with valid 'pkey'
* values originating from in-kernel users. Complain
* if a bad value is observed.
*/
if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
return -EINVAL;
/* Set the bits we need in POR: */
new_por = POE_RXW;
if (init_val & PKEY_DISABLE_WRITE)
new_por &= ~POE_W;
if (init_val & PKEY_DISABLE_ACCESS)
new_por &= ~POE_RW;
if (init_val & PKEY_DISABLE_READ)
new_por &= ~POE_R;
if (init_val & PKEY_DISABLE_EXECUTE)
new_por &= ~POE_X;
/* Shift the bits in to the correct place in POR for pkey: */
pkey_shift = pkey * POR_BITS_PER_PKEY;
new_por <<= pkey_shift;
/* Get old POR and mask off any old bits in place: */
old_por = read_sysreg_s(SYS_POR_EL0);
old_por &= ~(POE_MASK << pkey_shift);
/* Write old part along with new part: */
write_sysreg_s(old_por | new_por, SYS_POR_EL0);
return 0;
}
#endif

View File

@ -36,8 +36,6 @@
#define TCR_KASLR_FLAGS 0
#endif
#define TCR_SMP_FLAGS TCR_SHARED
/* PTWs cacheable, inner/outer WBWA */
#define TCR_CACHE_FLAGS TCR_IRGN_WBWA | TCR_ORGN_WBWA
@ -469,7 +467,7 @@ SYM_FUNC_START(__cpu_setup)
tcr .req x16
mov_q mair, MAIR_EL1_SET
mov_q tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \
TCR_SMP_FLAGS | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
TCR_SHARED | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
tcr_clear_errata_bits tcr, x9, x5

View File

@ -42,14 +42,16 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr)
* the temporary mappings we use during restore.
*/
__set_pte(dst_ptep, pte_mkwrite_novma(pte));
} else if ((debug_pagealloc_enabled() ||
is_kfence_address((void *)addr)) && !pte_none(pte)) {
} else if (!pte_none(pte)) {
/*
* debug_pagealloc will removed the PTE_VALID bit if
* the page isn't in use by the resume kernel. It may have
* been in use by the original kernel, in which case we need
* to put it back in our copy to do the restore.
*
* Other cases include kfence / vmalloc / memfd_secret which
* may call `set_direct_map_invalid_noflush()`.
*
* Before marking this entry valid, check the pfn should
* be mapped.
*/

View File

@ -45,6 +45,7 @@ HAS_MOPS
HAS_NESTED_VIRT
HAS_PAN
HAS_S1PIE
HAS_S1POE
HAS_RAS_EXTN
HAS_RNG
HAS_SB

View File

@ -2029,6 +2029,31 @@ Sysreg FAR_EL1 3 0 6 0 0
Field 63:0 ADDR
EndSysreg
Sysreg PMICNTR_EL0 3 3 9 4 0
Field 63:0 ICNT
EndSysreg
Sysreg PMICFILTR_EL0 3 3 9 6 0
Res0 63:59
Field 58 SYNC
Field 57:56 VS
Res0 55:32
Field 31 P
Field 30 U
Field 29 NSK
Field 28 NSU
Field 27 NSH
Field 26 M
Res0 25
Field 24 SH
Field 23 T
Field 22 RLK
Field 21 RLU
Field 20 RLH
Res0 19:16
Field 15:0 evtCount
EndSysreg
Sysreg PMSCR_EL1 3 0 9 9 0
Res0 63:8
Field 7:6 PCT
@ -2153,6 +2178,11 @@ Field 4 P
Field 3:0 ALIGN
EndSysreg
Sysreg PMSELR_EL0 3 3 9 12 5
Res0 63:5
Field 4:0 SEL
EndSysreg
SysregFields CONTEXTIDR_ELx
Res0 63:32
Field 31:0 PROCID

View File

@ -1026,6 +1026,10 @@ config PPC_MEM_KEYS
If unsure, say y.
config ARCH_PKEY_BITS
int
default 5
config PPC_SECURE_BOOT
prompt "Enable secure boot support"
bool

View File

@ -1889,6 +1889,10 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
If unsure, say y.
config ARCH_PKEY_BITS
int
default 4
choice
prompt "TSX enable mode"
depends on CPU_SUP_INTEL

View File

@ -822,7 +822,7 @@ static struct iommu_iort_rmr_data *iort_rmr_alloc(
return NULL;
/* Create a copy of SIDs array to associate with this rmr_data */
sids_copy = kmemdup(sids, num_sids * sizeof(*sids), GFP_KERNEL);
sids_copy = kmemdup_array(sids, num_sids, sizeof(*sids), GFP_KERNEL);
if (!sids_copy) {
kfree(rmr_data);
return NULL;
@ -1703,6 +1703,13 @@ static struct acpi_platform_list pmcg_plat_info[] __initdata = {
/* HiSilicon Hip09 Platform */
{"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
/* HiSilicon Hip10/11 Platform uses the same SMMU IP with Hip09 */
{"HISI ", "HIP10 ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
{"HISI ", "HIP10C ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
{"HISI ", "HIP11 ", 0, ACPI_SIG_IORT, greater_than_or_equal,
"Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09},
{ }
};

View File

@ -39,6 +39,8 @@ void __init kvm_init_hyp_services(void)
pr_info("hypervisor services detected (0x%08lx 0x%08lx 0x%08lx 0x%08lx)\n",
res.a3, res.a2, res.a1, res.a0);
kvm_arch_init_hyp_services();
}
bool kvm_arm_hyp_service_available(u32 func_id)

View File

@ -48,6 +48,13 @@ config ARM_CMN
Support for PMU events monitoring on the Arm CMN-600 Coherent Mesh
Network interconnect.
config ARM_NI
tristate "Arm NI-700 PMU support"
depends on ARM64 || COMPILE_TEST
help
Support for PMU events monitoring on the Arm NI-700 Network-on-Chip
interconnect and family.
config ARM_PMU
depends on ARM || ARM64
bool "ARM PMU framework"

View File

@ -3,6 +3,7 @@ obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
obj-$(CONFIG_ARM_CCN) += arm-ccn.o
obj-$(CONFIG_ARM_CMN) += arm-cmn.o
obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
obj-$(CONFIG_ARM_NI) += arm-ni.o
obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
obj-$(CONFIG_ARM_PMUV3) += arm_pmuv3.o

View File

@ -400,7 +400,7 @@ static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
}
/* clear common counter intr status */
clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
writel(clr_status,
drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
}

View File

@ -47,33 +47,66 @@
* implementations, we'll have to introduce per cpu-type tables.
*/
enum m1_pmu_events {
M1_PMU_PERFCTR_UNKNOWN_01 = 0x01,
M1_PMU_PERFCTR_CPU_CYCLES = 0x02,
M1_PMU_PERFCTR_INSTRUCTIONS = 0x8c,
M1_PMU_PERFCTR_UNKNOWN_8d = 0x8d,
M1_PMU_PERFCTR_UNKNOWN_8e = 0x8e,
M1_PMU_PERFCTR_UNKNOWN_8f = 0x8f,
M1_PMU_PERFCTR_UNKNOWN_90 = 0x90,
M1_PMU_PERFCTR_UNKNOWN_93 = 0x93,
M1_PMU_PERFCTR_UNKNOWN_94 = 0x94,
M1_PMU_PERFCTR_UNKNOWN_95 = 0x95,
M1_PMU_PERFCTR_UNKNOWN_96 = 0x96,
M1_PMU_PERFCTR_UNKNOWN_97 = 0x97,
M1_PMU_PERFCTR_UNKNOWN_98 = 0x98,
M1_PMU_PERFCTR_UNKNOWN_99 = 0x99,
M1_PMU_PERFCTR_UNKNOWN_9a = 0x9a,
M1_PMU_PERFCTR_UNKNOWN_9b = 0x9b,
M1_PMU_PERFCTR_UNKNOWN_9c = 0x9c,
M1_PMU_PERFCTR_RETIRE_UOP = 0x1,
M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE = 0x2,
M1_PMU_PERFCTR_L1I_TLB_FILL = 0x4,
M1_PMU_PERFCTR_L1D_TLB_FILL = 0x5,
M1_PMU_PERFCTR_MMU_TABLE_WALK_INSTRUCTION = 0x7,
M1_PMU_PERFCTR_MMU_TABLE_WALK_DATA = 0x8,
M1_PMU_PERFCTR_L2_TLB_MISS_INSTRUCTION = 0xa,
M1_PMU_PERFCTR_L2_TLB_MISS_DATA = 0xb,
M1_PMU_PERFCTR_MMU_VIRTUAL_MEMORY_FAULT_NONSPEC = 0xd,
M1_PMU_PERFCTR_SCHEDULE_UOP = 0x52,
M1_PMU_PERFCTR_INTERRUPT_PENDING = 0x6c,
M1_PMU_PERFCTR_MAP_STALL_DISPATCH = 0x70,
M1_PMU_PERFCTR_MAP_REWIND = 0x75,
M1_PMU_PERFCTR_MAP_STALL = 0x76,
M1_PMU_PERFCTR_MAP_INT_UOP = 0x7c,
M1_PMU_PERFCTR_MAP_LDST_UOP = 0x7d,
M1_PMU_PERFCTR_MAP_SIMD_UOP = 0x7e,
M1_PMU_PERFCTR_FLUSH_RESTART_OTHER_NONSPEC = 0x84,
M1_PMU_PERFCTR_INST_ALL = 0x8c,
M1_PMU_PERFCTR_INST_BRANCH = 0x8d,
M1_PMU_PERFCTR_INST_BRANCH_CALL = 0x8e,
M1_PMU_PERFCTR_INST_BRANCH_RET = 0x8f,
M1_PMU_PERFCTR_INST_BRANCH_TAKEN = 0x90,
M1_PMU_PERFCTR_INST_BRANCH_INDIR = 0x93,
M1_PMU_PERFCTR_INST_BRANCH_COND = 0x94,
M1_PMU_PERFCTR_INST_INT_LD = 0x95,
M1_PMU_PERFCTR_INST_INT_ST = 0x96,
M1_PMU_PERFCTR_INST_INT_ALU = 0x97,
M1_PMU_PERFCTR_INST_SIMD_LD = 0x98,
M1_PMU_PERFCTR_INST_SIMD_ST = 0x99,
M1_PMU_PERFCTR_INST_SIMD_ALU = 0x9a,
M1_PMU_PERFCTR_INST_LDST = 0x9b,
M1_PMU_PERFCTR_INST_BARRIER = 0x9c,
M1_PMU_PERFCTR_UNKNOWN_9f = 0x9f,
M1_PMU_PERFCTR_UNKNOWN_bf = 0xbf,
M1_PMU_PERFCTR_UNKNOWN_c0 = 0xc0,
M1_PMU_PERFCTR_UNKNOWN_c1 = 0xc1,
M1_PMU_PERFCTR_UNKNOWN_c4 = 0xc4,
M1_PMU_PERFCTR_UNKNOWN_c5 = 0xc5,
M1_PMU_PERFCTR_UNKNOWN_c6 = 0xc6,
M1_PMU_PERFCTR_UNKNOWN_c8 = 0xc8,
M1_PMU_PERFCTR_UNKNOWN_ca = 0xca,
M1_PMU_PERFCTR_UNKNOWN_cb = 0xcb,
M1_PMU_PERFCTR_L1D_TLB_ACCESS = 0xa0,
M1_PMU_PERFCTR_L1D_TLB_MISS = 0xa1,
M1_PMU_PERFCTR_L1D_CACHE_MISS_ST = 0xa2,
M1_PMU_PERFCTR_L1D_CACHE_MISS_LD = 0xa3,
M1_PMU_PERFCTR_LD_UNIT_UOP = 0xa6,
M1_PMU_PERFCTR_ST_UNIT_UOP = 0xa7,
M1_PMU_PERFCTR_L1D_CACHE_WRITEBACK = 0xa8,
M1_PMU_PERFCTR_LDST_X64_UOP = 0xb1,
M1_PMU_PERFCTR_LDST_XPG_UOP = 0xb2,
M1_PMU_PERFCTR_ATOMIC_OR_EXCLUSIVE_SUCC = 0xb3,
M1_PMU_PERFCTR_ATOMIC_OR_EXCLUSIVE_FAIL = 0xb4,
M1_PMU_PERFCTR_L1D_CACHE_MISS_LD_NONSPEC = 0xbf,
M1_PMU_PERFCTR_L1D_CACHE_MISS_ST_NONSPEC = 0xc0,
M1_PMU_PERFCTR_L1D_TLB_MISS_NONSPEC = 0xc1,
M1_PMU_PERFCTR_ST_MEMORY_ORDER_VIOLATION_NONSPEC = 0xc4,
M1_PMU_PERFCTR_BRANCH_COND_MISPRED_NONSPEC = 0xc5,
M1_PMU_PERFCTR_BRANCH_INDIR_MISPRED_NONSPEC = 0xc6,
M1_PMU_PERFCTR_BRANCH_RET_INDIR_MISPRED_NONSPEC = 0xc8,
M1_PMU_PERFCTR_BRANCH_CALL_INDIR_MISPRED_NONSPEC = 0xca,
M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC = 0xcb,
M1_PMU_PERFCTR_L1I_TLB_MISS_DEMAND = 0xd4,
M1_PMU_PERFCTR_MAP_DISPATCH_BUBBLE = 0xd6,
M1_PMU_PERFCTR_L1I_CACHE_MISS_DEMAND = 0xdb,
M1_PMU_PERFCTR_FETCH_RESTART = 0xde,
M1_PMU_PERFCTR_ST_NT_UOP = 0xe5,
M1_PMU_PERFCTR_LD_NT_UOP = 0xe6,
M1_PMU_PERFCTR_UNKNOWN_f5 = 0xf5,
M1_PMU_PERFCTR_UNKNOWN_f6 = 0xf6,
M1_PMU_PERFCTR_UNKNOWN_f7 = 0xf7,
@ -97,33 +130,33 @@ enum m1_pmu_events {
*/
static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
[0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
[M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
[M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
[M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
[M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
[M1_PMU_PERFCTR_RETIRE_UOP] = BIT(7),
[M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE] = ANY_BUT_0_1 | BIT(0),
[M1_PMU_PERFCTR_INST_ALL] = BIT(7) | BIT(1),
[M1_PMU_PERFCTR_INST_BRANCH] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_CALL] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_RET] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_TAKEN] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_INDIR] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_COND] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_INT_LD] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_INT_ST] = BIT(7),
[M1_PMU_PERFCTR_INST_INT_ALU] = BIT(7),
[M1_PMU_PERFCTR_INST_SIMD_LD] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_SIMD_ST] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_SIMD_ALU] = BIT(7),
[M1_PMU_PERFCTR_INST_LDST] = BIT(7),
[M1_PMU_PERFCTR_INST_BARRIER] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
[M1_PMU_PERFCTR_L1D_CACHE_MISS_LD_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_L1D_CACHE_MISS_ST_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_L1D_TLB_MISS_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_ST_MEMORY_ORDER_VIOLATION_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_COND_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_RET_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_CALL_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
@ -133,9 +166,8 @@ static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = {
PERF_MAP_ALL_UNSUPPORTED,
[PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES,
[PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS,
/* No idea about the rest yet */
[PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE,
[PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INST_ALL,
};
/* sysfs definitions */
@ -154,8 +186,8 @@ static ssize_t m1_pmu_events_sysfs_show(struct device *dev,
PMU_EVENT_ATTR_ID(name, m1_pmu_events_sysfs_show, config)
static struct attribute *m1_pmu_event_attrs[] = {
M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES),
M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS),
M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE),
M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INST_ALL),
NULL,
};
@ -400,7 +432,7 @@ static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; idx++) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, M1_PMU_NR_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct perf_sample_data data;
@ -560,7 +592,7 @@ static int m1_pmu_init(struct arm_pmu *cpu_pmu, u32 flags)
cpu_pmu->reset = m1_pmu_reset;
cpu_pmu->set_event_filter = m1_pmu_set_event_filter;
cpu_pmu->num_events = M1_PMU_NR_COUNTERS;
bitmap_set(cpu_pmu->cntr_mask, 0, M1_PMU_NR_COUNTERS);
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = &m1_pmu_events_attr_group;
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = &m1_pmu_format_attr_group;
return 0;

View File

@ -24,14 +24,6 @@
#define CMN_NI_NODE_ID GENMASK_ULL(31, 16)
#define CMN_NI_LOGICAL_ID GENMASK_ULL(47, 32)
#define CMN_NODEID_DEVID(reg) ((reg) & 3)
#define CMN_NODEID_EXT_DEVID(reg) ((reg) & 1)
#define CMN_NODEID_PID(reg) (((reg) >> 2) & 1)
#define CMN_NODEID_EXT_PID(reg) (((reg) >> 1) & 3)
#define CMN_NODEID_1x1_PID(reg) (((reg) >> 2) & 7)
#define CMN_NODEID_X(reg, bits) ((reg) >> (3 + (bits)))
#define CMN_NODEID_Y(reg, bits) (((reg) >> 3) & ((1U << (bits)) - 1))
#define CMN_CHILD_INFO 0x0080
#define CMN_CI_CHILD_COUNT GENMASK_ULL(15, 0)
#define CMN_CI_CHILD_PTR_OFFSET GENMASK_ULL(31, 16)
@ -43,6 +35,9 @@
#define CMN_MAX_XPS (CMN_MAX_DIMENSION * CMN_MAX_DIMENSION)
#define CMN_MAX_DTMS (CMN_MAX_XPS + (CMN_MAX_DIMENSION - 1) * 4)
/* Currently XPs are the node type we can have most of; others top out at 128 */
#define CMN_MAX_NODES_PER_EVENT CMN_MAX_XPS
/* The CFG node has various info besides the discovery tree */
#define CMN_CFGM_PERIPH_ID_01 0x0008
#define CMN_CFGM_PID0_PART_0 GENMASK_ULL(7, 0)
@ -50,24 +45,28 @@
#define CMN_CFGM_PERIPH_ID_23 0x0010
#define CMN_CFGM_PID2_REVISION GENMASK_ULL(7, 4)
#define CMN_CFGM_INFO_GLOBAL 0x900
#define CMN_CFGM_INFO_GLOBAL 0x0900
#define CMN_INFO_MULTIPLE_DTM_EN BIT_ULL(63)
#define CMN_INFO_RSP_VC_NUM GENMASK_ULL(53, 52)
#define CMN_INFO_DAT_VC_NUM GENMASK_ULL(51, 50)
#define CMN_INFO_DEVICE_ISO_ENABLE BIT_ULL(44)
#define CMN_CFGM_INFO_GLOBAL_1 0x908
#define CMN_CFGM_INFO_GLOBAL_1 0x0908
#define CMN_INFO_SNP_VC_NUM GENMASK_ULL(3, 2)
#define CMN_INFO_REQ_VC_NUM GENMASK_ULL(1, 0)
/* XPs also have some local topology info which has uses too */
#define CMN_MXP__CONNECT_INFO(p) (0x0008 + 8 * (p))
#define CMN__CONNECT_INFO_DEVICE_TYPE GENMASK_ULL(4, 0)
#define CMN__CONNECT_INFO_DEVICE_TYPE GENMASK_ULL(5, 0)
#define CMN_MAX_PORTS 6
#define CI700_CONNECT_INFO_P2_5_OFFSET 0x10
/* PMU registers occupy the 3rd 4KB page of each node's region */
#define CMN_PMU_OFFSET 0x2000
/* ...except when they don't :( */
#define CMN_S3_DTM_OFFSET 0xa000
#define CMN_S3_PMU_OFFSET 0xd900
/* For most nodes, this is all there is */
#define CMN_PMU_EVENT_SEL 0x000
@ -78,7 +77,8 @@
/* Technically this is 4 bits wide on DNs, but we only use 2 there anyway */
#define CMN__PMU_OCCUP1_ID GENMASK_ULL(34, 32)
/* HN-Ps are weird... */
/* Some types are designed to coexist with another device in the same node */
#define CMN_CCLA_PMU_EVENT_SEL 0x008
#define CMN_HNP_PMU_EVENT_SEL 0x008
/* DTMs live in the PMU space of XP registers */
@ -123,27 +123,28 @@
/* The DTC node is where the magic happens */
#define CMN_DT_DTC_CTL 0x0a00
#define CMN_DT_DTC_CTL_DT_EN BIT(0)
#define CMN_DT_DTC_CTL_CG_DISABLE BIT(10)
/* DTC counters are paired in 64-bit registers on a 16-byte stride. Yuck */
#define _CMN_DT_CNT_REG(n) ((((n) / 2) * 4 + (n) % 2) * 4)
#define CMN_DT_PMEVCNT(n) (CMN_PMU_OFFSET + _CMN_DT_CNT_REG(n))
#define CMN_DT_PMCCNTR (CMN_PMU_OFFSET + 0x40)
#define CMN_DT_PMEVCNT(dtc, n) ((dtc)->pmu_base + _CMN_DT_CNT_REG(n))
#define CMN_DT_PMCCNTR(dtc) ((dtc)->pmu_base + 0x40)
#define CMN_DT_PMEVCNTSR(n) (CMN_PMU_OFFSET + 0x50 + _CMN_DT_CNT_REG(n))
#define CMN_DT_PMCCNTRSR (CMN_PMU_OFFSET + 0x90)
#define CMN_DT_PMEVCNTSR(dtc, n) ((dtc)->pmu_base + 0x50 + _CMN_DT_CNT_REG(n))
#define CMN_DT_PMCCNTRSR(dtc) ((dtc)->pmu_base + 0x90)
#define CMN_DT_PMCR (CMN_PMU_OFFSET + 0x100)
#define CMN_DT_PMCR(dtc) ((dtc)->pmu_base + 0x100)
#define CMN_DT_PMCR_PMU_EN BIT(0)
#define CMN_DT_PMCR_CNTR_RST BIT(5)
#define CMN_DT_PMCR_OVFL_INTR_EN BIT(6)
#define CMN_DT_PMOVSR (CMN_PMU_OFFSET + 0x118)
#define CMN_DT_PMOVSR_CLR (CMN_PMU_OFFSET + 0x120)
#define CMN_DT_PMOVSR(dtc) ((dtc)->pmu_base + 0x118)
#define CMN_DT_PMOVSR_CLR(dtc) ((dtc)->pmu_base + 0x120)
#define CMN_DT_PMSSR (CMN_PMU_OFFSET + 0x128)
#define CMN_DT_PMSSR(dtc) ((dtc)->pmu_base + 0x128)
#define CMN_DT_PMSSR_SS_STATUS(n) BIT(n)
#define CMN_DT_PMSRR (CMN_PMU_OFFSET + 0x130)
#define CMN_DT_PMSRR(dtc) ((dtc)->pmu_base + 0x130)
#define CMN_DT_PMSRR_SS_REQ BIT(0)
#define CMN_DT_NUM_COUNTERS 8
@ -198,10 +199,11 @@ enum cmn_model {
CMN650 = 2,
CMN700 = 4,
CI700 = 8,
CMNS3 = 16,
/* ...and then we can use bitmap tricks for commonality */
CMN_ANY = -1,
NOT_CMN600 = -2,
CMN_650ON = CMN650 | CMN700,
CMN_650ON = CMN650 | CMN700 | CMNS3,
};
/* Actual part numbers and revision IDs defined by the hardware */
@ -210,6 +212,7 @@ enum cmn_part {
PART_CMN650 = 0x436,
PART_CMN700 = 0x43c,
PART_CI700 = 0x43a,
PART_CMN_S3 = 0x43e,
};
/* CMN-600 r0px shouldn't exist in silicon, thankfully */
@ -261,6 +264,7 @@ enum cmn_node_type {
CMN_TYPE_HNS = 0x200,
CMN_TYPE_HNS_MPAM_S,
CMN_TYPE_HNS_MPAM_NS,
CMN_TYPE_APB = 0x1000,
/* Not a real node type */
CMN_TYPE_WP = 0x7770
};
@ -280,8 +284,11 @@ struct arm_cmn_node {
u16 id, logid;
enum cmn_node_type type;
/* XP properties really, but replicated to children for convenience */
u8 dtm;
s8 dtc;
u8 portid_bits:4;
u8 deviceid_bits:4;
/* DN/HN-F/CXHA */
struct {
u8 val : 4;
@ -307,8 +314,9 @@ struct arm_cmn_dtm {
struct arm_cmn_dtc {
void __iomem *base;
void __iomem *pmu_base;
int irq;
int irq_friend;
s8 irq_friend;
bool cc_active;
struct perf_event *counters[CMN_DT_NUM_COUNTERS];
@ -357,49 +365,33 @@ struct arm_cmn {
static int arm_cmn_hp_state;
struct arm_cmn_nodeid {
u8 x;
u8 y;
u8 port;
u8 dev;
};
static int arm_cmn_xyidbits(const struct arm_cmn *cmn)
{
return fls((cmn->mesh_x - 1) | (cmn->mesh_y - 1) | 2);
return fls((cmn->mesh_x - 1) | (cmn->mesh_y - 1));
}
static struct arm_cmn_nodeid arm_cmn_nid(const struct arm_cmn *cmn, u16 id)
static struct arm_cmn_nodeid arm_cmn_nid(const struct arm_cmn_node *dn)
{
struct arm_cmn_nodeid nid;
if (cmn->num_xps == 1) {
nid.x = 0;
nid.y = 0;
nid.port = CMN_NODEID_1x1_PID(id);
nid.dev = CMN_NODEID_DEVID(id);
} else {
int bits = arm_cmn_xyidbits(cmn);
nid.x = CMN_NODEID_X(id, bits);
nid.y = CMN_NODEID_Y(id, bits);
if (cmn->ports_used & 0xc) {
nid.port = CMN_NODEID_EXT_PID(id);
nid.dev = CMN_NODEID_EXT_DEVID(id);
} else {
nid.port = CMN_NODEID_PID(id);
nid.dev = CMN_NODEID_DEVID(id);
}
}
nid.dev = dn->id & ((1U << dn->deviceid_bits) - 1);
nid.port = (dn->id >> dn->deviceid_bits) & ((1U << dn->portid_bits) - 1);
return nid;
}
static struct arm_cmn_node *arm_cmn_node_to_xp(const struct arm_cmn *cmn,
const struct arm_cmn_node *dn)
{
struct arm_cmn_nodeid nid = arm_cmn_nid(cmn, dn->id);
int xp_idx = cmn->mesh_x * nid.y + nid.x;
int id = dn->id >> (dn->portid_bits + dn->deviceid_bits);
int bits = arm_cmn_xyidbits(cmn);
int x = id >> bits;
int y = id & ((1U << bits) - 1);
return cmn->xps + xp_idx;
return cmn->xps + cmn->mesh_x * y + x;
}
static struct arm_cmn_node *arm_cmn_node(const struct arm_cmn *cmn,
enum cmn_node_type type)
@ -423,15 +415,27 @@ static enum cmn_model arm_cmn_model(const struct arm_cmn *cmn)
return CMN700;
case PART_CI700:
return CI700;
case PART_CMN_S3:
return CMNS3;
default:
return 0;
};
}
static int arm_cmn_pmu_offset(const struct arm_cmn *cmn, const struct arm_cmn_node *dn)
{
if (cmn->part == PART_CMN_S3) {
if (dn->type == CMN_TYPE_XP)
return CMN_S3_DTM_OFFSET;
return CMN_S3_PMU_OFFSET;
}
return CMN_PMU_OFFSET;
}
static u32 arm_cmn_device_connect_info(const struct arm_cmn *cmn,
const struct arm_cmn_node *xp, int port)
{
int offset = CMN_MXP__CONNECT_INFO(port);
int offset = CMN_MXP__CONNECT_INFO(port) - arm_cmn_pmu_offset(cmn, xp);
if (port >= 2) {
if (cmn->part == PART_CMN600 || cmn->part == PART_CMN650)
@ -444,7 +448,7 @@ static u32 arm_cmn_device_connect_info(const struct arm_cmn *cmn,
offset += CI700_CONNECT_INFO_P2_5_OFFSET;
}
return readl_relaxed(xp->pmu_base - CMN_PMU_OFFSET + offset);
return readl_relaxed(xp->pmu_base + offset);
}
static struct dentry *arm_cmn_debugfs;
@ -478,20 +482,25 @@ static const char *arm_cmn_device_type(u8 type)
case 0x17: return "RN-F_C_E|";
case 0x18: return " RN-F_E |";
case 0x19: return "RN-F_E_E|";
case 0x1a: return " HN-S |";
case 0x1b: return " LCN |";
case 0x1c: return " MTSX |";
case 0x1d: return " HN-V |";
case 0x1e: return " CCG |";
case 0x20: return " RN-F_F |";
case 0x21: return "RN-F_F_E|";
case 0x22: return " SN-F_F |";
default: return " ???? |";
}
}
static void arm_cmn_show_logid(struct seq_file *s, int x, int y, int p, int d)
static void arm_cmn_show_logid(struct seq_file *s, const struct arm_cmn_node *xp, int p, int d)
{
struct arm_cmn *cmn = s->private;
struct arm_cmn_node *dn;
u16 id = xp->id | d | (p << xp->deviceid_bits);
for (dn = cmn->dns; dn->type; dn++) {
struct arm_cmn_nodeid nid = arm_cmn_nid(cmn, dn->id);
int pad = dn->logid < 10;
if (dn->type == CMN_TYPE_XP)
@ -500,7 +509,7 @@ static void arm_cmn_show_logid(struct seq_file *s, int x, int y, int p, int d)
if (dn->type < CMN_TYPE_HNI)
continue;
if (nid.x != x || nid.y != y || nid.port != p || nid.dev != d)
if (dn->id != id)
continue;
seq_printf(s, " %*c#%-*d |", pad + 1, ' ', 3 - pad, dn->logid);
@ -521,6 +530,7 @@ static int arm_cmn_map_show(struct seq_file *s, void *data)
y = cmn->mesh_y;
while (y--) {
int xp_base = cmn->mesh_x * y;
struct arm_cmn_node *xp = cmn->xps + xp_base;
u8 port[CMN_MAX_PORTS][CMN_MAX_DIMENSION];
for (x = 0; x < cmn->mesh_x; x++)
@ -528,16 +538,14 @@ static int arm_cmn_map_show(struct seq_file *s, void *data)
seq_printf(s, "\n%-2d |", y);
for (x = 0; x < cmn->mesh_x; x++) {
struct arm_cmn_node *xp = cmn->xps + xp_base + x;
for (p = 0; p < CMN_MAX_PORTS; p++)
port[p][x] = arm_cmn_device_connect_info(cmn, xp, p);
port[p][x] = arm_cmn_device_connect_info(cmn, xp + x, p);
seq_printf(s, " XP #%-3d|", xp_base + x);
}
seq_puts(s, "\n |");
for (x = 0; x < cmn->mesh_x; x++) {
s8 dtc = cmn->xps[xp_base + x].dtc;
s8 dtc = xp[x].dtc;
if (dtc < 0)
seq_puts(s, " DTC ?? |");
@ -554,10 +562,10 @@ static int arm_cmn_map_show(struct seq_file *s, void *data)
seq_puts(s, arm_cmn_device_type(port[p][x]));
seq_puts(s, "\n 0|");
for (x = 0; x < cmn->mesh_x; x++)
arm_cmn_show_logid(s, x, y, p, 0);
arm_cmn_show_logid(s, xp + x, p, 0);
seq_puts(s, "\n 1|");
for (x = 0; x < cmn->mesh_x; x++)
arm_cmn_show_logid(s, x, y, p, 1);
arm_cmn_show_logid(s, xp + x, p, 1);
}
seq_puts(s, "\n-----+");
}
@ -585,7 +593,7 @@ static void arm_cmn_debugfs_init(struct arm_cmn *cmn, int id) {}
struct arm_cmn_hw_event {
struct arm_cmn_node *dn;
u64 dtm_idx[4];
u64 dtm_idx[DIV_ROUND_UP(CMN_MAX_NODES_PER_EVENT * 2, 64)];
s8 dtc_idx[CMN_MAX_DTCS];
u8 num_dns;
u8 dtm_offset;
@ -599,6 +607,7 @@ struct arm_cmn_hw_event {
bool wide_sel;
enum cmn_filter_select filter_sel;
};
static_assert(sizeof(struct arm_cmn_hw_event) <= offsetof(struct hw_perf_event, target));
#define for_each_hw_dn(hw, dn, i) \
for (i = 0, dn = hw->dn; i < hw->num_dns; i++, dn++)
@ -609,7 +618,6 @@ struct arm_cmn_hw_event {
static struct arm_cmn_hw_event *to_cmn_hw(struct perf_event *event)
{
BUILD_BUG_ON(sizeof(struct arm_cmn_hw_event) > offsetof(struct hw_perf_event, target));
return (struct arm_cmn_hw_event *)&event->hw;
}
@ -790,8 +798,8 @@ static umode_t arm_cmn_event_attr_is_visible(struct kobject *kobj,
CMN_EVENT_ATTR(CMN_ANY, cxha_##_name, CMN_TYPE_CXHA, _event)
#define CMN_EVENT_CCRA(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, ccra_##_name, CMN_TYPE_CCRA, _event)
#define CMN_EVENT_CCHA(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, ccha_##_name, CMN_TYPE_CCHA, _event)
#define CMN_EVENT_CCHA(_model, _name, _event) \
CMN_EVENT_ATTR(_model, ccha_##_name, CMN_TYPE_CCHA, _event)
#define CMN_EVENT_CCLA(_name, _event) \
CMN_EVENT_ATTR(CMN_ANY, ccla_##_name, CMN_TYPE_CCLA, _event)
#define CMN_EVENT_CCLA_RNI(_name, _event) \
@ -1149,42 +1157,43 @@ static struct attribute *arm_cmn_event_attrs[] = {
CMN_EVENT_CCRA(wdb_alloc, 0x59),
CMN_EVENT_CCRA(ssb_alloc, 0x5a),
CMN_EVENT_CCHA(rddatbyp, 0x61),
CMN_EVENT_CCHA(chirsp_up_stall, 0x62),
CMN_EVENT_CCHA(chidat_up_stall, 0x63),
CMN_EVENT_CCHA(snppcrd_link0_stall, 0x64),
CMN_EVENT_CCHA(snppcrd_link1_stall, 0x65),
CMN_EVENT_CCHA(snppcrd_link2_stall, 0x66),
CMN_EVENT_CCHA(reqtrk_occ, 0x67),
CMN_EVENT_CCHA(rdb_occ, 0x68),
CMN_EVENT_CCHA(rdbyp_occ, 0x69),
CMN_EVENT_CCHA(wdb_occ, 0x6a),
CMN_EVENT_CCHA(snptrk_occ, 0x6b),
CMN_EVENT_CCHA(sdb_occ, 0x6c),
CMN_EVENT_CCHA(snphaz_occ, 0x6d),
CMN_EVENT_CCHA(reqtrk_alloc, 0x6e),
CMN_EVENT_CCHA(rdb_alloc, 0x6f),
CMN_EVENT_CCHA(rdbyp_alloc, 0x70),
CMN_EVENT_CCHA(wdb_alloc, 0x71),
CMN_EVENT_CCHA(snptrk_alloc, 0x72),
CMN_EVENT_CCHA(sdb_alloc, 0x73),
CMN_EVENT_CCHA(snphaz_alloc, 0x74),
CMN_EVENT_CCHA(pb_rhu_req_occ, 0x75),
CMN_EVENT_CCHA(pb_rhu_req_alloc, 0x76),
CMN_EVENT_CCHA(pb_rhu_pcie_req_occ, 0x77),
CMN_EVENT_CCHA(pb_rhu_pcie_req_alloc, 0x78),
CMN_EVENT_CCHA(pb_pcie_wr_req_occ, 0x79),
CMN_EVENT_CCHA(pb_pcie_wr_req_alloc, 0x7a),
CMN_EVENT_CCHA(pb_pcie_reg_req_occ, 0x7b),
CMN_EVENT_CCHA(pb_pcie_reg_req_alloc, 0x7c),
CMN_EVENT_CCHA(pb_pcie_rsvd_req_occ, 0x7d),
CMN_EVENT_CCHA(pb_pcie_rsvd_req_alloc, 0x7e),
CMN_EVENT_CCHA(pb_rhu_dat_occ, 0x7f),
CMN_EVENT_CCHA(pb_rhu_dat_alloc, 0x80),
CMN_EVENT_CCHA(pb_rhu_pcie_dat_occ, 0x81),
CMN_EVENT_CCHA(pb_rhu_pcie_dat_alloc, 0x82),
CMN_EVENT_CCHA(pb_pcie_wr_dat_occ, 0x83),
CMN_EVENT_CCHA(pb_pcie_wr_dat_alloc, 0x84),
CMN_EVENT_CCHA(CMN_ANY, rddatbyp, 0x61),
CMN_EVENT_CCHA(CMN_ANY, chirsp_up_stall, 0x62),
CMN_EVENT_CCHA(CMN_ANY, chidat_up_stall, 0x63),
CMN_EVENT_CCHA(CMN_ANY, snppcrd_link0_stall, 0x64),
CMN_EVENT_CCHA(CMN_ANY, snppcrd_link1_stall, 0x65),
CMN_EVENT_CCHA(CMN_ANY, snppcrd_link2_stall, 0x66),
CMN_EVENT_CCHA(CMN_ANY, reqtrk_occ, 0x67),
CMN_EVENT_CCHA(CMN_ANY, rdb_occ, 0x68),
CMN_EVENT_CCHA(CMN_ANY, rdbyp_occ, 0x69),
CMN_EVENT_CCHA(CMN_ANY, wdb_occ, 0x6a),
CMN_EVENT_CCHA(CMN_ANY, snptrk_occ, 0x6b),
CMN_EVENT_CCHA(CMN_ANY, sdb_occ, 0x6c),
CMN_EVENT_CCHA(CMN_ANY, snphaz_occ, 0x6d),
CMN_EVENT_CCHA(CMN_ANY, reqtrk_alloc, 0x6e),
CMN_EVENT_CCHA(CMN_ANY, rdb_alloc, 0x6f),
CMN_EVENT_CCHA(CMN_ANY, rdbyp_alloc, 0x70),
CMN_EVENT_CCHA(CMN_ANY, wdb_alloc, 0x71),
CMN_EVENT_CCHA(CMN_ANY, snptrk_alloc, 0x72),
CMN_EVENT_CCHA(CMN_ANY, db_alloc, 0x73),
CMN_EVENT_CCHA(CMN_ANY, snphaz_alloc, 0x74),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_req_occ, 0x75),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_req_alloc, 0x76),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_pcie_req_occ, 0x77),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_pcie_req_alloc, 0x78),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_wr_req_occ, 0x79),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_wr_req_alloc, 0x7a),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_reg_req_occ, 0x7b),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_reg_req_alloc, 0x7c),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_rsvd_req_occ, 0x7d),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_rsvd_req_alloc, 0x7e),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_dat_occ, 0x7f),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_dat_alloc, 0x80),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_pcie_dat_occ, 0x81),
CMN_EVENT_CCHA(CMN_ANY, pb_rhu_pcie_dat_alloc, 0x82),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_wr_dat_occ, 0x83),
CMN_EVENT_CCHA(CMN_ANY, pb_pcie_wr_dat_alloc, 0x84),
CMN_EVENT_CCHA(CMNS3, chirsp1_up_stall, 0x85),
CMN_EVENT_CCLA(rx_cxs, 0x21),
CMN_EVENT_CCLA(tx_cxs, 0x22),
@ -1271,15 +1280,11 @@ static ssize_t arm_cmn_format_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct arm_cmn_format_attr *fmt = container_of(attr, typeof(*fmt), attr);
int lo = __ffs(fmt->field), hi = __fls(fmt->field);
if (lo == hi)
return sysfs_emit(buf, "config:%d\n", lo);
if (!fmt->config)
return sysfs_emit(buf, "config:%d-%d\n", lo, hi);
return sysfs_emit(buf, "config:%*pbl\n", 64, &fmt->field);
return sysfs_emit(buf, "config%d:%d-%d\n", fmt->config, lo, hi);
return sysfs_emit(buf, "config%d:%*pbl\n", fmt->config, 64, &fmt->field);
}
#define _CMN_FORMAT_ATTR(_name, _cfg, _fld) \
@ -1415,7 +1420,7 @@ static u32 arm_cmn_wp_config(struct perf_event *event, int wp_idx)
static void arm_cmn_set_state(struct arm_cmn *cmn, u32 state)
{
if (!cmn->state)
writel_relaxed(0, cmn->dtc[0].base + CMN_DT_PMCR);
writel_relaxed(0, CMN_DT_PMCR(&cmn->dtc[0]));
cmn->state |= state;
}
@ -1424,7 +1429,7 @@ static void arm_cmn_clear_state(struct arm_cmn *cmn, u32 state)
cmn->state &= ~state;
if (!cmn->state)
writel_relaxed(CMN_DT_PMCR_PMU_EN | CMN_DT_PMCR_OVFL_INTR_EN,
cmn->dtc[0].base + CMN_DT_PMCR);
CMN_DT_PMCR(&cmn->dtc[0]));
}
static void arm_cmn_pmu_enable(struct pmu *pmu)
@ -1459,18 +1464,19 @@ static u64 arm_cmn_read_dtm(struct arm_cmn *cmn, struct arm_cmn_hw_event *hw,
static u64 arm_cmn_read_cc(struct arm_cmn_dtc *dtc)
{
u64 val = readq_relaxed(dtc->base + CMN_DT_PMCCNTR);
void __iomem *pmccntr = CMN_DT_PMCCNTR(dtc);
u64 val = readq_relaxed(pmccntr);
writeq_relaxed(CMN_CC_INIT, dtc->base + CMN_DT_PMCCNTR);
writeq_relaxed(CMN_CC_INIT, pmccntr);
return (val - CMN_CC_INIT) & ((CMN_CC_INIT << 1) - 1);
}
static u32 arm_cmn_read_counter(struct arm_cmn_dtc *dtc, int idx)
{
u32 val, pmevcnt = CMN_DT_PMEVCNT(idx);
void __iomem *pmevcnt = CMN_DT_PMEVCNT(dtc, idx);
u32 val = readl_relaxed(pmevcnt);
val = readl_relaxed(dtc->base + pmevcnt);
writel_relaxed(CMN_COUNTER_INIT, dtc->base + pmevcnt);
writel_relaxed(CMN_COUNTER_INIT, pmevcnt);
return val - CMN_COUNTER_INIT;
}
@ -1481,7 +1487,7 @@ static void arm_cmn_init_counter(struct perf_event *event)
u64 count;
for_each_hw_dtc_idx(hw, i, idx) {
writel_relaxed(CMN_COUNTER_INIT, cmn->dtc[i].base + CMN_DT_PMEVCNT(idx));
writel_relaxed(CMN_COUNTER_INIT, CMN_DT_PMEVCNT(&cmn->dtc[i], idx));
cmn->dtc[i].counters[idx] = event;
}
@ -1564,9 +1570,12 @@ static void arm_cmn_event_start(struct perf_event *event, int flags)
int i;
if (type == CMN_TYPE_DTC) {
i = hw->dtc_idx[0];
writeq_relaxed(CMN_CC_INIT, cmn->dtc[i].base + CMN_DT_PMCCNTR);
cmn->dtc[i].cc_active = true;
struct arm_cmn_dtc *dtc = cmn->dtc + hw->dtc_idx[0];
writel_relaxed(CMN_DT_DTC_CTL_DT_EN | CMN_DT_DTC_CTL_CG_DISABLE,
dtc->base + CMN_DT_DTC_CTL);
writeq_relaxed(CMN_CC_INIT, CMN_DT_PMCCNTR(dtc));
dtc->cc_active = true;
} else if (type == CMN_TYPE_WP) {
u64 val = CMN_EVENT_WP_VAL(event);
u64 mask = CMN_EVENT_WP_MASK(event);
@ -1595,8 +1604,10 @@ static void arm_cmn_event_stop(struct perf_event *event, int flags)
int i;
if (type == CMN_TYPE_DTC) {
i = hw->dtc_idx[0];
cmn->dtc[i].cc_active = false;
struct arm_cmn_dtc *dtc = cmn->dtc + hw->dtc_idx[0];
dtc->cc_active = false;
writel_relaxed(CMN_DT_DTC_CTL_DT_EN, dtc->base + CMN_DT_DTC_CTL);
} else if (type == CMN_TYPE_WP) {
for_each_hw_dn(hw, dn, i) {
void __iomem *base = dn->pmu_base + CMN_DTM_OFFSET(hw->dtm_offset);
@ -1784,7 +1795,8 @@ static int arm_cmn_event_init(struct perf_event *event)
/* ...but the DTM may depend on which port we're watching */
if (cmn->multi_dtm)
hw->dtm_offset = CMN_EVENT_WP_DEV_SEL(event) / 2;
} else if (type == CMN_TYPE_XP && cmn->part == PART_CMN700) {
} else if (type == CMN_TYPE_XP &&
(cmn->part == PART_CMN700 || cmn->part == PART_CMN_S3)) {
hw->wide_sel = true;
}
@ -1815,10 +1827,7 @@ static int arm_cmn_event_init(struct perf_event *event)
}
if (!hw->num_dns) {
struct arm_cmn_nodeid nid = arm_cmn_nid(cmn, nodeid);
dev_dbg(cmn->dev, "invalid node 0x%x (%d,%d,%d,%d) type 0x%x\n",
nodeid, nid.x, nid.y, nid.port, nid.dev, type);
dev_dbg(cmn->dev, "invalid node 0x%x type 0x%x\n", nodeid, type);
return -EINVAL;
}
@ -1921,7 +1930,7 @@ static int arm_cmn_event_add(struct perf_event *event, int flags)
arm_cmn_claim_wp_idx(dtm, event, d, wp_idx, i);
writel_relaxed(cfg, dtm->base + CMN_DTM_WPn_CONFIG(wp_idx));
} else {
struct arm_cmn_nodeid nid = arm_cmn_nid(cmn, dn->id);
struct arm_cmn_nodeid nid = arm_cmn_nid(dn);
if (cmn->multi_dtm)
nid.port %= 2;
@ -2010,7 +2019,7 @@ static int arm_cmn_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_nod
cmn = hlist_entry_safe(cpuhp_node, struct arm_cmn, cpuhp_node);
node = dev_to_node(cmn->dev);
if (node != NUMA_NO_NODE && cpu_to_node(cmn->cpu) != node && cpu_to_node(cpu) == node)
if (cpu_to_node(cmn->cpu) != node && cpu_to_node(cpu) == node)
arm_cmn_migrate(cmn, cpu);
return 0;
}
@ -2043,7 +2052,7 @@ static irqreturn_t arm_cmn_handle_irq(int irq, void *dev_id)
irqreturn_t ret = IRQ_NONE;
for (;;) {
u32 status = readl_relaxed(dtc->base + CMN_DT_PMOVSR);
u32 status = readl_relaxed(CMN_DT_PMOVSR(dtc));
u64 delta;
int i;
@ -2065,7 +2074,7 @@ static irqreturn_t arm_cmn_handle_irq(int irq, void *dev_id)
}
}
writel_relaxed(status, dtc->base + CMN_DT_PMOVSR_CLR);
writel_relaxed(status, CMN_DT_PMOVSR_CLR(dtc));
if (!dtc->irq_friend)
return ret;
@ -2119,15 +2128,16 @@ static int arm_cmn_init_dtc(struct arm_cmn *cmn, struct arm_cmn_node *dn, int id
{
struct arm_cmn_dtc *dtc = cmn->dtc + idx;
dtc->base = dn->pmu_base - CMN_PMU_OFFSET;
dtc->pmu_base = dn->pmu_base;
dtc->base = dtc->pmu_base - arm_cmn_pmu_offset(cmn, dn);
dtc->irq = platform_get_irq(to_platform_device(cmn->dev), idx);
if (dtc->irq < 0)
return dtc->irq;
writel_relaxed(CMN_DT_DTC_CTL_DT_EN, dtc->base + CMN_DT_DTC_CTL);
writel_relaxed(CMN_DT_PMCR_PMU_EN | CMN_DT_PMCR_OVFL_INTR_EN, dtc->base + CMN_DT_PMCR);
writeq_relaxed(0, dtc->base + CMN_DT_PMCCNTR);
writel_relaxed(0x1ff, dtc->base + CMN_DT_PMOVSR_CLR);
writel_relaxed(CMN_DT_PMCR_PMU_EN | CMN_DT_PMCR_OVFL_INTR_EN, CMN_DT_PMCR(dtc));
writeq_relaxed(0, CMN_DT_PMCCNTR(dtc));
writel_relaxed(0x1ff, CMN_DT_PMOVSR_CLR(dtc));
return 0;
}
@ -2168,10 +2178,12 @@ static int arm_cmn_init_dtcs(struct arm_cmn *cmn)
continue;
xp = arm_cmn_node_to_xp(cmn, dn);
dn->portid_bits = xp->portid_bits;
dn->deviceid_bits = xp->deviceid_bits;
dn->dtc = xp->dtc;
dn->dtm = xp->dtm;
if (cmn->multi_dtm)
dn->dtm += arm_cmn_nid(cmn, dn->id).port / 2;
dn->dtm += arm_cmn_nid(dn).port / 2;
if (dn->type == CMN_TYPE_DTC) {
int err = arm_cmn_init_dtc(cmn, dn, dtc_idx++);
@ -2213,7 +2225,7 @@ static void arm_cmn_init_node_info(struct arm_cmn *cmn, u32 offset, struct arm_c
node->id = FIELD_GET(CMN_NI_NODE_ID, reg);
node->logid = FIELD_GET(CMN_NI_LOGICAL_ID, reg);
node->pmu_base = cmn->base + offset + CMN_PMU_OFFSET;
node->pmu_base = cmn->base + offset + arm_cmn_pmu_offset(cmn, node);
if (node->type == CMN_TYPE_CFG)
level = 0;
@ -2271,7 +2283,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
reg = readl_relaxed(cfg_region + CMN_CFGM_PERIPH_ID_23);
cmn->rev = FIELD_GET(CMN_CFGM_PID2_REVISION, reg);
/*
* With the device isolation feature, if firmware has neglected to enable
* an XP port then we risk locking up if we try to access anything behind
* it; however we also have no way to tell from Non-Secure whether any
* given port is disabled or not, so the only way to win is not to play...
*/
reg = readq_relaxed(cfg_region + CMN_CFGM_INFO_GLOBAL);
if (reg & CMN_INFO_DEVICE_ISO_ENABLE) {
dev_err(cmn->dev, "Device isolation enabled, not continuing due to risk of lockup\n");
return -ENODEV;
}
cmn->multi_dtm = reg & CMN_INFO_MULTIPLE_DTM_EN;
cmn->rsp_vc_num = FIELD_GET(CMN_INFO_RSP_VC_NUM, reg);
cmn->dat_vc_num = FIELD_GET(CMN_INFO_DAT_VC_NUM, reg);
@ -2341,18 +2363,27 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
arm_cmn_init_dtm(dtm++, xp, 0);
/*
* Keeping track of connected ports will let us filter out
* unnecessary XP events easily. We can also reliably infer the
* "extra device ports" configuration for the node ID format
* from this, since in that case we will see at least one XP
* with port 2 connected, for the HN-D.
* unnecessary XP events easily, and also infer the per-XP
* part of the node ID format.
*/
for (int p = 0; p < CMN_MAX_PORTS; p++)
if (arm_cmn_device_connect_info(cmn, xp, p))
xp_ports |= BIT(p);
if (cmn->multi_dtm && (xp_ports & 0xc))
if (cmn->num_xps == 1) {
xp->portid_bits = 3;
xp->deviceid_bits = 2;
} else if (xp_ports > 0x3) {
xp->portid_bits = 2;
xp->deviceid_bits = 1;
} else {
xp->portid_bits = 1;
xp->deviceid_bits = 2;
}
if (cmn->multi_dtm && (xp_ports > 0x3))
arm_cmn_init_dtm(dtm++, xp, 1);
if (cmn->multi_dtm && (xp_ports & 0x30))
if (cmn->multi_dtm && (xp_ports > 0xf))
arm_cmn_init_dtm(dtm++, xp, 2);
cmn->ports_used |= xp_ports;
@ -2407,10 +2438,13 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
case CMN_TYPE_CXHA:
case CMN_TYPE_CCRA:
case CMN_TYPE_CCHA:
case CMN_TYPE_CCLA:
case CMN_TYPE_HNS:
dn++;
break;
case CMN_TYPE_CCLA:
dn->pmu_base += CMN_CCLA_PMU_EVENT_SEL;
dn++;
break;
/* Nothing to see here */
case CMN_TYPE_MPAM_S:
case CMN_TYPE_MPAM_NS:
@ -2418,6 +2452,7 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
case CMN_TYPE_CXLA:
case CMN_TYPE_HNS_MPAM_S:
case CMN_TYPE_HNS_MPAM_NS:
case CMN_TYPE_APB:
break;
/*
* Split "optimised" combination nodes into separate
@ -2428,7 +2463,7 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
case CMN_TYPE_HNP:
case CMN_TYPE_CCLA_RNI:
dn[1] = dn[0];
dn[0].pmu_base += CMN_HNP_PMU_EVENT_SEL;
dn[0].pmu_base += CMN_CCLA_PMU_EVENT_SEL;
dn[1].type = arm_cmn_subtype(dn->type);
dn += 2;
break;
@ -2603,6 +2638,7 @@ static const struct of_device_id arm_cmn_of_match[] = {
{ .compatible = "arm,cmn-600", .data = (void *)PART_CMN600 },
{ .compatible = "arm,cmn-650" },
{ .compatible = "arm,cmn-700" },
{ .compatible = "arm,cmn-s3" },
{ .compatible = "arm,ci-700" },
{}
};

781
drivers/perf/arm-ni.c Normal file
View File

@ -0,0 +1,781 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (C) 2022-2024 Arm Limited
// NI-700 Network-on-Chip PMU driver
#include <linux/acpi.h>
#include <linux/bitfield.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
/* Common registers */
#define NI_NODE_TYPE 0x000
#define NI_NODE_TYPE_NODE_ID GENMASK(31, 16)
#define NI_NODE_TYPE_NODE_TYPE GENMASK(15, 0)
#define NI_CHILD_NODE_INFO 0x004
#define NI_CHILD_PTR(n) (0x008 + (n) * 4)
#define NI700_PMUSELA 0x00c
/* Config node */
#define NI_PERIPHERAL_ID0 0xfe0
#define NI_PIDR0_PART_7_0 GENMASK(7, 0)
#define NI_PERIPHERAL_ID1 0xfe4
#define NI_PIDR1_PART_11_8 GENMASK(3, 0)
#define NI_PERIPHERAL_ID2 0xfe8
#define NI_PIDR2_VERSION GENMASK(7, 4)
/* PMU node */
#define NI_PMEVCNTR(n) (0x008 + (n) * 8)
#define NI_PMCCNTR_L 0x0f8
#define NI_PMCCNTR_U 0x0fc
#define NI_PMEVTYPER(n) (0x400 + (n) * 4)
#define NI_PMEVTYPER_NODE_TYPE GENMASK(12, 9)
#define NI_PMEVTYPER_NODE_ID GENMASK(8, 0)
#define NI_PMCNTENSET 0xc00
#define NI_PMCNTENCLR 0xc20
#define NI_PMINTENSET 0xc40
#define NI_PMINTENCLR 0xc60
#define NI_PMOVSCLR 0xc80
#define NI_PMOVSSET 0xcc0
#define NI_PMCFGR 0xe00
#define NI_PMCR 0xe04
#define NI_PMCR_RESET_CCNT BIT(2)
#define NI_PMCR_RESET_EVCNT BIT(1)
#define NI_PMCR_ENABLE BIT(0)
#define NI_NUM_COUNTERS 8
#define NI_CCNT_IDX 31
/* Event attributes */
#define NI_CONFIG_TYPE GENMASK_ULL(15, 0)
#define NI_CONFIG_NODEID GENMASK_ULL(31, 16)
#define NI_CONFIG_EVENTID GENMASK_ULL(47, 32)
#define NI_EVENT_TYPE(event) FIELD_GET(NI_CONFIG_TYPE, (event)->attr.config)
#define NI_EVENT_NODEID(event) FIELD_GET(NI_CONFIG_NODEID, (event)->attr.config)
#define NI_EVENT_EVENTID(event) FIELD_GET(NI_CONFIG_EVENTID, (event)->attr.config)
enum ni_part {
PART_NI_700 = 0x43b,
PART_NI_710AE = 0x43d,
};
enum ni_node_type {
NI_GLOBAL,
NI_VOLTAGE,
NI_POWER,
NI_CLOCK,
NI_ASNI,
NI_AMNI,
NI_PMU,
NI_HSNI,
NI_HMNI,
NI_PMNI,
};
struct arm_ni_node {
void __iomem *base;
enum ni_node_type type;
u16 id;
u32 num_components;
};
struct arm_ni_unit {
void __iomem *pmusela;
enum ni_node_type type;
u16 id;
bool ns;
union {
__le64 pmusel;
u8 event[8];
};
};
struct arm_ni_cd {
void __iomem *pmu_base;
u16 id;
int num_units;
int irq;
int cpu;
struct hlist_node cpuhp_node;
struct pmu pmu;
struct arm_ni_unit *units;
struct perf_event *evcnt[NI_NUM_COUNTERS];
struct perf_event *ccnt;
};
struct arm_ni {
struct device *dev;
void __iomem *base;
enum ni_part part;
int id;
int num_cds;
struct arm_ni_cd cds[] __counted_by(num_cds);
};
#define cd_to_ni(cd) container_of((cd), struct arm_ni, cds[(cd)->id])
#define pmu_to_cd(p) container_of((p), struct arm_ni_cd, pmu)
#define cd_for_each_unit(cd, u) \
for (struct arm_ni_unit *u = cd->units; u < cd->units + cd->num_units; u++)
static int arm_ni_hp_state;
struct arm_ni_event_attr {
struct device_attribute attr;
enum ni_node_type type;
};
#define NI_EVENT_ATTR(_name, _type) \
(&((struct arm_ni_event_attr[]) {{ \
.attr = __ATTR(_name, 0444, arm_ni_event_show, NULL), \
.type = _type, \
}})[0].attr.attr)
static ssize_t arm_ni_event_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct arm_ni_event_attr *eattr = container_of(attr, typeof(*eattr), attr);
if (eattr->type == NI_PMU)
return sysfs_emit(buf, "type=0x%x\n", eattr->type);
return sysfs_emit(buf, "type=0x%x,eventid=?,nodeid=?\n", eattr->type);
}
static umode_t arm_ni_event_attr_is_visible(struct kobject *kobj,
struct attribute *attr, int unused)
{
struct device *dev = kobj_to_dev(kobj);
struct arm_ni_cd *cd = pmu_to_cd(dev_get_drvdata(dev));
struct arm_ni_event_attr *eattr;
eattr = container_of(attr, typeof(*eattr), attr.attr);
cd_for_each_unit(cd, unit) {
if (unit->type == eattr->type && unit->ns)
return attr->mode;
}
return 0;
}
static struct attribute *arm_ni_event_attrs[] = {
NI_EVENT_ATTR(asni, NI_ASNI),
NI_EVENT_ATTR(amni, NI_AMNI),
NI_EVENT_ATTR(cycles, NI_PMU),
NI_EVENT_ATTR(hsni, NI_HSNI),
NI_EVENT_ATTR(hmni, NI_HMNI),
NI_EVENT_ATTR(pmni, NI_PMNI),
NULL
};
static const struct attribute_group arm_ni_event_attrs_group = {
.name = "events",
.attrs = arm_ni_event_attrs,
.is_visible = arm_ni_event_attr_is_visible,
};
struct arm_ni_format_attr {
struct device_attribute attr;
u64 field;
};
#define NI_FORMAT_ATTR(_name, _fld) \
(&((struct arm_ni_format_attr[]) {{ \
.attr = __ATTR(_name, 0444, arm_ni_format_show, NULL), \
.field = _fld, \
}})[0].attr.attr)
static ssize_t arm_ni_format_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct arm_ni_format_attr *fmt = container_of(attr, typeof(*fmt), attr);
return sysfs_emit(buf, "config:%*pbl\n", 64, &fmt->field);
}
static struct attribute *arm_ni_format_attrs[] = {
NI_FORMAT_ATTR(type, NI_CONFIG_TYPE),
NI_FORMAT_ATTR(nodeid, NI_CONFIG_NODEID),
NI_FORMAT_ATTR(eventid, NI_CONFIG_EVENTID),
NULL
};
static const struct attribute_group arm_ni_format_attrs_group = {
.name = "format",
.attrs = arm_ni_format_attrs,
};
static ssize_t arm_ni_cpumask_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct arm_ni_cd *cd = pmu_to_cd(dev_get_drvdata(dev));
return cpumap_print_to_pagebuf(true, buf, cpumask_of(cd->cpu));
}
static struct device_attribute arm_ni_cpumask_attr =
__ATTR(cpumask, 0444, arm_ni_cpumask_show, NULL);
static ssize_t arm_ni_identifier_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct arm_ni *ni = cd_to_ni(pmu_to_cd(dev_get_drvdata(dev)));
u32 reg = readl_relaxed(ni->base + NI_PERIPHERAL_ID2);
int version = FIELD_GET(NI_PIDR2_VERSION, reg);
return sysfs_emit(buf, "%03x%02x\n", ni->part, version);
}
static struct device_attribute arm_ni_identifier_attr =
__ATTR(identifier, 0444, arm_ni_identifier_show, NULL);
static struct attribute *arm_ni_other_attrs[] = {
&arm_ni_cpumask_attr.attr,
&arm_ni_identifier_attr.attr,
NULL
};
static const struct attribute_group arm_ni_other_attr_group = {
.attrs = arm_ni_other_attrs,
NULL
};
static const struct attribute_group *arm_ni_attr_groups[] = {
&arm_ni_event_attrs_group,
&arm_ni_format_attrs_group,
&arm_ni_other_attr_group,
NULL
};
static void arm_ni_pmu_enable(struct pmu *pmu)
{
writel_relaxed(NI_PMCR_ENABLE, pmu_to_cd(pmu)->pmu_base + NI_PMCR);
}
static void arm_ni_pmu_disable(struct pmu *pmu)
{
writel_relaxed(0, pmu_to_cd(pmu)->pmu_base + NI_PMCR);
}
struct arm_ni_val {
unsigned int evcnt;
unsigned int ccnt;
};
static bool arm_ni_val_count_event(struct perf_event *evt, struct arm_ni_val *val)
{
if (is_software_event(evt))
return true;
if (NI_EVENT_TYPE(evt) == NI_PMU) {
val->ccnt++;
return val->ccnt <= 1;
}
val->evcnt++;
return val->evcnt <= NI_NUM_COUNTERS;
}
static int arm_ni_validate_group(struct perf_event *event)
{
struct perf_event *sibling, *leader = event->group_leader;
struct arm_ni_val val = { 0 };
if (leader == event)
return 0;
arm_ni_val_count_event(event, &val);
if (!arm_ni_val_count_event(leader, &val))
return -EINVAL;
for_each_sibling_event(sibling, leader) {
if (!arm_ni_val_count_event(sibling, &val))
return -EINVAL;
}
return 0;
}
static int arm_ni_event_init(struct perf_event *event)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
if (event->attr.type != event->pmu->type)
return -ENOENT;
if (is_sampling_event(event))
return -EINVAL;
event->cpu = cd->cpu;
if (NI_EVENT_TYPE(event) == NI_PMU)
return arm_ni_validate_group(event);
cd_for_each_unit(cd, unit) {
if (unit->type == NI_EVENT_TYPE(event) &&
unit->id == NI_EVENT_NODEID(event) && unit->ns) {
event->hw.config_base = (unsigned long)unit;
return arm_ni_validate_group(event);
}
}
return -EINVAL;
}
static u64 arm_ni_read_ccnt(struct arm_ni_cd *cd)
{
u64 l, u_old, u_new;
int retries = 3; /* 1st time unlucky, 2nd improbable, 3rd just broken */
u_new = readl_relaxed(cd->pmu_base + NI_PMCCNTR_U);
do {
u_old = u_new;
l = readl_relaxed(cd->pmu_base + NI_PMCCNTR_L);
u_new = readl_relaxed(cd->pmu_base + NI_PMCCNTR_U);
} while (u_new != u_old && --retries);
WARN_ON(!retries);
return (u_new << 32) | l;
}
static void arm_ni_event_read(struct perf_event *event)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
struct hw_perf_event *hw = &event->hw;
u64 count, prev;
bool ccnt = hw->idx == NI_CCNT_IDX;
do {
prev = local64_read(&hw->prev_count);
if (ccnt)
count = arm_ni_read_ccnt(cd);
else
count = readl_relaxed(cd->pmu_base + NI_PMEVCNTR(hw->idx));
} while (local64_cmpxchg(&hw->prev_count, prev, count) != prev);
count -= prev;
if (!ccnt)
count = (u32)count;
local64_add(count, &event->count);
}
static void arm_ni_event_start(struct perf_event *event, int flags)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
writel_relaxed(1U << event->hw.idx, cd->pmu_base + NI_PMCNTENSET);
}
static void arm_ni_event_stop(struct perf_event *event, int flags)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
writel_relaxed(1U << event->hw.idx, cd->pmu_base + NI_PMCNTENCLR);
if (flags & PERF_EF_UPDATE)
arm_ni_event_read(event);
}
static void arm_ni_init_ccnt(struct arm_ni_cd *cd)
{
local64_set(&cd->ccnt->hw.prev_count, S64_MIN);
lo_hi_writeq_relaxed(S64_MIN, cd->pmu_base + NI_PMCCNTR_L);
}
static void arm_ni_init_evcnt(struct arm_ni_cd *cd, int idx)
{
local64_set(&cd->evcnt[idx]->hw.prev_count, S32_MIN);
writel_relaxed(S32_MIN, cd->pmu_base + NI_PMEVCNTR(idx));
}
static int arm_ni_event_add(struct perf_event *event, int flags)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
struct hw_perf_event *hw = &event->hw;
struct arm_ni_unit *unit;
enum ni_node_type type = NI_EVENT_TYPE(event);
u32 reg;
if (type == NI_PMU) {
if (cd->ccnt)
return -ENOSPC;
hw->idx = NI_CCNT_IDX;
cd->ccnt = event;
arm_ni_init_ccnt(cd);
} else {
hw->idx = 0;
while (cd->evcnt[hw->idx]) {
if (++hw->idx == NI_NUM_COUNTERS)
return -ENOSPC;
}
cd->evcnt[hw->idx] = event;
unit = (void *)hw->config_base;
unit->event[hw->idx] = NI_EVENT_EVENTID(event);
arm_ni_init_evcnt(cd, hw->idx);
lo_hi_writeq_relaxed(le64_to_cpu(unit->pmusel), unit->pmusela);
reg = FIELD_PREP(NI_PMEVTYPER_NODE_TYPE, type) |
FIELD_PREP(NI_PMEVTYPER_NODE_ID, NI_EVENT_NODEID(event));
writel_relaxed(reg, cd->pmu_base + NI_PMEVTYPER(hw->idx));
}
if (flags & PERF_EF_START)
arm_ni_event_start(event, 0);
return 0;
}
static void arm_ni_event_del(struct perf_event *event, int flags)
{
struct arm_ni_cd *cd = pmu_to_cd(event->pmu);
struct hw_perf_event *hw = &event->hw;
arm_ni_event_stop(event, PERF_EF_UPDATE);
if (hw->idx == NI_CCNT_IDX)
cd->ccnt = NULL;
else
cd->evcnt[hw->idx] = NULL;
}
static irqreturn_t arm_ni_handle_irq(int irq, void *dev_id)
{
struct arm_ni_cd *cd = dev_id;
irqreturn_t ret = IRQ_NONE;
u32 reg = readl_relaxed(cd->pmu_base + NI_PMOVSCLR);
if (reg & (1U << NI_CCNT_IDX)) {
ret = IRQ_HANDLED;
if (!(WARN_ON(!cd->ccnt))) {
arm_ni_event_read(cd->ccnt);
arm_ni_init_ccnt(cd);
}
}
for (int i = 0; i < NI_NUM_COUNTERS; i++) {
if (!(reg & (1U << i)))
continue;
ret = IRQ_HANDLED;
if (!(WARN_ON(!cd->evcnt[i]))) {
arm_ni_event_read(cd->evcnt[i]);
arm_ni_init_evcnt(cd, i);
}
}
writel_relaxed(reg, cd->pmu_base + NI_PMOVSCLR);
return ret;
}
static int arm_ni_init_cd(struct arm_ni *ni, struct arm_ni_node *node, u64 res_start)
{
struct arm_ni_cd *cd = ni->cds + node->id;
const char *name;
int err;
cd->id = node->id;
cd->num_units = node->num_components;
cd->units = devm_kcalloc(ni->dev, cd->num_units, sizeof(*(cd->units)), GFP_KERNEL);
if (!cd->units)
return -ENOMEM;
for (int i = 0; i < cd->num_units; i++) {
u32 reg = readl_relaxed(node->base + NI_CHILD_PTR(i));
void __iomem *unit_base = ni->base + reg;
struct arm_ni_unit *unit = cd->units + i;
reg = readl_relaxed(unit_base + NI_NODE_TYPE);
unit->type = FIELD_GET(NI_NODE_TYPE_NODE_TYPE, reg);
unit->id = FIELD_GET(NI_NODE_TYPE_NODE_ID, reg);
switch (unit->type) {
case NI_PMU:
reg = readl_relaxed(unit_base + NI_PMCFGR);
if (!reg) {
dev_info(ni->dev, "No access to PMU %d\n", cd->id);
devm_kfree(ni->dev, cd->units);
return 0;
}
unit->ns = true;
cd->pmu_base = unit_base;
break;
case NI_ASNI:
case NI_AMNI:
case NI_HSNI:
case NI_HMNI:
case NI_PMNI:
unit->pmusela = unit_base + NI700_PMUSELA;
writel_relaxed(1, unit->pmusela);
if (readl_relaxed(unit->pmusela) != 1)
dev_info(ni->dev, "No access to node 0x%04x%04x\n", unit->id, unit->type);
else
unit->ns = true;
break;
default:
/*
* e.g. FMU - thankfully bits 3:2 of FMU_ERR_FR0 are RES0 so
* can't alias any of the leaf node types we're looking for.
*/
dev_dbg(ni->dev, "Mystery node 0x%04x%04x\n", unit->id, unit->type);
break;
}
}
res_start += cd->pmu_base - ni->base;
if (!devm_request_mem_region(ni->dev, res_start, SZ_4K, dev_name(ni->dev))) {
dev_err(ni->dev, "Failed to request PMU region 0x%llx\n", res_start);
return -EBUSY;
}
writel_relaxed(NI_PMCR_RESET_CCNT | NI_PMCR_RESET_EVCNT,
cd->pmu_base + NI_PMCR);
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMCNTENCLR);
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMOVSCLR);
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMINTENSET);
cd->irq = platform_get_irq(to_platform_device(ni->dev), cd->id);
if (cd->irq < 0)
return cd->irq;
err = devm_request_irq(ni->dev, cd->irq, arm_ni_handle_irq,
IRQF_NOBALANCING | IRQF_NO_THREAD,
dev_name(ni->dev), cd);
if (err)
return err;
cd->cpu = cpumask_local_spread(0, dev_to_node(ni->dev));
cd->pmu = (struct pmu) {
.module = THIS_MODULE,
.parent = ni->dev,
.attr_groups = arm_ni_attr_groups,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
.task_ctx_nr = perf_invalid_context,
.pmu_enable = arm_ni_pmu_enable,
.pmu_disable = arm_ni_pmu_disable,
.event_init = arm_ni_event_init,
.add = arm_ni_event_add,
.del = arm_ni_event_del,
.start = arm_ni_event_start,
.stop = arm_ni_event_stop,
.read = arm_ni_event_read,
};
name = devm_kasprintf(ni->dev, GFP_KERNEL, "arm_ni_%d_cd_%d", ni->id, cd->id);
if (!name)
return -ENOMEM;
err = cpuhp_state_add_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node);
if (err)
return err;
err = perf_pmu_register(&cd->pmu, name, -1);
if (err)
cpuhp_state_remove_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node);
return err;
}
static void arm_ni_probe_domain(void __iomem *base, struct arm_ni_node *node)
{
u32 reg = readl_relaxed(base + NI_NODE_TYPE);
node->base = base;
node->type = FIELD_GET(NI_NODE_TYPE_NODE_TYPE, reg);
node->id = FIELD_GET(NI_NODE_TYPE_NODE_ID, reg);
node->num_components = readl_relaxed(base + NI_CHILD_NODE_INFO);
}
static int arm_ni_probe(struct platform_device *pdev)
{
struct arm_ni_node cfg, vd, pd, cd;
struct arm_ni *ni;
struct resource *res;
void __iomem *base;
static atomic_t id;
int num_cds;
u32 reg, part;
/*
* We want to map the whole configuration space for ease of discovery,
* but the PMU pages are the only ones for which we can honestly claim
* exclusive ownership, so we'll request them explicitly once found.
*/
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
base = devm_ioremap(&pdev->dev, res->start, resource_size(res));
if (!base)
return -ENOMEM;
arm_ni_probe_domain(base, &cfg);
if (cfg.type != NI_GLOBAL)
return -ENODEV;
reg = readl_relaxed(cfg.base + NI_PERIPHERAL_ID0);
part = FIELD_GET(NI_PIDR0_PART_7_0, reg);
reg = readl_relaxed(cfg.base + NI_PERIPHERAL_ID1);
part |= FIELD_GET(NI_PIDR1_PART_11_8, reg) << 8;
switch (part) {
case PART_NI_700:
case PART_NI_710AE:
break;
default:
dev_WARN(&pdev->dev, "Unknown part number: 0x%03x, this may go badly\n", part);
break;
}
num_cds = 0;
for (int v = 0; v < cfg.num_components; v++) {
reg = readl_relaxed(cfg.base + NI_CHILD_PTR(v));
arm_ni_probe_domain(base + reg, &vd);
for (int p = 0; p < vd.num_components; p++) {
reg = readl_relaxed(vd.base + NI_CHILD_PTR(p));
arm_ni_probe_domain(base + reg, &pd);
num_cds += pd.num_components;
}
}
ni = devm_kzalloc(&pdev->dev, struct_size(ni, cds, num_cds), GFP_KERNEL);
if (!ni)
return -ENOMEM;
ni->dev = &pdev->dev;
ni->base = base;
ni->num_cds = num_cds;
ni->part = part;
ni->id = atomic_fetch_inc(&id);
for (int v = 0; v < cfg.num_components; v++) {
reg = readl_relaxed(cfg.base + NI_CHILD_PTR(v));
arm_ni_probe_domain(base + reg, &vd);
for (int p = 0; p < vd.num_components; p++) {
reg = readl_relaxed(vd.base + NI_CHILD_PTR(p));
arm_ni_probe_domain(base + reg, &pd);
for (int c = 0; c < pd.num_components; c++) {
int ret;
reg = readl_relaxed(pd.base + NI_CHILD_PTR(c));
arm_ni_probe_domain(base + reg, &cd);
ret = arm_ni_init_cd(ni, &cd, res->start);
if (ret)
return ret;
}
}
}
return 0;
}
static void arm_ni_remove(struct platform_device *pdev)
{
struct arm_ni *ni = platform_get_drvdata(pdev);
for (int i = 0; i < ni->num_cds; i++) {
struct arm_ni_cd *cd = ni->cds + i;
if (!cd->pmu_base)
continue;
writel_relaxed(0, cd->pmu_base + NI_PMCR);
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMINTENCLR);
perf_pmu_unregister(&cd->pmu);
cpuhp_state_remove_instance_nocalls(arm_ni_hp_state, &cd->cpuhp_node);
}
}
#ifdef CONFIG_OF
static const struct of_device_id arm_ni_of_match[] = {
{ .compatible = "arm,ni-700" },
{}
};
MODULE_DEVICE_TABLE(of, arm_ni_of_match);
#endif
#ifdef CONFIG_ACPI
static const struct acpi_device_id arm_ni_acpi_match[] = {
{ "ARMHCB70" },
{}
};
MODULE_DEVICE_TABLE(acpi, arm_ni_acpi_match);
#endif
static struct platform_driver arm_ni_driver = {
.driver = {
.name = "arm-ni",
.of_match_table = of_match_ptr(arm_ni_of_match),
.acpi_match_table = ACPI_PTR(arm_ni_acpi_match),
},
.probe = arm_ni_probe,
.remove = arm_ni_remove,
};
static void arm_ni_pmu_migrate(struct arm_ni_cd *cd, unsigned int cpu)
{
perf_pmu_migrate_context(&cd->pmu, cd->cpu, cpu);
irq_set_affinity(cd->irq, cpumask_of(cpu));
cd->cpu = cpu;
}
static int arm_ni_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_node)
{
struct arm_ni_cd *cd;
int node;
cd = hlist_entry_safe(cpuhp_node, struct arm_ni_cd, cpuhp_node);
node = dev_to_node(cd_to_ni(cd)->dev);
if (cpu_to_node(cd->cpu) != node && cpu_to_node(cpu) == node)
arm_ni_pmu_migrate(cd, cpu);
return 0;
}
static int arm_ni_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_node)
{
struct arm_ni_cd *cd;
unsigned int target;
int node;
cd = hlist_entry_safe(cpuhp_node, struct arm_ni_cd, cpuhp_node);
if (cpu != cd->cpu)
return 0;
node = dev_to_node(cd_to_ni(cd)->dev);
target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);
if (target < nr_cpu_ids)
arm_ni_pmu_migrate(cd, target);
return 0;
}
static int __init arm_ni_init(void)
{
int ret;
ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
"perf/arm/ni:online",
arm_ni_pmu_online_cpu,
arm_ni_pmu_offline_cpu);
if (ret < 0)
return ret;
arm_ni_hp_state = ret;
ret = platform_driver_register(&arm_ni_driver);
if (ret)
cpuhp_remove_multi_state(arm_ni_hp_state);
return ret;
}
static void __exit arm_ni_exit(void)
{
platform_driver_unregister(&arm_ni_driver);
cpuhp_remove_multi_state(arm_ni_hp_state);
}
module_init(arm_ni_init);
module_exit(arm_ni_exit);
MODULE_AUTHOR("Robin Murphy <robin.murphy@arm.com>");
MODULE_DESCRIPTION("Arm NI-700 PMU driver");
MODULE_LICENSE("GPL v2");

View File

@ -522,7 +522,7 @@ static void armpmu_enable(struct pmu *pmu)
{
struct arm_pmu *armpmu = to_arm_pmu(pmu);
struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS);
/* For task-bound events we may be called on other CPUs */
if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
@ -742,7 +742,7 @@ static void cpu_pm_pmu_setup(struct arm_pmu *armpmu, unsigned long cmd)
struct perf_event *event;
int idx;
for (idx = 0; idx < armpmu->num_events; idx++) {
for_each_set_bit(idx, armpmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
event = hw_events->events[idx];
if (!event)
continue;
@ -772,7 +772,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
{
struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS);
if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
return NOTIFY_DONE;
@ -924,8 +924,9 @@ int armpmu_register(struct arm_pmu *pmu)
if (ret)
goto out_destroy;
pr_info("enabled with %s PMU driver, %d counters available%s\n",
pmu->name, pmu->num_events,
pr_info("enabled with %s PMU driver, %d (%*pb) counters available%s\n",
pmu->name, bitmap_weight(pmu->cntr_mask, ARMPMU_MAX_HWEVENTS),
ARMPMU_MAX_HWEVENTS, &pmu->cntr_mask,
has_nmi ? ", using NMIs" : "");
kvm_host_pmu_init(pmu);

View File

@ -59,7 +59,7 @@ static int pmu_parse_percpu_irq(struct arm_pmu *pmu, int irq)
static bool pmu_has_irq_affinity(struct device_node *node)
{
return !!of_find_property(node, "interrupt-affinity", NULL);
return of_property_present(node, "interrupt-affinity");
}
static int pmu_parse_irq_affinity(struct device *dev, int i)

View File

@ -451,13 +451,6 @@ static const struct attribute_group armv8_pmuv3_caps_attr_group = {
.attrs = armv8_pmuv3_caps_attrs,
};
/*
* Perf Events' indices
*/
#define ARMV8_IDX_CYCLE_COUNTER 0
#define ARMV8_IDX_COUNTER0 1
#define ARMV8_IDX_CYCLE_COUNTER_USER 32
/*
* We unconditionally enable ARMv8.5-PMU long event counter support
* (64-bit events) where supported. Indicate if this arm_pmu has long
@ -489,19 +482,12 @@ static bool armv8pmu_event_is_chained(struct perf_event *event)
return !armv8pmu_event_has_user_read(event) &&
armv8pmu_event_is_64bit(event) &&
!armv8pmu_has_long_event(cpu_pmu) &&
(idx != ARMV8_IDX_CYCLE_COUNTER);
(idx < ARMV8_PMU_MAX_GENERAL_COUNTERS);
}
/*
* ARMv8 low level PMU access
*/
/*
* Perf Event to low level counters mapping
*/
#define ARMV8_IDX_TO_COUNTER(x) \
(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
static u64 armv8pmu_pmcr_read(void)
{
return read_pmcr();
@ -514,21 +500,19 @@ static void armv8pmu_pmcr_write(u64 val)
write_pmcr(val);
}
static int armv8pmu_has_overflowed(u32 pmovsr)
static int armv8pmu_has_overflowed(u64 pmovsr)
{
return pmovsr & ARMV8_PMU_OVERFLOWED_MASK;
return !!(pmovsr & ARMV8_PMU_OVERFLOWED_MASK);
}
static int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
static int armv8pmu_counter_has_overflowed(u64 pmnc, int idx)
{
return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
return !!(pmnc & BIT(idx));
}
static u64 armv8pmu_read_evcntr(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
return read_pmevcntrn(counter);
return read_pmevcntrn(idx);
}
static u64 armv8pmu_read_hw_counter(struct perf_event *event)
@ -557,7 +541,7 @@ static bool armv8pmu_event_needs_bias(struct perf_event *event)
return false;
if (armv8pmu_has_long_event(cpu_pmu) ||
idx == ARMV8_IDX_CYCLE_COUNTER)
idx >= ARMV8_PMU_MAX_GENERAL_COUNTERS)
return true;
return false;
@ -585,8 +569,10 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
int idx = hwc->idx;
u64 value;
if (idx == ARMV8_IDX_CYCLE_COUNTER)
if (idx == ARMV8_PMU_CYCLE_IDX)
value = read_pmccntr();
else if (idx == ARMV8_PMU_INSTR_IDX)
value = read_pmicntr();
else
value = armv8pmu_read_hw_counter(event);
@ -595,9 +581,7 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
static void armv8pmu_write_evcntr(int idx, u64 value)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
write_pmevcntrn(counter, value);
write_pmevcntrn(idx, value);
}
static void armv8pmu_write_hw_counter(struct perf_event *event,
@ -620,15 +604,16 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
value = armv8pmu_bias_long_counter(event, value);
if (idx == ARMV8_IDX_CYCLE_COUNTER)
if (idx == ARMV8_PMU_CYCLE_IDX)
write_pmccntr(value);
else if (idx == ARMV8_PMU_INSTR_IDX)
write_pmicntr(value);
else
armv8pmu_write_hw_counter(event, value);
}
static void armv8pmu_write_evtype(int idx, unsigned long val)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
unsigned long mask = ARMV8_PMU_EVTYPE_EVENT |
ARMV8_PMU_INCLUDE_EL2 |
ARMV8_PMU_EXCLUDE_EL0 |
@ -638,7 +623,7 @@ static void armv8pmu_write_evtype(int idx, unsigned long val)
mask |= ARMV8_PMU_EVTYPE_TC | ARMV8_PMU_EVTYPE_TH;
val &= mask;
write_pmevtypern(counter, val);
write_pmevtypern(idx, val);
}
static void armv8pmu_write_event_type(struct perf_event *event)
@ -658,24 +643,26 @@ static void armv8pmu_write_event_type(struct perf_event *event)
armv8pmu_write_evtype(idx - 1, hwc->config_base);
armv8pmu_write_evtype(idx, chain_evt);
} else {
if (idx == ARMV8_IDX_CYCLE_COUNTER)
if (idx == ARMV8_PMU_CYCLE_IDX)
write_pmccfiltr(hwc->config_base);
else if (idx == ARMV8_PMU_INSTR_IDX)
write_pmicfiltr(hwc->config_base);
else
armv8pmu_write_evtype(idx, hwc->config_base);
}
}
static u32 armv8pmu_event_cnten_mask(struct perf_event *event)
static u64 armv8pmu_event_cnten_mask(struct perf_event *event)
{
int counter = ARMV8_IDX_TO_COUNTER(event->hw.idx);
u32 mask = BIT(counter);
int counter = event->hw.idx;
u64 mask = BIT(counter);
if (armv8pmu_event_is_chained(event))
mask |= BIT(counter - 1);
return mask;
}
static void armv8pmu_enable_counter(u32 mask)
static void armv8pmu_enable_counter(u64 mask)
{
/*
* Make sure event configuration register writes are visible before we
@ -688,7 +675,7 @@ static void armv8pmu_enable_counter(u32 mask)
static void armv8pmu_enable_event_counter(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
u32 mask = armv8pmu_event_cnten_mask(event);
u64 mask = armv8pmu_event_cnten_mask(event);
kvm_set_pmu_events(mask, attr);
@ -697,7 +684,7 @@ static void armv8pmu_enable_event_counter(struct perf_event *event)
armv8pmu_enable_counter(mask);
}
static void armv8pmu_disable_counter(u32 mask)
static void armv8pmu_disable_counter(u64 mask)
{
write_pmcntenclr(mask);
/*
@ -710,7 +697,7 @@ static void armv8pmu_disable_counter(u32 mask)
static void armv8pmu_disable_event_counter(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
u32 mask = armv8pmu_event_cnten_mask(event);
u64 mask = armv8pmu_event_cnten_mask(event);
kvm_clr_pmu_events(mask);
@ -719,18 +706,17 @@ static void armv8pmu_disable_event_counter(struct perf_event *event)
armv8pmu_disable_counter(mask);
}
static void armv8pmu_enable_intens(u32 mask)
static void armv8pmu_enable_intens(u64 mask)
{
write_pmintenset(mask);
}
static void armv8pmu_enable_event_irq(struct perf_event *event)
{
u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx);
armv8pmu_enable_intens(BIT(counter));
armv8pmu_enable_intens(BIT(event->hw.idx));
}
static void armv8pmu_disable_intens(u32 mask)
static void armv8pmu_disable_intens(u64 mask)
{
write_pmintenclr(mask);
isb();
@ -741,13 +727,12 @@ static void armv8pmu_disable_intens(u32 mask)
static void armv8pmu_disable_event_irq(struct perf_event *event)
{
u32 counter = ARMV8_IDX_TO_COUNTER(event->hw.idx);
armv8pmu_disable_intens(BIT(counter));
armv8pmu_disable_intens(BIT(event->hw.idx));
}
static u32 armv8pmu_getreset_flags(void)
static u64 armv8pmu_getreset_flags(void)
{
u32 value;
u64 value;
/* Read */
value = read_pmovsclr();
@ -786,9 +771,12 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
/* Clear any unused counters to avoid leaking their contents */
for_each_clear_bit(i, cpuc->used_mask, cpu_pmu->num_events) {
if (i == ARMV8_IDX_CYCLE_COUNTER)
for_each_andnot_bit(i, cpu_pmu->cntr_mask, cpuc->used_mask,
ARMPMU_MAX_HWEVENTS) {
if (i == ARMV8_PMU_CYCLE_IDX)
write_pmccntr(0);
else if (i == ARMV8_PMU_INSTR_IDX)
write_pmicntr(0);
else
armv8pmu_write_evcntr(i, 0);
}
@ -842,7 +830,7 @@ static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
{
u32 pmovsr;
u64 pmovsr;
struct perf_sample_data data;
struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
struct pt_regs *regs;
@ -869,7 +857,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
* to prevent skews in group events.
*/
armv8pmu_stop(cpu_pmu);
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
@ -908,7 +896,7 @@ static int armv8pmu_get_single_idx(struct pmu_hw_events *cpuc,
{
int idx;
for (idx = ARMV8_IDX_COUNTER0; idx < cpu_pmu->num_events; idx++) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) {
if (!test_and_set_bit(idx, cpuc->used_mask))
return idx;
}
@ -924,7 +912,9 @@ static int armv8pmu_get_chain_idx(struct pmu_hw_events *cpuc,
* Chaining requires two consecutive event counters, where
* the lower idx must be even.
*/
for (idx = ARMV8_IDX_COUNTER0 + 1; idx < cpu_pmu->num_events; idx += 2) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS) {
if (!(idx & 0x1))
continue;
if (!test_and_set_bit(idx, cpuc->used_mask)) {
/* Check if the preceding even counter is available */
if (!test_and_set_bit(idx - 1, cpuc->used_mask))
@ -946,14 +936,27 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
/* Always prefer to place a cycle counter into the cycle counter. */
if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) &&
!armv8pmu_event_get_threshold(&event->attr)) {
if (!test_and_set_bit(ARMV8_IDX_CYCLE_COUNTER, cpuc->used_mask))
return ARMV8_IDX_CYCLE_COUNTER;
if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask))
return ARMV8_PMU_CYCLE_IDX;
else if (armv8pmu_event_is_64bit(event) &&
armv8pmu_event_want_user_access(event) &&
!armv8pmu_has_long_event(cpu_pmu))
return -EAGAIN;
}
/*
* Always prefer to place a instruction counter into the instruction counter,
* but don't expose the instruction counter to userspace access as userspace
* may not know how to handle it.
*/
if ((evtype == ARMV8_PMUV3_PERFCTR_INST_RETIRED) &&
!armv8pmu_event_get_threshold(&event->attr) &&
test_bit(ARMV8_PMU_INSTR_IDX, cpu_pmu->cntr_mask) &&
!armv8pmu_event_want_user_access(event)) {
if (!test_and_set_bit(ARMV8_PMU_INSTR_IDX, cpuc->used_mask))
return ARMV8_PMU_INSTR_IDX;
}
/*
* Otherwise use events counters
*/
@ -978,15 +981,7 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
if (!sysctl_perf_user_access || !armv8pmu_event_has_user_read(event))
return 0;
/*
* We remap the cycle counter index to 32 to
* match the offset applied to the rest of
* the counter indices.
*/
if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
return ARMV8_IDX_CYCLE_COUNTER_USER;
return event->hw.idx;
return event->hw.idx + 1;
}
/*
@ -1061,14 +1056,16 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
static void armv8pmu_reset(void *info)
{
struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
u64 pmcr;
u64 pmcr, mask;
bitmap_to_arr64(&mask, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS);
/* The counter and interrupt enable registers are unknown at reset. */
armv8pmu_disable_counter(U32_MAX);
armv8pmu_disable_intens(U32_MAX);
armv8pmu_disable_counter(mask);
armv8pmu_disable_intens(mask);
/* Clear the counters we flip at guest entry/exit */
kvm_clr_pmu_events(U32_MAX);
kvm_clr_pmu_events(mask);
/*
* Initialize & Reset PMNC. Request overflow interrupt for
@ -1089,14 +1086,14 @@ static int __armv8_pmuv3_map_event_id(struct arm_pmu *armpmu,
if (event->attr.type == PERF_TYPE_HARDWARE &&
event->attr.config == PERF_COUNT_HW_BRANCH_INSTRUCTIONS) {
if (test_bit(ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED,
armpmu->pmceid_bitmap))
return ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED;
if (test_bit(ARMV8_PMUV3_PERFCTR_BR_RETIRED,
armpmu->pmceid_bitmap))
return ARMV8_PMUV3_PERFCTR_BR_RETIRED;
if (test_bit(ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED,
armpmu->pmceid_bitmap))
return ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED;
return HW_OP_UNSUPPORTED;
}
@ -1211,10 +1208,15 @@ static void __armv8pmu_probe_pmu(void *info)
probe->present = true;
/* Read the nb of CNTx counters supported from PMNC */
cpu_pmu->num_events = FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read());
bitmap_set(cpu_pmu->cntr_mask,
0, FIELD_GET(ARMV8_PMU_PMCR_N, armv8pmu_pmcr_read()));
/* Add the CPU cycles counter */
cpu_pmu->num_events += 1;
set_bit(ARMV8_PMU_CYCLE_IDX, cpu_pmu->cntr_mask);
/* Add the CPU instructions counter */
if (pmuv3_has_icntr())
set_bit(ARMV8_PMU_INSTR_IDX, cpu_pmu->cntr_mask);
pmceid[0] = pmceid_raw[0] = read_pmceid0();
pmceid[1] = pmceid_raw[1] = read_pmceid1();

View File

@ -41,7 +41,7 @@
/*
* Cache if the event is allowed to trace Context information.
* This allows us to perform the check, i.e, perfmon_capable(),
* This allows us to perform the check, i.e, perf_allow_kernel(),
* in the context of the event owner, once, during the event_init().
*/
#define SPE_PMU_HW_FLAGS_CX 0x00001
@ -50,7 +50,7 @@ static_assert((PERF_EVENT_FLAG_ARCH & SPE_PMU_HW_FLAGS_CX) == SPE_PMU_HW_FLAGS_C
static void set_spe_event_has_cx(struct perf_event *event)
{
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && !perf_allow_kernel(&event->attr))
event->hw.flags |= SPE_PMU_HW_FLAGS_CX;
}
@ -745,9 +745,8 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
(reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)))
return -EACCES;
if (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT))
return perf_allow_kernel(&event->attr);
return 0;
}

View File

@ -64,6 +64,7 @@ enum armv6_counters {
ARMV6_CYCLE_COUNTER = 0,
ARMV6_COUNTER0,
ARMV6_COUNTER1,
ARMV6_NUM_COUNTERS
};
/*
@ -254,7 +255,7 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu)
*/
armv6_pmcr_write(pmcr);
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV6_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
@ -391,7 +392,8 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = armv6pmu_start;
cpu_pmu->stop = armv6pmu_stop;
cpu_pmu->map_event = armv6_map_event;
cpu_pmu->num_events = 3;
bitmap_set(cpu_pmu->cntr_mask, 0, ARMV6_NUM_COUNTERS);
}
static int armv6_1136_pmu_init(struct arm_pmu *cpu_pmu)

View File

@ -649,24 +649,12 @@ static struct attribute_group armv7_pmuv2_events_attr_group = {
/*
* Perf Events' indices
*/
#define ARMV7_IDX_CYCLE_COUNTER 0
#define ARMV7_IDX_COUNTER0 1
#define ARMV7_IDX_COUNTER_LAST(cpu_pmu) \
(ARMV7_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1)
#define ARMV7_MAX_COUNTERS 32
#define ARMV7_COUNTER_MASK (ARMV7_MAX_COUNTERS - 1)
#define ARMV7_IDX_CYCLE_COUNTER 31
#define ARMV7_IDX_COUNTER_MAX 31
/*
* ARMv7 low level PMNC access
*/
/*
* Perf Event to low level counters mapping
*/
#define ARMV7_IDX_TO_COUNTER(x) \
(((x) - ARMV7_IDX_COUNTER0) & ARMV7_COUNTER_MASK)
/*
* Per-CPU PMNC: config reg
*/
@ -725,19 +713,17 @@ static inline int armv7_pmnc_has_overflowed(u32 pmnc)
static inline int armv7_pmnc_counter_valid(struct arm_pmu *cpu_pmu, int idx)
{
return idx >= ARMV7_IDX_CYCLE_COUNTER &&
idx <= ARMV7_IDX_COUNTER_LAST(cpu_pmu);
return test_bit(idx, cpu_pmu->cntr_mask);
}
static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx)
{
return pmnc & BIT(ARMV7_IDX_TO_COUNTER(idx));
return pmnc & BIT(idx);
}
static inline void armv7_pmnc_select_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter));
asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (idx));
isb();
}
@ -787,29 +773,25 @@ static inline void armv7_pmnc_write_evtsel(int idx, u32 val)
static inline void armv7_pmnc_enable_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_disable_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_enable_intens(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_disable_intens(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(idx)));
isb();
/* Clear the overflow flag in case an interrupt is pending. */
asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(idx)));
isb();
}
@ -853,15 +835,12 @@ static void armv7_pmnc_dump_regs(struct arm_pmu *cpu_pmu)
asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (val));
pr_info("CCNT =0x%08x\n", val);
for (cnt = ARMV7_IDX_COUNTER0;
cnt <= ARMV7_IDX_COUNTER_LAST(cpu_pmu); cnt++) {
for_each_set_bit(cnt, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(cnt);
asm volatile("mrc p15, 0, %0, c9, c13, 2" : "=r" (val));
pr_info("CNT[%d] count =0x%08x\n",
ARMV7_IDX_TO_COUNTER(cnt), val);
pr_info("CNT[%d] count =0x%08x\n", cnt, val);
asm volatile("mrc p15, 0, %0, c9, c13, 1" : "=r" (val));
pr_info("CNT[%d] evtsel=0x%08x\n",
ARMV7_IDX_TO_COUNTER(cnt), val);
pr_info("CNT[%d] evtsel=0x%08x\n", cnt, val);
}
}
#endif
@ -958,7 +937,7 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
*/
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
@ -1027,7 +1006,7 @@ static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc,
* For anything other than a cycle counter, try and use
* the events counters
*/
for (idx = ARMV7_IDX_COUNTER0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
if (!test_and_set_bit(idx, cpuc->used_mask))
return idx;
}
@ -1073,7 +1052,7 @@ static int armv7pmu_set_event_filter(struct hw_perf_event *event,
static void armv7pmu_reset(void *info)
{
struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
u32 idx, nb_cnt = cpu_pmu->num_events, val;
u32 idx, val;
if (cpu_pmu->secure_access) {
asm volatile("mrc p15, 0, %0, c1, c1, 1" : "=r" (val));
@ -1082,7 +1061,7 @@ static void armv7pmu_reset(void *info)
}
/* The counter and interrupt enable registers are unknown at reset. */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
armv7_pmnc_disable_counter(idx);
armv7_pmnc_disable_intens(idx);
}
@ -1161,20 +1140,22 @@ static void armv7pmu_init(struct arm_pmu *cpu_pmu)
static void armv7_read_num_pmnc_events(void *info)
{
int *nb_cnt = info;
int nb_cnt;
struct arm_pmu *cpu_pmu = info;
/* Read the nb of CNTx counters supported from PMNC */
*nb_cnt = (armv7_pmnc_read() >> ARMV7_PMNC_N_SHIFT) & ARMV7_PMNC_N_MASK;
nb_cnt = (armv7_pmnc_read() >> ARMV7_PMNC_N_SHIFT) & ARMV7_PMNC_N_MASK;
bitmap_set(cpu_pmu->cntr_mask, 0, nb_cnt);
/* Add the CPU cycles counter */
*nb_cnt += 1;
set_bit(ARMV7_IDX_CYCLE_COUNTER, cpu_pmu->cntr_mask);
}
static int armv7_probe_num_events(struct arm_pmu *arm_pmu)
{
return smp_call_function_any(&arm_pmu->supported_cpus,
armv7_read_num_pmnc_events,
&arm_pmu->num_events, 1);
arm_pmu, 1);
}
static int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu)
@ -1524,7 +1505,7 @@ static void krait_pmu_reset(void *info)
{
u32 vval, fval;
struct arm_pmu *cpu_pmu = info;
u32 idx, nb_cnt = cpu_pmu->num_events;
u32 idx;
armv7pmu_reset(info);
@ -1538,7 +1519,7 @@ static void krait_pmu_reset(void *info)
venum_post_pmresr(vval, fval);
/* Reset PMxEVNCTCR to sane default */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(idx);
asm volatile("mcr p15, 0, %0, c9, c15, 0" : : "r" (0));
}
@ -1562,7 +1543,7 @@ static int krait_event_to_bit(struct perf_event *event, unsigned int region,
* Lower bits are reserved for use by the counters (see
* armv7pmu_get_event_idx() for more info)
*/
bit += ARMV7_IDX_COUNTER_LAST(cpu_pmu) + 1;
bit += bitmap_weight(cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX);
return bit;
}
@ -1845,7 +1826,7 @@ static void scorpion_pmu_reset(void *info)
{
u32 vval, fval;
struct arm_pmu *cpu_pmu = info;
u32 idx, nb_cnt = cpu_pmu->num_events;
u32 idx;
armv7pmu_reset(info);
@ -1860,7 +1841,7 @@ static void scorpion_pmu_reset(void *info)
venum_post_pmresr(vval, fval);
/* Reset PMxEVNCTCR to sane default */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(idx);
asm volatile("mcr p15, 0, %0, c9, c15, 0" : : "r" (0));
}
@ -1883,7 +1864,7 @@ static int scorpion_event_to_bit(struct perf_event *event, unsigned int region,
* Lower bits are reserved for use by the counters (see
* armv7pmu_get_event_idx() for more info)
*/
bit += ARMV7_IDX_COUNTER_LAST(cpu_pmu) + 1;
bit += bitmap_weight(cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX);
return bit;
}

View File

@ -53,6 +53,8 @@ enum xscale_counters {
XSCALE_COUNTER2,
XSCALE_COUNTER3,
};
#define XSCALE1_NUM_COUNTERS 3
#define XSCALE2_NUM_COUNTERS 5
static const unsigned xscale_perf_map[PERF_COUNT_HW_MAX] = {
PERF_MAP_ALL_UNSUPPORTED,
@ -168,7 +170,7 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, XSCALE1_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
@ -364,7 +366,8 @@ static int xscale1pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = xscale1pmu_start;
cpu_pmu->stop = xscale1pmu_stop;
cpu_pmu->map_event = xscale_map_event;
cpu_pmu->num_events = 3;
bitmap_set(cpu_pmu->cntr_mask, 0, XSCALE1_NUM_COUNTERS);
return 0;
}
@ -500,7 +503,7 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, XSCALE2_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
@ -719,7 +722,8 @@ static int xscale2pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = xscale2pmu_start;
cpu_pmu->stop = xscale2pmu_stop;
cpu_pmu->map_event = xscale_map_event;
cpu_pmu->num_events = 5;
bitmap_set(cpu_pmu->cntr_mask, 0, XSCALE2_NUM_COUNTERS);
return 0;
}

View File

@ -107,6 +107,7 @@ struct dwc_pcie_vendor_id {
static const struct dwc_pcie_vendor_id dwc_pcie_vendor_ids[] = {
{.vendor_id = PCI_VENDOR_ID_ALIBABA },
{.vendor_id = PCI_VENDOR_ID_QCOM },
{} /* terminator */
};
@ -556,10 +557,10 @@ static int dwc_pcie_register_dev(struct pci_dev *pdev)
{
struct platform_device *plat_dev;
struct dwc_pcie_dev_info *dev_info;
u32 bdf;
u32 sbdf;
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf,
sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);
plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", sbdf,
pdev, sizeof(*pdev));
if (IS_ERR(plat_dev))
@ -611,15 +612,15 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
struct pci_dev *pdev = plat_dev->dev.platform_data;
struct dwc_pcie_pmu *pcie_pmu;
char *name;
u32 bdf, val;
u32 sbdf, val;
u16 vsec;
int ret;
vsec = pci_find_vsec_capability(pdev, pdev->vendor,
DWC_PCIE_VSEC_RAS_DES_ID);
pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf);
sbdf = plat_dev->id;
name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", sbdf);
if (!name)
return -ENOMEM;
@ -650,7 +651,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
&pcie_pmu->cpuhp_node);
if (ret) {
pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf);
pci_err(pdev, "Error %d registering hotplug @%x\n", ret, sbdf);
return ret;
}
@ -663,7 +664,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
if (ret) {
pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf);
pci_err(pdev, "Error %d registering PMU @%x\n", ret, sbdf);
return ret;
}
ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu,
@ -726,7 +727,6 @@ static struct platform_driver dwc_pcie_pmu_driver = {
static int __init dwc_pcie_pmu_init(void)
{
struct pci_dev *pdev = NULL;
bool found = false;
int ret;
for_each_pci_dev(pdev) {
@ -738,11 +738,7 @@ static int __init dwc_pcie_pmu_init(void)
pci_dev_put(pdev);
return ret;
}
found = true;
}
if (!found)
return -ENODEV;
ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
"perf/dwc_pcie_pmu:online",

View File

@ -141,6 +141,22 @@ static ssize_t bus_show(struct device *dev, struct device_attribute *attr, char
}
static DEVICE_ATTR_RO(bus);
static ssize_t bdf_min_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));
return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_min);
}
static DEVICE_ATTR_RO(bdf_min);
static ssize_t bdf_max_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));
return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_max);
}
static DEVICE_ATTR_RO(bdf_max);
static struct hisi_pcie_reg_pair
hisi_pcie_parse_reg_value(struct hisi_pcie_pmu *pcie_pmu, u32 reg_off)
{
@ -208,7 +224,7 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
static u64 hisi_pcie_pmu_get_event_ctrl_val(struct perf_event *event)
{
u64 port, trig_len, thr_len, len_mode;
u64 reg = HISI_PCIE_INIT_SET;
u64 reg = 0;
/* Config HISI_PCIE_EVENT_CTRL according to event. */
reg |= FIELD_PREP(HISI_PCIE_EVENT_M, hisi_pcie_get_real_event(event));
@ -452,10 +468,24 @@ static void hisi_pcie_pmu_set_period(struct perf_event *event)
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
u64 orig_cnt, cnt;
orig_cnt = hisi_pcie_pmu_read_counter(event);
local64_set(&hwc->prev_count, HISI_PCIE_INIT_VAL);
hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_CNT, idx, HISI_PCIE_INIT_VAL);
hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EXT_CNT, idx, HISI_PCIE_INIT_VAL);
/*
* The counter maybe unwritable if the target event is unsupported.
* Check this by comparing the counts after setting the period. If
* the counts stay unchanged after setting the period then update
* the hwc->prev_count correctly. Otherwise the final counts user
* get maybe totally wrong.
*/
cnt = hisi_pcie_pmu_read_counter(event);
if (orig_cnt == cnt)
local64_set(&hwc->prev_count, cnt);
}
static void hisi_pcie_pmu_enable_counter(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc)
@ -749,6 +779,8 @@ static const struct attribute_group hisi_pcie_pmu_format_group = {
static struct attribute *hisi_pcie_pmu_bus_attrs[] = {
&dev_attr_bus.attr,
&dev_attr_bdf_max.attr,
&dev_attr_bdf_min.attr,
NULL
};

View File

@ -9,6 +9,8 @@ config TSM_REPORTS
source "drivers/virt/coco/efi_secret/Kconfig"
source "drivers/virt/coco/pkvm-guest/Kconfig"
source "drivers/virt/coco/sev-guest/Kconfig"
source "drivers/virt/coco/tdx-guest/Kconfig"

View File

@ -4,5 +4,6 @@
#
obj-$(CONFIG_TSM_REPORTS) += tsm.o
obj-$(CONFIG_EFI_SECRET) += efi_secret/
obj-$(CONFIG_ARM_PKVM_GUEST) += pkvm-guest/
obj-$(CONFIG_SEV_GUEST) += sev-guest/
obj-$(CONFIG_INTEL_TDX_GUEST) += tdx-guest/

View File

@ -0,0 +1,10 @@
config ARM_PKVM_GUEST
bool "Arm pKVM protected guest driver"
depends on ARM64
help
Protected guests running under the pKVM hypervisor on arm64
are isolated from the host and must issue hypercalls to enable
interaction with virtual devices. This driver implements
support for probing and issuing these hypercalls.
If unsure, say 'N'.

View File

@ -0,0 +1,2 @@
# SPDX-License-Identifier: GPL-2.0-only
obj-$(CONFIG_ARM_PKVM_GUEST) += arm-pkvm-guest.o

View File

@ -0,0 +1,127 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Support for the hypercall interface exposed to protected guests by
* pKVM.
*
* Author: Will Deacon <will@kernel.org>
* Copyright (C) 2024 Google LLC
*/
#include <linux/arm-smccc.h>
#include <linux/array_size.h>
#include <linux/io.h>
#include <linux/mem_encrypt.h>
#include <linux/mm.h>
#include <linux/pgtable.h>
#include <asm/hypervisor.h>
static size_t pkvm_granule;
static int arm_smccc_do_one_page(u32 func_id, phys_addr_t phys)
{
phys_addr_t end = phys + PAGE_SIZE;
while (phys < end) {
struct arm_smccc_res res;
arm_smccc_1_1_invoke(func_id, phys, 0, 0, &res);
if (res.a0 != SMCCC_RET_SUCCESS)
return -EPERM;
phys += pkvm_granule;
}
return 0;
}
static int __set_memory_range(u32 func_id, unsigned long start, int numpages)
{
void *addr = (void *)start, *end = addr + numpages * PAGE_SIZE;
while (addr < end) {
int err;
err = arm_smccc_do_one_page(func_id, virt_to_phys(addr));
if (err)
return err;
addr += PAGE_SIZE;
}
return 0;
}
static int pkvm_set_memory_encrypted(unsigned long addr, int numpages)
{
return __set_memory_range(ARM_SMCCC_VENDOR_HYP_KVM_MEM_UNSHARE_FUNC_ID,
addr, numpages);
}
static int pkvm_set_memory_decrypted(unsigned long addr, int numpages)
{
return __set_memory_range(ARM_SMCCC_VENDOR_HYP_KVM_MEM_SHARE_FUNC_ID,
addr, numpages);
}
static const struct arm64_mem_crypt_ops pkvm_crypt_ops = {
.encrypt = pkvm_set_memory_encrypted,
.decrypt = pkvm_set_memory_decrypted,
};
static int mmio_guard_ioremap_hook(phys_addr_t phys, size_t size,
pgprot_t *prot)
{
phys_addr_t end;
pteval_t protval = pgprot_val(*prot);
/*
* We only expect MMIO emulation for regions mapped with device
* attributes.
*/
if (protval != PROT_DEVICE_nGnRE && protval != PROT_DEVICE_nGnRnE)
return 0;
phys = PAGE_ALIGN_DOWN(phys);
end = phys + PAGE_ALIGN(size);
while (phys < end) {
const int func_id = ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_FUNC_ID;
int err;
err = arm_smccc_do_one_page(func_id, phys);
if (err)
return err;
phys += PAGE_SIZE;
}
return 0;
}
void pkvm_init_hyp_services(void)
{
int i;
struct arm_smccc_res res;
const u32 funcs[] = {
ARM_SMCCC_KVM_FUNC_HYP_MEMINFO,
ARM_SMCCC_KVM_FUNC_MEM_SHARE,
ARM_SMCCC_KVM_FUNC_MEM_UNSHARE,
};
for (i = 0; i < ARRAY_SIZE(funcs); ++i) {
if (!kvm_arm_hyp_service_available(funcs[i]))
return;
}
arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_HYP_MEMINFO_FUNC_ID,
0, 0, 0, &res);
if (res.a0 > PAGE_SIZE) /* Includes error codes */
return;
pkvm_granule = res.a0;
arm64_mem_crypt_ops_register(&pkvm_crypt_ops);
if (kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_MMIO_GUARD))
arm64_ioremap_prot_hook_register(&mmio_guard_ioremap_hook);
}

View File

@ -976,7 +976,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
[ilog2(VM_PKEY_BIT0)] = "",
[ilog2(VM_PKEY_BIT1)] = "",
[ilog2(VM_PKEY_BIT2)] = "",
#if VM_PKEY_BIT3
[ilog2(VM_PKEY_BIT3)] = "",
#endif
#if VM_PKEY_BIT4
[ilog2(VM_PKEY_BIT4)] = "",
#endif

View File

@ -10,7 +10,7 @@
#include <linux/perf_event.h>
#include <linux/perf/arm_pmuv3.h>
#define ARMV8_PMU_CYCLE_IDX (ARMV8_PMU_MAX_COUNTERS - 1)
#define KVM_ARMV8_PMU_MAX_COUNTERS 32
#if IS_ENABLED(CONFIG_HW_PERF_EVENTS) && IS_ENABLED(CONFIG_KVM)
struct kvm_pmc {
@ -19,14 +19,14 @@ struct kvm_pmc {
};
struct kvm_pmu_events {
u32 events_host;
u32 events_guest;
u64 events_host;
u64 events_guest;
};
struct kvm_pmu {
struct irq_work overflow_work;
struct kvm_pmu_events events;
struct kvm_pmc pmc[ARMV8_PMU_MAX_COUNTERS];
struct kvm_pmc pmc[KVM_ARMV8_PMU_MAX_COUNTERS];
int irq_num;
bool created;
bool irq_level;

View File

@ -115,6 +115,70 @@
/* KVM "vendor specific" services */
#define ARM_SMCCC_KVM_FUNC_FEATURES 0
#define ARM_SMCCC_KVM_FUNC_PTP 1
/* Start of pKVM hypercall range */
#define ARM_SMCCC_KVM_FUNC_HYP_MEMINFO 2
#define ARM_SMCCC_KVM_FUNC_MEM_SHARE 3
#define ARM_SMCCC_KVM_FUNC_MEM_UNSHARE 4
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_5 5
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_6 6
#define ARM_SMCCC_KVM_FUNC_MMIO_GUARD 7
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_8 8
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_9 9
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_10 10
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_11 11
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_12 12
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_13 13
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_14 14
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_15 15
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_16 16
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_17 17
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_18 18
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_19 19
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_20 20
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_21 21
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_22 22
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_23 23
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_24 24
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_25 25
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_26 26
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_27 27
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_28 28
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_29 29
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_30 30
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_31 31
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_32 32
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_33 33
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_34 34
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_35 35
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_36 36
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_37 37
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_38 38
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_39 39
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_40 40
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_41 41
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_42 42
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_43 43
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_44 44
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_45 45
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_46 46
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_47 47
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_48 48
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_49 49
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_50 50
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_51 51
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_52 52
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_53 53
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_54 54
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_55 55
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_56 56
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_57 57
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_58 58
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_59 59
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_60 60
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_61 61
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_62 62
#define ARM_SMCCC_KVM_FUNC_PKVM_RESV_63 63
/* End of pKVM hypercall range */
#define ARM_SMCCC_KVM_FUNC_FEATURES_2 127
#define ARM_SMCCC_KVM_NUM_FUNCS 128
@ -137,6 +201,30 @@
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_PTP)
#define ARM_SMCCC_VENDOR_HYP_KVM_HYP_MEMINFO_FUNC_ID \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_HYP_MEMINFO)
#define ARM_SMCCC_VENDOR_HYP_KVM_MEM_SHARE_FUNC_ID \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_MEM_SHARE)
#define ARM_SMCCC_VENDOR_HYP_KVM_MEM_UNSHARE_FUNC_ID \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_MEM_UNSHARE)
#define ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_FUNC_ID \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_VENDOR_HYP, \
ARM_SMCCC_KVM_FUNC_MMIO_GUARD)
/* ptp_kvm counter type ID */
#define KVM_PTP_VIRT_COUNTER 0
#define KVM_PTP_PHYS_COUNTER 1

View File

@ -335,11 +335,15 @@ extern unsigned int kobjsize(const void *objp);
#ifdef CONFIG_ARCH_HAS_PKEYS
# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
# define VM_PKEY_BIT0 VM_HIGH_ARCH_0 /* A protection key is a 4-bit value */
# define VM_PKEY_BIT1 VM_HIGH_ARCH_1 /* on x86 and 5-bit value on ppc64 */
# define VM_PKEY_BIT0 VM_HIGH_ARCH_0
# define VM_PKEY_BIT1 VM_HIGH_ARCH_1
# define VM_PKEY_BIT2 VM_HIGH_ARCH_2
#if CONFIG_ARCH_PKEY_BITS > 3
# define VM_PKEY_BIT3 VM_HIGH_ARCH_3
#ifdef CONFIG_PPC
#else
# define VM_PKEY_BIT3 0
#endif
#if CONFIG_ARCH_PKEY_BITS > 4
# define VM_PKEY_BIT4 VM_HIGH_ARCH_4
#else
# define VM_PKEY_BIT4 0
@ -378,8 +382,8 @@ extern unsigned int kobjsize(const void *objp);
#endif
#if defined(CONFIG_ARM64_MTE)
# define VM_MTE VM_HIGH_ARCH_0 /* Use Tagged memory for access control */
# define VM_MTE_ALLOWED VM_HIGH_ARCH_1 /* Tagged memory permitted */
# define VM_MTE VM_HIGH_ARCH_4 /* Use Tagged memory for access control */
# define VM_MTE_ALLOWED VM_HIGH_ARCH_5 /* Tagged memory permitted */
#else
# define VM_MTE VM_NONE
# define VM_MTE_ALLOWED VM_NONE

View File

@ -17,10 +17,14 @@
#ifdef CONFIG_ARM_PMU
/*
* The ARMv7 CPU PMU supports up to 32 event counters.
* The Armv7 and Armv8.8 or less CPU PMU supports up to 32 event counters.
* The Armv8.9/9.4 CPU PMU supports up to 33 event counters.
*/
#ifdef CONFIG_ARM
#define ARMPMU_MAX_HWEVENTS 32
#else
#define ARMPMU_MAX_HWEVENTS 33
#endif
/*
* ARM PMU hw_event flags
*/
@ -96,7 +100,7 @@ struct arm_pmu {
void (*stop)(struct arm_pmu *);
void (*reset)(void *);
int (*map_event)(struct perf_event *event);
int num_events;
DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS);
bool secure_access; /* 32-bit ARM only */
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);

View File

@ -6,8 +6,9 @@
#ifndef __PERF_ARM_PMUV3_H
#define __PERF_ARM_PMUV3_H
#define ARMV8_PMU_MAX_COUNTERS 32
#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)
#define ARMV8_PMU_MAX_GENERAL_COUNTERS 31
#define ARMV8_PMU_CYCLE_IDX 31
#define ARMV8_PMU_INSTR_IDX 32 /* Not accessible from AArch32 */
/*
* Common architectural and microarchitectural event numbers.
@ -227,8 +228,10 @@
*/
#define ARMV8_PMU_OVSR_P GENMASK(30, 0)
#define ARMV8_PMU_OVSR_C BIT(31)
#define ARMV8_PMU_OVSR_F BIT_ULL(32) /* arm64 only */
/* Mask for writable bits is both P and C fields */
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C)
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C | \
ARMV8_PMU_OVSR_F)
/*
* PMXEVTYPER: Event selection reg

View File

@ -1602,13 +1602,7 @@ static inline int perf_is_paranoid(void)
return sysctl_perf_event_paranoid > -1;
}
static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
}
int perf_allow_kernel(struct perf_event_attr *attr);
static inline int perf_allow_cpu(struct perf_event_attr *attr)
{

Some files were not shown because too many files have changed in this diff Show More