PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported
to sample bandwidth, latency, buffer occupation etc.
Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is
registered as a PMU in /sys/bus/event_source/devices, so users can
select target PMU, and use filter to do further sets.
Filtering options contains:
event - select the event.
port - select target Root Ports. Information of Root Ports are
shown under sysfs.
bdf - select requester_id of target EP device.
trig_len - set trigger condition for starting event statistics.
trig_mode - set trigger mode. 0 means starting to statistic when bigger
than trigger condition, and 1 means smaller.
thr_len - set threshold for statistics.
thr_mode - set threshold mode. 0 means count when bigger than threshold,
and 1 means smaller.
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
This driver adds support for Last-level cache tag-and-data unit
(LLC-TAD) PMU that is featured in some of the Marvell's CN10K
infrastructure silicons.
The LLC is divided into 2N slices distributed across N Mesh tiles
in a single-socket configuration. The driver always configures the
same counter for all of the TADs. The user would end up effectively
reserving one of eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of an application run. The driver does not provide a way
for the user to partition TADs so that different TADs are used for
different applications.
The event counters are zeroed to start event counting to avoid any
rollover issues. TAD perf counters are 64-bit, so it's not currently
possible to overflow event counters at current mesh and core
frequencies.
To measure tad pmu events use perf tool stat command. For instance:
perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application>
perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application>
Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
The SMMU_PMCG_IIDR register was not present in older revisions of the
Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of
fields from several PIDR registers, allowing us to present a
standardized identifier to userspace.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Add device-tree support to the SMMUv3 PMCG driver.
Signed-off-by: Jay Chen <jkchen@linux.alibaba.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
In general, detailed performance analysis will require knoweldge of the
the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc.
are connected to each node. However for certain development and bringup
tasks it can be useful to have a quick overview of the CMN internal
topology to hand too. Add a debugfs file to map this out.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The second generation of CMN IPs add new node types and significantly
expand the configuration space with options for extra device ports on
edge XPs, either plumbed into the regular DTM or with extra dedicated
DTMs to monitor them, plus larger (and smaller) mesh sizes. Add basic
support for pulling this new information out of the hardware, piping
it around as necessary, and handling (most of) the new choices.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e58b495bcc7deec3882be4bac910ed0bf6979674.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In preparation for supporting newer CMN products, let's introduce a
means to differentiate the features and events which are specific to a
particular IP from those which remain common to the whole family. The
newer designs have also smoothed off some of the rough edges in terms
of discoverability, so separate out the parts of the flow which have
effectively now become CMN-600 quirks.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9f6368cdca4c821d801138939508a5bba54ccabb.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
With the value of CMN_MAX_DTMS increasing significantly, our validation
data structure is set to get quite big. Technically we could pack it at
least twice as densely, since we only need around 19 bits of information
per DTM, but that makes the code even more mind-bogglingly impenetrable,
and even half of "quite big" may still be uncomfortably large for a
stack frame (~1KB). Just move it to an off-stack allocation instead.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In cases where we do know which DTC domain a node belongs to, we can
skip initialising or reading the global count in DTCs where we know
it won't change. The machinery to achieve that is mostly in place
already, so finish hooking it up by converting the vestigial domain
tracking to propagate suitable bitmaps all the way through to events.
Note that this does not allow allocating such an unused counter to a
different event on that DTC, because that is a flippin' nightmare.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/51d930fd945ef51c81f5889ccca055c302b0a1d0.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
When multiple nodes of the same type are connected to the same XP
(particularly in CAL configurations), it seems that they are likely
to be consecutive in logical ID. Therefore, we're likely to gain a
small benefit from an easy tweak to optimise out consecutive reads
of the same set of DTM counters for an aggregated event.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Untangle DTMs from XPs into a dedicated abstraction. This helps make
things a little more obvious and robust, but primarily paves the way
for further development where new IPs can grow extra DTMs per XP.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9cca18b1b98f482df7f1aaf3d3213e7f39500423.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Refactor the places where we scan through the set of nodes to switch
from explicit array indexing to pointer-based iteration. This leads to
slightly simpler object code, but also makes the source less dense and
more pleasant for further development. It also unearths an almost-bug
in arm_cmn_event_init() where we've been depending on the "array index"
of NULL relative to cmn->dns being a sufficiently large number, yuck.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ee0c9eda9a643f46001ac43aadf3f0b1fd5660dd.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add a bit more abstraction for the places where we decompose node IDs.
This will help keep things nice and manageable when we come to add yet
more variables which affect the node ID format. Also use the opportunity
to move the rest of the low-level node management helpers back up to the
logical place they were meant to be - how they ended up buried right in
the middle of the event-related definitions is somewhat of a mystery...
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/a2242a8c3c96056c13a04ae87bf2047e5e64d2d9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Although CMN is currently (and overwhelmingly likely to remain) deployed
in arm64-only (modulo userspace) systems, the 64-bit "dependency" for
compile-testing was just laziness due to heavy reliance on readq/writeq
accessors. Since we only need one extra include for robustness in that
regard, let's pull that in, widen the compile-test coverage, and fix up
the smattering of type laziness that that brings to light.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
On a system with multiple CMN meshes, ideally we'd want to access each
PMU from within its own mesh, rather than with a long CML round-trip,
wherever feasible. Since such a system is likely to be presented as
multiple NUMA nodes, let's also hope a proximity domain is specified
for each CMN programming interface, and use that to guide our choice
of IRQ affinity to favour a node-local CPU where possible.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32438b0d016e0649d882d47d30ac2000484287b9.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Attempting to migrate the PMU context after we've unregistered the PMU
device, or especially if we never successfully registered it in the
first place, is a woefully bad idea. It's also fundamentally pointless
anyway. Make sure to unregister an instance from the hotplug handler
*without* invoking the teardown callback.
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/2c221d745544774e4b07583b65b5d4d94f7e0fe4.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
- Update the ACPICA code in the kernel to upstream revision 20210930
including the following changes:
* Fix system-wide resume issue caused by evaluating control
methods too early in the resume path (Rafael Wysocki).
* Add support for Windows 2020 _OSI string (Mario Limonciello).
* Add Generic Port Affinity type for SRAT (Alison Schofield).
* Add disassembly support for the NHLT ACPI table (Bob Moore).
- Avoid flushing caches before entering C3 type of idle states on
AMD processors (Deepak Sharma).
- Avoid enumerating CPUs that are not present and not online-capable
according to the platform firmware (Mario Limonciello).
- Add DMI-based mechanism to quirk IRQ overrides and use it for two
platforms (Hui Wang).
- Change the configuration of unused ACPI device objects to reflect
the D3cold power state after enumerating devices (Rafael Wysocki).
- Update MAINTAINERS information regarding ACPI (Rafael Wysocki).
- Fix typo in ACPI Kconfig (Masanari Iid).
- Use sysfs_emit() instead of snprintf() in some places (Qing Wang).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices
in general (Rafael Wysocki).
- Use acpi_device_adr() in acpi_find_child_device() instead of
evaluating _ADR (Rafael Wysocki).
- Drop duplicate device IDs from PNP device IDs list (Krzysztof
Kozlowski).
- Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
Gong).
- Use ACPI_COMPANION() to simplify code in some drivers (Rafael
Wysocki).
- Check the states of all ACPI power resources during initialization
to avoid dealing with power resources in unknown states (Rafael
Wysocki).
- Fix ACPI power resource issues related to sharing wakeup power
resources (Rafael Wysocki).
- Avoid registering redundant suspend_ops (Rafael Wysocki).
- Report battery charging state as "full" if it appears to be over
the design capacity (André Almeida).
- Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
Schaeckeler).
- Mark apei_hest_parse() static (Christoph Hellwig).
- Relax platform response timeout to 1 second after instructing it
to inject an error (Shuai Xue).
- Make the PRM code handle memory allocation and remapping failures
more gracefully and drop some unnecessary blank lines from that
code (Aubrey Li).
- Fix spelling mistake in the ACPI documentation (Colin Ian King).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGBkbESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxCf8QALHX0PjsKC9cFs+hHVoI+T9VdVa/2pY4
8vMjxYqkano4ti2xu1Fo/gaECnpgtVH+mV6iXPSAf80Wjgxnrx8K7y4qyUR4POBR
l/5jA+XuFOyBRCSIHjJrfGof3/bTCt63LuyOc1WxiFzhkIuKeiI7Bmw6pvxrcXZw
Ehe0VZK8eZGHR2py7RlZ68IJ32uAdlVKjlXyOFWfm9pjKtYit49WV/rjY0ldHBou
vq8JP4EEmCKhyYAuzRprkPR2itm5fYI3lceG3UaIq5uWCj0IlaWcuv6c7K6N6QA1
bHjCUSWPMCoHKQbZoX5MEEAjIJKdO81Rj3PGjNL1uYSz806ZmNcxav1twebm1lEa
h6GbcFDaY9USMEl1D8s7h5Lm9cM33cen4IW9Ms5nMjugkdAr4KyLOA8CgpTbb1Ih
vBrlB4CzxEqgpa+YAvJd7/r3FckauGthkciEjOCFpUxeZpKW5kweLIwFpPHKaqot
KXtE+wwacxmkwYiJwEg141Bp4oFfy6iBcDn8xOapjq89DFEAn03VuvAO33Q+2KPL
A2RtOaV2tSxxNDUss9wENI4lF5sc2JvBECXc1MZCBBNQqxTrwT562UwMhgUXD1Vy
8LvdrYAOZzS6QizeHQC27FQ/uwQ5/uEqaYfjMv+JQjaZZU7YeG3vekADqiEstfWF
UUj72AE8a/Cs
=Ni9a
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to the most recent upstream
revision, address some issues related to the ACPI power resources
management, simplify the enumeration of PCI devices having ACPI
companions, add new quirks, fix assorted problems, update the
ACPI-related information in maintainers and clean up code in several
places.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20210930
including the following changes:
- Fix system-wide resume issue caused by evaluating control
methods too early in the resume path (Rafael Wysocki).
- Add support for Windows 2020 _OSI string (Mario Limonciello).
- Add Generic Port Affinity type for SRAT (Alison Schofield).
- Add disassembly support for the NHLT ACPI table (Bob Moore).
- Avoid flushing caches before entering C3 type of idle states on AMD
processors (Deepak Sharma).
- Avoid enumerating CPUs that are not present and not online-capable
according to the platform firmware (Mario Limonciello).
- Add DMI-based mechanism to quirk IRQ overrides and use it for two
platforms (Hui Wang).
- Change the configuration of unused ACPI device objects to reflect
the D3cold power state after enumerating devices (Rafael Wysocki).
- Update MAINTAINERS information regarding ACPI (Rafael Wysocki).
- Fix typo in ACPI Kconfig (Masanari Iid).
- Use sysfs_emit() instead of snprintf() in some places (Qing Wang).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices in
general (Rafael Wysocki).
- Use acpi_device_adr() in acpi_find_child_device() instead of
evaluating _ADR (Rafael Wysocki).
- Drop duplicate device IDs from PNP device IDs list (Krzysztof
Kozlowski).
- Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard
Gong).
- Use ACPI_COMPANION() to simplify code in some drivers (Rafael
Wysocki).
- Check the states of all ACPI power resources during initialization
to avoid dealing with power resources in unknown states (Rafael
Wysocki).
- Fix ACPI power resource issues related to sharing wakeup power
resources (Rafael Wysocki).
- Avoid registering redundant suspend_ops (Rafael Wysocki).
- Report battery charging state as "full" if it appears to be over
the design capacity (André Almeida).
- Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan
Schaeckeler).
- Mark apei_hest_parse() static (Christoph Hellwig).
- Relax platform response timeout to 1 second after instructing it to
inject an error (Shuai Xue).
- Make the PRM code handle memory allocation and remapping failures
more gracefully and drop some unnecessary blank lines from that
code (Aubrey Li).
- Fix spelling mistake in the ACPI documentation (Colin Ian King)"
* tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (36 commits)
ACPI: glue: Use acpi_device_adr() in acpi_find_child_device()
perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly
ACPI: APEI: mark apei_hest_parse() static
ACPI: APEI: EINJ: Relax platform response timeout to 1 second
gpio-amdpt: ACPI: Use the ACPI_COMPANION() macro directly
nouveau: ACPI: Use the ACPI_COMPANION() macro directly
ACPI: resources: Add one more Medion model in IRQ override quirk
ACPI: AC: Quirk GK45 to skip reading _PSR
ACPI: PM: sleep: Do not set suspend_ops unnecessarily
ACPI: PRM: Handle memory allocation and memory remap failure
ACPI: PRM: Remove unnecessary blank lines
ACPI: PM: Turn off wakeup power resources on _DSW/_PSW errors
ACPI: PM: Fix sharing of wakeup power resources
ACPI: PM: Turn off unused wakeup power resources
ACPI: PM: Check states of power resources during initialization
ACPI: replace snprintf() in "show" functions with sysfs_emit()
ACPI: LPSS: Use ACPI_COMPANION() directly
ACPI: scan: Release PM resources blocked by unused objects
ACPI: battery: Accept charges over the design capacity as full
ACPICA: Update version to 20210930
...
- Support for the Arm8.6 timer extensions, including a self-synchronising
view of the system registers to elide some expensive ISB instructions.
- Exception table cleanup and rework so that the fixup handlers appear
correctly in backtraces.
- A handful of miscellaneous changes, the main one being selection of
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
- More mm and pgtable cleanups.
- KASAN support for "asymmetric" MTE, where tag faults are reported
synchronously for loads (via an exception) and asynchronously for
stores (via a register).
- Support for leaving the MMU enabled during kexec relocation, which
significantly speeds up the operation.
- Minor improvements to our perf PMU drivers.
- Improvements to the compat vDSO build system, particularly when
building with LLVM=1.
- Preparatory work for handling some Coresight TRBE tracing errata.
- Cleanup and refactoring of the SVE code to pave the way for SME
support in future.
- Ensure SCS pages are unpoisoned immediately prior to freeing them
when KASAN is enabled for the vmalloc area.
- Try moving to the generic pfn_valid() implementation again now that
the DMA mapping issue from last time has been resolved.
- Numerous improvements and additions to our FPSIMD and SVE selftests.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmF74ZYQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNI/eB/UZYAtmNi6xC5StPaETyMLeZph9BV/IqIFq
N71ds7MFzlX/agR6MwLbH2tBHezBtlQ90O732Jjz8zAec2cHd+7sx/w82JesX7PB
IuOfqP78rvtU4ZkKe1Rcd96QtYvbtNAqcRhIo95OzfV9xwuzkvdXI+ZTYhtCfCuZ
GozCqQoJtnNDayMtfzbDSXyJLNJc/qnIcUQhrt3vg12zbF3BcHxnmp0nBcHCqZEo
lDJYufju7p87kCzaFYda2WhlI3t+NThqKOiZ332wQfqzNcr+rw1Y4jWbnCfrdLtI
JfHT9yiuHDmFSYaJrk7NU8kftW31NV70bbhD7rZ+DQCVndl0lRc=
=3R3j
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"There's the usual summary below, but the highlights are support for
the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the
ability to kexec() with the MMU enabled and a second attempt at
switching to the generic pfn_valid() implementation.
Summary:
- Support for the Arm8.6 timer extensions, including a
self-synchronising view of the system registers to elide some
expensive ISB instructions.
- Exception table cleanup and rework so that the fixup handlers
appear correctly in backtraces.
- A handful of miscellaneous changes, the main one being selection of
CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
- More mm and pgtable cleanups.
- KASAN support for "asymmetric" MTE, where tag faults are reported
synchronously for loads (via an exception) and asynchronously for
stores (via a register).
- Support for leaving the MMU enabled during kexec relocation, which
significantly speeds up the operation.
- Minor improvements to our perf PMU drivers.
- Improvements to the compat vDSO build system, particularly when
building with LLVM=1.
- Preparatory work for handling some Coresight TRBE tracing errata.
- Cleanup and refactoring of the SVE code to pave the way for SME
support in future.
- Ensure SCS pages are unpoisoned immediately prior to freeing them
when KASAN is enabled for the vmalloc area.
- Try moving to the generic pfn_valid() implementation again now that
the DMA mapping issue from last time has been resolved.
- Numerous improvements and additions to our FPSIMD and SVE
selftests"
[ armv8.6 timer updates were in a shared branch and already came in
through -tip in the timer pull - Linus ]
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits)
arm64: Select POSIX_CPU_TIMERS_TASK_WORK
arm64: Document boot requirements for FEAT_SME_FA64
arm64/sve: Fix warnings when SVE is disabled
arm64/sve: Add stub for sve_max_virtualisable_vl()
arm64: errata: Add detection for TRBE write to out-of-range
arm64: errata: Add workaround for TSB flush failures
arm64: errata: Add detection for TRBE overwrite in FILL mode
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
selftests: arm64: Factor out utility functions for assembly FP tests
arm64: vmlinux.lds.S: remove `.fixup` section
arm64: extable: add load_unaligned_zeropad() handler
arm64: extable: add a dedicated uaccess handler
arm64: extable: add `type` and `data` fields
arm64: extable: use `ex` for `exception_table_entry`
arm64: extable: make fixup_exception() return bool
arm64: extable: consolidate definitions
arm64: gpr-num: support W registers
arm64: factor out GPR numbering helpers
arm64: kvm: use kvm_exception_table_entry
arm64: lib: __arch_copy_to_user(): fold fixups into body
...
The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION()
macro and the ACPI handle produced by the former comes from the
ACPI device object produced by the latter, so it is way more
straightforward to evaluate the latter directly instead of passing
the handle produced by the former to acpi_bus_get_device().
Modify l2_cache_pmu_probe_cluster() accordingly (no intentional
functional impact).
While at it, rename the ACPI device pointer to adev for more
clarity.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Improve build test cover by allowing some drivers to build under
COMPILE_TEST where possible.
Some notes:
- Mostly a dependency on CONFIG_ACPI is not really required for only
building (but left untouched), but is required for TX2 which uses ACPI
functions which have no stubs
- XGENE required 64b dependency as it relies on some unsigned long perf
struct fields being 64b
- I don't see why TX2 requires NUMA to build, but left untouched
- Added an explicit dependency on GENERIC_MSI_IRQ_DOMAIN for
ARM_SMMU_V3_PMU, which is required for platform MSI functions
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-3-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
A LSL of 32 requires > 32b value to hold the result. However in
tx2_uncore_event_update(), 1UL << 32 currently only works as unsigned
long is 64b on a 64b system.
If we want to compile test for a 32b system, we need unsigned long long,
whose min size is 64b.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The PA PMU counter offset was correct in [1] and the driver has
already been verified. We want to keep the register offset using
lower case character in later version that is consistent with
the existed driver. Since there was no functional change, we
didn't do more test. However there is typo when modified the PA
PMU counter offset by mistake, so fix this bad mistake.
[1] https://www.spinics.net/lists/arm-kernel/msg865263.html
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210928123022.23467-1-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Russell reported that since 5.13, KVM's probing of the PMU has
started to fail on his HW. As it turns out, there is an implicit
ordering dependency between the architectural PMU probing code and
and KVM's own probing. If, due to probe ordering reasons, KVM probes
before the PMU driver, it will fail to detect the PMU and prevent it
from being advertised to guests as well as the VMM.
Obviously, this is one probing too many, and we should be able to
deal with any ordering.
Add a callback from the PMU code into KVM to advertise the registration
of a host CPU PMU, allowing for any probing order.
Fixes: 5421db1be3 ("KVM: arm64: Divorce the perf code from oprofile helpers")
Reported-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YUYRKVflRtUytzy5@shell.armlinux.org.uk
Cc: stable@vger.kernel.org
ddr_perf_probe() misses to call ida_simple_remove() in an error path.
Jump to cpuhp_state_err to fix it.
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/20210617122614.166823-1-jingxiangfeng@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
When multiple dtcs share the same IRQ number, the irq_friend which
used to refer to dtc object gets calculated incorrect which leads
to invalid pointer.
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1623946129-3290-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify IMX8_DDR_PMU_EVENT_ATTR
Reviewed by Frank Li <Frank .li@nxp.com>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-7-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify XGENE_PMU_EVENT_ATTR
Cc: Khuong Dinh <khuong@os.amperecomputing.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-6-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify L3CACHE_EVENT_ATTR
Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-5-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify L2CACHE_EVENT_ATTR
Cc: Andy Gross <agross@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-4-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
With global filtering, we only allow an event to be scheduled if its
filter settings exactly match those of any existing events, therefore
it is pointless to reapply the filter in that case. Much worse, though,
is that in doing that we trample the event type of counter 0 if it's
already active, and never touch the appropriate PMEVTYPERn so the new
event is likely not counting the right thing either. Don't do that.
CC: stable@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/32c80c0e46237f49ad8da0c9f8864e13c4a803aa.1623153312.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
These are only put in an array of pointers to const attribute_group
structs. Make them const like the other static attribute_group structs
to allow the compiler to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210605221514.73449-1-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Link: https://lore.kernel.org/r/20210608084816.1046485-1-chenxiaosong2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
'Data source' is a new function for HHA PMU and config / clear
interface was wrong by mistake. 'HHA_DATSRC_CTRL' register is
mainly used for data source configuration, if we enable bit0
as driver, it will go on count the event and we didn't check
it carefully. So fix the issue and do as the initial purpose.
Fixes: 932f6a99f9 ("drivers/perf: hisi: Add new functions for HHA PMU")
Reported-by: kernel test robot <lkp@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1622709291-37996-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
request_irq() after setting IRQ_NOAUTOEN as below
irq_set_status_flags(irq, IRQ_NOAUTOEN); request_irq(dev, irq...); can
be replaced by request_irq() with IRQF_NO_AUTOEN flag.
this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which
is being merged: https://lore.kernel.org/patchwork/patch/1388765/
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1622595642-61678-3-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
request_irq() after setting IRQ_NOAUTOEN as below
irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_irq(dev, irq...);
can be replaced by request_irq() with IRQF_NO_AUTOEN flag.
this patch is made base on "add IRQF_NO_AUTOEN for request_irq" which
is being merged: https://lore.kernel.org/patchwork/patch/1388765/
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1622595642-61678-2-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528061738.23392-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014940.4184-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014749.24068-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Use DEVICE_ATTR_RO helper instead of plain DEVICE_ATTR,
which makes the code a bit shorter and easier to read.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20210528014130.7708-1-yuehaibing@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Fix some coding style issues reported by checkpatch.pl, including
following types:
ERROR: need consistent spacing around '-' (ctx:WxV)
ERROR: space required before the open parenthesis '('
Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-5-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Fix some coding style issues reported by checkpatch.pl, including
following types:
ERROR: spaces required around that '=' (ctx:VxW)
WARNING: Possible unnecessary 'out of memory' message
Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-3-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Fix some coding style issues reported by checkpatch.pl, including
following types:
WARNING: void function return statements are not generally useful
WARNING: Possible unnecessary 'out of memory' message
Signed-off-by: Junhao He <hejunhao2@hisilicon.com>
Signed-off-by: Jay Fang <f.fangjian@huawei.com>
Link: https://lore.kernel.org/r/1620736054-58412-2-git-send-email-f.fangjian@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Link: https://lore.kernel.org/r/1620715364-107460-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
These drivers use irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.813375875@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.699566062@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.603636289@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.505110632@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.395086573@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.277228577@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
The driver uses irq_set_affinity_hint() to set the affinity for the PMU
interrupts, which relies on the undocumented side effect that this function
actually sets the affinity under the hood.
Setting an hint is clearly not a guarantee and for these PMU interrupts an
affinity hint, which is supposed to guide userspace for setting affinity,
is beyond pointless, because the affinity of these interrupts cannot be
modified from user space.
Aside of that the error checks are bogus because the only error which is
returned from irq_set_affinity_hint() is when there is no irq descriptor
for the interrupt number, but not when the affinity set fails. That's on
purpose because the hint can point to an offline CPU.
Replace the mindless abuse with irq_set_affinity().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210518093118.128250213@linutronix.de
Signed-off-by: Will Deacon <will@kernel.org>
- Stage-2 isolation for the host kernel when running in protected mode
- Guest SVE support when running in nVHE mode
- Force W^X hypervisor mappings in nVHE mode
- ITS save/restore for guests using direct injection with GICv4.1
- nVHE panics now produce readable backtraces
- Guest support for PTP using the ptp_kvm driver
- Performance improvements in the S2 fault handler
x86:
- Optimizations and cleanup of nested SVM code
- AMD: Support for virtual SPEC_CTRL
- Optimizations of the new MMU code: fast invalidation,
zap under read lock, enable/disably dirty page logging under
read lock
- /dev/kvm API for AMD SEV live migration (guest API coming soon)
- support SEV virtual machines sharing the same encryption context
- support SGX in virtual machines
- add a few more statistics
- improved directed yield heuristics
- Lots and lots of cleanups
Generic:
- Rework of MMU notifier interface, simplifying and optimizing
the architecture-specific code
- Some selftests improvements
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCJ13kUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroM1HAgAqzPxEtiTPTFeFJV5cnPPJ3dFoFDK
y/juZJUQ1AOtvuWzzwuf175ewkv9vfmtG6rVohpNSkUlJYeoc6tw7n8BTTzCVC1b
c/4Dnrjeycr6cskYlzaPyV6MSgjSv5gfyj1LA5UEM16LDyekmaynosVWY5wJhju+
Bnyid8l8Utgz+TLLYogfQJQECCrsU0Wm//n+8TWQgLf1uuiwshU5JJe7b43diJrY
+2DX+8p9yWXCTz62sCeDWNahUv8AbXpMeJ8uqZPYcN1P0gSEUGu8xKmLOFf9kR7b
M4U1Gyz8QQbjd2lqnwiWIkvRLX6gyGVbq2zH0QbhUe5gg3qGUX7JjrhdDQ==
=AXUi
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"This is a large update by KVM standards, including AMD PSP (Platform
Security Processor, aka "AMD Secure Technology") and ARM CoreSight
(debug and trace) changes.
ARM:
- CoreSight: Add support for ETE and TRBE
- Stage-2 isolation for the host kernel when running in protected
mode
- Guest SVE support when running in nVHE mode
- Force W^X hypervisor mappings in nVHE mode
- ITS save/restore for guests using direct injection with GICv4.1
- nVHE panics now produce readable backtraces
- Guest support for PTP using the ptp_kvm driver
- Performance improvements in the S2 fault handler
x86:
- AMD PSP driver changes
- Optimizations and cleanup of nested SVM code
- AMD: Support for virtual SPEC_CTRL
- Optimizations of the new MMU code: fast invalidation, zap under
read lock, enable/disably dirty page logging under read lock
- /dev/kvm API for AMD SEV live migration (guest API coming soon)
- support SEV virtual machines sharing the same encryption context
- support SGX in virtual machines
- add a few more statistics
- improved directed yield heuristics
- Lots and lots of cleanups
Generic:
- Rework of MMU notifier interface, simplifying and optimizing the
architecture-specific code
- a handful of "Get rid of oprofile leftovers" patches
- Some selftests improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits)
KVM: selftests: Speed up set_memory_region_test
selftests: kvm: Fix the check of return value
KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt()
KVM: SVM: Skip SEV cache flush if no ASIDs have been used
KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()
KVM: SVM: Drop redundant svm_sev_enabled() helper
KVM: SVM: Move SEV VMCB tracking allocation to sev.c
KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()
KVM: SVM: Unconditionally invoke sev_hardware_teardown()
KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)
KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y
KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables
KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features
KVM: SVM: Move SEV module params/variables to sev.c
KVM: SVM: Disable SEV/SEV-ES if NPT is disabled
KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails
KVM: SVM: Zero out the VMCB array used to track SEV ASID association
x86/sev: Drop redundant and potentially misleading 'sev_enabled'
KVM: x86: Move reverse CPUID helpers to separate header file
KVM: x86: Rename GPR accessors to make mode-aware variants the defaults
...
Nearly all of the messages we can log from the platform device code
relate to the specific PMU device and the properties we're parsing from
its DT node. In some cases we use %pOF to point at where something was
wrong, but even that is inconsistent. Let's convert these logs to the
appropriate dev_printk variants, so that every issue specific to the
device and/or its DT description is clearly and instantly attributable,
particularly if there is more than one PMU node present in the DT.
The local refactoring in a couple of functions invites some extra
cleanup in the process - the init_fn matching can be streamlined, and
the PMU registration failure message moved to the appropriate place and
log level.
CC: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/10a4aacdf071d0c03d061c408a5899e5b32cc0a6.1616774562.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
By virtue of using platform_irq_get_optional() under the covers,
platform_irq_count() needs the target interrupt controller to be
available and may return -EPROBE_DEFER if it isn't. Let's use
dev_err_probe() to avoid a spurious error log (and help debug any
deferral issues) in that case.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/073d5e0d3ed1f040592cb47ca6fe3759f40cc7d1.1616774562.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
On HiSilicon Hip09 platform, there is a PA (Protocol Adapter) module on
each chip SICL (Super I/O Cluster) which incorporates three Hydra interface
and facilitates the cache coherency between the dies on the chip. While PA
uncore PMU model is the same as other Hip09 PMU modules and many PMU events
are supported. Let's support the PMU driver using the HiSilicon uncore PMU
framework.
PA PMU supports the following filter functions:
* tracetag_en: allows user to count events according to tt_req or
tt_core set in L3C PMU. It's the same as other PMUs.
* srcid_cmd & srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.
* tgtid_cmd & tgtid_msk: it is the similar function to srcid_cmd &
srcid_msk. Both are used to check where the data comes from or go to.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-9-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
HiSilicon's Hip09 is comprised by multi-dies that can be connected by SLLC
module (Skyros Link Layer Controller), its has separate PMU registers which
the driver can program it freely and interrupt is supported to handle
counter overflow. Let's support its driver under the framework of HiSilicon
uncore PMU driver.
SLLC PMU supports the following filter functions:
* tracetag_en: allows user to count data according to tt_req or
tt_core set in L3C PMU.
* srcid_cmd & srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.
* tgtid_hi & tgtid_lo: it also supports event statistics that these
operations will go to the CCL/ICL by configuration target ID or
target ID range. It's the same as source ID with 11-bit width in
the SoC. More introduction is added in documentation:
Documentation/admin-guide/perf/hisi-pmu.rst
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-8-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
DDRC PMU's events are useful for performance profiling, but the events
are limited and counter is fixed. On HiSilicon Hip09 platform, PMU
counters are the programmable and more events are supported. Let's
add the DDRC PMU v2 driver.
Bandwidth events are exposed directly in driver and some more events
will listed in JSON file later.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-7-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
On HiSilicon Hip09 platform, some new functions are also supported on
HHA PMU.
* tracetag_en: it is the abbreviation of tracetag enable and allows user
to count events according to tt_req or tt_core set in L3C PMU.
* datasrc_skt: it is the abbreviation of data source from another
socket and it is used in the multi-chips. It's the same as L3C PMU.
* srcid_cmd & srcid_msk: pair of the fields are used to filter
statistics that come from the specific CCL/ICL by the configuration.
These are the abbreviation of source ID command and mask. The source
ID is 11-bit and detailed descriptions are documented in
Documentation/admin-guide/perf/hisi-pmu.rst.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-6-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
On HiSilicon Hip09 platform, some new functions are enhanced on L3C PMU:
* tt_req: it is the abbreviation of tracetag request and allows user to
count only read/write/atomic operations. tt_req is 3-bit and details are
listed in the hisi-pmu document.
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5
* tt_core: it is the abbreviation of tracetag core and allows user to
filter by core/thread within the cluster, it is a 8-bit bitmap that each
bit represents the corresponding core/thread in this L3C.
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0xf/ sleep 5
* datasrc_cfg: it is the abbreviation of data source configuration and
allows user to check where the data comes from, such as: from local DDR,
cross-die DDR or cross-socket DDR. Its is 5-bit and represents different
data source in the SoC.
$# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0xe/ sleep 5
* datasrc_skt: it is the abbreviation of data source from another socket
and is used in the multi-chips, if user wants to check the cross-socket
datat source, it shall be added in perf command. Only one bit is used to
control this.
$# perf stat -a -e hisi_sccl3_l3c0/dat_access,datasrc_cfg=0x10,datasrc_skt=1/ sleep 5
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-5-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
For HiSilicon uncore PMU, more versions are supported and some variables
shall be added suffix to distinguish the version which are prepared for
the new drivers.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-4-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
On HiSilicon uncore PMU drivers, interrupt handling function and interrupt
registration function are very similar in differents PMU modules. Let's
refactor the frame.
Two new callbacks are added for the HW accessors:
* hisi_uncore_ops::get_int_status returns a bitmap of events which
have overflowed and raised an interrupt
* hisi_uncore_ops::clear_int_status clears the overflow status for a
specific event
These callback functions are used by a common IRQ handler,
hisi_uncore_pmu_isr().
One more function hisi_uncore_pmu_init_irq() is added to replace each
PMU initialization IRQ interface and simplify the code.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-3-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
The sanity check for counter index has been done in the function
hisi_uncore_pmu_get_event_idx, so remove the redundant interface
hisi_uncore_pmu_counter_valid() and sanity check.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1615186237-22263-2-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
For each PMU event, there is a SMMU_EVENT_ATTR(xx, XX) and
&smmu_event_attr_xx.attr.attr. Let's redefine the SMMU_EVENT_ATTR
to simplify the smmu_pmu_events.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1612789498-12957-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
sprintf does not know the PAGE_SIZE maximum of the temporary buffer
used for sysfs content and it's possible to overrun the buffer length.
Use sysfs_emit() function to ensures that no overrun is done.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1616148273-16374-4-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Fix the following coccicheck warning:
./drivers/perf/hisilicon/hisi_uncore_pmu.c:128:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/fsl_imx8_ddr_perf.c:173:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_spe_pmu.c:129:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_smmu_pmu.c:563:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_dsu_pmu.c:149:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm_dsu_pmu.c:139:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cmn.c:563:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cmn.c:351:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-ccn.c:224:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:708:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:699:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:528:8-16: WARNING: use scnprintf or sprintf.
./drivers/perf/arm-cci.c:309:8-16: WARNING: use scnprintf or sprintf.
Signed-off-by: Zihao Tang <tangzihao1@hisilicon.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1616148273-16374-2-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 53c218da22 ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20210312080421.277562-1-weiyongjun1@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Including:
- ARM SMMU and Mediatek updates from Will Deacon:
- Support for MT8192 IOMMU from Mediatek
- Arm v7s io-pgtable extensions for MT8192
- Removal of TLBI_ON_MAP quirk
- New Qualcomm compatible strings
- Allow SVA without hardware broadcast TLB maintenance
on SMMUv3
- Virtualization Host Extension support for SMMUv3 (SVA)
- Allow SMMUv3 PMU (perf) driver to be built
independently from IOMMU
- Some tidy-up in IOVA and core code
- Conversion of the AMD IOMMU code to use the generic
IO-page-table framework
- Intel VT-d updates from Lu Baolu:
- Audit capability consistency among different IOMMUs
- Add SATC reporting structure support
- Add iotlb_sync_map callback support
- SDHI Support for Renesas IOMMU driver
- Misc Cleanups and other small improvments
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmAz2AcACgkQK/BELZcB
GuMflRAAyOfXzFqfmniYKxXmxAhOIlibCoojXzecItifdyaFkvUhxCadRg5u3dLH
IeACDvUiaA5VQnVhjI0a7gCckKdq5YDwwAS+GQICUKNEZtWwrHwm2QTRBmL9MOlA
v+iTrhYCqZWIAPe16BP5L4u6q4JWS2N9oNmDp6ia1VIhPjfsHU+gXpYKSxiLicmV
VECAHJk5/JrwKXBP2nMg3ZqGz9RoJc2CzC4zvKu7ADDB9Zl+pXs74mb4ta81Y3G4
vf07G/ORYJjbMskz5KcmYw2897I9ejMrNaHrYNjlh2IDpGqmoCJ4+8vuVO0zslrm
GzMOHMshaI653BmuRDHyczwNrNxMxSX3NOeR2fp76d3MSouK7RoFMr7ghMAegp1u
qmSrqFbgnOT4cdzN8QpPyU22lmVHtQHm0P4EpZZzC95dtJo1nzt8BFrDjPdJDOyZ
D7oKvZq+OA6MtjCN4gZ2ClxQoiUZ8E/jP1uGIknpzR1oeWnyEFtx8aI9q4yRjNcp
n8UR7wFqbOIV0O/QC7UlEp/xSpG7BDN4BkvIgbH2qe6LmRmejCrnUWv6pkmPc9sI
wFgV/Qnh9oo7yf6zvmCsi1r0kmfGLPRe8GB9+eN3wY3xxzDOLJEJNdVghWaP8Rz7
MBcL/0u8+fMfQlRiOPX64BSIF7fFPY0r0+erINawf1VUbyjUsUE=
=xSX1
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- ARM SMMU and Mediatek updates from Will Deacon:
- Support for MT8192 IOMMU from Mediatek
- Arm v7s io-pgtable extensions for MT8192
- Removal of TLBI_ON_MAP quirk
- New Qualcomm compatible strings
- Allow SVA without hardware broadcast TLB maintenance on SMMUv3
- Virtualization Host Extension support for SMMUv3 (SVA)
- Allow SMMUv3 PMU perf driver to be built independently from IOMMU
- Some tidy-up in IOVA and core code
- Conversion of the AMD IOMMU code to use the generic IO-page-table
framework
- Intel VT-d updates from Lu Baolu:
- Audit capability consistency among different IOMMUs
- Add SATC reporting structure support
- Add iotlb_sync_map callback support
- SDHI support for Renesas IOMMU driver
- Misc cleanups and other small improvments
* tag 'iommu-updates-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (94 commits)
iommu/amd: Fix performance counter initialization
MAINTAINERS: repair file pattern in MEDIATEK IOMMU DRIVER
iommu/mediatek: Fix error code in probe()
iommu/mediatek: Fix unsigned domid comparison with less than zero
iommu/vt-d: Parse SATC reporting structure
iommu/vt-d: Add new enum value and structure for SATC
iommu/vt-d: Add iotlb_sync_map callback
iommu/vt-d: Move capability check code to cap_audit files
iommu/vt-d: Audit IOMMU Capabilities and add helper functions
iommu/vt-d: Fix 'physical' typos
iommu: Properly pass gfp_t in _iommu_map() to avoid atomic sleeping
iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration]
driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3
MAINTAINERS: Add entry for MediaTek IOMMU
iommu/mediatek: Add mt8192 support
iommu/mediatek: Remove unnecessary check in attach_device
iommu/mediatek: Support master use iova over 32bit
iommu/mediatek: Add iova reserved function
iommu/mediatek: Support for multi domains
iommu/mediatek: Add get_domain_id from dev->dma_range_map
...
Set "suppress_bind_attrs" to true, so that bind/unbind can be
disabled via sysfs and prevent unbinding ARM_DMC620_PMU drivers
during perf sampling.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1612252686-50329-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
- Support for MT8192 IOMMU from Mediatek
- Arm v7s io-pgtable extensions for MT8192
- Removal of TLBI_ON_MAP quirk
- New Qualcomm compatible strings
- Allow SVA without hardware broadcast TLB maintenance on SMMUv3
- Virtualization Host Extension support for SMMUv3 (SVA)
- Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU
- Misc cleanups
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmAYD/QQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLZjCACBK2bCmkVJfs39YeeMhHWVRyiiMpIuCvja
7WuIqeXnbtAg19hkwPqns+XYeTTnoeo/aSZUS9z8DIDUqADMES+1UUMfVf2mfL3A
VD1pGRtQezYS+rDOc5dL9m1CcJ2RRY5tfQEjK8Fnm5CItvylNXbURH1nfu1tMVW7
fn1tMNKCTnY7FkABttKyR4a1XIyFX+fPdSrqd5PjQ7D6rK2ZYrorMf55WpmQEfFu
3er4F3Xz8nJeTvswPUMNB3ibvcdICM9VcSwFHpElTq2dbHUDDz3uFQwyiXrutEmb
Xz4UGabjS/ZEPaty6y7VxdYEbD4Q2dzii57jKKtV49N36iCvkSR2
=5KXT
-----END PGP SIGNATURE-----
Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
Arm SMMU updates for 5.12
- Support for MT8192 IOMMU from Mediatek
- Arm v7s io-pgtable extensions for MT8192
- Removal of TLBI_ON_MAP quirk
- New Qualcomm compatible strings
- Allow SVA without hardware broadcast TLB maintenance on SMMUv3
- Virtualization Host Extension support for SMMUv3 (SVA)
- Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU
- Misc cleanups
The ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3_PMU was added with the idea
that a SMMUv3 PMCG would only exist on a system with an associated SMMUv3.
However it is not the job of Kconfig to make these sorts of decisions (even
if it were true), so remove the dependency.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1612175042-56866-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Although it's neat to avoid the suffix for the typical case of a
single PMU, it means systems with multiple CMN instances end up with
inconsistent naming. I think it also breaks perf tool's "uncore alias"
logic if the common instance prefix is also the full name of one.
Avoid any surprises by not trying to be clever and simply numbering
every instance, even when it might technically prove redundant.
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/649a2281233f193d59240b13ed91b57337c77b32.1611839564.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-5-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-4-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-3-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
The only usage is to put their addresses in an array of pointers to
const struct attribute group. Make them const to allow the compiler
to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20210117212847.21319-2-rikard.falkeborn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).
The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
and filtering on these events using PMSEVFR_EL1.
Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in SPE driver.
Signed-off-by: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20201203141609.14148-1-liwei391@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
This reverts commit 367c820ef0.
lockup_detector_init() makes heavy use of per-cpu variables and must be
called with preemption disabled. Usually, it's handled early during boot
in kernel_init_freeable(), before SMP has been initialised.
Since we do not know whether or not our PMU interrupt can be signalled
as an NMI until considerably later in the boot process, the Arm PMU
driver attempts to re-initialise the lockup detector off the back of a
device_initcall(). Unfortunately, this is called from preemptible
context and results in the following splat:
| BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
| caller is debug_smp_processor_id+0x20/0x2c
| CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace+0x0/0x3c0
| show_stack+0x20/0x6c
| dump_stack+0x2f0/0x42c
| check_preemption_disabled+0x1cc/0x1dc
| debug_smp_processor_id+0x20/0x2c
| hardlockup_detector_event_create+0x34/0x18c
| hardlockup_detector_perf_init+0x2c/0x134
| watchdog_nmi_probe+0x18/0x24
| lockup_detector_init+0x44/0xa8
| armv8_pmu_driver_init+0x54/0x78
| do_one_initcall+0x184/0x43c
| kernel_init_freeable+0x368/0x380
| kernel_init+0x1c/0x1cc
| ret_from_fork+0x10/0x30
Rather than bodge this with raw_smp_processor_id() or randomly disabling
preemption, simply revert the culprit for now until we figure out how to
do this properly.
Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com
Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The DDR Perf for i.MX8 is a system PMU whose AXI ID would different from
SoC to SoC. Need expose system PMU identifier for userspace which refer
to /sys/bus/event_source/devices/<PMU DEVICE>/identifier.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20201130114202.26057-3-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.
One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
DDR Perf driver only supports free-running event counters(counter1/2/3)
now, this patch adds support for stop event counters.
Legacy SoCs:
Cycle counter(counter0) is a special counter, only count cycles. When
cycle counter overflow, it will lock all counters and generate an
interrupt. In ddr_perf_irq_handler, disable cycle counter then all
counters would stop at the same time, update all counters' count, then
enable cycle counter that all counters count again. During this process,
only clear cycle counter, no need to clear event counters since they are
free-running counters. They would continue counting after overflow and
do/while loop from ddr_perf_event_update can handle event counters
overflow case.
i.MX8MP:
Almost all is the same as legacy SoCs, the only difference is that, event
counters are not free-running any more. Like cycle counter, when event
counters overflow, they would stop counting unless clear the counter,
and no interrupt generate for event counters. So we should clear event
counters that let them re-count when cycle counter overflow, which ensure
event counters will not lose data.
This patch adds stop event counters support which would be compatible to
free-running event counters. We use the cycle counter to stop overflow
of the event counters.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20201027104451.15434-1-qiangqing.zhang@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
SMMU_PMCG_IIDR was added in the SMMUv3.3 spec.
For the perf tool to know the specific HW implementation, expose the
PMCG_IIDR contents only when set.
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-5-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
To allow userspace to identify the specific implementation of the device,
add an "identifier" sysfs file.
Encoding is as follows (same for all uncore drivers):
hi1620: 0x0
hi1630: 0x30
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1602149181-237415-2-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
DMC-620 PMU supports total 10 counters which each is
independently programmable to different events and can
be started and stopped individually.
Currently, it only supports ACPI. Other platforms feel free to test and add
support for device tree.
Usage example:
#perf stat -e arm_dmc620_10008c000/clk_cycle_count/ -C 0
Get perf event for clk_cycle_count counter.
#perf stat -e arm_dmc620_10008c000/clkdiv2_allocate,mask=0x1f,match=0x2f,
incr=2,invert=1/ -C 0
The above example shows how to specify mask, match, incr,
invert parameters for clkdiv2_allocate event.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1604518246-6198-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
The node type field is an enum type, so print it as a 32-bit quantity
rather than as an unsigned short.
Link: https://lore.kernel.org/r/202009302350.QIzfkx62-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
Ensure that the 'irq' field of 'struct arm_cmn_dtc' is a signed int
so that it can be compared '< 0'.
Link: https://lore.kernel.org/r/20200929170835.GA15956@embeddedor
Addresses-Coverity-ID: 1497488 ("Unsigned compared against 0")
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Add required PMU interrupt operations for NMIs. Request interrupt lines as
NMIs when possible, otherwise fall back to normal interrupts.
NMIs are only supported on the arm64 architecture with a GICv3 irqchip.
[Alexandru E.: Added that NMIs only work on arm64 + GICv3, print message
when PMU is using NMIs]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-8-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently the PMU interrupt can either be a normal irq or a percpu irq.
Supporting NMI will introduce two cases for each existing one. It becomes
a mess of 'if's when managing the interrupt.
Define sets of callbacks for operations commonly done on the interrupt. The
appropriate set of callbacks is selected at interrupt request time and
simplifies interrupt enabling/disabling and freeing.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-7-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Initial driver for PMU event counting on the Arm CMN-600 interconnect.
CMN sports an obnoxiously complex distributed PMU system as part of
its debug and trace features, which can do all manner of things like
sampling, cross-triggering and generating CoreSight trace. This driver
covers the PMU functionality, plus the relevant aspects of watchpoints
for simply counting matching flits.
Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
In tx2_uncore_pmu_init_dev(), a call to acpi_dev_get_resources() is used
to create a list _CRS resources which is searched for the device base
address. There is an error check following this:
if (!rentry->res)
return NULL
In no case, will rentry->res be NULL, so the test is useless. Even
if the test worked, it comes before the resource list memory is
freed. None of this really matters as long as the ACPI table has
the memory resource. Let's clean it up so that it makes sense and
will give a meaningful error should firmware leave out the memory
resource.
Fixes: 69c32972d5 ("drivers/perf: Add Cavium ThunderX2 SoC UNCORE PMU driver")
Signed-off-by: Mark Salter <msalter@redhat.com>
Link: https://lore.kernel.org/r/20200915204110.326138-2-msalter@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
This splat was reported on newer Fedora kernels booting on certain
X-gene based machines:
xgene-pmu APMC0D83:00: X-Gene PMU version 3
Unable to handle kernel read from unreadable memory at virtual \
address 0000000000004006
...
Call trace:
string+0x50/0x100
vsnprintf+0x160/0x750
devm_kvasprintf+0x5c/0xb4
devm_kasprintf+0x54/0x60
__devm_ioremap_resource+0xdc/0x1a0
devm_ioremap_resource+0x14/0x20
acpi_get_pmu_hw_inf.isra.0+0x84/0x15c
acpi_pmu_dev_add+0xbc/0x21c
acpi_ns_walk_namespace+0x16c/0x1e4
acpi_walk_namespace+0xb4/0xfc
xgene_pmu_probe_pmu_dev+0x7c/0xe0
xgene_pmu_probe.part.0+0x2c0/0x310
xgene_pmu_probe+0x54/0x64
platform_drv_probe+0x60/0xb4
really_probe+0xe8/0x4a0
driver_probe_device+0xe4/0x100
device_driver_attach+0xcc/0xd4
__driver_attach+0xb0/0x17c
bus_for_each_dev+0x6c/0xb0
driver_attach+0x30/0x40
bus_add_driver+0x154/0x250
driver_register+0x84/0x140
__platform_driver_register+0x54/0x60
xgene_pmu_driver_init+0x28/0x34
do_one_initcall+0x40/0x204
do_initcalls+0x104/0x144
kernel_init_freeable+0x198/0x210
kernel_init+0x20/0x12c
ret_from_fork+0x10/0x18
Code: 91000400 110004e1 eb08009f 540000c0 (38646846)
---[ end trace f08c10566496a703 ]---
This is due to use of an uninitialized local resource struct in the xgene
pmu driver. The thunderx2_pmu driver avoids this by using the resource list
constructed by acpi_dev_get_resources() rather than using a callback from
that function. The callback in the xgene driver didn't fully initialize
the resource. So get rid of the callback and search the resource list as
done by thunderx2.
Fixes: 832c927d11 ("perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver")
Signed-off-by: Mark Salter <msalter@redhat.com>
Link: https://lore.kernel.org/r/20200915204110.326138-1-msalter@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
Add support for probing device from ACPI node.
Each DSU ACPI node and its associated cpus are inside a cluster node.
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1600106656-9542-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
MODULE_*** is used in HiSilicon uncore PMU drivers and is provided by
linux/module.h, but the header file is not directly included. Add the
missing include.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1599186097-18599-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
framework we just have some minor tweaks and a debugfs feature, so not much to
see there. The driver updates are fairly well split between AT91 and Qualcomm
clk support. Adding those two drivers together equals about 50% of the
diffstat. Otherwise, the big amount of work this time was on supporting
Broadcom's Raspberry Pi firmware clks. See below for some more highlights.
Core:
- Document clk_hw_round_rate() so it gets some more use
- Remove unused __clk_get_flags()
- Add a prepare/enable debugfs feature similar to rate setting
New Drivers:
- Add support for SAMA7G5 SoC clks
- Enable CPU clks on Qualcomm IPQ6018 SoCs
- Enable CPU clks on Qualcomm MSM8996 SoCs
- GPU clk support for Qualcomm SM8150 and SM8250 SoCs
- Audio clks on Qualcomm SC7180 SoCs
- Microchip Sparx5 DPLL clk
- Add support for the new Renesas RZ/G2H (R8A774E1) SoC
Updates:
- Make defines for bcm63xx-gate clks to use in DT
- Support BCM2711 SoC firmware clks
- Add HDMI clks for BCM2711 SoCs
- Add RTC related clks on Ingenic SoCs
- Support USB PHY clks on Ingenic SoCs
- Support gate clks on BCM6318 SoCs
- RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs
- Use poll_timeout functions in Rockchip clk driver
- Support Rockchip rk3288w SoC variant
- Mark mac_lbtest critical on Rockchip rk3188
- Add CAAM clock support for i.MX vf610 driver
- Add MU root clock support for i.MX imx8mp driver
- Amlogic g12: add neural network accelerator clock sources
- Amlogic meson8: remove critical flag for main PLL divider
- Amlogic meson8: add video decoder clock gates
- Convert one more Renesas DT binding to json-schema
- Enhance critical clock handling on Renesas platforms to only consider
clocks that were enabled at boot time
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAl8tspERHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSX0og//TXp134IBXVZ2Z5Wca4J41itv7tPGJFsX
EslQ/eMm6xhGEqqrA6BVsQ228JF9rUmg1fbvdl82UPK7Zyp9P2fo6CMC65ngDTky
B1rLZOaKitER9JdL9DmmnIR772pI//rAMdIeVC/Jn5RR7OpP4lgY6+qvK0pJjVFl
8aK3uHY+UWVFtqZxuJEAAyZBq1+bREqmVwNC1my6kmMIf0j0KwcGhrZgASWWtjQK
4TRmroehLC4FBqnJ78Y3E6UAOBlz6C7XnP38qJge2672Ef7QhXZU7AGrGiTtxc4h
5fB6MNMF+5QGel54qR1eH+JxaEoKsAjLaX1VBr7hAHrGIQ26dBFHFPsPvWDA+VVR
4bwXJKz2objAWyqlMVM/cVV3q6uDixuScdrw2aAiojmV7ZvdZXflacdmZuS5v7e4
sh86llN+lF0YrViYz33z+up0risCAi089xVo7Z99VgyLe2DR8TE+4/d6DQTFRNxl
m66m6mlB9pPPgIV028SAf/zmzyoVWEarwdEAQPjJC6KVRbnR9mFlhSiTQMmCsSvU
zHRxVcInc+8qIJY6VJ552UFu/JAan/AFl5knb+kW21a7hM8p81H9HuSnS4aHDDq4
yETPU3XSMYz8ARHX4lKo1UIB6+k9iF2CWdCP+gXHNP+QYSAsRBIJsonjul9lLUCi
8i6mUGkS0OU=
=QLRg
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"It looks like a smaller batch of clk updates this time around.
In the core framework we just have some minor tweaks and a debugfs
feature, so not much to see there. The driver updates are fairly well
split between AT91 and Qualcomm clk support. Adding those two drivers
together equals about 50% of the diffstat.
Otherwise, the big amount of work this time was on supporting
Broadcom's Raspberry Pi firmware clks.
Highlights:
Core:
- Document clk_hw_round_rate() so it gets some more use
- Remove unused __clk_get_flags()
- Add a prepare/enable debugfs feature similar to rate setting
New Drivers:
- Add support for SAMA7G5 SoC clks
- Enable CPU clks on Qualcomm IPQ6018 SoCs
- Enable CPU clks on Qualcomm MSM8996 SoCs
- GPU clk support for Qualcomm SM8150 and SM8250 SoCs
- Audio clks on Qualcomm SC7180 SoCs
- Microchip Sparx5 DPLL clk
- Add support for the new Renesas RZ/G2H (R8A774E1) SoC
Updates:
- Make defines for bcm63xx-gate clks to use in DT
- Support BCM2711 SoC firmware clks
- Add HDMI clks for BCM2711 SoCs
- Add RTC related clks on Ingenic SoCs
- Support USB PHY clks on Ingenic SoCs
- Support gate clks on BCM6318 SoCs
- RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs
- Use poll_timeout functions in Rockchip clk driver
- Support Rockchip rk3288w SoC variant
- Mark mac_lbtest critical on Rockchip rk3188
- Add CAAM clock support for i.MX vf610 driver
- Add MU root clock support for i.MX imx8mp driver
- Amlogic g12: add neural network accelerator clock sources
- Amlogic meson8: remove critical flag for main PLL divider
- Amlogic meson8: add video decoder clock gates
- Convert one more Renesas DT binding to json-schema
- Enhance critical clock handling on Renesas platforms to only
consider clocks that were enabled at boot time"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (79 commits)
clk: qcom: gcc: Make disp gpll0 branch aon for sc7180/sdm845
ipq806x: gcc: add support for child probe
clk: qcom: msm8996: Make symbol 'cpu_msm8996_clks' static
clk: qcom: ipq8074: Add correct index for PCIe clocks
clk: <linux/clk-provider.h>: drop a duplicated word
clk: renesas: cpg-mssr: Add r8a774e1 support
dt-bindings: clock: renesas,cpg-mssr: Document r8a774e1
clk: Drop duplicate selection in Kconfig
clk: qcom: smd: Add support for MSM8992/4 rpm clocks
clk: qcom: ipq8074: Add missing clocks for pcie
dt-bindings: clock: qcom: ipq8074: Add missing bindings for PCIe
Replace HTTP links with HTTPS ones: Common CLK framework
clk: qcom: Add CPU clock driver for msm8996
dt-bindings: clk: qcom: Add bindings for CPU clock for msm8996
soc: qcom: Separate kryo l2 accessors from PMU driver
clk: meson: meson8b: add the vclk2_en gate clock
clk: meson: meson8b: add the vclk_en gate clock
clk: qcom: Fix return value check in apss_ipq6018_probe()
clk: bcm: dvp: Add missing module informations
clk: meson: meson8b: Drop CLK_IS_CRITICAL from fclk_div2
...
- Removal of the tremendously unpopular read_barrier_depends() barrier,
which is a NOP on all architectures apart from Alpha, in favour of
allowing architectures to override READ_ONCE() and do whatever dance
they need to do to ensure address dependencies provide LOAD ->
LOAD/STORE ordering. This work also offers a potential solution if
compilers are shown to convert LOAD -> LOAD address dependencies into
control dependencies (e.g. under LTO), as weakly ordered architectures
will effectively be able to upgrade READ_ONCE() to smp_load_acquire().
The latter case is not used yet, but will be discussed further at LPC.
- Make the MSI/IOMMU input/output ID translation PCI agnostic, augment
the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
bus-specific parameter and apply the resulting changes to the device
ID space provided by the Freescale FSL bus.
- arm64 support for TLBI range operations and translation table level
hints (part of the ARMv8.4 architecture version).
- Time namespace support for arm64.
- Export the virtual and physical address sizes in vmcoreinfo for
makedumpfile and crash utilities.
- CPU feature handling cleanups and checks for programmer errors
(overlapping bit-fields).
- ACPI updates for arm64: disallow AML accesses to EFI code regions and
kernel memory.
- perf updates for arm64.
- Miscellaneous fixes and cleanups, most notably PLT counting
optimisation for module loading, recordmcount fix to ignore
relocations other than R_AARCH64_CALL26, CMA areas reserved for
gigantic pages on 16K and 64K configurations.
- Trivial typos, duplicate words.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl8oTcsACgkQa9axLQDI
XvEj6hAAkn39mO5xrR/Vhpg3DyFPk63ZlMSX9SsOeVyaLbovT6stTs1XAZXPpnkt
rV3gwACyGSrqH6+uey9pHgHJuPF2TdrGEVK08yVKo9KGW/6yXSIncdKFE4jUJ/WJ
wF5j7eMET2aGzcpm5AlzMmq6HOrKB8nZac9H8/x6H+Ox2WdgJkEjOkDvyqACUyum
N3FsTZkWj2pIkTXHNgDZ8KjxVLO8HlFaB2hkxFDl9NPlX2UTCQJ8Tg1KiPLafKaK
gUvH4usQDFdb5RU/UWogre37J4emO0ZTApZOyju+U+PMMWlWVHjZ4isUIS9zz/AE
JNZ23dnKZX2HrYa5p8HZx175zwj/vXUqUHCZPLvQXaAudCEhF8BVljPiG0e80FV5
GHFUgUbylKspp01I/9L+2JvsG96Mr0e+P3Sx7L2HTI42cmtoSa14+MpoSRj7zlft
Qcl8hfrVOjCjUnFRHa/1y1cGvnD9GbgnKJR7zgVxl9bD/Jd48r1HUtwRORZCzWFr
mRPVbPS72fWxMzMV9DZYJm02jJY9kLX2BMl49njbB8MhAhzOvrMVzoVVtMMeRFLR
XHeJpmg36W09FiRGe7LRXlkXIhCQzQG2bJfiphuupCfhjRAitPoq8I925G6Pig60
c8RWaXGU7PrEsdMNrL83vekvGKgqrkoFkRVtsCoQ2X6Hvu/XdYI=
=mh79
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 and cross-arch updates from Catalin Marinas:
"Here's a slightly wider-spread set of updates for 5.9.
Going outside the usual arch/arm64/ area is the removal of
read_barrier_depends() series from Will and the MSI/IOMMU ID
translation series from Lorenzo.
The notable arm64 updates include ARMv8.4 TLBI range operations and
translation level hint, time namespace support, and perf.
Summary:
- Removal of the tremendously unpopular read_barrier_depends()
barrier, which is a NOP on all architectures apart from Alpha, in
favour of allowing architectures to override READ_ONCE() and do
whatever dance they need to do to ensure address dependencies
provide LOAD -> LOAD/STORE ordering.
This work also offers a potential solution if compilers are shown
to convert LOAD -> LOAD address dependencies into control
dependencies (e.g. under LTO), as weakly ordered architectures will
effectively be able to upgrade READ_ONCE() to smp_load_acquire().
The latter case is not used yet, but will be discussed further at
LPC.
- Make the MSI/IOMMU input/output ID translation PCI agnostic,
augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
bus-specific parameter and apply the resulting changes to the
device ID space provided by the Freescale FSL bus.
- arm64 support for TLBI range operations and translation table level
hints (part of the ARMv8.4 architecture version).
- Time namespace support for arm64.
- Export the virtual and physical address sizes in vmcoreinfo for
makedumpfile and crash utilities.
- CPU feature handling cleanups and checks for programmer errors
(overlapping bit-fields).
- ACPI updates for arm64: disallow AML accesses to EFI code regions
and kernel memory.
- perf updates for arm64.
- Miscellaneous fixes and cleanups, most notably PLT counting
optimisation for module loading, recordmcount fix to ignore
relocations other than R_AARCH64_CALL26, CMA areas reserved for
gigantic pages on 16K and 64K configurations.
- Trivial typos, duplicate words"
Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org
Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits)
arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
arm64/mm: save memory access in check_and_switch_context() fast switch path
arm64: sigcontext.h: delete duplicated word
arm64: ptrace.h: delete duplicated word
arm64: pgtable-hwdef.h: delete duplicated words
bus: fsl-mc: Add ACPI support for fsl-mc
bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
of/irq: Make of_msi_map_rid() PCI bus agnostic
of/irq: make of_msi_map_get_device_domain() bus agnostic
dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
of/device: Add input id to of_dma_configure()
of/iommu: Make of_map_rid() PCI agnostic
ACPI/IORT: Add an input ID to acpi_dma_configure()
ACPI/IORT: Remove useless PCI bus walk
ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC
arm64: enable time namespace support
arm64/vdso: Restrict splitting VVAR VMA
arm64/vdso: Handle faults on timens page
...
Forcefully unbinding PMU drivers during perf sampling will lead to
a kernel panic, because the perf upper-layer framework call a NULL
pointer in this situation.
To solve this issue, "suppress_bind_attrs" should be set to true, so
that bind/unbind can be disabled via sysfs and prevent unbinding PMU
drivers during perf sampling.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The driver provides kernel level API for other drivers
to access the MSM8996 L2 cache registers.
Separating the L2 access code from the PMU driver and
making it public to allow other drivers use it.
The accesses must be separated with a single spinlock,
maintained in this driver.
Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1593766185-16346-2-git-send-email-loic.poulain@linaro.org
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
- Fix SCS debug check to report max stack usage in bytes as advertised
- Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS
- Fix incorrect mask in HiSilicon L3C perf PMU driver
- Fix compat vDSO compilation under some toolchain configurations
- Fix false UBSAN warning from ACPI IORT parsing code
- Fix booting under bootloaders that ignore TEXT_OFFSET
- Annotate debug initcall function with '__init'
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7iMe8QHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNIp5B/46kdFZ1M8VSsGxtZMzLVZBR4MWzjx1wBD3
Zzvcg5x0aLAvg+VephmQ5cBiQE78/KKISUdTKndevJ9feVhzz8kxbOhLB88o14+L
Pk63p4jol8v7cJHiqcsBgSLR6MDAiY+4epsgeFA7WkO9cf529UIMO1ea2TCx0KbT
tKniZghX5I485Fu2RHtZGLGBxQXqFBcDJUok3/IoZnp2SDyUxrzHPViFL9fHHzCb
FNSEJijcoHfrIKiG4bPssKICmvbtcNysembDlJeyZ+5qJXqotty2M3OK+We7vPrg
Ne5O/tQoeCt4lLuW40yEmpQzodNLG8D+isC6cFvspmPXSyHflSCz
=EtmQ
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"arm64 fixes that came in during the merge window.
There will probably be more to come, but it doesn't seem like it's
worth me sitting on these in the meantime.
- Fix SCS debug check to report max stack usage in bytes as advertised
- Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS
- Fix incorrect mask in HiSilicon L3C perf PMU driver
- Fix compat vDSO compilation under some toolchain configurations
- Fix false UBSAN warning from ACPI IORT parsing code
- Fix booting under bootloaders that ignore TEXT_OFFSET
- Annotate debug initcall function with '__init'"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: warn on incorrect placement of the kernel by the bootloader
arm64: acpi: fix UBSAN warning
arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
drivers/perf: hisi: Fix wrong value for all counters enable
arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
arm64: debug: mark a function as __init to save some memory
scs: Report SCS usage in bytes rather than number of entries
In L3C uncore PMU drivers, bit16 is used to control all counters enable &
disable. Wrong value is given in the driver and its default value is 1'b1,
it can work because each PMU counter has its own control bits too.
Let's fix the wrong value.
Fixes: 2940bc4333 ("perf: hisi: Add support for HiSilicon SoC L3C PMU driver")
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1591350221-32275-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
- Branch Target Identification (BTI)
* Support for ARMv8.5-BTI in both user- and kernel-space. This
allows branch targets to limit the types of branch from which
they can be called and additionally prevents branching to
arbitrary code, although kernel support requires a very recent
toolchain.
* Function annotation via SYM_FUNC_START() so that assembly
functions are wrapped with the relevant "landing pad"
instructions.
* BPF and vDSO updates to use the new instructions.
* Addition of a new HWCAP and exposure of BTI capability to
userspace via ID register emulation, along with ELF loader
support for the BTI feature in .note.gnu.property.
* Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
- Shadow Call Stack (SCS)
* Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each
task that holds only return addresses. This protects function
return control flow from buffer overruns on the main stack.
* Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
* Core support for SCS, should other architectures want to use it
too.
* SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
- CPU feature detection
* Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a
concern for KVM, which disabled support for 32-bit guests on
such a system.
* Addition of new ID registers and fields as the architecture has
been extended.
- Perf and PMU drivers
* Minor fixes and cleanups to system PMU drivers.
- Hardware errata
* Unify KVM workarounds for VHE and nVHE configurations.
* Sort vendor errata entries in Kconfig.
- Secure Monitor Call Calling Convention (SMCCC)
* Update to the latest specification from Arm (v1.2).
* Allow PSCI code to query the SMCCC version.
- Software Delegated Exception Interface (SDEI)
* Unexport a bunch of unused symbols.
* Minor fixes to handling of firmware data.
- Pointer authentication
* Add support for dumping the kernel PAC mask in vmcoreinfo so
that the stack can be unwound by tools such as kdump.
* Simplification of key initialisation during CPU bringup.
- BPF backend
* Improve immediate generation for logical and add/sub
instructions.
- vDSO
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
- ACPI
- Work around for an ambiguity in the IORT specification relating
to the "num_ids" field.
- Support _DMA method for all named components rather than only
PCIe root complexes.
- Minor other IORT-related fixes.
- Miscellaneous
* Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
* Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
* Refactoring and cleanup
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7U9csQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLBHCACs/YU4SM7Om5f+7QnxIKao5DBr2CnGGvdC
yTfDghFDTLQVv3MufLlfno3yBe5G8sQpcZfcc+hewfcGoMzVZXu8s7LzH6VSn9T9
jmT3KjDMrg0RjSHzyumJp2McyelTk0a4FiKArSIIKsJSXUyb1uPSgm7SvKVDwEwU
JGDzL9IGilmq59GiXfDzGhTZgmC37QdwRoRxDuqtqWQe5CHoRXYexg87HwBKOQxx
HgU9L7ehri4MRZfpyjaDrr6quJo3TVnAAKXNBh3mZAskVS9ZrfKpEH0kYWYuqybv
znKyHRecl/rrGePV8RTMtrwnSdU26zMXE/omsVVauDfG9hqzqm+Q
=w3qi
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"A sizeable pile of arm64 updates for 5.8.
Summary below, but the big two features are support for Branch Target
Identification and Clang's Shadow Call stack. The latter is currently
arm64-only, but the high-level parts are all in core code so it could
easily be adopted by other architectures pending toolchain support
Branch Target Identification (BTI):
- Support for ARMv8.5-BTI in both user- and kernel-space. This allows
branch targets to limit the types of branch from which they can be
called and additionally prevents branching to arbitrary code,
although kernel support requires a very recent toolchain.
- Function annotation via SYM_FUNC_START() so that assembly functions
are wrapped with the relevant "landing pad" instructions.
- BPF and vDSO updates to use the new instructions.
- Addition of a new HWCAP and exposure of BTI capability to userspace
via ID register emulation, along with ELF loader support for the
BTI feature in .note.gnu.property.
- Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
Shadow Call Stack (SCS):
- Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each task
that holds only return addresses. This protects function return
control flow from buffer overruns on the main stack.
- Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
- Core support for SCS, should other architectures want to use it
too.
- SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
CPU feature detection:
- Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a concern
for KVM, which disabled support for 32-bit guests on such a system.
- Addition of new ID registers and fields as the architecture has
been extended.
Perf and PMU drivers:
- Minor fixes and cleanups to system PMU drivers.
Hardware errata:
- Unify KVM workarounds for VHE and nVHE configurations.
- Sort vendor errata entries in Kconfig.
Secure Monitor Call Calling Convention (SMCCC):
- Update to the latest specification from Arm (v1.2).
- Allow PSCI code to query the SMCCC version.
Software Delegated Exception Interface (SDEI):
- Unexport a bunch of unused symbols.
- Minor fixes to handling of firmware data.
Pointer authentication:
- Add support for dumping the kernel PAC mask in vmcoreinfo so that
the stack can be unwound by tools such as kdump.
- Simplification of key initialisation during CPU bringup.
BPF backend:
- Improve immediate generation for logical and add/sub instructions.
vDSO:
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
ACPI:
- Work around for an ambiguity in the IORT specification relating to
the "num_ids" field.
- Support _DMA method for all named components rather than only PCIe
root complexes.
- Minor other IORT-related fixes.
Miscellaneous:
- Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
- Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
- Refactoring and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
KVM: arm64: Check advertised Stage-2 page size capability
arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
ACPI/IORT: Remove the unused __get_pci_rid()
arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
arm64/cpufeature: Introduce ID_MMFR5 CPU register
arm64/cpufeature: Introduce ID_DFR1 CPU register
arm64/cpufeature: Introduce ID_PFR2 CPU register
arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
arm64: mm: Add asid_gen_match() helper
firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
arm64: vdso: Fix CFI directives in sigreturn trampoline
arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
...
Currently when trying to remove the SMMUv3 PMU module we get a
WARN_ON_ONCE from free_irq(), because the affinity hint set during probe
hasn't been properly cleared.
[ 238.878383] WARNING: CPU: 0 PID: 175 at kernel/irq/manage.c:1744 free_irq+0x324/0x358
...
[ 238.897263] Call trace:
[ 238.897998] free_irq+0x324/0x358
[ 238.898792] devm_irq_release+0x18/0x28
[ 238.899189] release_nodes+0x1b0/0x228
[ 238.899984] devres_release_all+0x38/0x60
[ 238.900779] device_release_driver_internal+0x10c/0x1d0
[ 238.901574] driver_detach+0x50/0xe0
[ 238.902368] bus_remove_driver+0x5c/0xd8
[ 238.903448] driver_unregister+0x30/0x60
[ 238.903958] platform_driver_unregister+0x14/0x20
[ 238.905075] arm_smmu_pmu_exit+0x1c/0xecc [arm_smmuv3_pmu]
[ 238.905547] __arm64_sys_delete_module+0x14c/0x260
[ 238.906342] el0_svc_common.constprop.0+0x74/0x178
[ 238.907355] do_el0_svc+0x24/0x90
[ 238.907932] el0_sync_handler+0x11c/0x198
[ 238.908979] el0_sync+0x158/0x180
Just like the other perf drivers, clear the affinity hint before
releasing the device.
Fixes: 7d839b4b9e ("perf/smmuv3: Add arm64 smmuv3 pmu driver")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200422084805.237738-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
This patch lets HiSilicon uncore PMU driver can be built as modules.
A common module and three specific uncore PMU driver modules will be built.
Export necessary functions in hisi_uncore_pmu module, and change
irq_set_affinity to irq_set_affinity_hint to pass compile.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Tested-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1588820305-174479-1-git-send-email-wangzhou1@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Open access to monitoring for CAP_PERFMON privileged process. Providing
the access under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes operation more secure.
CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)
For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure monitoring is discouraged with respect to CAP_PERFMON capability.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/4ec1d6f7-548c-8d1c-f84a-cebeb9674e4e@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- In-kernel Pointer Authentication support (previously only offered to
user space).
- ARM Activity Monitors (AMU) extension support allowing better CPU
utilisation numbers for the scheduler (frequency invariance).
- Memory hot-remove support for arm64.
- Lots of asm annotations (SYM_*) in preparation for the in-kernel
Branch Target Identification (BTI) support.
- arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU
init callbacks, support for new DT compatibles.
- IPv6 header checksum optimisation.
- Fixes: SDEI (software delegated exception interface) double-lock on
hibernate with shared events.
- Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm()
converted to C, cpufeature finalisation helper.
- sys_mremap() comment explaining the asymmetric address untagging
behaviour.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6DVyIACgkQa9axLQDI
XvHkqRAAiZA2EYKiQL4M1DJ1cNTADjT7xKX9+UtYBXj7GMVhgVWdunpHVE6qtfgk
cT6avmKrS/6PDqizJgr+Z1yX8x3Kvs57G4BvmIUKIw97mkdewvFQ9JKv6VA1vb86
7Qrl1WzqsGg5Kj9uUfI4h+ZoT1H4C/9PQeFxJwgZRtF9DxRh8O7VeZI+JCu8Aub2
lIkjI8rh+EpTsGT9h/PMGWUcawnKQloZ1/F+GfMAuYBvIv2RNN2xVreJtTmm4NyJ
VcpL0KCNyAI2lGdaJg5nBLRDyGuXDm5i+PLsCSXMquI4fie00txXeD8sjbeuO0ks
YTJ0EhmUUhbSE17go+SxYiEFE0v09i+lD5ud+B4Vmojp0KTczTta9VSgURlbb2/9
n9biq5G3PPDNIrZqiTT2Tf4AMz1350nkbzL2gzKecM5aIzR/u3y5yII5CgfZtFnj
7bGbyFpFpcqI7UaISPsNCxmknbTt/7ff0WM3+7SbecxI3AD2mnxsOdN9JTLyhDp+
owjyiaWxl5zMWF9DhplLG/9BKpNWSxh3skazdOdELd8GTq2MbJlXrVG2XgXTAOh3
y1s6RQrfw8zXh8TSqdmmzauComXIRWTum/sbVB3U8Z3AUsIeq/NTSbN5X9JyIbOP
HOabhlVhhkI6omN1grqPX4jwUiZLZoNfn7Ez4q71549KVK/uBtA=
=LJVX
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The bulk is in-kernel pointer authentication, activity monitors and
lots of asm symbol annotations. I also queued the sys_mremap() patch
commenting the asymmetry in the address untagging.
Summary:
- In-kernel Pointer Authentication support (previously only offered
to user space).
- ARM Activity Monitors (AMU) extension support allowing better CPU
utilisation numbers for the scheduler (frequency invariance).
- Memory hot-remove support for arm64.
- Lots of asm annotations (SYM_*) in preparation for the in-kernel
Branch Target Identification (BTI) support.
- arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the
PMU init callbacks, support for new DT compatibles.
- IPv6 header checksum optimisation.
- Fixes: SDEI (software delegated exception interface) double-lock on
hibernate with shared events.
- Minor clean-ups and refactoring: cpu_ops accessor,
cpu_do_switch_mm() converted to C, cpufeature finalisation helper.
- sys_mremap() comment explaining the asymmetric address untagging
behaviour"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
mm/mremap: Add comment explaining the untagging behaviour of mremap()
arm64: head: Convert install_el2_stub to SYM_INNER_LABEL
arm64: Introduce get_cpu_ops() helper function
arm64: Rename cpu_read_ops() to init_cpu_ops()
arm64: Declare ACPI parking protocol CPU operation if needed
arm64: move kimage_vaddr to .rodata
arm64: use mov_q instead of literal ldr
arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH
lkdtm: arm64: test kernel pointer authentication
arm64: compile the kernel with ptrauth return address signing
kconfig: Add support for 'as-option'
arm64: suspend: restore the kernel ptrauth keys
arm64: __show_regs: strip PAC from lr in printk
arm64: unwind: strip PAC from kernel addresses
arm64: mask PAC bits of __builtin_return_address
arm64: initialize ptrauth keys for kernel booting task
arm64: initialize and switch ptrauth kernel keys
arm64: enable ptrauth earlier
arm64: cpufeature: handle conflicts based on capability
arm64: cpufeature: Move cpu capability helpers inside C file
...
snprintf() is a hard-to-use function, it's especially difficult to use
it for concatenating substrings in a buffer with a limited size.
Since snprintf() returns the would-be-output size, not the actual
size, the subsequent use of snprintf() may point to the incorrect
position easily. Although the current code doesn't actually overflow
the buffer, it's an incorrect usage.
This patch replaces such snprintf() calls with a safer version,
scnprintf().
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Will Deacon <will@kernel.org>
Fix bogus NULL checks on the return value of acpi_cpu_get_madt_gicc()
by checking for a 0 'gicc->performance_interrupt' value instead.
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Will Deacon <will@kernel.org>
When disabling a counter from ddr_perf_event_stop(), the counter value
is reset to 0 at the same time.
Preserve the counter value by performing a read-modify-write of the
PMU register and clearing only the enable bit.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
We already check that the 'nr_pages' is > 2, so there's no need to check
that it's != 0 later on.
Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Will Deacon <will@kernel.org>
Even though a SMMUv3 PMCG implementation may use an MSI as the form of
interrupt source, the kernel would still complain that it does not find
the wired (GSIV) interrupt in this case:
root@(none)$ dmesg | grep arm-smmu-v3-pmcg | grep "not found"
[ 59.237219] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.8.auto: IRQ index 0 not found
[ 59.322841] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.9.auto: IRQ index 0 not found
[ 59.422155] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.10.auto: IRQ index 0 not found
[ 59.539014] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.11.auto: IRQ index 0 not found
[ 59.640329] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.12.auto: IRQ index 0 not found
[ 59.743112] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.13.auto: IRQ index 0 not found
[ 59.880577] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.14.auto: IRQ index 0 not found
[ 60.017528] arm-smmu-v3-pmcg arm-smmu-v3-pmcg.15.auto: IRQ index 0 not found
Use platform_get_irq_optional() to silence the warning.
If neither interrupt source is found, then the driver will still warn that
IRQ setup errored and the probe will fail.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
This driver allocates a dynamic cpu hotplug state but never releases it.
If reloaded in a loop it will quickly trigger a WARN message:
"No more dynamic states available for CPU hotplug"
Fix by calling cpuhp_remove_multi_state on remove like several other
perf pmu drivers.
Also fix the cleanup logic on probe error paths: add the missing
cpuhp_remove_multi_state call and properly check the return value from
cpuhp_state_add_instant_nocalls.
Fixes: 9a66d36cc7 ("drivers/perf: imx_ddr: Add DDR performance counter support to perf")
Acked-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
hisi_read_sccl_and_ccl_id is not readable and its comment is a little
confused, so simplify the function and its comment as Mark's suggestion.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
In smmu_pmu_probe(), there is put_cpu() in the error path,
which is wrong because we use raw_smp_processor_id() to
get the cpu ID, not get_cpu(), remove it.
While we are at it, kill 'out_cpuhp_err' altogether and
just return err if we fail to add the hotplug instance.
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For some HiSilicon platform, the originally designed SCCL_ID and CCL_ID
are not satisfied with much rich topology when the MT is set, so we
extend the SCCL_ID to MPIDR[aff3] and CCL_ID to MPIDR[aff2]. Let's
update this for HiSilicon uncore PMU driver.
Cc: John Garry <john.garry@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
caps/filter indicates whether HW supports AXI ID filter or not.
caps/enhanced_filter indicates whether HW supports enhanced AXI ID filter
or not.
Users can check filter features from userspace with these attributions.
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
[will: reworked cap switch to be less error-prone]
Signed-off-by: Will Deacon <will@kernel.org>
With DDR_CAP_AXI_ID_FILTER quirk, indicating HW supports AXI ID filter
which only can get bursts from DDR transaction, i.e. DDR read/write
requests.
This patch add DDR_CAP_AXI_ID_ENHANCED_FILTER quirk, indicating HW
supports AXI ID filter which can get bursts and bytes from DDR
transaction at the same time. We hope PMU always return bytes in the
driver due to it is more meaningful for users.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
CCPI2 is a low-latency high-bandwidth serial interface for inter socket
connectivity of ThunderX2 processors.
CCPI2 PMU supports up to 8 counters per socket. Counters are
independently programmable to different events and can be started and
stopped individually. The CCPI2 counters are 64-bit and do not overflow
in normal operation.
Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@marvell.com>
Signed-off-by: Will Deacon <will@kernel.org>
Add compatible string for the ARM CCN-512 interconnect
Acked-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
Signed-off-by: Boleslaw Malecki <boleslaw.malecki@tieto.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
* for-next/52-bit-kva: (25 commits)
Support for 52-bit virtual addressing in kernel space
* for-next/cpu-topology: (9 commits)
Move CPU topology parsing into core code and add support for ACPI 6.3
* for-next/error-injection: (2 commits)
Support for function error injection via kprobes
* for-next/perf: (8 commits)
Support for i.MX8 DDR PMU and proper SMMUv3 group validation
* for-next/psci-cpuidle: (7 commits)
Move PSCI idle code into a new CPUidle driver
* for-next/rng: (4 commits)
Support for 'rng-seed' property being passed in the devicetree
* for-next/smpboot: (3 commits)
Reduce fragility of secondary CPU bringup in debug configurations
* for-next/tbi: (10 commits)
Introduce new syscall ABI with relaxed requirements for pointer tags
* for-next/tlbi: (6 commits)
Handle spurious page faults arising from kernel space
AXI filtering is used by events 0x41 and 0x42 to count reads or writes
with an ARID or AWID matching a specified filter. The filter is exposed
to userspace as an (ID, MASK) pair, where each set bit in the mask
causes the corresponding bit in the ID to be ignored when matching
against the ID of memory transactions for the purposes of incrementing
the counter.
For example:
# perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xff,axi_id=0x800/ cmd
will count all read transactions from AXI IDs 0x800 - 0x8ff. If the
'axi_mask' is omitted, then it is treated as 0x0 which means that the
'axi_id' will be matched exactly.
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
With global filtering, it becomes possible for users to construct
self-contradictory groups with conflicting filters. Make sure we
cover that when initially validating events.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Ensure that a group will actually fit into the available counters.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.
// <smpl>
@@
expression ret;
struct platform_device *E;
@@
ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);
if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>
While we're here, remove braces on if statements that only have one
statement (manually).
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
This is required for automatic probing when driver is built as a module.
Fixes: 9a66d36cc7 ("drivers/perf: imx_ddr: Add DDR performance counter support to perf")
Acked-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM
notifier code incorrectly skips restoration of the counters. Fix the
logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT.
Cc: <stable@vger.kernel.org>
Fixes: da4e4f18af ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The perf infrastructure is used for userspace to track issues.
At least a good part of what's described here is related to
it.
So, add it to the admin-guide.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Rename the perf documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
- Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
manage the permissions of executable vmalloc regions more strictly
- Slight performance improvement by keeping softirqs enabled while
touching the FPSIMD/SVE state (kernel_neon_begin/end)
- Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG
and AXFLAG instructions for floating point comparison flags
manipulation) and FRINT (rounding floating point numbers to integers)
- Re-instate ARM64_PSEUDO_NMI support which was previously marked as
BROKEN due to some bugs (now fixed)
- Improve parking of stopped CPUs and implement an arm64-specific
panic_smp_self_stop() to avoid warning on not being able to stop
secondary CPUs during panic
- perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
platforms
- perf: DDR performance monitor support for iMX8QXP
- cache_line_size() can now be set from DT or ACPI/PPTT if provided to
cope with a system cache info not exposed via the CPUID registers
- Avoid warning on hardware cache line size greater than
ARCH_DMA_MINALIGN if the system is fully coherent
- arm64 do_page_fault() and hugetlb cleanups
- Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
- Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags'
introduced in 5.1)
- CONFIG_RANDOMIZE_BASE now enabled in defconfig
- Allow the selection of ARM64_MODULE_PLTS, currently only done via
RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
over into the vmalloc area
- Make ZONE_DMA32 configurable
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI
XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN
fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk
gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0
w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8
Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT
KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc
eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA
o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb
lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF
7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU
tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc=
=0TDT
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
- Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
manage the permissions of executable vmalloc regions more strictly
- Slight performance improvement by keeping softirqs enabled while
touching the FPSIMD/SVE state (kernel_neon_begin/end)
- Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
XAFLAG and AXFLAG instructions for floating point comparison flags
manipulation) and FRINT (rounding floating point numbers to integers)
- Re-instate ARM64_PSEUDO_NMI support which was previously marked as
BROKEN due to some bugs (now fixed)
- Improve parking of stopped CPUs and implement an arm64-specific
panic_smp_self_stop() to avoid warning on not being able to stop
secondary CPUs during panic
- perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
platforms
- perf: DDR performance monitor support for iMX8QXP
- cache_line_size() can now be set from DT or ACPI/PPTT if provided to
cope with a system cache info not exposed via the CPUID registers
- Avoid warning on hardware cache line size greater than
ARCH_DMA_MINALIGN if the system is fully coherent
- arm64 do_page_fault() and hugetlb cleanups
- Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
- Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
'arm_boot_flags' introduced in 5.1)
- CONFIG_RANDOMIZE_BASE now enabled in defconfig
- Allow the selection of ARM64_MODULE_PLTS, currently only done via
RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
over into the vmalloc area
- Make ZONE_DMA32 configurable
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
perf: arm_spe: Enable ACPI/Platform automatic module loading
arm_pmu: acpi: spe: Add initial MADT/SPE probing
ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
ACPI/PPTT: Modify node flag detection to find last IDENTICAL
x86/entry: Simplify _TIF_SYSCALL_EMU handling
arm64: rename dump_instr as dump_kernel_instr
arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
arm64: Implement panic_smp_self_stop()
arm64: Improve parking of stopped CPUs
arm64: Expose FRINT capabilities to userspace
arm64: Expose ARMv8.5 CondM capability to userspace
arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
arm64: ARM64_MODULES_PLTS must depend on MODULES
arm64: bpf: do not allocate executable memory
arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
arm64: module: create module allocations without exec permissions
arm64: Allow user selection of ARM64_MODULE_PLTS
acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
arm64: Allow selecting Pseudo-NMI again
...
Lets add the MODULE_TABLE and platform id_table entries so that
the SPE driver can attach to the ACPI platform device created by
the core pmu code.
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 503 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add DDR performance monitor support for iMX8QXP. The PMU consists of 3
programmable event counters and a single dedicated cycle counter.
Example usage:
$ perf stat -a -e \
imx8_ddr0/read-cycles/,imx8_ddr0/write-cycles/,imx8_ddr0/precharge/ ls
- or -
$ perf stat -a -e \
imx8_ddr0/cycles/,imx8_ddr0/read-access/,imx8_ddr0/write-access/ ls
Other events are supported, and advertised via perf list.
Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[will: rewrote commit message/kconfig and used #defines for dev/cpuhp names]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 and
only version 2 as published by the free software foundation this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 294 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.825281744@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 655 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070034.575739538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Fix SPE probe failure when backing auxbuf with high-order pages
- Fix handling of DMA allocations from outside of the vmalloc area
- Fix generation of build-id ELF section for vDSO object
- Disable huge I/O mappings if kernel page table dumping is enabled
- A few other minor fixes (comments, kconfig etc)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzlRT0ACgkQt6xw3ITB
YzRGOwgArUDryBedDdkxAvjx7fk8O+qjtWctAhdPtyuXIvVLOc3tpiKlayCguF/a
clqr4qAfxswoDLHRMwhh7xdv955A2vraHQWlzvGUj2O2M4mG8RdbVJLm3NxpA09m
dufjSuFcwxcou2c4rXbSXSB4AYJXPmQJiad04VsWj68+TVehy0P45zaPcjHsPNPI
D9sTa9XhBlNa0qpJG7tP9T8FS/QP/hpWHn8v0z/DQ4QetKRTstkpwD5kmJox8WmM
Bw593bvQQ2+5q9g+z0FM3M/7yHwTJw2RLnnIb29YsW8MxM3rUeqt+FMA2OALBgbi
0m7WoTZwO9hDQuPU1DDvZUtw3iOpeg==
=buiS
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Fix SPE probe failure when backing auxbuf with high-order pages
- Fix handling of DMA allocations from outside of the vmalloc area
- Fix generation of build-id ELF section for vDSO object
- Disable huge I/O mappings if kernel page table dumping is enabled
- A few other minor fixes (comments, kconfig etc)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vdso: Explicitly add build-id option
arm64/mm: Inhibit huge-vmap with ptdump
arm64: Print physical address of page table base in show_pte()
arm64: don't trash config with compat symbol if COMPAT is disabled
arm64: assembler: Update comment above cond_yield_neon() macro
drivers/perf: arm_spe: Don't error on high-order pages for aux buf
arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not see http www gnu org licenses
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details [based]
[from] [clk] [highbank] [c] you should have received a copy of the
gnu general public license along with this program if not see http
www gnu org licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 355 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 5768402fd9 ("perf/ring_buffer: Use high order allocations
for AUX buffers optimistically"), the perf core tends to back aux buffer
allocations with high-order pages with the order encoded in the
PagePrivate data. The Arm SPE driver explicitly rejects such pages,
causing the perf tool to fail with:
| failed to mmap with 12 (Cannot allocate memory)
In actual fact, we can simply treat these pages just like any other
since the perf core takes care to populate the page array appropriately.
In theory we could try to map with PMDs where possible, but for now,
let's just get things working again.
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: 5768402fd9 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Like arm-cci, arm-ccn has the same issue of disabling preemption around
operations which can take mutexes. Again, remove the definite bug by
simply not trying to fight the theoretical races. And since we are
touching the hotplug handling code, take the opportunity to streamline
it, as there's really no need to store a full-sized cpumask to keep
track of a single CPU ID.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Uncore PMU drivers face an awkward cyclic dependency wherein:
- They have to pick a valid online CPU to associate with before
registering the PMU device, since it will get exposed to userspace
immediately.
- The PMU registration has to be be at least partly complete before
hotplug events can be handled, since trying to migrate an
uninitialised context would be bad.
- The hotplug handler has to be ready as soon as a CPU is chosen, lest
it go offline without the user-visible cpumask value getting updated.
The arm-cci driver has tried to solve this by using get_cpu() to pick
the current CPU and prevent it from disappearing while both
registrations are performed, but that results in taking mutexes with
preemption disabled, which makes certain configurations very unhappy:
[ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
[ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[ 1.983342] Preemption disabled at:
[ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
[ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
[ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
[ 1.983364] Call trace:
[ 1.983369] dump_backtrace+0x0/0x158
[ 1.983372] show_stack+0x24/0x30
[ 1.983378] dump_stack+0x80/0xa4
[ 1.983383] ___might_sleep+0x138/0x160
[ 1.983386] __might_sleep+0x58/0x90
[ 1.983391] __rt_mutex_lock_state+0x30/0xc0
[ 1.983395] _mutex_lock+0x24/0x30
[ 1.983400] perf_pmu_register+0x2c/0x388
[ 1.983404] cci_pmu_probe+0x2bc/0x488
[ 1.983409] platform_drv_probe+0x58/0xa8
It is not feasible to resolve all the possible races outside of the perf
core itself, so address the immediate bug by following the example of
nearly every other PMU driver and not even trying to do so. Registering
the hotplug notifier first should minimise the window in which things
can go wrong, so that's about as much as we can reasonably do here. This
also revealed an additional race in assigning the global pointer too
late relative to the hotplug notifier, which gets fixed in the process.
Reported-by: Li, Meng <Meng.Li@windriver.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
HiSilicon erratum 162001800 describes the limitation of
SMMUv3 PMCG implementation on HiSilicon Hip08 platforms.
On these platforms, the PMCG event counter registers
(SMMU_PMCG_EVCNTRn) are read only and as a result it
is not possible to set the initial counter period value
on event monitor start.
To work around this, the current value of the counter
is read and used for delta calculations. OEM information
from ACPI header is used to identify the affected hardware
platforms.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[will: update silicon-errata.txt and add reason string to acpi match]
Signed-off-by: Will Deacon <will.deacon@arm.com>
This adds support for MSI-based counter overflow interrupt.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Adds a new driver to support the SMMUv3 PMU and add it into the
perf events framework.
Each SMMU node may have multiple PMUs associated with it, each of
which may support different events.
SMMUv3 PMCG devices are named as smmuv3_pmcg_<phys_addr_page> where
<phys_addr_page> is the physical page address of the SMMU PMCG
wrapped to 4K boundary. For example, the PMCG at 0xff88840000 is
named smmuv3_pmcg_ff88840
Filtering by stream id is done by specifying filtering parameters
with the event. options are:
filter_enable - 0 = no filtering, 1 = filtering enabled
filter_span - 0 = exact match, 1 = pattern match
filter_stream_id - pattern to filter against
Example: perf stat -e smmuv3_pmcg_ff88840/transaction,filter_enable=1,
filter_span=1,filter_stream_id=0x42/ -a netperf
Applies filter pattern 0x42 to transaction events, which means events
matching stream ids 0x42 & 0x43 are counted as only upper StreamID
bits are required to match the given filter. Further filtering
information is available in the SMMU documentation.
SMMU events are not attributable to a CPU, so task mode and sampling
are not supported.
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
[will: fold in review feedback from Robin]
[will: rewrite Kconfig text and allow building as a module]
Signed-off-by: Will Deacon <will.deacon@arm.com>
- Pseudo NMI support for arm64 using GICv3 interrupt priorities
- uaccess macros clean-up (unsafe user accessors also merged but
reverted, waiting for objtool support on arm64)
- ptrace regsets for Pointer Authentication (ARMv8.3) key management
- inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the
riscv maintainers)
- arm64/perf updates: PMU bindings converted to json-schema, unused
variable and misleading comment removed
- arm64/debug fixes to ensure checking of the triggering exception level
and to avoid the propagation of the UNKNOWN FAR value into the si_code
for debug signals
- Workaround for Fujitsu A64FX erratum 010001
- lib/raid6 ARM NEON optimisations
- NR_CPUS now defaults to 256 on arm64
- Minor clean-ups (documentation/comments, Kconfig warning, unused
asm-offsets, clang warnings)
- MAINTAINERS update for list information to the ARM64 ACPI entry
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlyCl0cACgkQa9axLQDI
XvEyKxAAiogBZLbyhcy8bTUHVzVoJE0FyAkdO2wWnnaff2Ohkhy1Y/npv33IeK2q
RknxqDIx2DUUVPJNRZGoI/WwBtTZdKaAnW4rIKG84yC1eAkFcd96WQasaZzcp1qY
HmvbJiYXM0bh+0J7i3Wgry/QzOkrltJFJW2kp6Wd5aFE+R1WyWyxT6d+Fp0J3vlA
bT70jlpBK6LXEOmmBS+04Ml02+8MvaGxIl8EInBHSfDLRLErj5E8n41rRHKUiSWz
maWI+kVoLYwOE68xiZlDftUBEeQpUSWgg2nxeK+640QSl1wJmVcRcY9nm6TZeMG2
AiZTR9a7cP5rrdSN5suUmb7d4AMMVlVMisGDlwb+9oCxeTRDzg0uwACaVgHfPqQr
UeBdHbL9nStN7uBH23H8L9mKk+tqpFmk0sgzdrKejOwysAiqWV8aazb/Na3qnVRl
J1B5opxMnGOsjXmHvtG/tiZl281Uwz5ZmzfLmIY3gUZgUgdA3511Egp0ry5y1dzJ
SkYC4Hmzb2ybQvXGIDDa3OzCwXXiqyqKsO+O8Egg1k4OIwbp3w+NHE7gKeA+dMgD
gjN7zEalCUi46Q28xiCPEb+88BpQ18czIWGQLb9mAnmYeZPjqqenXKXuRHr4lgVe
jPURJ/vqvFEglZJN1RDuQHKzHEcm5f2XE566sMZYdSoeiUCb0QM=
=2U56
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Pseudo NMI support for arm64 using GICv3 interrupt priorities
- uaccess macros clean-up (unsafe user accessors also merged but
reverted, waiting for objtool support on arm64)
- ptrace regsets for Pointer Authentication (ARMv8.3) key management
- inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by
the riscv maintainers)
- arm64/perf updates: PMU bindings converted to json-schema, unused
variable and misleading comment removed
- arm64/debug fixes to ensure checking of the triggering exception
level and to avoid the propagation of the UNKNOWN FAR value into the
si_code for debug signals
- Workaround for Fujitsu A64FX erratum 010001
- lib/raid6 ARM NEON optimisations
- NR_CPUS now defaults to 256 on arm64
- Minor clean-ups (documentation/comments, Kconfig warning, unused
asm-offsets, clang warnings)
- MAINTAINERS update for list information to the ARM64 ACPI entry
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
arm64: mmu: drop paging_init comments
arm64: debug: Ensure debug handlers check triggering exception level
arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
Revert "arm64: uaccess: Implement unsafe accessors"
arm64: avoid clang warning about self-assignment
arm64: Kconfig.platforms: fix warning unmet direct dependencies
lib/raid6: arm: optimize away a mask operation in NEON recovery routine
lib/raid6: use vdupq_n_u8 to avoid endianness warnings
arm64: io: Hook up __io_par() for inX() ordering
riscv: io: Update __io_[p]ar() macros to take an argument
asm-generic/io: Pass result of I/O accessor to __io_[p]ar()
arm64: Add workaround for Fujitsu A64FX erratum 010001
arm64: Rename get_thread_info()
arm64: Remove documentation about TIF_USEDFPU
arm64: irqflags: Fix clang build warnings
arm64: Enable the support of pseudo-NMIs
arm64: Skip irqflags tracing for NMI in IRQs disabled context
arm64: Skip preemption when exiting an NMI
arm64: Handle serror in NMI context
irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
...
When pmu::setup_aux() is called the coresight PMU needs to know which
sink to use for the session by looking up the information in the
event's attr::config2 field.
As such simply replace the cpu information by the complete perf_event
structure and change all affected customers.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20190131184714.20388-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/perf/xgene_pmu.c: In function 'xgene_perf_stop':
drivers/perf/xgene_pmu.c:1055:6: warning:
variable 'config' set but not used [-Wunused-but-set-variable]
It never used since introduction.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
For drivers that do not support context exclusion let's advertise the
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.
This change means that qcom_{l2|l3}_pmu will now also indicate that
they do not support exclude_{host|guest} and that xgene_pmu does
not also support exclude_idle and exclude_hv.
Note that for qcom_l2_pmu we now implictly return -EINVAL instead
of -EOPNOTSUPP. This change will result in the perf userspace
utility retrying the perf_event_open system call with fallback
event attributes that do not fail.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-9-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will
prevent us from handling events where any exclusion flags are set.
Let's also remove the now unnecessary check for exclusion flags.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-8-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The ARM PMU driver can be used to represent a variety of ARM based
PMUs. Some of these PMUs do not provide support for context
exclusion, where this is the case we advertise the
PERF_PMU_CAP_NO_EXCLUDE capability to ensure that perf prevents us
from handling events where any exclusion flags are set.
Where an ARM PMU driver has the set_event_filter function implemented,
we rely on it to perform exclusion checks. At present some of these
functions do not test for all of the available exclude flags.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-6-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For DDRC PMU, each PMU counter is fixed-purpose. There is a mismatch
between perf list and driver definition on rw_chg event.
# perf list | grep chg
hisi_sccl1_ddrc0/rnk_chg/ [Kernel PMU event]
hisi_sccl1_ddrc0/rw_chg/ [Kernel PMU event]
But the register offset of rw_chg event is not defined in the driver,
meanwhile bnk_chg register offset is mis-defined, let's fixup it.
Fixes: 904dcf03f0 ("perf: hisi: Add support for HiSilicon SoC DDRC PMU driver")
Cc: stable@vger.kernel.org
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Weijian Huang <huangweijian4@hisilicon.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds a perf driver for the PMU UNCORE devices DDR4 Memory
Controller(DMC) and Level 3 Cache(L3C). Each PMU supports up to 4
counters. All counters lack overflow interrupt and are
sampled periodically.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
[will: consistent enum cpuhp_state naming]
Signed-off-by: Will Deacon <will.deacon@arm.com>
devm_kasprintf() may return NULL on failure of internal allocation
thus the assignment to 'name' is not safe if unchecked. If NULL
is passed in for name then perf_pmu_register() would not fail
but rather silently jump to skip_type which is not the intent
here. As perf_pmu_register() may also return -ENOMEM returning
-ENOMEM in the (unlikely) failure case of devm_kasprintf() should
be fine here as well.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Fixes: d5d9696b03 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
[will: reworded error message]
Signed-off-by: Will Deacon <will.deacon@arm.com>
If the CPU assigned to the xgene PMU is taken offline, then subsequent
perf invocations on the PMU will fail:
# echo 0 > /sys/devices/system/cpu/cpu0/online
# perf stat -a -e l3c0/cycle-count/,l3c0/write/ sleep 1
Error:
The sys_perf_event_open() syscall returned with 19 (No such device) for event (l3c0/cycle-count/).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?
This patch implements a hotplug notifier in the xgene PMU driver so that
the PMU context is migrated to another online CPU should its assigned
CPU disappear.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com>
[will: Made naming of new cpuhp_state enum entry consistent]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When built as a module, the spe driver isn't automatically
loaded on DT systems. Add the MODULE_DEVICE_TABLE entry.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- Core mmu_gather changes which allow tracking the levels of page-table
being cleared together with the arm64 low-level flushing routines
- Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
mitigate Spectre-v4 dynamically without trapping to EL3 firmware
- Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
- Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
- Support for Common Not Private (CnP) translations allowing threads of
the same CPU to share the TLB entries
- Accelerated crc32 routines
- Move swapper_pg_dir to the rodata section
- Trap WFI instruction executed in user space
- ARM erratum 1188874 workaround (arch_timer)
- Miscellaneous fixes and clean-ups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlvKGdEACgkQa9axLQDI
XvGSQBAAiOH6aQABL4TB7c5KIc7C+Unjm6QCFCoaeGWoHuemnM6cFJ7RQsi0GqnP
dVEX5V/FKfmeTWO5g24Ah+MbTm3Bt6+81gywAmi1rrHhmCaCIPjT7xDqy/WsLlvt
7WtgegSGvQ7DIMj2dbfFav6+ra67qAiYZTc46jvuynVl6DrE3BCiyTDbXAWt2nzP
Xf3un4AHRbg3UEMUZTLqU5q4z0tbM6rEAZru8O0UOTnD2q7uttUqW3Ab7fpuEkkj
lEVrMWD3h8SJg+Df9CbXmCNOjh4VhwBwDb5LgO8vA/AcyV/YLEF5b2OUAk/28qwo
0GBwjqRyI4+YQ9LPg41MhGzrlnta0HCdYoeNLgLQZiDcUkuSfGhoA+MNZNOR8B08
sCWF7F6f8UIQm8KMMBiYYdlVyUYgHLsWE/1+CyeLV0oIoWT5k3c+Xe3pho9KpVb0
Co04TqMlqalry0sbevHz5c55H7iWIjB1Tpo3SxM105dVJVibXRPXkz+WZ5iPO+xa
ex2j1kjNdA/AUzrSCZ5lh22zhg0WsfwD++E5meAaJMxieim8FeZDRga43rowJ0BA
zMbSNB/+NDFZ9EhC40VaUfKk8Tkgiug9J5swv0+v7hy1QLDyydHhbOecTuIueauM
6taiT2Iuov5yFng1eonYj4htvouVF4WOhPGthFPJMOcrB9mLMhs=
=3Mc8
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Apart from some new arm64 features and clean-ups, this also contains
the core mmu_gather changes for tracking the levels of the page table
being cleared and a minor update to the generic
compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ.
Summary:
- Core mmu_gather changes which allow tracking the levels of
page-table being cleared together with the arm64 low-level flushing
routines
- Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
mitigate Spectre-v4 dynamically without trapping to EL3 firmware
- Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
- Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
- Support for Common Not Private (CnP) translations allowing threads
of the same CPU to share the TLB entries
- Accelerated crc32 routines
- Move swapper_pg_dir to the rodata section
- Trap WFI instruction executed in user space
- ARM erratum 1188874 workaround (arch_timer)
- Miscellaneous fixes and clean-ups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
arm64: cpufeature: Fix handling of CTR_EL0.IDC field
arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
Documentation/arm64: HugeTLB page implementation
arm64: mm: Use __pa_symbol() for set_swapper_pgd()
arm64: Add silicon-errata.txt entry for ARM erratum 1188873
Revert "arm64: uaccess: implement unsafe accessors"
arm64: mm: Drop the unused cpu parameter
MAINTAINERS: fix bad sdei paths
arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines
arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
arm64: xen: Use existing helper to check interrupt status
arm64: Use daifflag_restore after bp_hardening
arm64: daifflags: Use irqflags functions for daifflags
arm64: arch_timer: avoid unused function warning
arm64: Trap WFI executed in userspace
arm64: docs: Document SSBS HWCAP
arm64: docs: Fix typos in ELF hwcaps
arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe
...
It doesn't make sense for a perf event to be configured as a CHAIN event
in isolation, so extend the arm_pmu structure with a ->filter_match()
function to allow the backend PMU implementation to reject CHAIN events
early.
Cc: <stable@vger.kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation to remove the node name pointer from struct device_node,
convert printf users to use the %pOFn format specifier.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Here is the bit set of char/misc drivers for 4.19-rc1
There is a lot here, much more than normal, seems like everyone is
writing new driver subsystems these days... Anyway, major things here
are:
- new FSI driver subsystem, yet-another-powerpc low-level
hardware bus
- gnss, finally an in-kernel GPS subsystem to try to tame all of
the crazy out-of-tree drivers that have been floating around
for years, combined with some really hacky userspace
implementations. This is only for GNSS receivers, but you
have to start somewhere, and this is great to see.
Other than that, there are new slimbus drivers, new coresight drivers,
new fpga drivers, and loads of DT bindings for all of these and existing
drivers.
Full details of everything is in the shortlog.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW3g7ew8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykfBgCeOG0RkSI92XVZe0hs/QYFW9kk8JYAnRBf3Qpm
cvW7a+McOoKz/MGmEKsi
=TNfn
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the bit set of char/misc drivers for 4.19-rc1
There is a lot here, much more than normal, seems like everyone is
writing new driver subsystems these days... Anyway, major things here
are:
- new FSI driver subsystem, yet-another-powerpc low-level hardware
bus
- gnss, finally an in-kernel GPS subsystem to try to tame all of the
crazy out-of-tree drivers that have been floating around for years,
combined with some really hacky userspace implementations. This is
only for GNSS receivers, but you have to start somewhere, and this
is great to see.
Other than that, there are new slimbus drivers, new coresight drivers,
new fpga drivers, and loads of DT bindings for all of these and
existing drivers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
android: binder: Rate-limit debug and userspace triggered err msgs
fsi: sbefifo: Bump max command length
fsi: scom: Fix NULL dereference
misc: mic: SCIF Fix scif_get_new_port() error handling
misc: cxl: changed asterisk position
genwqe: card_base: Use true and false for boolean values
misc: eeprom: assignment outside the if statement
uio: potential double frees if __uio_register_device() fails
eeprom: idt_89hpesx: clean up an error pointer vs NULL inconsistency
misc: ti-st: Fix memory leak in the error path of probe()
android: binder: Show extra_buffers_size in trace
firmware: vpd: Fix section enabled flag on vpd_section_destroy
platform: goldfish: Retire pdev_bus
goldfish: Use dedicated macros instead of manual bit shifting
goldfish: Add missing includes to goldfish.h
mux: adgs1408: new driver for Analog Devices ADGS1408/1409 mux
dt-bindings: mux: add adi,adgs1408
Drivers: hv: vmbus: Cleanup synic memory free path
Drivers: hv: vmbus: Remove use of slow_virt_to_phys()
Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
...
Pull in arm perf updates, including support for 64-bit (chained) event
counters and some non-critical fixes for some of the system PMU drivers.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Instead of checking the return value of platform_get_resource(), we can
use devm_ioremap_resource() which has the NULL pointer check and the
memory region requesting. devm_ioremap_resource is designed to replace
calls to devm_request_mem_region followed by devm_ioremap, so let's use
the same.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
MT bit in MPIDR_EL1 is now supported in certain HiSilicon platforms, so
the mapping between sccl_id/ccl_id and affinity level needs to be updated
from the generic encoding we originally used.
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
[will: fixed comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add support for 64bit event by using chained event counters
and 64bit cycle counters.
PMUv3 allows chaining a pair of adjacent 32-bit counters, effectively
forming a 64-bit counter. The low/even counter is programmed to count
the event of interest, and the high/odd counter is programmed to count
the CHAIN event, taken when the low/even counter overflows.
For CPU cycles, when 64bit mode is requested, the cycle counter
is used in 64bit mode. If the cycle counter is not available,
falls back to chaining.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The armpmu uses get_event_idx callback to allocate an event
counter for a given event, which marks the selected counter
as "used". Now, when we delete the counter, the arm_pmu goes
ahead and clears the "used" bit and then invokes the "clear_event_idx"
call back, which kind of splits the job between the core code
and the backend. To keep things tidy, mandate the implementation
of clear_event_idx() and add it for exisiting backends.
This will be useful for adding the chained event support, where
we leave the event idx maintenance to the backend.
Also, when an event is removed from the PMU, reset the hw.idx
to indicate that a counter is not allocated for this event,
to help the backends do better checks. This will be also used
for the chain counter support.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Each PMU has a set of 32bit event counters. But in some
special cases, the events could be counted using counters
which are effectively 64bit wide.
e.g, Arm V8 PMUv3 has a 64 bit cycle counter which can count
only the CPU cycles. Also, the PMU can chain the event counters
to effectively count as a 64bit counter.
Add support for tracking the events that uses 64bit counters.
This only affects the periods set for each counter in the core
driver.
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Each PMU defines their max_period of the counter as the maximum
value that can be counted. Since all the PMU backends support
32bit counters by default, let us remove the redundant field.
No functional changes.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
At over 4000 #includes, <linux/platform_device.h> is the 9th most
#included header file in the Linux kernel. It does not need
<linux/mod_devicetable.h>, so drop that header and explicitly add
<linux/mod_devicetable.h> to source files that need it.
4146 #include <linux/platform_device.h>
After this patch, there are 225 files that use <linux/mod_devicetable.h>,
for a reduction of around 3900 times that <linux/mod_devicetable.h>
does not have to be read & parsed.
225 #include <linux/mod_devicetable.h>
This patch was build-tested on 20 different arch-es.
It also makes these drivers SubmitChecklist#1 compliant.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <lkp@intel.com> # drivers/media/platform/vimc/
Reported-by: kbuild test robot <lkp@intel.com> # drivers/pinctrl/pinctrl-u300.c
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a PMU doesn't have any IRQs, we should return 0 from
armpmu_request_irqs(), rather than uninitialised stack.
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLA in favor of a maximum size and adds a sanity check
at registration time. The sizes are all explicitly enumerated already,
so this just collects them into macros.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch fixes the below parser error of the IOB SLOW PMU.
# perf stat -a -e iob-slow0/cycle-count/ sleep 1
evenf syntax error: 'iob-slow0/cycle-count/'
\___ parser error
It replaces the "-" character by "_" character inside the PMU name.
Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When the arm-cci driver is enabled, but both CONFIG_ARM_CCI5xx_PMU and
CONFIG_ARM_CCI400_PMU are not, we get a warning about how parts of
the driver are never used:
drivers/perf/arm-cci.c:1454:29: error: 'cci_pmu_models' defined but not used [-Werror=unused-variable]
drivers/perf/arm-cci.c:693:16: error: 'cci_pmu_event_show' defined but not used [-Werror=unused-function]
drivers/perf/arm-cci.c:685:16: error: 'cci_pmu_format_show' defined but not used [-Werror=unused-function]
Marking all three functions as __maybe_unused avoids the warnings in
randconfig builds. I'm doing this lacking any ideas for a better fix.
Fixes: 3de6be7a3d ("drivers/bus: Split Arm CCI driver")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since commit bddb9b68d3 ("drivers/perf: commonise PERF_EVENTS
dependency"), all perf drivers depend on PERF_EVENTS config under a
common menu.
Config ARM_SPE_PMU still declares explicitly a dependency on
PERF_EVENTS, which is unneeded, so remove it.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM CCN PMU driver uses dev_warn() to complain about parameters in
the user-provided perf_event_attr. This means that under normal
operation (e.g. a single invocation of the perf tool), a number of
messages warnings may be logged to dmesg.
Tools may issue multiple syscalls to probe for feature support, and
multiple applications (from multiple users) can attempt to open events
simultaneously, so this is not very helpful, even if a user happens to
have access to dmesg. Worse, this can push important information out of
the dmesg ring buffer, and can significantly slow down syscall fuzzers,
vastly increasing the time it takes to find critical bugs.
Demote the dev_warn() instances to dev_dbg(), as is the case for all
other PMU drivers under drivers/perf/. Users who wish to debug PMU event
initialisation can enable dynamic debug to receive these messages.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fill in the few extra bits and annotations needed to make the driver
work properly as a module, and jiggle the Kconfig to expose the
driver-level ARM_CCI_PMU option.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The CCI PMU driver bears some legacy remnants of the arm_pmu framework
from when it was split in c6f85cb430 ("bus: cci: move away from
arm_pmu framework"). In particular this perf_pmu_{dis,en}able() dance
around pmu->add which was fixed for arm_pmu in a9e469d1c8
("drivers/perf: arm_pmu: remove pointless PMU disabling").
For the exact same reasons (i.e. perf core already does this around the
call anyway), give cci_pmu_add() the exact same change, which also
prevents having to export those core functions to build it as a module.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The CCI/CCN drivers are licensed under GPLv2, but the MODULE_LICENSE()
tags are using the bare "GPL" string implying GPLv2 or later. Fix them
to match their actual file license.
Acked-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm_pmu::handle_irq() callback has the same prototype as a generic
IRQ handler, taking the IRQ number and a void pointer argument which it
must convert to an arm_pmu pointer.
This means that all arm_pmu::handle_irq() take an IRQ number they never
use, and all must explicitly cast the void pointer to an arm_pmu
pointer.
Instead, let's change arm_pmu::handle_irq to take an arm_pmu pointer,
allowing these casts to be removed. The redundant IRQ number parameter
is also removed.
Suggested-by: Hoeun Ryu <hoeun.ryu@lge.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since sampling events are rejected up-front by cci_pmu_event_init(), it
doesn't make much sense to go fiddling with the sampling period later.
This would seem to be just another leftover artefact of the arm_pmu
framwork, and as such can go.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We should get drvdata from struct device directly. Going via
platform_device is an unneeded step back and forth.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The main addition this time around is the new ARM "SCMI" framework,
which is the latest in a series of standards coming from ARM to do power
management in a platform independent way. This has been through many
review cycles, and it relies on a rather interesting way of using the
mailbox subsystem, but in the end I agreed that Sudeep's version was
the best we could do after all.
Other changes include:
- the ARM CCN driver is moved out of drivers/bus into drivers/perf,
which makes more sense. Similarly, the performance monitoring
portion of the CCI driver are moved the same way and cleaned up
a little more.
- a series of updates to the SCPI framework
- support for the Mediatek mt7623a SoC in drivers/soc
- support for additional NVIDIA Tegra hardware in drivers/soc
- a new reset driver for Socionext Uniphier
- lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
drivers/firmware and drivers/reset across platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaxiNzAAoJEGCrR//JCVInYhYP/2kPhc5t/kszA1bcklcbO9dY
eX37Ra/RR4yQ5yeQZVIZ4UkUovxk9PmG2tM4K5oJaTDsz5pPEgavVOOr3sbfj6vb
4O9auTeysEQlHcbVdNFum0YS2gUY2YD7D12DTRorotLxCqod184ccWXq0XGfIWaY
l3YRrcL/lPlqmyS3z/GNx9oNygOMUzEfXfIQYICyzHuYiLBUGnkKC1vIb+Hx1TDq
Cxk++AUqH13Mss24O2A2QQh+oBHj2BybDLLqwcC5PSpsUbFrVCfzG54l43mig32T
NOxV0Qnml2wAtU4H0QcgtSgwRimHD0YOiX8ssquvDDiqTqM5G+llSTGkEbYe+AUW
4GIZYoBOwGkfEXS+tyymHe9yfc5h1OLYAeFU1jRm723c7phanuu67rPn35YC8UMK
zSql10JpkAGNzMikrxxb6wnis951w2UFlzhgZQ6ItA/nRq3l+oEQA0Qiljv965nz
DVLsD5+gdhK6GBctkzlsD5HFn6GjM8JilnsOVPHD765nKnVBSxKiXRLV228XVug2
rChF1FhQqLnM54jCMqHZX5fS9SbSgtYswHqIXpVw6GmJkqq/Ly10yGR0vuWD+uyn
BV7q5AKpGrwm6wZkMM2uZ1VdUtWzn856AbkqrvX/QhmJcX4McuqaLUrC8bSOj1ty
KeVil0akq3nU+xHl5Ojs
=Pmsx
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
"The main addition this time around is the new ARM "SCMI" framework,
which is the latest in a series of standards coming from ARM to do
power management in a platform independent way.
This has been through many review cycles, and it relies on a rather
interesting way of using the mailbox subsystem, but in the end I
agreed that Sudeep's version was the best we could do after all.
Other changes include:
- the ARM CCN driver is moved out of drivers/bus into drivers/perf,
which makes more sense. Similarly, the performance monitoring
portion of the CCI driver are moved the same way and cleaned up a
little more.
- a series of updates to the SCPI framework
- support for the Mediatek mt7623a SoC in drivers/soc
- support for additional NVIDIA Tegra hardware in drivers/soc
- a new reset driver for Socionext Uniphier
- lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
drivers/firmware and drivers/reset across platforms"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits)
reset: uniphier: add ethernet reset control support for PXs3
reset: stm32mp1: Enable stm32mp1 reset driver
dt-bindings: reset: add STM32MP1 resets
reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control
reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency
reset: modify the way reset lookup works for board files
reset: add support for non-DT systems
clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove
firmware: arm_scmi: prevent accessing rate_discrete uninitialized
hwmon: (scmi) return -EINVAL when sensor information is unavailable
amlogic: meson-gx-socinfo: Update soc ids
soc/tegra: pmc: Use the new reset APIs to manage reset controllers
soc: mediatek: update power domain data of MT2712
dt-bindings: soc: update MT2712 power dt-bindings
cpufreq: scmi: add thermal dependency
soc: mediatek: fix the mistaken pointer accessed when subdomains are added
soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC
soc: mediatek: avoid hardcoded value with bus_prot_mask
dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding
dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC
...
Nothing particularly stands out here, probably because people were tied
up with spectre/meltdown stuff last time around. Still, the main pieces
are:
- Rework of our CPU features framework so that we can whitelist CPUs that
don't require kpti even in a heterogeneous system
- Support for the IDC/DIC architecture extensions, which allow us to elide
instruction and data cache maintenance when writing out instructions
- Removal of the large memory model which resulted in suboptimal codegen
by the compiler and increased the use of literal pools, which could
potentially be used as ROP gadgets since they are mapped as executable
- Rework of forced signal delivery so that the siginfo_t is well-formed
and handling of show_unhandled_signals is consolidated and made
consistent between different fault types
- More siginfo cleanup based on the initial patches from Eric Biederman
- Workaround for Cortex-A55 erratum #1024718
- Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJaw1TCAAoJELescNyEwWM0gyQIAJVMK4QveBW+LwF96NYdZo16
p90Aa+nqKelh/s93govQArDMv1gxyuXdFlQZVOGPQHfqpz6RhJWmBA2tFsUbQrUc
OBcioPrRihqTmKBe+1r1XORwZxkVX6GGmCn0LYpPR7I3TjxXZpvxqaxGxiUvHkci
yVxWlDTyN/7eL3akhCpCDagN3Fxwk3QnJLqE3fxOFMlY7NvQcmUxcITiUl/s469q
xK6SWH9SRH1JK8jTHPitwUBiU//3FfCqSI9HLEdDIDoTuPcVM8UetWvi4QzrzJL1
UYg8lmU0CXNmflDzZJDaMf+qFApOrGxR0YVPpBzlQvxe0JIY69g48f+JzDPz8nc=
=+gNa
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Nothing particularly stands out here, probably because people were
tied up with spectre/meltdown stuff last time around. Still, the main
pieces are:
- Rework of our CPU features framework so that we can whitelist CPUs
that don't require kpti even in a heterogeneous system
- Support for the IDC/DIC architecture extensions, which allow us to
elide instruction and data cache maintenance when writing out
instructions
- Removal of the large memory model which resulted in suboptimal
codegen by the compiler and increased the use of literal pools,
which could potentially be used as ROP gadgets since they are
mapped as executable
- Rework of forced signal delivery so that the siginfo_t is
well-formed and handling of show_unhandled_signals is consolidated
and made consistent between different fault types
- More siginfo cleanup based on the initial patches from Eric
Biederman
- Workaround for Cortex-A55 erratum #1024718
- Some small ACPI IORT updates and cleanups from Lorenzo Pieralisi
- Misc cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (70 commits)
arm64: uaccess: Fix omissions from usercopy whitelist
arm64: fpsimd: Split cpu field out from struct fpsimd_state
arm64: tlbflush: avoid writing RES0 bits
arm64: cmpxchg: Include linux/compiler.h in asm/cmpxchg.h
arm64: move percpu cmpxchg implementation from cmpxchg.h to percpu.h
arm64: cmpxchg: Include build_bug.h instead of bug.h for BUILD_BUG
arm64: lse: Include compiler_types.h and export.h for out-of-line LL/SC
arm64: fpsimd: include <linux/init.h> in fpsimd.h
drivers/perf: arm_pmu_platform: do not warn about affinity on uniprocessor
perf: arm_spe: include linux/vmalloc.h for vmap()
Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)"
arm64: cpufeature: Avoid warnings due to unused symbols
arm64: Add work around for Arm Cortex-A55 Erratum 1024718
arm64: Delay enabling hardware DBM feature
arm64: Add MIDR encoding for Arm Cortex-A55 and Cortex-A35
arm64: capabilities: Handle shared entries
arm64: capabilities: Add support for checks based on a list of MIDRs
arm64: Add helpers for checking CPU MIDR against a range
arm64: capabilities: Clean up midr range helpers
arm64: capabilities: Change scope of VHE to Boot CPU feature
...
If there is exactly one CPU present, there is no ambiguity: do not warn
that PMU setup would need to guess IRQ affinity.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On linux-next, I get a build failure in some configurations:
drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_setup_aux':
drivers/perf/arm_spe_pmu.c:857:14: error: implicit declaration of function 'vmap'; did you mean 'swap'? [-Werror=implicit-function-declaration]
buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
^~~~
swap
drivers/perf/arm_spe_pmu.c:857:37: error: 'VM_MAP' undeclared (first use in this function); did you mean 'VM_MPX'?
buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
^~~~~~
VM_MPX
drivers/perf/arm_spe_pmu.c:857:37: note: each undeclared identifier is reported only once for each function it appears in
drivers/perf/arm_spe_pmu.c: In function 'arm_spe_pmu_free_aux':
drivers/perf/arm_spe_pmu.c:878:2: error: implicit declaration of function 'vunmap'; did you mean 'iounmap'? [-Werror=implicit-function-declaration]
vmap() is declared in linux/vmalloc.h, so we should include that header file.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[will: add additional missing #includes reported by Mark]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that all the grouping is done with RB trees, we no longer need
group_entry and can replace the whole thing with sibling_list.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valery Cherepennikov <valery.cherepennikov@intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Depending directly on the bus driver's global cci_ctrl_base variable
is a little unpleasant, and exporting it to allow the PMU driver to
be modular would be even more so. Let's make things a little better
abstracted by adding the control register block to the cci_pmu instance
data alongside the PMU register block, and communicating the mapped
address from the bus driver via platform data.
It's not practical to try the same thing for the bus driver itself,
given that the globals are entangled with the hairy assembly code for
port control, so we leave them be there. It would however be prudent
to move them to the __ro_after_init section in passing, since the
addresses really should never be changing once set.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Since I am the self-appointed of_device_get_match_data() police, it's
only right that I should clean up this driver while I'm otherwise
touching it. This also reveals that we're passing around a struct
platform_device in places where we only ever care about its regular
device, so straighten that out in the process.
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Realistically, systems with multiple CCIs are unlikely to ever exist,
and since the driver only actually supports a single instance anyway
there's really no need to do the multi-instance hotplug state dance.
Take the opportunity to simplify the hotplug-related code all over,
addressing the context-migration TODO in the process for good measure.
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The arm-cci driver is really two entirely separate drivers; one for MCPM
port control and the other for the performance monitors. Since they are
already relatively self-contained, let's take the plunge and move the
PMU parts out to drivers/perf where they belong these days. For non-MCPM
systems this leaves a small dependency on the remaining "bus" stub for
initial probing and discovery, but we end up with something that still
fits the general pattern of its fellow system PMU drivers to ease future
maintenance.
Moving code to a new file also offers a perfect excuse to modernise the
license/copyright headers and clean up some funky linewraps on the way.
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The arm-ccn driver is purely a perf driver for the CCN PMU, not a bus
driver in the sense of the other residents of drivers/bus/, so let's
move it to the appropriate place for SoC PMU drivers. Not to mention
moving the documentation accordingly as well.
Acked-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Commit 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
moved all of the arm_pmu IRQ enable/disable calls to the CPU hotplug hooks,
regardless of whether they are implemented as PPIs or SPIs. This can
lead to us sleeping from atomic context due to disable_irq blocking:
| BUG: sleeping function called from invalid context at kernel/irq/manage.c:112
| in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1
| no locks held by migration/1/15.
| irq event stamp: 192
| hardirqs last enabled at (191): [<00000000803c2507>]
| _raw_spin_unlock_irq+0x2c/0x4c
| hardirqs last disabled at (192): [<000000007f57ad28>] multi_cpu_stop+0x9c/0x140
| softirqs last enabled at (0): [<0000000004ee1b58>]
| copy_process.isra.77.part.78+0x43c/0x1504
| softirqs last disabled at (0): [< (null)>] (null)
| CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651
| Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
| Call trace:
| dump_backtrace+0x0/0x140
| show_stack+0x14/0x1c
| dump_stack+0xb4/0xf0
| ___might_sleep+0x1fc/0x218
| __might_sleep+0x70/0x80
| synchronize_irq+0x40/0xa8
| disable_irq+0x20/0x2c
| arm_perf_teardown_cpu+0x80/0xac
Since the interrupt is always CPU-affine and this code is running with
interrupts disabled, we can just use disable_irq_nosync as we know there
isn't a concurrent invocation of the handler to worry about.
Fixes: 6de3f79112 ("arm_pmu: explicitly enable/disable SPIs at hotplug")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We can't request IRQs in atomic context, so for ACPI systems we'll have
to request them up-front, and later associate them with CPUs.
This patch reorganises the arm_pmu code to do so. As we no longer have
the arm_pmu structure at probe time, a number of prototypes need to be
adjusted, requiring changes to the common arm_pmu code and arm_pmu
platform code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To support ACPI systems, we need to request IRQs before we know the
associated PMU, and thus we need some percpu variable that the IRQ
handler can find the PMU from.
As we're going to request IRQs without the PMU, we can't rely on the
arm_pmu::active_irqs mask, and similarly need to track requested IRQs
with a percpu variable.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: made armpmu_count_irq_users static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
To support ACPI systems, we need to request IRQs before CPUs are
hotplugged, and thus we need to request IRQs before we know their
associated PMU.
This is problematic if a PMU IRQ is pending out of reset, as it may be
taken before we know the PMU, and thus the IRQ handler won't be able to
handle it, leaving it screaming.
To avoid such problems, lets request all IRQs in a disabled state, and
explicitly enable/disable them at hotplug time, when we're sure the PMU
has been probed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm_pmu platform code explicitly checks for mismatched PPIs at probe
time, while the ACPI code leaves this to the core code. Future
refactoring will make this difficult for the core code to check, so
let's have the ACPI code check this explicitly.
As before, upon a failure we'll continue on without an interrupt. Ho
hum.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In ACPI systems, we don't know the makeup of CPUs until we hotplug them
on, and thus have to allocate the PMU datastructures at hotplug time.
Thus, we must use GFP_ATOMIC allocations.
Let's add an armpmu_alloc_atomic() that we can use in this case.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The armpmu_{request,free}_irqs() helpers are only used by
arm_pmu_platform.c, so let's fold them in and make them static.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we have no platforms passing platform data to the arm_pmu code,
we can get rid of the platdata and associated hooks, paving the way for
rework of our IRQ handling.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
future. So I don't remove it here, but annotate as __used.
Tested on arm64 and 32-bit BE mips.
[arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
[ynorov@caviumnetworks.com: fix net/core/ethtool.c]
Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We set dsu_pmu->num_counters to -1, when the DSU is allocated
but not initialised when none of the CPUs are active in the DSU.
However, we use an unsigned field for num_counters. Switch this
to a signed field.
Fixes: 7520fa9924 ("perf: ARM DynamIQ Shared Unit PMU support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).
* 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
perf: ARM DynamIQ Shared Unit PMU support
dt-bindings: Document devicetree binding for ARM DSU PMU
arm_pmu: Use of_cpu_node_to_id helper
arm64: Use of_cpu_node_to_id helper for CPU topology parsing
irqchip: gic-v3: Use of_cpu_node_to_id helper
coresight: of: Use of_cpu_node_to_id helper
of: Add helper for mapping device node to logical CPU number
perf: Export perf_event_update_userpage
Add support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).
The DSU integrates one or more cores with an L3 memory system, control
logic, and external interfaces to form a multicore cluster. The PMU
allows counting the various events related to L3, SCU etc, along with
providing a cycle counter.
The PMU can be accessed via system registers, which are common
to the cores in the same cluster. The PMU registers follow the
semantics of the ARMv8 PMU, mostly, with the exception that
the counters record the cluster wide events.
This driver is mostly based on the ARMv8 and CCI PMU drivers.
The driver only supports ARM64 at the moment. It can be extended
to support ARM32 by providing register accessors like we do in
arch/arm64/include/arm_dsu_pmu.h.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Use the new generic helper, of_cpu_node_to_id(), to map a
a phandle to the logical CPU number while parsing the
PMU irq affinity.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When running with the kernel unmapped whilst at EL0, the virtually-addressed
SPE buffer is also unmapped, which can lead to buffer faults if userspace
profiling is enabled and potentially also when writing back kernel samples
unless an expensive drain operation is performed on exception return.
For now, fail the SPE driver probe when arm64_kernel_unmapped_at_el0().
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Plenty of acronym soup here:
- Initial support for the Scalable Vector Extension (SVE)
- Improved handling for SError interrupts (required to handle RAS events)
- Enable GCC support for 128-bit integer types
- Remove kernel text addresses from backtraces and register dumps
- Use of WFE to implement long delay()s
- ACPI IORT updates from Lorenzo Pieralisi
- Perf PMU driver for the Statistical Profiling Extension (SPE)
- Perf PMU driver for Hisilicon's system PMUs
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJaCcLqAAoJELescNyEwWM0JREH/2FbmD/khGzEtP8LW+o9D8iV
TBM02uWQxS1bbO1pV2vb+512YQO+iWfeQwJH9Jv2FZcrMvFv7uGRnYgAnJuXNGrl
W+LL6OhN22A24LSawC437RU3Xe7GqrtONIY/yLeJBPablfcDGzPK1eHRA0pUzcyX
VlyDruSHWX44VGBPV6JRd3x0vxpV8syeKOjbRvopRfn3Nwkbd76V3YSfEgwoTG5W
ET1sOnXLmHHdeifn/l1Am5FX1FYstpcd7usUTJ4Oto8y7e09tw3bGJCD0aMJ3vow
v1pCUWohEw7fHqoPc9rTrc1QEnkdML4vjJvMPUzwyTfPrN+7uEuMIEeJierW+qE=
=0qrg
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The big highlight is support for the Scalable Vector Extension (SVE)
which required extensive ABI work to ensure we don't break existing
applications by blowing away their signal stack with the rather large
new vector context (<= 2 kbit per vector register). There's further
work to be done optimising things like exception return, but the ABI
is solid now.
Much of the line count comes from some new PMU drivers we have, but
they're pretty self-contained and I suspect we'll have more of them in
future.
Plenty of acronym soup here:
- initial support for the Scalable Vector Extension (SVE)
- improved handling for SError interrupts (required to handle RAS
events)
- enable GCC support for 128-bit integer types
- remove kernel text addresses from backtraces and register dumps
- use of WFE to implement long delay()s
- ACPI IORT updates from Lorenzo Pieralisi
- perf PMU driver for the Statistical Profiling Extension (SPE)
- perf PMU driver for Hisilicon's system PMUs
- misc cleanups and non-critical fixes"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
arm64: Make ARMV8_DEPRECATED depend on SYSCTL
arm64: Implement __lshrti3 library function
arm64: support __int128 on gcc 5+
arm64/sve: Add documentation
arm64/sve: Detect SVE and activate runtime support
arm64/sve: KVM: Hide SVE from CPU features exposed to guests
arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
arm64/sve: KVM: Prevent guests from using SVE
arm64/sve: Add sysctl to set the default vector length for new processes
arm64/sve: Add prctl controls for userspace vector length management
arm64/sve: ptrace and ELF coredump support
arm64/sve: Preserve SVE registers around EFI runtime service calls
arm64/sve: Preserve SVE registers around kernel-mode NEON use
arm64/sve: Probe SVE capabilities and usable vector lengths
arm64: cpufeature: Move sys_caps_initialised declarations
arm64/sve: Backend logic for setting the vector length
arm64/sve: Signal handling support
arm64/sve: Support vector length resetting for new processes
arm64/sve: Core task context handling
arm64/sve: Low-level CPU setup
...
When the PMU driver is built as a module, the perf expects the
pmu->module to be valid, so that the driver is prevented from
being unloaded while it is in use. Fix the SPE pmu driver to
fill in this field.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge in ARM PMU and perf updates for 4.15:
- Support for the Statistical Profiling Extension
- Support for Hisilicon's SoC PMU
Signed-off-by: Will Deacon <will.deacon@arm.com>
arm_pmu interrupts are maked as PERCPU even when these are not local
physical interrupts to a single CPU. When using non-local interrupts,
interrupts marked as PERCPU will not get freed not disabled properly
by the PMU driver.
Check if interrupts are local to a single CPU with PERCPU_DEVID since
this is what the PMU driver really needs to know.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds support for DDRC PMU driver in HiSilicon SoC chip, Each
DDRC has own control, counter and interrupt registers and is an separate
PMU. For each DDRC PMU, it has 8-fixed-purpose counters which have been
mapped to 8-events by hardware, it assumes that counter index is equal
to event code (0 - 7) in DDRC PMU driver. Interrupt is supported to
handle counter (32-bits) overflow.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
L3 cache coherence is maintained by Hydra Home Agent (HHA) in HiSilicon
SoC. This patch adds support for HHA PMU driver, Each HHA has own
control, counter and interrupt registers and is an separate PMU. For
each HHA PMU, it has 16-programable counters and each counter is
free-running. Interrupt is supported to handle counter (48-bits)
overflow.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds support for L3C PMU driver in HiSilicon SoC chip, Each
L3C has own control, counter and interrupt registers and is an separate
PMU. For each L3C PMU, it has 8-programable counters and each counter
is free-running. Interrupt is supported to handle counter (48-bits)
overflow.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
[will: Fix leader accounting in uncore group validation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARMv8.2 architecture introduces the optional Statistical Profiling
Extension (SPE).
SPE can be used to profile a population of operations in the CPU pipeline
after instruction decode. These are either architected instructions (i.e.
a dynamic instruction trace) or CPU-specific uops and the choice is fixed
statically in the hardware and advertised to userspace via caps/. Sampling
is controlled using a sampling interval, similar to a regular PMU counter,
but also with an optional random perturbation to avoid falling into patterns
where you continuously profile the same instruction in a hot loop.
After each operation is decoded, the interval counter is decremented. When
it hits zero, an operation is chosen for profiling and tracked within the
pipeline until it retires. Along the way, information such as TLB lookups,
cache misses, time spent to issue etc is captured in the form of a sample.
The sample is then filtered according to certain criteria (e.g. load
latency) that can be specified in the event config (described under
format/) and, if the sample satisfies the filter, it is written out to
memory as a record, otherwise it is discarded. Only one operation can
be sampled at a time.
The in-memory buffer is linear and virtually addressed, raising an
interrupt when it fills up. The PMU driver handles these interrupts to
give the appearance of a ring buffer, as expected by the AUX code.
The in-memory trace-like format is self-describing (though not parseable
in reverse) and written as a series of records, with each record
corresponding to a sample and consisting of a sequence of packets. These
packets are defined by the architecture, although some have CPU-specific
fields for recording information specific to the microarchitecture.
As a simple example, a record generated for a branch instruction may
consist of the following packets:
0 (Address) : Virtual PC of the branch instruction
1 (Type) : Conditional direct branch
2 (Counter) : Number of cycles taken from Dispatch to Issue
3 (Address) : Virtual branch target + condition flags
4 (Counter) : Number of cycles taken from Dispatch to Complete
5 (Events) : Mispredicted as not-taken
6 (END) : End of record
It is also possible to toggle properties such as timestamp packets in
each record.
This patch adds support for SPE in the form of a new perf driver.
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
acpi_disabled has been checked in armv8_pmu_driver_init and it shall
be ZERO in arm_pmu_acpi_probe, clean up this unnecessary check.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add event names so that common events can be
specified symbolically, for example:
l2cache_0/total-reads/,l2cache_0/cycles/
Event names are displayed in 'perf list'.
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Free memory region, if arm_pmu_acpi_probe is not successful.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Rather than continue adding CPU-specific event maps, instead look up by
default in the PMUv3 event map and only fallback to the CPU-specific maps
if either the event isn't described by PMUv3, or it is described but
the PMCEID registers say that it is unsupported by the current CPU.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Managed resources in the driver should be automatically cleaned up on
driver detach. It's unnecessary to manually free/unmmap these resources.
One of the manual cleanup causes static checkers to complain.
The bug is reported by Dan Carpenter <dan.carpenter@oracle.com> in [1]
[1] https://www.spinics.net/lists/arm-kernel/msg593012.html
This patch gets rid of all the unnecessary manual cleanup and properly
unregister all the registered PMU devices by the driver on driver detach.
Signed-off-by: Tai Nguyen <ttnguyen@apm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since the PMU register interface is banked per CPU, CPU PMU interrrupts
cannot be handled by a CPU other than the one with the PMU asserting the
interrupt. This means that migrating PMU SPIs, as we do during a CPU
hotplug operation doesn't make any sense and can lead to the IRQ being
disabled entirely if we route a spurious IRQ to the new affinity target.
This has been observed in practice on AMD Seattle, where CPUs on the
non-boot cluster appear to take a spurious PMU IRQ when coming online,
which is routed to CPU0 where it cannot be handled.
This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their
affinity prior to requesting them, ensuring that they cannot
be migrated during hotplug events. This interacts badly with the DB8500
erratum workaround that ping-pongs the interrupt affinity from the handler,
so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags
to be overridden in the platdata.
Fixes: 3cf7ee98b8 ("drivers/perf: arm_pmu: move irq request/free into probe")
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The check for column exclusion did not verify that the event being
checked was an L2 event, and not a software event.
Software events should not be checked for column exclusion.
This resulted in a group with both software and L2 events sometimes
incorrectly rejecting the L2 event for column exclusion and
not counting it.
Add a check for PMU type before applying column exclusion logic.
Fixes: 21bdbb7102 ("perf: add qcom l2 cache perf events driver")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.
It can support up to
- 2 IOB PMU instances
- 8 L3C PMU instances
- 2 MCB PMU instances
- 8 MCU PMU instances
and these PMUs support 64 bit counter
Signed-off-by: Hoan Tran <hotran@apm.com>
[Mark: stop counters in _xgene_pmu_isr()]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make xgene_pmu_v3_ops static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch moves PMU leaf functions into a function pointer structure.
It helps code maintain and expasion easier.
Signed-off-by: Hoan Tran <hotran@apm.com>
[Mark: remove redundant cast]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: make xgene_pmu_ops static]
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch parses PMU Subnode from a match table.
Signed-off-by: Hoan Tran <hotran@apm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>