Currently the PMU device appears directly under /sys/devices/
Only root busses should appear there, so instead assign the pmu->dev
parent to be the PCI device.
Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20240412161057.14099-31-Jonathan.Cameron@huawei.com
When perf_init_event() calls perf_try_init_event() to init pmu driver,
searches for the next pmu driver only when the return value is -ENOENT.
Therefore, hisi_ptt_pmu_event_init() needs to check the type at the
beginning of the function.
Otherwise, in the case of perf-task mode, perf_try_init_event() returns
-EOPNOTSUPP and skips subsequent pmu drivers, causes perf_init_event() to
fail.
Fixes: ff0de066b4 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20240108121906.3514820-1-yangjihong1@huawei.com
In the current implementation, there're 4*4MiB trace buffer and hardware
will fill the buffer one by one. The driver will get notified if one
buffer is full and then copy data to the AUX buffer. If there's no
enough room for the next trace buffer, we'll commit the AUX buffer to
the perf core and try to apply a new one. In a typical configuration
the AUX buffer will be 16MiB, so we'll commit the data after the whole
AUX buffer is occupied. Then the driver cannot apply a new AUX buffer
immediately until the committed data is consumed by userspace and then
there's room in the AUX buffer again.
This patch tries to optimize this by commit the data after one single
trace buffer is filled. Since there's still room in the AUX buffer,
driver can apply a new one without failure and don't need to wait for
the userspace to consume the data.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231010084731.30450-4-yangyicong@huawei.com
On trace end we disable the hardware but leave the interrupt
unmasked. Mask the interrupt to make the process reverse to
the start. No actual issue since hardware should send no
interrupt after disabled.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231010084731.30450-2-yangyicong@huawei.com
PTT is an uncore PMU and shouldn't be attached to any task. Block
the usage in pmu::event_init().
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231010084731.30450-5-yangyicong@huawei.com
Handle the trace interrupt in the hardirq context, make sure the irq
core won't threaded it by declaring IRQF_NO_THREAD and userspace won't
balance it by declaring IRQF_NOBALANCING. Otherwise we may violate the
synchronization requirements of the perf core, referenced to the
change of arm-ccn PMU
commit 0811ef7e2f ("bus: arm-ccn: fix PMU interrupt flags").
In the interrupt handler we mainly doing 2 things:
- Copy the data from the local DMA buffer to the AUX buffer
- Commit the data in the AUX buffer
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[ Fixed commit description to suppress checkpatch warning ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231010084731.30450-3-yangyicong@huawei.com
When start trace with perf option "-C $cpu" and immediately stop it
with SIGTERM or others, the perf core will invoke pmu::read() while
the driver doesn't implement it. Add a dummy pmu::read() to avoid
any issues.
Fixes: ff0de066b4 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231010084731.30450-6-yangyicong@huawei.com
PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it mannually using PCI_DEVID(). Use
pci_dev_id() to simplify the code a little bit.
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230808030835.167538-1-wangxiongfeng2@huawei.com
We're using pci_irq_vector() to obtain the interrupt number and then
bind it to the CPU start perf under the protection of spinlock in
pmu::start(). pci_irq_vector() might sleep since [1] because it will
call msi_domain_get_virq() to get the MSI interrupt number and it
needs to acquire dev->msi.data->mutex. Getting a mutex will sleep on
contention. So use pci_irq_vector() in an atomic context is problematic.
This patch cached the interrupt number in the probe() and uses the
cached data instead to avoid potential sleep.
[1] commit 82ff8e6b78 ("PCI/MSI: Use msi_get_virq() in pci_get_vector()")
Fixes: ff0de066b4 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230621092804.15120-6-yangyicong@huawei.com
The PTT trace collects PCIe TLP headers from the PCIe link and don't
have the ability to exclude certain context. It doesn't support itrace
as well. So replace PERF_PMU_CAP_ITRACE with PERF_PMU_CAP_NO_EXCLUDE.
This will greatly save the storage of final data. Tested tracing idle
link for ~15s, without this patch we'll collect ~28.682MB data for
additional information and with this patch it reduced to ~0.226MB.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230621092804.15120-5-yangyicong@huawei.com
The PTT can only filter the traced TLP headers by the Root Ports or the
Requester ID of the Endpoint, which are located on the same PCIe core of
the PTT device. The filter value used is derived from the BDF number of
the supported Root Port or the Endpoint. It's not friendly enough for the
users since it requires the user to be familiar enough with the platform
and calculate the filter value manually.
This patch export the available filters through sysfs. Each available
filters is presented as an individual file with the name of the BDF
number of the related PCIe device. The files are created under
$(PTT PMU dir)/available_root_port_filters and
$(PTT PMU dir)/available_requester_filters respectively. The filter
value can be known by reading the related file.
Then the users can easily know the available filters for trace and get
the filter values without calculating.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230621092804.15120-4-yangyicong@huawei.com
The PCIe devices supported by the PTT trace can be removed/rescanned by
hotplug or through sysfs. Add support for dynamically updating the
available filter list by registering a PCI bus notifier block. Then user
can always get latest information about available tracing filters and
driver can block the invalid filters of which related devices no longer
exist in the system.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230621092804.15120-3-yangyicong@huawei.com
Factor out the allocation and release of filters. This will make it easier
to extend and manage the function of the filter.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230621092804.15120-2-yangyicong@huawei.com
The PTT device can only support the devices on the same PCIe core,
within BDF range [lower_bdf, upper_bdf]. It's not correct to assume
the devices on the root bus are from the same PCIe core, there are
cases that root ports from different PCIe core are sharing the same
bus. So check when initializing the filters list.
Fixes: ff0de066b4 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20230112112201.16283-1-yangyicong@huawei.com
drivers/hwtracing/ptt/hisi_ptt.c:13:10: fatal error: linux/dma-iommu.h: No such file or directory
13 | #include <linux/dma-iommu.h>
| ^~~~~~~~~~~~~~~~~~~
Caused by:
commit ff0de066b4 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
interacting with:
commit f2042ed21d ("iommu/dma: Make header private")
from the iommu tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Yicong Yang <yangyicong@hisilicon.com>
[Fixed subject line and added changelog text]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Add tune function for the HiSilicon Tune and Trace device. The interface
of tune is exposed through sysfs attributes of PTT PMU device.
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220816114414.4092-4-yangyicong@huawei.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated
Endpoint(RCiEP) device, providing the capability to dynamically monitor and
tune the PCIe traffic and trace the TLP headers.
Add the driver for the device to enable the trace function. Register PMU
device of PTT trace, then users can use trace through perf command. The
driver makes use of perf AUX trace function and support the following
events to configure the trace:
- filter: select Root port or Endpoint to trace
- type: select the type of traced TLP headers
- direction: select the direction of traced TLP headers
- format: select the data format of the traced TLP headers
This patch initially add basic trace support of PTT device.
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220816114414.4092-3-yangyicong@huawei.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>