linux/drivers/pci/pcie/Kconfig
Robert Richter 0a867568bb PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler
In Restricted CXL Device (RCD) mode a CXL device is exposed as an
RCiEP, but CXL downstream and upstream ports are not enumerated and
not visible in the PCIe hierarchy. [1] Protocol and link errors from
these non-enumerated ports are signaled as internal AER errors, either
Uncorrectable Internal Error (UIE) or Corrected Internal Errors (CIE)
via an RCEC.

Restricted CXL host (RCH) downstream port-detected errors have the
Requester ID of the RCEC set in the RCEC's AER Error Source ID
register. A CXL handler must then inspect the error status in various
CXL registers residing in the dport's component register space (CXL
RAS capability) or the dport's RCRB (PCIe AER extended
capability). [2]

Errors showing up in the RCEC's error handler must be handled and
connected to the CXL subsystem. Implement this by forwarding the error
to all CXL devices below the RCEC. Since the entire CXL device is
controlled only using PCIe Configuration Space of device 0, function
0, only pass it there [3]. The error handling is limited to currently
supported devices with the Memory Device class code set (CXL Type 3
Device, PCI_CLASS_MEMORY_CXL, 502h), handle downstream port errors in
the device's cxl_pci driver. Support for other CXL Device Types
(e.g. a CXL.cache Device) can be added later.

To handle downstream port errors in addition to errors directed to the
CXL endpoint device, a handler must also inspect the CXL RAS and PCIe
AER capabilities of the CXL downstream port the device is connected
to.

Since CXL downstream port errors are signaled using internal errors,
the handler requires those errors to be unmasked. This is subject of a
follow-on patch.

The reason for choosing this implementation is that the AER service
driver claims the RCEC device, but does not allow it to register a
custom specific handler to support CXL. Connecting the RCEC hard-wired
with a CXL handler does not work, as the CXL subsystem might not be
present all the time. The alternative to add an implementation to the
portdrv to allow the registration of a custom RCEC error handler isn't
worth doing it as CXL would be its only user. Instead, just check for
an CXL RCEC and pass it down to the connected CXL device's error
handler. With this approach the code can entirely be implemented in
the PCIe AER driver and is independent of the CXL subsystem. The CXL
driver only provides the handler.

[1] CXL 3.0 spec: 9.11.8 CXL Devices Attached to an RCH
[2] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
[3] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices

Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-pci@vger.kernel.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231018171713.1883517-18-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:13:39 -07:00

158 lines
4.3 KiB
Plaintext

# SPDX-License-Identifier: GPL-2.0
#
# PCI Express Port Bus Configuration
#
config PCIEPORTBUS
bool "PCI Express Port Bus support"
default y if USB4
help
This enables PCI Express Port Bus support. Users can then enable
support for Native Hot-Plug, Advanced Error Reporting, Power
Management Events, and Downstream Port Containment.
#
# Include service Kconfig here
#
config HOTPLUG_PCI_PCIE
bool "PCI Express Hotplug driver"
depends on HOTPLUG_PCI && PCIEPORTBUS
default y if USB4
help
Say Y here if you have a motherboard that supports PCIe native
hotplug.
Thunderbolt/USB4 PCIe tunneling depends on native PCIe hotplug.
When in doubt, say N.
config PCIEAER
bool "PCI Express Advanced Error Reporting support"
depends on PCIEPORTBUS
select RAS
help
This enables PCI Express Root Port Advanced Error Reporting
(AER) driver support. Error reporting messages sent to Root
Port will be handled by PCI Express AER driver.
config PCIEAER_INJECT
tristate "PCI Express error injection support"
depends on PCIEAER
select GENERIC_IRQ_INJECTION
help
This enables PCI Express Root Port Advanced Error Reporting
(AER) software error injector.
Debugging AER code is quite difficult because it is hard
to trigger various real hardware errors. Software-based
error injection can fake almost all kinds of errors with the
help of a user space helper tool aer-inject, which can be
gotten from:
https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/
config PCIEAER_CXL
bool "PCI Express CXL RAS support"
default y
depends on PCIEAER && CXL_PCI
help
Enables CXL error handling.
If unsure, say Y.
#
# PCI Express ECRC
#
config PCIE_ECRC
bool "PCI Express ECRC settings control"
depends on PCIEAER
help
Used to override firmware/bios settings for PCI Express ECRC
(transaction layer end-to-end CRC checking).
When in doubt, say N.
#
# PCI Express ASPM
#
config PCIEASPM
bool "PCI Express ASPM control" if EXPERT
default y
help
This enables OS control over PCI Express ASPM (Active State
Power Management) and Clock Power Management. ASPM supports
state L0/L0s/L1.
ASPM is initially set up by the firmware. With this option enabled,
Linux can modify this state in order to disable ASPM on known-bad
hardware or configurations and enable it when known-safe.
ASPM can be disabled or enabled at runtime via
/sys/module/pcie_aspm/parameters/policy
When in doubt, say Y.
choice
prompt "Default ASPM policy"
default PCIEASPM_DEFAULT
depends on PCIEASPM
config PCIEASPM_DEFAULT
bool "BIOS default"
depends on PCIEASPM
help
Use the BIOS defaults for PCI Express ASPM.
config PCIEASPM_POWERSAVE
bool "Powersave"
depends on PCIEASPM
help
Enable PCI Express ASPM L0s and L1 where possible, even if the
BIOS did not.
config PCIEASPM_POWER_SUPERSAVE
bool "Power Supersave"
depends on PCIEASPM
help
Same as PCIEASPM_POWERSAVE, except it also enables L1 substates where
possible. This would result in higher power savings while staying in L1
where the components support it.
config PCIEASPM_PERFORMANCE
bool "Performance"
depends on PCIEASPM
help
Disable PCI Express ASPM L0s and L1, even if the BIOS enabled them.
endchoice
config PCIE_PME
def_bool y
depends on PCIEPORTBUS && PM
config PCIE_DPC
bool "PCI Express Downstream Port Containment support"
depends on PCIEPORTBUS && PCIEAER
help
This enables PCI Express Downstream Port Containment (DPC)
driver support. DPC events from Root and Downstream ports
will be handled by the DPC driver. If your system doesn't
have this capability or you do not want to use this feature,
it is safe to answer N.
config PCIE_PTM
bool "PCI Express Precision Time Measurement support"
help
This enables PCI Express Precision Time Measurement (PTM)
support.
This is only useful if you have devices that support PTM, but it
is safe to enable even if you don't.
config PCIE_EDR
bool "PCI Express Error Disconnect Recover support"
depends on PCIE_DPC && ACPI
help
This option adds Error Disconnect Recover support as specified
in the Downstream Port Containment Related Enhancements ECN to
the PCI Firmware Specification r3.2. Enable this if you want to
support hybrid DPC model which uses both firmware and OS to
implement DPC.