2021-02-17 12:09:50 +08:00
|
|
|
# SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
menuconfig CXL_BUS
|
|
|
|
tristate "CXL (Compute Express Link) Devices Support"
|
|
|
|
depends on PCI
|
2023-07-03 19:29:13 +08:00
|
|
|
select FW_LOADER
|
|
|
|
select FW_UPLOAD
|
2022-07-20 04:52:47 +08:00
|
|
|
select PCI_DOE
|
2021-02-17 12:09:50 +08:00
|
|
|
help
|
|
|
|
CXL is a bus that is electrically compatible with PCI Express, but
|
|
|
|
layers three protocols on that signalling (CXL.io, CXL.cache, and
|
|
|
|
CXL.mem). The CXL.cache protocol allows devices to hold cachelines
|
|
|
|
locally, the CXL.mem protocol allows devices to be fully coherent
|
|
|
|
memory targets, the CXL.io protocol is equivalent to PCI Express.
|
|
|
|
Say 'y' to enable support for the configuration and management of
|
|
|
|
devices supporting these protocols.
|
|
|
|
|
|
|
|
if CXL_BUS
|
|
|
|
|
2022-01-24 08:28:44 +08:00
|
|
|
config CXL_PCI
|
|
|
|
tristate "PCI manageability"
|
2021-06-10 00:01:41 +08:00
|
|
|
default CXL_BUS
|
2021-02-17 12:09:50 +08:00
|
|
|
help
|
2022-01-24 08:28:44 +08:00
|
|
|
The CXL specification defines a "CXL memory device" sub-class in the
|
|
|
|
PCI "memory controller" base class of devices. Device's identified by
|
|
|
|
this class code provide support for volatile and / or persistent
|
|
|
|
memory to be mapped into the system address map (Host-managed Device
|
|
|
|
Memory (HDM)).
|
2021-02-17 12:09:50 +08:00
|
|
|
|
2022-01-24 08:28:44 +08:00
|
|
|
Say 'y/m' to enable a driver that will attach to CXL memory expander
|
|
|
|
devices enumerated by the memory device class code for configuration
|
|
|
|
and management primarily via the mailbox interface. See Chapter 2.3
|
|
|
|
Type 3 CXL Device in the CXL 2.0 specification for more details.
|
2021-02-17 12:09:50 +08:00
|
|
|
|
|
|
|
If unsure say 'm'.
|
2021-02-17 12:09:54 +08:00
|
|
|
|
|
|
|
config CXL_MEM_RAW_COMMANDS
|
|
|
|
bool "RAW Command Interface for Memory Devices"
|
2022-01-24 08:28:44 +08:00
|
|
|
depends on CXL_PCI
|
2021-02-17 12:09:54 +08:00
|
|
|
help
|
|
|
|
Enable CXL RAW command interface.
|
|
|
|
|
|
|
|
The CXL driver ioctl interface may assign a kernel ioctl command
|
|
|
|
number for each specification defined opcode. At any given point in
|
|
|
|
time the number of opcodes that the specification defines and a device
|
|
|
|
may implement may exceed the kernel's set of associated ioctl function
|
|
|
|
numbers. The mismatch is either by omission, specification is too new,
|
|
|
|
or by design. When prototyping new hardware, or developing / debugging
|
|
|
|
the driver it is useful to be able to submit any possible command to
|
|
|
|
the hardware, even commands that may crash the kernel due to their
|
|
|
|
potential impact to memory currently in use by the kernel.
|
|
|
|
|
|
|
|
If developing CXL hardware or the driver say Y, otherwise say N.
|
2021-06-10 00:01:35 +08:00
|
|
|
|
|
|
|
config CXL_ACPI
|
|
|
|
tristate "CXL ACPI: Platform Support"
|
|
|
|
depends on ACPI
|
2021-06-10 00:01:41 +08:00
|
|
|
default CXL_BUS
|
2021-10-30 03:51:48 +08:00
|
|
|
select ACPI_TABLE_LIB
|
2021-06-10 00:01:35 +08:00
|
|
|
help
|
|
|
|
Enable support for host managed device memory (HDM) resources
|
|
|
|
published by a platform's ACPI CXL memory layout description. See
|
|
|
|
Chapter 9.14.1 CXL Early Discovery Table (CEDT) in the CXL 2.0
|
|
|
|
specification, and CXL Fixed Memory Window Structures (CEDT.CFMWS)
|
|
|
|
(https://www.computeexpresslink.org/spec-landing). The CXL core
|
|
|
|
consumes these resource to publish the root of a cxl_port decode
|
|
|
|
hierarchy to map regions that represent System RAM, or Persistent
|
|
|
|
Memory regions to be managed by LIBNVDIMM.
|
|
|
|
|
2021-06-16 07:18:17 +08:00
|
|
|
If unsure say 'm'.
|
|
|
|
|
|
|
|
config CXL_PMEM
|
|
|
|
tristate "CXL PMEM: Persistent Memory Support"
|
|
|
|
depends on LIBNVDIMM
|
|
|
|
default CXL_BUS
|
|
|
|
help
|
|
|
|
In addition to typical memory resources a platform may also advertise
|
|
|
|
support for persistent memory attached via CXL. This support is
|
|
|
|
managed via a bridge driver from CXL to the LIBNVDIMM system
|
|
|
|
subsystem. Say 'y/m' to enable support for enumerating and
|
|
|
|
provisioning the persistent memory capacity of CXL memory expanders.
|
|
|
|
|
2021-06-10 00:01:35 +08:00
|
|
|
If unsure say 'm'.
|
2022-02-02 05:07:51 +08:00
|
|
|
|
2022-02-04 23:18:31 +08:00
|
|
|
config CXL_MEM
|
|
|
|
tristate "CXL: Memory Expansion"
|
|
|
|
depends on CXL_PCI
|
|
|
|
default CXL_BUS
|
|
|
|
help
|
|
|
|
The CXL.mem protocol allows a device to act as a provider of "System
|
|
|
|
RAM" and/or "Persistent Memory" that is fully coherent as if the
|
|
|
|
memory were attached to the typical CPU memory controller. This is
|
|
|
|
known as HDM "Host-managed Device Memory".
|
|
|
|
|
|
|
|
Say 'y/m' to enable a driver that will attach to CXL.mem devices for
|
|
|
|
memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
|
|
|
|
specification for a detailed description of HDM.
|
|
|
|
|
|
|
|
If unsure say 'm'.
|
|
|
|
|
2022-02-02 05:07:51 +08:00
|
|
|
config CXL_PORT
|
|
|
|
default CXL_BUS
|
|
|
|
tristate
|
|
|
|
|
2022-04-23 06:58:11 +08:00
|
|
|
config CXL_SUSPEND
|
|
|
|
def_bool y
|
|
|
|
depends on SUSPEND && CXL_MEM
|
|
|
|
|
cxl/region: Add region creation support
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 01:28:34 +08:00
|
|
|
config CXL_REGION
|
2023-02-10 17:06:27 +08:00
|
|
|
bool "CXL: Region Support"
|
cxl/region: Add region creation support
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 01:28:34 +08:00
|
|
|
default CXL_BUS
|
2022-04-26 02:43:44 +08:00
|
|
|
# For MAX_PHYSMEM_BITS
|
|
|
|
depends on SPARSEMEM
|
cxl/region: Add region creation support
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 01:28:34 +08:00
|
|
|
select MEMREGION
|
2022-04-26 02:43:44 +08:00
|
|
|
select GET_FREE_REGION
|
2023-02-10 17:06:27 +08:00
|
|
|
help
|
|
|
|
Enable the CXL core to enumerate and provision CXL regions. A CXL
|
|
|
|
region is defined by one or more CXL expanders that decode a given
|
|
|
|
system-physical address range. For CXL regions established by
|
|
|
|
platform-firmware this option enables memory error handling to
|
|
|
|
identify the devices participating in a given interleaved memory
|
|
|
|
range. Otherwise, platform-firmware managed CXL is enabled by being
|
|
|
|
placed in the system address map and does not need a driver.
|
|
|
|
|
|
|
|
If unsure say 'y'
|
cxl/region: Add region creation support
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 01:28:34 +08:00
|
|
|
|
2022-12-02 06:03:41 +08:00
|
|
|
config CXL_REGION_INVALIDATION_TEST
|
|
|
|
bool "CXL: Region Cache Management Bypass (TEST)"
|
|
|
|
depends on CXL_REGION
|
|
|
|
help
|
|
|
|
CXL Region management and security operations potentially invalidate
|
2023-01-25 11:22:21 +08:00
|
|
|
the content of CPU caches without notifying those caches to
|
2022-12-02 06:03:41 +08:00
|
|
|
invalidate the affected cachelines. The CXL Region driver attempts
|
|
|
|
to invalidate caches when those events occur. If that invalidation
|
|
|
|
fails the region will fail to enable. Reasons for cache
|
|
|
|
invalidation failure are due to the CPU not providing a cache
|
|
|
|
invalidation mechanism. For example usage of wbinvd is restricted to
|
|
|
|
bare metal x86. However, for testing purposes toggling this option
|
|
|
|
can disable that data integrity safety and proceed with enabling
|
|
|
|
regions when there might be conflicting contents in the CPU cache.
|
|
|
|
|
|
|
|
If unsure, or if this kernel is meant for production environments,
|
|
|
|
say N.
|
|
|
|
|
perf: CXL Performance Monitoring Unit driver
CXL rev 3.0 introduces a standard performance monitoring hardware
block to CXL. Instances are discovered using CXL Register Locator DVSEC
entries. Each CXL component may have multiple PMUs.
This initial driver supports a subset of types of counter.
It supports counters that are either fixed or configurable, but requires
that they support the ability to freeze and write value whilst frozen.
Development done with QEMU model which will be posted shortly.
Example:
$ perf stat -a -e cxl_pmu_mem0.0/h2d_req_snpcur/ -e cxl_pmu_mem0.0/h2d_req_snpdata/ -e cxl_pmu_mem0.0/clock_ticks/ sleep 1
Performance counter stats for 'system wide':
96,757,023,244,321 cxl_pmu_mem0.0/h2d_req_snpcur/
96,757,023,244,365 cxl_pmu_mem0.0/h2d_req_snpdata/
193,514,046,488,653 cxl_pmu_mem0.0/clock_ticks/
1.090539600 seconds time elapsed
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230526095824.16336-5-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-05-26 17:58:23 +08:00
|
|
|
config CXL_PMU
|
|
|
|
tristate "CXL Performance Monitoring Unit"
|
|
|
|
default CXL_BUS
|
|
|
|
depends on PERF_EVENTS
|
|
|
|
help
|
|
|
|
Support performance monitoring as defined in CXL rev 3.0
|
|
|
|
section 13.2: Performance Monitoring. CXL components may have
|
|
|
|
one or more CXL Performance Monitoring Units (CPMUs).
|
|
|
|
|
|
|
|
Say 'y/m' to enable a driver that will attach to performance
|
|
|
|
monitoring units and provide standard perf based interfaces.
|
|
|
|
|
|
|
|
If unsure say 'm'.
|
2021-02-17 12:09:50 +08:00
|
|
|
endif
|