linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-30 23:54:04 +08:00

Author	SHA1	Message	Date
Dan Williams	ee817acaa0	Merge branch 'for-6.3/cxl-ram-region' into cxl/next Pick up some fixes from exposure of for-6.3/cxl-ram-region in linux-next.	2023-02-14 14:58:26 -08:00
Dave Jiang	248529edc8	cxl: add RAS status unmasking for CXL By default the CXL RAS mask registers bits are defaulted to 1's and suppress all error reporting. If the kernel has negotiated ownership of error handling for CXL then unmask the mask registers by writing 0s. PCI_EXP_DEVCTL capability is checked to see uncorrectable or correctable errors bits are set before unmasking the respective errors. Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_regs.h Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167639402301.778884.12556849214955646539.stgit@djiang5-mobl3.local Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-14 14:12:54 -08:00
Dave Jiang	1922a6dc05	cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() With this [1] commit upstream, pci_enable_pci_error_report() is no longer necessary for the driver to call. Remove call and related cleanups. [1]: `f26e58bf6f` ("PCI/AER: Enable error reporting when AER is native") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167632012093.4153151.5360778069735064322.stgit@djiang5-mobl3.local Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-14 14:05:30 -08:00
Arnd Bergmann	7abcb0b106	cxl: avoid returning uninitialized error code The new cxl_add_to_region() function returns an uninitialized value on success: drivers/cxl/core/region.c:2628:6: error: variable 'rc' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (IS_ERR(cxlr)) { ^~~~~~~~~~~~ drivers/cxl/core/region.c:2654:9: note: uninitialized use occurs here return rc; Simplify the logic to have the rc variable always initialized in the same place. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230213101220.3821689-1-arnd@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-14 08:36:34 -08:00
Dan Williams	f57aec443c	cxl/pmem: Fix nvdimm registration races A loop of the form: while true; do modprobe cxl_pci; modprobe -r cxl_pci; done ...fails with the following crash signature: BUG: kernel NULL pointer dereference, address: 0000000000000040 [..] RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core] [..] Call Trace: <TASK> cxl_pmem_ctl+0x121/0x240 [cxl_pmem] nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm] nd_label_data_init+0x135/0x7e0 [libnvdimm] nvdimm_probe+0xd6/0x1c0 [libnvdimm] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] really_probe+0xde/0x380 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __device_attach_driver+0x85/0x110 bus_for_each_drv+0x7d/0xc0 __device_attach+0xb4/0x1e0 bus_probe_device+0x9f/0xc0 device_add+0x445/0x9c0 nd_async_device_register+0xe/0x40 [libnvdimm] async_run_entry_fn+0x30/0x130 ...namely that the bottom half of async nvdimm device registration runs after the CXL has already torn down the context that cxl_pmem_ctl() needs. Unlike the ACPI NFIT case that benefits from launching multiple nvdimm device registrations in parallel from those listed in the table, CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a synchronous registration path to preclude this scenario. Fixes: `21083f5152` ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices") Cc: <stable@vger.kernel.org> Reported-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-13 17:01:05 -08:00
Dan Williams	b8b9ffced0	Merge branch 'for-6.3/cxl-ram-region' into cxl/next Include the support for enumerating and provisioning ram regions for v6.3. This also include a default policy change for ram / volatile device-dax instances to assign them to the dax_kmem driver by default.	2023-02-10 18:11:01 -08:00
Dan Williams	dfd423e0a3	Merge branch 'for-6.3/cxl' into cxl/next Pick up some final miscellaneous updates for v6.3 including support for communicating 'exclusive' and 'enabled' state of commands.	2023-02-10 18:05:59 -08:00
Ira Weiny	814a15f3b4	cxl/uapi: Tag commands from cxl_query_cmd() It was pointed out that commands not supported by the device or excluded by the kernel were being returned in cxl_query_cmd().[1] While libcxl correctly handles failing commands, it is more efficient to not issue an invalid command in the first place. This can't be done without additional information being returned from cxl_query_cmd(). In addition, information about the availability of commands can be useful for debugging. Add flags to struct cxl_command_info which reflect if a command is enabled and/or exclusive to the kernel. [1] https://lore.kernel.org/all/63b4ec4e37cc1_5178e2941d@dwillia2-xfh.jf.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221222-cxl-misc-v4-3-62f701c1cdd1@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:54:45 -08:00
Ira Weiny	860334e590	cxl/mem: Remove unused CXL_CMD_FLAG_NONE define CXL_CMD_FLAG_NONE is not used, remove it. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221222-cxl-misc-v4-1-62f701c1cdd1@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:54:45 -08:00
Dan Williams	09d09e04d2	cxl/dax: Create dax devices for CXL RAM regions While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the remainder and hot-added devices. Platform firmware is also responsible for identifying the platform general purpose memory pool, typically DDR attached DRAM, and arranging for the remainder to be 'Soft Reserved'. That reservation allows the CXL subsystem to route the memory to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated access (device-dax). The new 'struct cxl_dax_region' object allows for a CXL memory resource (region) to be published, but also allow for udev and module policy to act on that event. It also prevents cxl_core.ko from having a module loading dependency on any drivers/dax/ modules. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602003896.1924368.10335442077318970468.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:33:45 -08:00
Dan Williams	3d8f7ccaa6	tools/testing/cxl: Define a fixed volatile configuration to parse Take two endpoints attached to the first switch on the first host-bridge in the cxl_test topology and define a pre-initialized region. This is a x2 interleave underneath a x1 CXL Window. $ modprobe cxl_test $ # cxl list -Ru { "region":"region3", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:33:04 -08:00
Dan Williams	a32320b71f	cxl/region: Add region autodiscovery Region autodiscovery is an asynchronous state machine advanced by cxl_port_probe(). After the decoders on an endpoint port are enumerated they are scanned for actively enabled instances. Each active decoder is flagged for auto-assembly CXL_DECODER_F_AUTO and attached to a region. If a region does not already exist for the address range setting of the decoder one is created. That creation process may race with other decoders of the same region being discovered since cxl_port_probe() is asynchronous. A new 'struct cxl_root_decoder' lock, @range_lock, is introduced to mitigate that race. Once all decoders have arrived, "p->nr_targets == p->interleave_ways", they are sorted by their relative decode position. The sort algorithm involves finding the point in the cxl_port topology where one leg of the decode leads to deviceA and the other deviceB. At that point in the topology the target order in the 'struct cxl_switch_decoder' indicates the relative position of those endpoint decoders in the region. >From that point the region goes through the same setup and validation steps as user-created regions, but instead of programming the decoders it validates that driver would have written the same values to the decoders as were already present. Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601999958.1924368.9366954455835735048.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:55 -08:00
Dan Williams	32ce3f185b	cxl/port: Split endpoint and switch port probe Jonathan points out that the shared code between the switch and endpoint case is small. Before adding another is_cxl_endpoint() conditional, just split the two cases. Rather than duplicate the "Couldn't enumerate decoders" error message take the opportunity to improve the error messages in devm_cxl_enumerate_decoders(). Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601999378.1924368.15071142145866277623.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:50 -08:00
Dan Williams	45d235c56b	cxl/region: Enable CONFIG_CXL_REGION to be toggled Add help text and a label so the CXL_REGION config option can be toggled. This is mainly to enable compile testing without region support. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601998765.1924368.258370414771847699.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:43 -08:00
Dan Williams	93c177fd6f	kernel/range: Uplevel the cxl subsystem's range_contains() helper In support of the CXL subsystem's use of 'struct range' to track decode address ranges, add a common range_contains() implementation with identical semantics as resource_contains(); The existing 'range_contains()' in lib/stackinit_kunit.c is namespaced with a 'stackinit_' prefix. Cc: Kees Cook <keescook@chromium.org> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601998163.1924368.6067392174077323935.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:37 -08:00
Dan Williams	9995576cef	cxl/region: Move region-position validation to a helper In preparation for region autodiscovery, that needs all devices discovered before their relative position in the region can be determined, consolidate all position dependent validation in a helper. Recall that in the on-demand region creation flow the end-user picks the position of a given endpoint decoder in a region. In the autodiscovery case the position of an endpoint decoder can only be determined after all other endpoint decoders that claim to decode the region's address range have been enumerated and attached. So, in the autodiscovery case endpoint decoders may be attached before their relative position is known. Once all decoders arrive, then positions can be determined and validated with cxl_region_validate_position() the same as user initiated on-demand creation. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601997584.1924368.4615769326126138969.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:30 -08:00
Dan Williams	86987c7662	cxl/region: Cleanup target list on attach error Jonathan noticed that the target list setup is not unwound completely upon error. Undo all the setup in the 'err_decrement:' exit path. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: http://lore.kernel.org/r/20230208123031.00006990@Huawei.com Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601996980.1924368.390423634911157277.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:21 -08:00
Dan Williams	3528b1e101	cxl/region: Refactor attach_target() for autodiscovery Region autodiscovery is the process of kernel creating 'struct cxl_region' object to represent active CXL memory ranges it finds already active in hardware when the driver loads. Typically this happens when platform firmware establishes CXL memory regions and then publishes them in the memory map. However, this can also happen in the case of kexec-reboot after the kernel has created regions. In the autodiscovery case the region creation process starts with a known endpoint decoder. Refactor attach_target() into a helper that is suitable to be called from either sysfs, for runtime region creation, or from cxl_port_probe() after it has enumerated all endpoint decoders. The cxl_port_probe() context is an async device-core probing context, so it is not appropriate to allow SIGTERM to interrupt the assembly process. Refactor attach_target() to take @cxled and @state as arguments where @state indicates whether waiting from the region rwsem is interruptible or not. No behavior change is intended. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601996393.1924368.2202255054618600069.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:15 -08:00
Dan Williams	6e09926418	cxl/region: Add volatile region creation support Expand the region creation infrastructure to enable 'ram' (volatile-memory) regions. The internals of create_pmem_region_store() and create_pmem_region_show() are factored out into helpers __create_region() and __create_region_show() for the 'ram' case to reuse. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601995775.1924368.352616146815830591.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:10 -08:00
Dan Williams	1b9b7a6fd6	cxl/region: Validate region mode vs decoder mode In preparation for a new region mode, do not, for example, allow 'ram' decoders to be assigned to 'pmem' regions and vice versa. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601995111.1924368.7459128614177994602.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:05 -08:00
Dan Williams	a8e7d558f7	cxl/region: Support empty uuids for non-pmem regions Shipping versions of the cxl-cli utility expect all regions to have a 'uuid' attribute. In preparation for 'ram' regions, update the 'uuid' attribute to return an empty string which satisfies the current expectations of 'cxl list -R'. Otherwise, 'cxl list -R' fails in the presence of regions with the 'uuid' attribute missing. Force the attribute to be read-only as there is no facility or expectation for a 'ram' region to recall its uuid from one boot to the next. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167601994558.1924368.12612811533724694444.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:32:01 -08:00
Dan Williams	7d505f982f	cxl/region: Add a mode attribute for regions In preparation for a new region type, "ram" regions, add a mode attribute to clarify the mode of the decoders that can be added to a region. Share the internals of mode_show() (for decoders) with the region case. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/167601993930.1924368.4305018565539515665.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:31:58 -08:00
Dan Williams	2345df5424	cxl/memdev: Fix endpoint port removal Testing of ram region support [1], stimulates a long standing bug in cxl_detach_ep() where some cxl_ep_remove() cleanup is skipped due to inability to walk ports after dports have been unregistered. That results in a failure to re-register a memdev after the port is re-enabled leading to a crash like the following: cxl_port_setup_targets: cxl region4: cxl_host_bridge.0:port4 iw: 1 ig: 256 general protection fault, ... [..] RIP: 0010:cxl_region_setup_targets+0x897/0x9e0 [cxl_core] dev_name at include/linux/device.h:700 (inlined by) cxl_port_setup_targets at drivers/cxl/core/region.c:1155 (inlined by) cxl_region_setup_targets at drivers/cxl/core/region.c:1249 [..] Call Trace: <TASK> attach_target+0x39a/0x760 [cxl_core] ? __mutex_unlock_slowpath+0x3a/0x290 cxl_add_to_region+0xb8/0x340 [cxl_core] ? lockdep_hardirqs_on+0x7d/0x100 discover_region+0x4b/0x80 [cxl_port] ? __pfx_discover_region+0x10/0x10 [cxl_port] device_for_each_child+0x58/0x90 cxl_port_probe+0x10e/0x130 [cxl_port] cxl_bus_probe+0x17/0x50 [cxl_core] Change the port ancestry walk to be by depth rather than by dport. This ensures that even if a port has unregistered its dports a deferred memdev cleanup will still be able to cleanup the memdev's interest in that port. The parent_port->dev.driver check is only needed for determining if the bottom up removal beat the top-down removal, but cxl_ep_remove() can always proceed given the port is pinned. That is, the two sources of cxl_ep_remove() are in cxl_detach_ep() and cxl_port_release(), and cxl_port_release() can not run if cxl_detach_ep() holds a reference. Fixes: `2703c16c75` ("cxl/core/port: Add switch port enumeration") Link: http://lore.kernel.org/r/167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com [1] Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/167601992789.1924368.8083994227892600608.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-10 17:29:09 -08:00
Davidlohr Bueso	d874297bc7	cxl/mem: Correct full ID range allocation For ID allocations we want 0-(max-1), ie: smatch complains: error: Calling ida_alloc_range() with a 'max' argument which is a power of 2. -1 missing? Correct this and also replace the call to use the max() flavor instead. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230208181944.240261-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-09 08:46:10 -08:00
Dan Williams	dbe9f7d1e1	Merge branch 'for-6.3/cxl-events' into cxl/next Add the CXL event and interrupt support for the v6.3 update.	2023-02-07 11:14:06 -08:00
Dan Williams	5485eb9559	Merge branch 'for-6.3/cxl' into cxl/next Merge the general CXL updates with fixes targeting v6.2-rc for v6.3. Resolve a conflict with the fix and move of cxl_report_and_clear() from pci.c to core/pci.c.	2023-02-07 11:12:24 -08:00
Dan Williams	711442e29f	cxl/region: Fix passthrough-decoder detection A passthrough decoder is a decoder that maps only 1 target. It is a special case because it does not impose any constraints on the interleave-math as compared to a decoder with multiple targets. Extend the passthrough case to multi-target-capable decoders that only have one target selected. I.e. the current code was only considering passthrough ports which are only a subset of the potential passthrough decoder scenarios. Fixes: `e4f6dfa9ef` ("cxl/region: Fix 'distance' calculation with passthrough ports") Cc: <stable@vger.kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167564540422.847146.13816934143225777888.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-07 11:04:30 -08:00
Fan Ni	4fa4302d6d	cxl/region: Fix null pointer dereference for resetting decoder Not all decoders have a reset callback. The CXL specification allows a host bridge with a single root port to have no explicit HDM decoders. Currently the region driver assumes there are none. As such the CXL core creates a special pass through decoder instance without a commit/reset callback. Prior to this patch, the ->reset() callback was called unconditionally when calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge, 1 Root Port, and one directly attached CXL type 3 device or multiple CXL type 3 devices attached to downstream ports of a switch can cause a null pointer dereference. Before the fix, a kernel crash was observed when we destroy the region, and a pass through decoder is reset. The issue can be reproduced as below, 1) create a region with a CXL setup which includes a HB with a single root port under which a memdev is attached directly. 2) destroy the region with cxl destroy-region regionX -f. Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Cc: <stable@vger.kernel.org> Signed-off-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Link: https://lore.kernel.org/r/20221215170909.2650271-1-fan.ni@samsung.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-06 17:33:50 -08:00
Dan Williams	5a84711fd7	cxl/pci: Fix irq oneshot expectations The IRQ core expects that users of the default hardirq handler specify IRQF_ONESHOT to keep interrupts disabled until the threaded handler runs. That meets the CXL driver's expectations since it is an edge triggered MSI and this flag would have been passed by default using pci_request_irq() instead of devm_request_threaded_irq(). Fixes: `a49aa8141b` ("cxl/mem: Wire up event interrupts") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-30 15:39:26 -08:00
Jonathan Cameron	fa8843451b	cxl/pci: Set the device timestamp CXL r3.0 section 8.2.9.4.2 "Set Timestamp" recommends that the host sets the timestamp after every Conventional or CXL Reset to ensure accurate timestamps. This should include on initial boot up. The time base that is being set is used by a device for the poison list overflow timestamp and all event timestamps. Note that the command is optional and if not supported and the device cannot return accurate timestamps it will fill the fields in with an appropriate marker (see the specification description of each timestamp). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230130151327.32415-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-30 11:30:51 -08:00
Jonathan Cameron	7ebf38c911	cxl/mbox: Add missing parameter to docs. Kernel-doc should be complete, so add documentation for the status parameter. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230130153437.3153-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-30 11:24:16 -08:00
Robert Richter	623c075133	cxl/mbox: Fix Payload Length check for Get Log command Commit `2aeaf663b8` introduced strict checking for variable length payload size validation. The payload length of received data must match the size of the requested data by the caller except for the case where the min_out value is set. The Get Log command does not have a header with a length field set. The Log size is determined by the Get Supported Logs command (CXL 3.0, 8.2.9.5.1). However, the actual size can be smaller and the number of valid bytes in the payload output must be determined reading the Payload Length field (CXL 3.0, Table 8-36, Note 2). Two issues arise: The command can successfully complete with a payload length of zero. And, the valid payload length must then also be consumed by the caller. Change cxl_xfer_log() to pass the number of payload bytes back to the caller to determine the number of log entries. Implement the payload handling as a special case where mbox_cmd->size_out is consulted when cxl_internal_send_cmd() returns -EIO. A WARN_ONCE() is added to check that -EIO is only returned in case of an unexpected output size. Logs can be bigger than the maximum payload length and multiple Get Log commands can be issued. If the received payload size is smaller than the maximum payload size we can assume all valid bytes have been fetched. Stop sending further Get Log commands then. On that occasion, change debug messages to also report the opcodes of supported commands. The variable payload commands GET_LSA and SET_LSA are not affected by this strict check: SET_LSA cannot be broken because SET_LSA does not return an output payload, and GET_LSA never expects short reads. Fixes: `2aeaf663b8` ("cxl/mbox: Add variable output size validation for internal commands") Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230119094934.86067-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-27 13:58:11 -08:00
Greg Kroah-Hartman	2a81ada32f	driver core: make struct bus_type.uevent() take a const * The uevent() callback in struct bus_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-27 13:45:52 +01:00
Greg Kroah-Hartman	a9b12f8b4e	driver core: make struct device_type.devnode() take a const * The devnode() callback in struct device_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Cc: Jens Axboe <axboe@kernel.dk> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Joel Stanley <joel@jms.id.au> Cc: Alistar Popple <alistair@popple.id.au> Cc: Eddie James <eajames@linux.ibm.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Jilin Yuan <yuanjilin@cdjrlc.com> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Won Chung <wonchung@google.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-7-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-27 13:45:38 +01:00
Ira Weiny	95b4947992	cxl/mem: Trace Memory Module Event Record CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record. Determine if the event read is memory module record and if so trace the record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-5-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 16:51:07 -08:00
Ira Weiny	2d6c1e6d60	cxl/mem: Trace DRAM Event Record CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record. Determine if the event read is a DRAM event record and if so trace the record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-4-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 16:51:07 -08:00
Ira Weiny	d54a531a43	cxl/mem: Trace General Media Event Record CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record. Determine if the event read is a general media record and if so trace the record as a General Media Event Record. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-3-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 16:51:07 -08:00
Davidlohr Bueso	a49aa8141b	cxl/mem: Wire up event interrupts Currently the only CXL features targeted for irq support require their message numbers to be within the first 16 entries. The device may however support less than 16 entries depending on the support it provides. Attempt to allocate these 16 irq vectors. If the device supports less then the PCI infrastructure will allocate that number. Upon successful allocation, users can plug in their respective isr at any point thereafter. CXL device events are signaled via interrupts. Each event log may have a different interrupt message number. These message numbers are reported in the Get Event Interrupt Policy mailbox command. Add interrupt support for event logs. Interrupts are allocated as shared interrupts. Therefore, all or some event logs can share the same message number. In addition all logs are queried on any interrupt in order of the most to least severe based on the status register. Finally place all event configuration logic into cxl_event_config(). Previously the logic was a simple 'read all' on start up. But interrupts must be configured prior to any reads to ensure no events are missed. A single event configuration function results in a cleaner over all implementation. Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-2-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 16:51:07 -08:00
Randy Dunlap	cbbd05d036	cxl: fix spelling mistakes Correct spelling mistakes (reported by codespell). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-cxl@vger.kernel.org Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20230125032221.21277-1-rdunlap@infradead.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 15:57:42 -08:00
Robert Richter	ee611e5e66	cxl/mbox: Add debug messages for enabled mailbox commands Only unsupported mailbox commands are reported in debug messages. A list of enabled commands is useful too. Change debug messages to also report the opcodes of enabled commands. Esp. if card initialization fails there is no way to get this information from userland. On that occasion also add missing trailing newlines. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230125085728.234697-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 15:57:30 -08:00
Ira Weiny	6ebe28f9ec	cxl/mem: Read, trace, and clear events on driver load CXL devices have multiple event logs which can be queried for CXL event records. Devices are required to support the storage of at least one event record in each event log type. Devices track event log overflow by incrementing a counter and tracking the time of the first and last overflow event seen. Software queries events via the Get Event Record mailbox command; CXL rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section 8.2.9.2.3 Clear Event Records mailbox command. If the result of negotiating CXL Error Reporting Control is OS control, read and clear all event logs on driver load. Ensure a clean slate of events by reading and clearing the events on driver load. The status register is not used because a device may continue to trigger events and the only requirement is to empty the log at least once. This allows for the required transition from empty to non-empty for interrupt generation. Handling of interrupts is in a follow on patch. The device can return up to 1MB worth of event records per query. Allocate a shared large buffer to handle the max number of records based on the mailbox payload size. This patch traces a raw event record and leaves specific event record type tracing to subsequent patches. Macros are created to aid in tracing the common CXL Event header fields. Each record is cleared explicitly. A clear all bit is specified but is only valid when the log overflows. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-1-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-26 13:17:51 -08:00
Dan Williams	19398821b2	cxl/pmem: Fix nvdimm unregistration when cxl_pmem driver is absent The cxl_pmem.ko module houses the driver for both cxl_nvdimm_bridge objects and cxl_nvdimm objects. When the core creates a cxl_nvdimm it arranges for it to be autoremoved when the bridge goes down. However, if the bridge never initialized because the cxl_pmem.ko module never loaded, it sets up a the following crash scenario: BUG: kernel NULL pointer dereference, address: 0000000000000478 [..] RIP: 0010:cxl_nvdimm_probe+0x99/0x140 [cxl_pmem] [..] Call Trace: <TASK> cxl_bus_probe+0x17/0x50 [cxl_core] really_probe+0xde/0x380 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __driver_attach+0xd2/0x1c0 bus_for_each_dev+0x79/0xc0 bus_add_driver+0x1b1/0x200 driver_register+0x89/0xe0 cxl_pmem_init+0x50/0xff0 [cxl_pmem] It turns out the recent rework to simplify nvdimm probing obviated the need to unregister cxl_nvdimm objects at cxl_nvdimm_bridge ->remove() time. Leave the cxl_nvdimm device registered until the hosting cxl_memdev departs. The alternative is that the cxl_memdev needs to be reattached whenever the cxl_nvdimm_bridge attach state cycles, which is awkward and unnecessary. The only requirement is to make sure that when the cxl_nvdimm_bridge goes away any dependent cxl_nvdimm objects are shutdown. Handle that in unregister_nvdimm_bus(). With these registration entanglements removed there is no longer a need to pre-load the cxl_pmem module in cxl_acpi. Fixes: `cb9cfff82f` ("cxl/acpi: Simplify cxl_nvdimm_bridge probing") Reported-by: Gregory Price <gregory.price@memverge.com> Debugged-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167426077263.3955046.9695309346988027311.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-25 15:35:26 -08:00
Dan Williams	172738bbcc	cxl/port: Link the 'parent_dport' in portX/ and endpointX/ sysfs Similar to the justification in: `1b58b4cac6` ("cxl/port: Record parent dport when adding ports") ...userspace wants to know the routing information for ports for calculating the memdev order for region creation among other things. Cache the information the kernel discovers at enumeration time in a 'parent_dport' attribute to save userspace the time of trawling sysfs to recover the same information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167124082375.1626103.6047000000121298560.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-25 15:32:57 -08:00
Dan Williams	af3ea9ab61	cxl/region: Clarify when a cxld->commit() callback is mandatory Both cxl_switch_decoders() and cxl_endpoint_decoders() are considered by cxl_region_decode_commit(). Flag cases where cxl_switch_decoders with multiple targets, or cxl_endpoint_decoders do not have a commit callback set. The switch case is unlikely to happen since switches are only enumerated by the CXL core, but the endpoint case may support decoders defined by drivers outside of drivers/cxl, like accerator drivers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167124081824.1626103.1555704405392757219.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-25 15:29:17 -08:00
Robert Richter	852db33c6c	cxl/pci: Show opcode in debug messages when sending a command For debugging it is very helpful to see which commands are sent. Add it to the debug message. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20230103210151.1126873-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-24 17:52:54 -08:00
Dave Jiang	2ec1b17f74	cxl: fix cxl_report_and_clear() RAS UE addr mis-assignment 'addr' that contains RAS UE register address is re-assigned to RAS_CAP_CONTROL offset if there are multiple UE errors. Use different addr variable to avoid the reassignment mistake. Fixes: `2905cb5236` ("cxl/pci: Add (hopeful) error handling support") Reported-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167302318779.580155.15233596744650706167.stgit@djiang5-mobl3.local Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-09 12:55:54 -08:00
Davidlohr Bueso	e520d52d7c	cxl/region: Only warn about cpu_cache_invalidate_memregion() once No need for more than once per module load. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20221215183836.24136-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-05 15:01:45 -08:00
Dan Williams	4a20bc3e20	cxl/pci: Move tracepoint definitions to drivers/cxl/core/ CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-04 17:11:11 -08:00
Fan Ni	f04facfb99	cxl/region: Fix memdev reuse check Due to a typo, the check of whether or not a memdev has already been used as a target for the region (above code piece) will always be skipped. Given a memdev with more than one HDM decoder, an interleaved region can be created that maps multiple HPAs to the same DPA. According to CXL spec 3.0 8.1.3.8.4, "Aliasing (mapping more than one Host Physical Address (HPA) to a single Device Physical Address) is forbidden." Fix this by using existing iterator for memdev reuse check. Cc: <stable@vger.kernel.org> Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Signed-off-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20221107212153.745993-1-fan.ni@samsung.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-08 13:03:47 -08:00
Dan Williams	9b5f77efb0	cxl/pci: Remove endian confusion readl() already handles endian conversion. That's the main difference between readl() and __raw_readl(). This is benign on little-endian systems, but big endian systems will end up byte-swabbing twice. Fixes: `2905cb5236` ("cxl/pci: Add (hopeful) error handling support") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030092025.4045167.10651070153523351093.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:38:12 -08:00
Dan Williams	372ab3bc37	cxl/pci: Add some type-safety to the AER trace points The first argument to the CXL AER trace points is the source device. Pass a 'const struct device ' rather than a 'const char ' for more type precision / safety. Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030091477.4045167.15174636482098463885.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:37:52 -08:00
Dan Williams	7fe898041f	cxl/security: Drop security command ioctl uapi CXL PMEM security operations are routed through the NVDIMM sysfs interface. For this reason the corresponding commands are marked "exclusive" to preclude collisions between the ioctl ABI and the sysfs ABI. However, a better way to preclude that collision is to simply remove the ioctl ABI (command-id definitions) for those operations. Now that cxl_internal_send_cmd() (formerly cxl_mbox_send_cmd()) no longer needs to talk the cxl_mem_commands array, all of the uapi definitions for the security commands can be dropped. These never appeared in a released kernel, so no regression risk. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030056464.4044561.11486507095384253833.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:36:02 -08:00
Dan Williams	2aeaf663b8	cxl/mbox: Add variable output size validation for internal commands cxl_internal_send_cmd() skips output size validation for variable output commands which is not ideal. Most of the time internal usages want to fail if the output size does not match what was requested. For other commands where the caller cannot predict the size there is usually a a header that conveys how much vaild data is in the payload. For those cases add @min_out as a parameter to specify what the minimum response payload needs to be for the caller to parse the rest of the payload. In this patch only Get Supported Logs has that behavior, but going forward records retrieval commands like Get Poison List and Get Event Records can use @min_out to retrieve a variable amount of records. Critically, this validation scheme skips the needs to interrogate the cxl_mem_commands array which in turn frees up the implementation to support internal command enabling without also enabling external / user commands. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030055918.4044561.10339573829837910505.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:36:02 -08:00
Dan Williams	5331cdf44d	cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size Internally cxl_mbox_send_cmd() converts all passed-in parameters to a 'struct cxl_mbox_cmd' instance and sends that to cxlds->mbox_send(). It then teases the possibilty that the caller can validate the output size. However, they cannot since the resulting output size is not conveyed to the called. Fix that by making the caller pass in a constructed 'struct cxl_mbox_cmd'. This prepares for a future patch to add output size validation on a per-command basis. Given the change in signature, also change the name to differentiate it from the user command submission path that performs more validation before generating the 'struct cxl_mbox_cmd' instance to execute. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030055370.4044561.17788093375112783036.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:36:02 -08:00
Dan Williams	f5ee4cc19c	cxl/security: Fix Get Security State output payload endian handling Multi-byte integer values in CXL mailbox payloads are little endian. Add a definition of the Get Security State output payload and convert the value before testing flags. Fixes: `3282811555` ("cxl/pmem: Introduce nvdimm_security_ops with ->get_flags() operation") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/167030054822.4044561.4917796262037689553.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-06 14:36:02 -08:00
Dave Jiang	c99b2e8cf7	cxl: update names for interleave ways conversion macros Change names for interleave ways macros to clearly indicate which variable is encoded and which is the actual ways value. ways == interleave ways eiw == encoded interleave ways Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027516228.3124679.11265039496968588580.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 18:17:16 -08:00
Dave Jiang	83351ddb78	cxl: update names for interleave granularity conversion macros Change names for granularity macros to clearly indicate which variable is encoded and which is the actual granularity. granularity == interleave granularity eig == encoded interleave granularity Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027493237.3124429.8948852388671827664.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 18:17:16 -08:00
Robert Richter	d3cdf4585f	cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry After parsing for a CHBCR in cxl_get_chbcr() the case of (ctx.chbcr == CXL_RESOURCE_NONE) is a slighly different error reason than the !ctx.chbcr case. In the first case the CHBS was found but the CHBCR was invalid or something else failed to determine it, while in the latter case no CHBS entry exists at all. Update the warning message to reflect this. The log messages for both cases can be differentiated now and the reason for a failure can be determined better. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167027170051.3542509.10494781536638424397.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 15:37:09 -08:00
Alison Schofield	14628aec84	cxl/acpi: Fail decoder add if CXIMS for HBIG is missing The BIOS provided CXIMS (CXL XOR Interleave Math Structure) is required for calculating a targets position in an interleave list during region creation. The CXL driver expects to discover a CXIMS that matches the HBIG (Host Bridge Interleave Granularity) and stores the xormaps found in that CXIMS for retrieval during region creation. If there is no CXIMS for an HBIG, no maps are stored. That leads to a NULL pointer dereference at xormap retrieval during region creation. Add a check during ACPI probe for the case of no matching CXIMS. Emit an error message and fail to add the decoder. Fixes: `f9db85bfec` ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)") Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20221205002951.1788783-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 12:33:20 -08:00
Colin Ian King	cb4cdf74bd	cxl/region: Fix spelling mistake "memergion" -> "memregion" There is a spelling mistake in a dev_warn message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20221205091819.1943564-1-colin.i.king@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 12:33:20 -08:00
Dan Williams	397cd26581	cxl/regs: Fix sparse warning The 0day robot belatedly points out that @addr is not properly tagged as an iomap pointer: "drivers/cxl/core/regs.c:332:14: sparse: sparse: incorrect type in assignment (different address spaces) @@ expected void addr @@ got void [noderef] __iomem @@" Fixes: 1168271ca054 ("cxl/acpi: Extract component registers of restricted hosts from RCRB") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/167008768190.2516013.11918622906007677341.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 12:33:20 -08:00
Dan Williams	02fedf1466	Merge branch 'for-6.2/cxl-xor' into for-6.2/cxl Pick up support for "XOR" interleave math when parsing ACPI CFMWS window structures. Fix up conflicts with the RCH emulation already pending in cxl/next.	2022-12-05 12:32:11 -08:00
Dan Williams	e0f6fa0d42	Merge branch 'for-6.2/cxl-aer' into for-6.2/cxl Pick up CXL AER handling and correctable error extensions. Resolve conflicts with cxl_pmem_wq reworks and RCH support.	2022-12-05 12:31:30 -08:00
Dan Williams	95dddcb5e8	Merge branch 'for-6.2/cxl-security' into for-6.2/cxl Pick CXL PMEM security commands for v6.2. Resolve conflicts with the removal of the cxl_pmem_wq.	2022-12-05 12:30:38 -08:00
Dan Williams	0a19bfc8de	cxl/port: Add RCD endpoint port enumeration Unlike a CXL memory expander in a VH topology that has at least one intervening 'struct cxl_port' instance between itself and the CXL root device, an RCD attaches one-level higher. For example: VH ┌──────────┐ │ ACPI0017 │ │ root0 │ └─────┬────┘ │ ┌─────┴────┐ │ dport0 │ ┌─────┤ ACPI0016 ├─────┐ │ │ port1 │ │ │ └────┬─────┘ │ │ │ │ ┌──┴───┐ ┌──┴───┐ ┌───┴──┐ │dport0│ │dport1│ │dport2│ │ RP0 │ │ RP1 │ │ RP2 │ └──────┘ └──┬───┘ └──────┘ │ ┌───┴─────┐ │endpoint0│ │ port2 │ └─────────┘ ...vs: RCH ┌──────────┐ │ ACPI0017 │ │ root0 │ └────┬─────┘ │ ┌───┴────┐ │ dport0 │ │ACPI0016│ └───┬────┘ │ ┌────┴─────┐ │endpoint0 │ │ port1 │ └──────────┘ So arrange for endpoint port in the RCH/RCD case to appear directly connected to the host-bridge in its singular role as a dport. Compare that to the VH case where the host-bridge serves a dual role as a 'cxl_dport' for the CXL root device and a 'cxl_port' upstream port for the Root Ports in the Root Complex that are modeled as 'cxl_dport' instances in the CXL topology. Another deviation from the VH case is that RCDs may need to look up their component registers from the Root Complex Register Block (RCRB). That platform firmware specified RCRB area is cached by the cxl_acpi driver and conveyed via the host-bridge dport to the cxl_mem driver to perform the cxl_rcrb_to_component() lookup for the endpoint port (See 9.11.8 CXL Devices Attached to an RCH for the lookup of the upstream port component registers). Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Camerom <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 10:32:26 -08:00
Dan Williams	7592d935b7	cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem tl;dr: Clean up an unnecessary export and enable cxl_test. An RCD (Restricted CXL Device), in contrast to a typical CXL device in a VH topology, obtains its component registers from the bottom half of the associated CXL host bridge RCRB (Root Complex Register Block). In turn this means that cxl_rcrb_to_component() needs to be called from devm_cxl_add_endpoint(). Presently devm_cxl_add_endpoint() is part of the CXL core, but the only user is the CXL mem module. Move it from cxl_core to cxl_mem to not only get rid of an unnecessary export, but to also enable its call out to cxl_rcrb_to_component(), in a subsequent patch, to be mocked by cxl_test. Recall that cxl_test can only mock exported symbols, and since cxl_rcrb_to_component() is itself inside the core, all callers must be outside of cxl_core to allow cxl_test to mock it. Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993045072.1882361.13944923741276843683.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-05 10:32:26 -08:00
Alison Schofield	f9db85bfec	cxl/acpi: Support CXL XOR Interleave Math (CXIMS) When the CFMWS is using XOR math, parse the corresponding CXIMS structure and store the xormaps in the root decoder structure. Use the xormaps in a new lookup, cxl_hb_xor(), to find a targets entry in the host bridge interleave target list. Defined in CXL Specfication 3.0 Section: 9.17.1 Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/5794813acdf7b67cfba3609c6aaff46932fa38d0.1669847017.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 16:54:35 -08:00
Dave Jiang	6155ccc9dd	cxl/pci: Add callback to log AER correctable error Add AER error handler callback to read the RAS capability structure correctable error (CE) status register for the CXL device. Log the error as a trace event and clear the error. For CXL devices, the driver also needs to write back to the status register to clear the unmasked correctable errors. See CXL spec rev3.0 8.2.4.16 for RAS capability structure CE Status Register. Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166985287203.2871899.13605149073500556137.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:56 -08:00
Dan Williams	2905cb5236	cxl/pci: Add (hopeful) error handling support Add nominal error handling that tears down CXL.mem in response to error notifications that imply a device reset. Given some CXL.mem may be operating as System RAM, there is a high likelihood that these error events are fatal. However, if the system survives the notification the expectation is that the driver behavior is equivalent to a hot-unplug and re-plug of an endpoint. Note that this does not change the mask values from the default. That awaits CXL _OSC support to determine whether platform firmware is in control of the mask registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974413966.1608150.15522782911404473932.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:17 -08:00
Dave Jiang	2f6e9c3051	cxl/pci: add tracepoint events for CXL RAS Add tracepoint events for recording the CXL uncorrectable and correctable errors. For uncorrectable errors, there is additional data of 512B from the header log register (CXL spec rev3 8.2.4.16.7). The trace event will intake a dynamic array that will dump the entire Header Log data. If multiple errors are set in the status register, then the 'first error' field (CXL spec rev3 v8.2.4.16.6) is read from the Error Capabilities and Control Register in order to determine the error. This implementation does not include CXL IDE Error details. Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/166974413388.1608150.5875712482260436188.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:17 -08:00
Dan Williams	bd09626b39	cxl/pci: Find and map the RAS Capability Structure The RAS Capability Structure has some ancillary information that may be relevant with respect to AER events, link and protcol error status registers. Map the RAS Capability Registers in support of defining a 'struct pci_error_handlers' instance for the cxl_pci driver. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412803.1608150.7096566580400947001.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:17 -08:00
Dan Williams	a1554e9cac	cxl/pci: Prepare for mapping RAS Capability Structure The RAS Capabilitiy Structure is a CXL Component register capability block. Unlike the HDM Decoder Capability, it will be referenced by the cxl_pci driver in response to PCIe AER events. Due to this it is no longer the case that cxl_map_component_regs() can assume that it should map all component registers. Plumb a bitmask of capability ids to map through cxl_map_component_regs(). For symmetry cxl_probe_device_regs() is updated to populate @id in 'struct cxl_reg_map' even though cxl_map_device_regs() does not have a need to map a subset of the device registers per caller. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974412214.1608150.11487843455070795378.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:17 -08:00
Dan Williams	920d8d2c60	cxl/port: Limit the port driver to just the HDM Decoder Capability Update the port driver to use cxl_map_component_registers() so that the component register block can be shared between the cxl_pci driver and the cxl_port driver. I.e. stop the port driver from reserving the entire component register block for itself via request_region() when it only needs the HDM Decoder Capability subset. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411625.1608150.7149373371599960307.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:16 -08:00
Dan Williams	6c7f4f1e51	cxl/core/regs: Make cxl_map_{component, device}_regs() device generic There is no need to carry the barno and the block offset through the stack, just convert them to a resource base immediately. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974411035.1608150.8605988708101648442.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:16 -08:00
Dan Williams	43a2fb3aef	cxl/pci: Kill cxl_map_regs() The component registers are currently unused by the cxl_pci driver. Only the physical address base of the component registers is conveyed to the cxl_mem driver. Just call cxl_map_device_registers() directly. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974410443.1608150.15855499736133349600.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:16 -08:00
Dan Williams	1191ca102d	cxl/pci: Cleanup cxl_map_device_regs() Use a loop to reduce the duplicated code in cxl_map_device_regs(). This is in preparation for deleting cxl_map_regs(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974409867.1608150.14886452053935226038.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:16 -08:00
Dan Williams	af2dfef854	cxl/pci: Cleanup repeated code in cxl_probe_regs() helpers Rather then duplicating the setting of valid, length, and offset for each type, just convey a pointer to the register map to common code. Yes, the change in cxl_probe_component_regs() does not save any lines of code, but it is preparation for adding another component register type to map (RAS Capability Structure). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166974409293.1608150.17661353937678581423.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 13:40:16 -08:00
Robert Richter	d5b1a27143	cxl/acpi: Extract component registers of restricted hosts from RCRB A downstream port must be connected to a component register block. For restricted hosts the base address is determined from the RCRB. The RCRB is provided by the host's CEDT CHBS entry. Rework CEDT parser to get the RCRB and add code to extract the component register block from it. RCRB's BAR[0..1] point to the component block containing CXL subsystem component registers. MEMBAR extraction follows the PCI base spec here, esp. 64 bit extraction and memory range alignment (6.0, 7.5.1.2.1). The RCRB base address is cached in the cxl_dport per-host bridge so that the upstream port component registers can be retrieved later by an RCD (RCIEP) associated with the host bridge. Note: Right now the component register block is used for HDM decoder capability only which is optional for RCDs. If unsupported by the RCD, the HDM init will fail. It is future work to bypass it in this case. Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/Y4dsGZ24aJlxSfI1@rric.localdomain [djbw: introduce devm_cxl_add_rch_dport()] Link: https://lore.kernel.org/r/166993044524.1882361.2539922887413208807.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 00:40:29 -08:00
Dan Williams	d18bc74ace	cxl/region: Manage CPU caches relative to DPA invalidation events A "DPA invalidation event" is any scenario where the contents of a DPA (Device Physical Address) is modified in a way that is incoherent with CPU caches, or if the HPA (Host Physical Address) to DPA association changes due to a remapping event. PMEM security events like Unlock and Passphrase Secure Erase already manage caches through LIBNVDIMM, so that leaves HPA to DPA remap events that need cache management by the CXL core. Those only happen when the boot time CXL configuration has changed. That event occurs when userspace attaches an endpoint decoder to a region configuration, and that region is subsequently activated. The implications of not invalidating caches between remap events is that reads from the region at different points in time may return different results due to stale cached data from the previous HPA to DPA mapping. Without a guarantee that the region contents after cxl_region_probe() are written before being read (a layering-violation assumption that cxl_region_probe() can not make) the CXL subsystem needs to ensure that reads that precede writes see consistent results. A CONFIG_CXL_REGION_INVALIDATION_TEST option is added to support debug and unit testing of the CXL implementation in QEMU or other environments where cpu_cache_has_invalidate_memregion() returns false. This may prove too restrictive for QEMU where the HDM decoders are emulated, but in that case the CXL subsystem needs some new mechanism / indication that the HDM decoder is emulated and not a passthrough of real hardware. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166993222098.1995348.16604163596374520890.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-03 00:03:57 -08:00
Dan Williams	07cb5f705b	cxl/pmem: Enforce keyctl ABI for PMEM security Preclude the possibility of user tooling sending device secrets in the clear into the kernel by marking the security commands as exclusive. This mandates the usage of the keyctl ABI for managing the device passphrase. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/166993221008.1995348.11651567302609703175.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:52:32 -08:00
Dan Williams	bf3e5da8cb	cxl/region: Fix missing probe failure cxl_region_probe() allows for regions not in the 'commit' state to be enabled. Fail probe when the region is not committed otherwise the kernel may indicate that an address range is active when none of the decoders are active. Fixes: `8d48817df6` ("cxl/region: Add region driver boiler plate") Cc: <stable@vger.kernel.org> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/166993220462.1995348.1698008475198427361.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:52:32 -08:00
Dave Jiang	b5807c80b5	cxl: add dimm_id support for __nvdimm_create() Set the cxlds->serial as the dimm_id to be fed to __nvdimm_create(). The security code uses that as the key description for the security key of the memory device. The nvdimm unlock code cannot find the respective key without the dimm_id. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166863357043.80269.4337575149671383294.stgit@djiang5-desk3.ch.intel.com Link: https://lore.kernel.org/r/166983620459.2734609.10175456773200251184.stgit@djiang5-desk3.ch.intel.com Link: https://lore.kernel.org/r/166993219918.1995348.10786511454826454601.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:52:32 -08:00
Robert Richter	1dedb6f3cf	cxl/ACPI: Register CXL host ports by bridge device A port of a CXL host bridge links to the bridge's ACPI device (&adev->dev) with its corresponding uport/dport device (uport_dev and dport_dev respectively). The device is not a direct parent device in the PCI topology as pdev->dev.parent points to a PCI bridge's (struct pci_host_bridge) device. The following CXL memory device hierarchy would be valid for an endpoint once an RCD EP would be enabled (note this will be done in a later patch): VH mode: cxlmd->dev.parent->parent ^^^\^^^^^^\ ^^^^^^\ \ \ pci_dev (Type 1, Downstream Port) \ pci_dev (Type 0, PCI Express Endpoint) cxl mem device RCD mode: cxlmd->dev.parent->parent ^^^\^^^^^^\ ^^^^^^\ \ \ pci_host_bridge \ pci_dev (Type 0, RCiEP) cxl mem device In VH mode a downstream port is created by port enumeration and thus always exists. Now, in RCD mode the host bridge also already exists but it references to an ACPI device. A port lookup by the PCI device's parent device will fail as a direct link to the registered port is missing. The ACPI device of the bridge must be determined first. To prevent this, change port registration of a CXL host to use the bridge device instead. Do this also for the VH case as port topology will better reflect the PCI topology then. Signed-off-by: Robert Richter <rrichter@amd.com> [djbw: rebase on brige mocking] Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166993043978.1882361.16238060349889579369.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:23:06 -08:00
Dan Williams	8b3b1c0dc5	tools/testing/cxl: Make mock CEDT parsing more robust Accept any cxl_test topology device as the first argument in cxl_chbs_context. This is in preparation for reworking the detection of the component registers across VH and RCH topologies. Move mock_acpi_table_parse_cedt() beneath the definition of is_mock_port() and use is_mock_port() instead of the explicit mock cxl_acpi device check. Acked-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166993043433.1882361.17651413716599606118.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:15:16 -08:00
Dan Williams	4029c32fb6	cxl/acpi: Move rescan to the workqueue Now that the cxl_mem driver has a need to take the root device lock, the cxl_bus_rescan() needs to run outside of the root lock context. That need arises from RCH topologies and the locking that the cxl_mem driver does to attach a descendant to an upstream port. In the RCH case the lock needed is the CXL root device lock [1]. Link: http://lore.kernel.org/r/166993045621.1882361.1730100141527044744.stgit@dwillia2-xfh.jf.intel.com [1] Tested-by: Robert Richter <rrichter@amd.com> Link: http://lore.kernel.org/r/166993042884.1882361.5633723613683058881.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:10:20 -08:00
Dan Williams	03ff079aa6	cxl/pmem: Remove the cxl_pmem_wq and related infrastructure Now that cxl_nvdimm and cxl_pmem_region objects are torn down sychronously with the removal of either the bridge, or an endpoint, the cxl_pmem_wq infrastructure can be jettisoned. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993042335.1882361.17022872468068436287.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:07:22 -08:00
Dan Williams	f17b558d66	cxl/pmem: Refactor nvdimm device registration, delete the workqueue The three objects 'struct cxl_nvdimm_bridge', 'struct cxl_nvdimm', and 'struct cxl_pmem_region' manage CXL persistent memory resources. The bridge represents base platform resources, the nvdimm represents one or more endpoints, and the region is a collection of nvdimms that contribute to an assembled address range. Their relationship is such that a region is torn down if any component endpoints are removed. All regions and endpoints are torn down if the foundational bridge device goes down. A workqueue was deployed to manage these interdependencies, but it is difficult to reason about, and fragile. A recent attempt to take the CXL root device lock in the cxl_mem driver was reported by lockdep as colliding with the flush_work() in the cxl_pmem flows. Instead of the workqueue, arrange for all pmem/nvdimm devices to be torn down immediately and hierarchically. A similar change is made to both the 'cxl_nvdimm' and 'cxl_pmem_region' objects. For bisect-ability both changes are made in the same patch which unfortunately makes the patch bigger than desired. Arrange for cxl_memdev and cxl_region to register a cxl_nvdimm and cxl_pmem_region as a devres release action of the bridge device. Additionally, include a devres release action of the cxl_memdev or cxl_region device that triggers the bridge's release action if an endpoint exits before the bridge. I.e. this allows either unplugging the bridge, or unplugging and endpoint to result in the same cleanup actions. To keep the patch smaller the cleanup of the now defunct workqueue infrastructure is saved for a follow-on patch. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041773.1882361.16444301376147207609.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:07:22 -08:00
Dan Williams	16d53cb0d6	cxl/region: Drop redundant pmem region release handling Now that a cxl_nvdimm object can only experience ->remove() via an unregistration event (because the cxl_nvdimm bind attributes are suppressed), additional cleanups are possible. It is already the case that the removal of a cxl_memdev object triggers ->remove() on any associated region. With that mechanism in place there is no need for the cxl_nvdimm removal to trigger the same. Just rely on cxl_region_detach() to tear down the whole cxl_pmem_region. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993041215.1882361.6321535567798911286.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-02 23:06:29 -08:00
Dan Williams	cb9cfff82f	cxl/acpi: Simplify cxl_nvdimm_bridge probing The 'struct cxl_nvdimm_bridge' object advertises platform CXL PMEM resources. It coordinates with libnvdimm to attach nvdimm devices and regions for each corresponding CXL object. That coordination is complicated, i.e. difficult to reason about, and it turns out redundant. It is already the case that the CXL core knows how to tear down a cxl_region when a cxl_memdev goes through ->remove(), so that pathway can be extended to directly cleanup cxl_nvdimm and cxl_pmem_region objects. Towards the goal of ripping out the cxl_nvdimm_bridge state machine, arrange for cxl_acpi to optionally pre-load the cxl_pmem driver so that the nvdimm bridge is active synchronously with devm_cxl_add_nvdimm_bridge(), and remove all the bind attributes for the cxl_nvdimm* objects since the cxl root device and cxl_memdev bind attributes are sufficient. Tested-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/166993040668.1882361.7450361097265836752.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 15:52:36 -08:00
Dave Jiang	452996fa07	cxl/pmem: add provider name to cxl pmem dimm attribute group Add provider name in order to associate cxl test dimm from cxl_test to the cxl pmem device when going through sysfs for security testing. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983618174.2734609.15600031015423828810.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	bd429e5355	cxl/pmem: add id attribute to CXL based nvdimm Add an id group attribute for CXL based nvdimm object. The addition allows ndctl to display the "unique id" for the nvdimm. The serial number for the CXL memory device will be used for this id. [ { "dev":"nmem10", "id":"0x4", "security":"disabled" }, ] The id attribute is needed by the ndctl security key management to setup a keyblob with a unique file name tied to the mem device. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983617029.2734609.8251308562882142281.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	dcedadfae2	nvdimm/cxl/pmem: Add support for master passphrase disable security command The original nvdimm_security_ops ->disable() only supports user passphrase for security disable. The CXL spec introduced the disabling of master passphrase. Add a ->disable_master() callback to support this new operation and leaving the old ->disable() mechanism alone. A "disable_master" command is added for the sysfs attribute in order to allow command to be issued from userspace. ndctl will need enabling in order to utilize this new operation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983616454.2734609.14204031148234398086.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	3b502e886d	cxl/pmem: Add "Passphrase Secure Erase" security command support Create callback function to support the nvdimm_security_ops() ->erase() callback. Translate the operation to send "Passphrase Secure Erase" security command for CXL memory device. When the mem device is secure erased, cpu_cache_invalidate_memregion() is called in order to invalidate all CPU caches before attempting to access the mem device again. See CXL 3.0 spec section 8.2.9.8.6.6 for reference. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983615293.2734609.10358657600295932156.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	2bb692f7a6	cxl/pmem: Add "Unlock" security command support Create callback function to support the nvdimm_security_ops() ->unlock() callback. Translate the operation to send "Unlock" security command for CXL mem device. When the mem device is unlocked, cpu_cache_invalidate_memregion() is called in order to invalidate all CPU caches before attempting to access the mem device. See CXL rev3.0 spec section 8.2.9.8.6.4 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983614167.2734609.15124543712487741176.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	a072f7b797	cxl/pmem: Add "Freeze Security State" security command support Create callback function to support the nvdimm_security_ops() ->freeze() callback. Translate the operation to send "Freeze Security State" security command for CXL memory device. See CXL rev3.0 spec section 8.2.9.8.6.5 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983613019.2734609.10645754779802492122.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	c4ef680d0b	cxl/pmem: Add Disable Passphrase security command support Create callback function to support the nvdimm_security_ops ->disable() callback. Translate the operation to send "Disable Passphrase" security command for CXL memory device. The operation supports disabling a passphrase for the CXL persistent memory device. In the original implementation of nvdimm_security_ops, this operation only supports disabling of the user passphrase. This is due to the NFIT version of disable passphrase only supported disabling of user passphrase. The CXL spec allows disabling of the master passphrase as well which nvidmm_security_ops does not support yet. In this commit, the callback function will only support user passphrase. See CXL rev3.0 spec section 8.2.9.8.6.3 for reference. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983611878.2734609.10602135274526390127.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	997469407f	cxl/pmem: Add "Set Passphrase" security command support Create callback function to support the nvdimm_security_ops ->change_key() callback. Translate the operation to send "Set Passphrase" security command for CXL memory device. The operation supports setting a passphrase for the CXL persistent memory device. It also supports the changing of the currently set passphrase. The operation allows manipulation of a user passphrase or a master passphrase. See CXL rev3.0 spec section 8.2.9.8.6.2 for reference. However, the spec leaves a gap WRT master passphrase usages. The spec does not define any ways to retrieve the status of if the support of master passphrase is available for the device, nor does the commands that utilize master passphrase will return a specific error that indicates master passphrase is not supported. If using a device does not support master passphrase and a command is issued with a master passphrase, the error message returned by the device will be ambiguous. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983610751.2734609.4445075071552032091.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-12-01 12:42:35 -08:00
Dave Jiang	3282811555	cxl/pmem: Introduce nvdimm_security_ops with ->get_flags() operation Add nvdimm_security_ops support for CXL memory device with the introduction of the ->get_flags() callback function. This is part of the "Persistent Memory Data-at-rest Security" command set for CXL memory device support. The ->get_flags() function provides the security state of the persistent memory device defined by the CXL 3.0 spec section 8.2.9.8.6.1. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166983609611.2734609.13231854299523325319.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-30 16:30:47 -08:00
Adam Manzanares	3b39fd6cf1	cxl: Replace HDM decoder granularity magic numbers When reviewing the CFMWS parsing code that deals with the HDM decoders, I noticed a couple of magic numbers. This commit replaces these magic numbers with constants defined by the CXL 3.0 specification. v2: - Change references to CXL 3.0 specification (David) - CXL_DECODER_MAX_GRANULARITY_ORDER -> CXL_DECODER_MAX_ENCODED_IG (Dan) Signed-off-by: Adam Manzanares <a.manzanares@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220829220249.243888-1-a.manzanares@samsung.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 12:11:27 -08:00
Robert Richter	b51d767521	cxl/acpi: Improve debug messages in cxl_acpi_probe() In cxl_acpi_probe() the iterator bus_for_each_dev() walks through all CXL hosts. Since all dev_*() debug messages point to the ACPI0017 device which is the CXL root for all hosts, the device information is pointless as it is always the same device. Change this to use the host device for this instead. Also, add additional host specific information such as CXL support, UID and CHBCR. This is an example log: acpi ACPI0016:00: UID found: 4 acpi ACPI0016:00: CHBCR found: 0x28090000000 acpi ACPI0016:00: dport added to root0 acpi ACPI0016:00: host-bridge: ACPI0016:00 pci0000:7f: host supports CXL Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-6-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 12:11:27 -08:00
Robert Richter	58eef878fc	cxl: Unify debug messages when calling devm_cxl_add_dport() CXL dports are added in a couple of code paths using devm_cxl_add_dport(). Debug messages are individually generated, but are incomplete and inconsistent. Change this by moving its generation to devm_cxl_add_dport(). This unifies the messages and reduces code duplication. Also, generate messages on failure. Use a __devm_cxl_add_dport() wrapper to keep the readability of the error exits. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-5-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	f3cd264c4e	cxl: Unify debug messages when calling devm_cxl_add_port() CXL ports are added in a couple of code paths using devm_cxl_add_port(). Debug messages are individually generated, but are incomplete and inconsistent. Change this by moving its generation to devm_cxl_add_port(). This unifies the messages and reduces code duplication. Also, generate messages on failure. Use a __devm_cxl_add_port() wrapper to keep the readability of the error exits. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-4-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	3bb80da51b	cxl/core: Check physical address before mapping it in devm_cxl_iomap_block() The physical base address of a CXL range can be invalid and is then set to CXL_RESOURCE_NONE. In general software shall prevent such situations, but it is hard to proof this may never happen. E.g. in add_port_attach_ep() there this the following: component_reg_phys = find_component_registers(uport_dev); port = devm_cxl_add_port(&parent_port->dev, uport_dev, component_reg_phys, parent_dport); find_component_registers() and subsequent functions (e.g. cxl_regmap_to_base()) may return CXL_RESOURCE_NONE. But it is written to port without any further check in cxl_port_alloc(): port->component_reg_phys = component_reg_phys; It is then later directly used in devm_cxl_setup_hdm() to map io ranges with devm_cxl_iomap_block(). Just an example... Check this condition. Also do not fail silently like an ioremap() failure, use a WARN_ON_ONCE() for it. Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20221018132341.76259-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Robert Richter	fa89248e66	cxl/core: Remove duplicate declaration of devm_cxl_iomap_block() The function devm_cxl_iomap_block() is only used in the core code. There are two declarations in header files of it, in drivers/cxl/core/core.h and drivers/cxl/cxl.h. Remove its unused declaration in drivers/cxl/cxl.h. Fixing build error in regs.c found by kernel test robot by including "core.h" there. Signed-off-by: Robert Richter <rrichter@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20221018132341.76259-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:37:08 -08:00
Ira Weiny	487d828d75	cxl/doe: Request exclusive DOE access The PCIE Data Object Exchange (DOE) mailbox is a protocol run over configuration cycles. It assumes one initiator at a time. While the kernel has control of the mailbox user space writes could interfere with the kernel access. Mark DOE mailbox config space exclusive when iterated by the CXL driver. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-14 10:07:22 -08:00
Dan Williams	8f401ec1c8	cxl/region: Recycle region ids At region creation time the next region-id is atomically cached so that there is predictability of region device names. If that region is destroyed and then a new one is created the region id increments. That ends up looking like a memory leak, or is otherwise surprising that identifiers roll forward even after destroying all previously created regions. Try to reuse rather than free old region ids at region release time. While this fixes a cosmetic issue, the needlessly advancing memory region-id gives the appearance of a memory leak, hence the "Fixes" tag, but no "Cc: stable" tag. Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Fixes: `779dd20cfb` ("cxl/region: Add region creation support") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752186062.947915.13200195701224993317.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 16:03:43 -07:00
Dan Williams	e4f6dfa9ef	cxl/region: Fix 'distance' calculation with passthrough ports When programming port decode targets, the algorithm wants to ensure that two devices are compatible to be programmed as peers beneath a given port. A compatible peer is a target that shares the same dport, and where that target's interleave position also routes it to the same dport. Compatibility is determined by the device's interleave position being >= to distance. For example, if a given dport can only map every Nth position then positions less than N away from the last target programmed are incompatible. The @distance for the host-bridge's cxl_port in a simple dual-ported host-bridge configuration with 2 direct-attached devices is 1, i.e. An x2 region divided by 2 dports to reach 2 region targets. An x4 region under an x2 host-bridge would need 2 intervening switches where the @distance at the host bridge level is 2 (x4 region divided by 2 switches to reach 4 devices). However, the distance between peers underneath a single ported host-bridge is always zero because there is no limit to the number of devices that can be mapped. In other words, there are no decoders to program in a passthrough, all descendants are mapped and distance only starts matters for the intervening descendant ports of the passthrough port. Add tracking for the number of dports mapped to a port, and use that to detect the passthrough case for calculating @distance. Cc: <stable@vger.kernel.org> Reported-by: Bobo WL <lmw.bobo@gmail.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 16:01:24 -07:00
Dan Williams	4d07ae22e7	cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak When a cxl_nvdimm object goes through a ->remove() event (device physically removed, nvdimm-bridge disabled, or nvdimm device disabled), then any associated regions must also be disabled. As highlighted by the cxl-create-region.sh test [1], a single device may host multiple regions, but the driver was only tracking one region at a time. This leads to a situation where only the last enabled region per nvdimm device is cleaned up properly. Other regions are leaked, and this also causes cxl_memdev reference leaks. Fix the tracking by allowing cxl_nvdimm objects to track multiple region associations. Cc: <stable@vger.kernel.org> Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1] Reported-by: Vishal Verma <vishal.l.verma@intel.com> Fixes: `04ad63f086` ("cxl/region: Introduce cxl_pmem_region objects") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Dan Williams	0d9e734018	cxl/region: Fix cxl_region leak, cleanup targets at region delete When a region is deleted any targets that have been previously assigned to that region hold references to it. Trigger those references to drop by detaching all targets at unregister_region() time. Otherwise that region object will leak as userspace has lost the ability to detach targets once region sysfs is torn down. Cc: <stable@vger.kernel.org> Fixes: `b9686e8c8e` ("cxl/region: Enable the assignment of endpoint decoders to regions") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752183055.947915.17681995648556534844.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Dan Williams	a90accb358	cxl/region: Fix region HPA ordering validation Some regions may not have any address space allocated. Skip them when validating HPA order otherwise a crash like the following may result: devm_cxl_add_region: cxl_acpi cxl_acpi.0: decoder3.4: created region9 BUG: kernel NULL pointer dereference, address: 0000000000000000 [..] RIP: 0010:store_targetN+0x655/0x1740 [cxl_core] [..] Call Trace: <TASK> kernfs_fop_write_iter+0x144/0x200 vfs_write+0x24a/0x4d0 ksys_write+0x69/0xf0 do_syscall_64+0x3a/0x90 store_targetN+0x655/0x1740: alloc_region_ref at drivers/cxl/core/region.c:676 (inlined by) cxl_port_attach_region at drivers/cxl/core/region.c:850 (inlined by) cxl_region_attach at drivers/cxl/core/region.c:1290 (inlined by) attach_target at drivers/cxl/core/region.c:1410 (inlined by) store_targetN at drivers/cxl/core/region.c:1453 Cc: <stable@vger.kernel.org> Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/166752182461.947915.497032805239915067.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-04 15:58:35 -07:00
Yu Zhe	4f1aa35f1f	cxl/pmem: Use size_add() against integer overflow "struct_size() + n" may cause a integer overflow, use size_add() to handle it. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20220927070247.23148-1-yuzhe@nfschina.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-03 11:20:46 -07:00
Vishal Verma	71ee71d7ad	cxl/region: Fix decoder allocation crash When an intermediate port's decoders have been exhausted by existing regions, and creating a new region with the port in question in it's hierarchical path is attempted, cxl_port_attach_region() fails to find a port decoder (as would be expected), and drops into the failure / cleanup path. However, during cleanup of the region reference, a sanity check attempts to dereference the decoder, which in the above case didn't exist. This causes a NULL pointer dereference BUG. To fix this, refactor the decoder allocation and de-allocation into helper routines, and in this 'free' routine, check that the decoder, @cxld, is valid before attempting any operations on it. Cc: <stable@vger.kernel.org> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Link: https://lore.kernel.org/r/20221101074100.1732003-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-11-01 15:33:07 -07:00
Jonathan Cameron	f010c75c05	cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. Writes to the device must include an offset and size as defined in CXL 2.0 8.2.9.5.2.4 Set LSA (Opcode 4103h) Fixes tag is non obvious as this code has been through several reworks and variable names + wasn't in use until the addition of the region code. Due to a bug in QEMU CXL emulation this overrun resulted in QEMU crashing. Reported-by: Bobo WL <lmw.bobo@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Fixes: `60b8f17215` ("cxl/pmem: Translate NVDIMM label commands to CXL label commands") Link: https://lore.kernel.org/r/20220815154044.24733-3-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Jonathan Cameron	2816e24b05	cxl/region: Fix null pointer dereference due to pass through decoder commit Not all decoders have a commit callback. The CXL specification allows a host bridge with a single root port to have no explicit HDM decoders. Currently the region driver assumes there are none. As such the CXL core creates a special pass through decoder instance without a commit callback. Prior to this patch, the ->commit() callback was called unconditionally. Thus a configuration with 1 Host Bridge, 1 Root Port, 1 switch with multiple downstream ports below which there are multiple CXL type 3 devices results in a situation where committing the region causes a null pointer dereference. Reported-by: Bobo WL <lmw.bobo@gmail.com> Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20220818164210.2084-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Jonathan Cameron	cf00b33058	cxl/mbox: Add a check on input payload size A bug in the LSA code resulted in transfers slightly larger than the mailbox size. Let us make it easier to catch similar issues in future by adding a low level check. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220815154044.24733-2-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-10-20 16:28:53 -07:00
Dan Williams	1cd8a2537e	cxl/hdm: Fix skip allocations vs multiple pmem allocations Vishal notes that when attempting to define a second pmem region on a device the DPA allocation fails with a message of the form: decoder11.1: failed to reserve skipped space Recall that the skip setting is used when there is a pmem allocation in the presence of free ram DPA space. The first pmem allocation skips over the free ram and subsequent pmem allocations do not require a skip. The bug is that a skip is still attempted and the DPA reservation code flags the double skip allocation conflict. Fixes: `cf880423b6` ("cxl/hdm: Add support for allocating DPA to an endpoint decoder") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973754730.1558392.15466392461645857658.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:11:38 -07:00
Dan Williams	4d8e4ea5bb	cxl/region: Disallow region granularity != window granularity The endpoint decode granularity must be <= the window granularity otherwise capacity in the endpoints is lost in the decode. Consider an attempt to have a region granularity of 512 with 4 devices within a window that maps 2 host bridges at a granularity of 256 bytes: HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x100 0 0 0 0x300 0x100 1 0 2 0x400 0x200 0 1 1 0x500 0x200 1 1 3 0x600 0x300 0 1 1 0x700 0x300 1 1 3 0x800 0x400 0 0 0 0x900 0x400 1 0 2 0xA00 0x500 0 0 0 0xB00 0x500 1 0 2 Notice how endpoint0 maps HPA 0x0 and 0x200 correctly, but then at HPA 0x800 it results in DPA 0x200-0x400 on being skipped. Fix this by restricing the region granularity to be equal to the window granularity resulting in the following for a x4 region under a x2 window at a granularity of 256. HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x0 0 1 1 0x300 0x0 1 1 3 0x400 0x100 0 0 0 0x500 0x100 1 0 2 0x600 0x100 0 1 1 0x700 0x100 1 1 3 Not that it ever made practical sense to support region granularity > window granularity. The window rotates host bridges causing endpoints to never see a consecutive stream of requests at the desired granularity without breaks to issue cycles to the other host bridge. Fixes: `80d10a6cee` ("cxl/region: Add interleave geometry attributes") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973127171.1526540.9923273539049172976.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:04 -07:00
Dan Williams	298d44d04b	cxl/region: Fix x1 interleave to greater than x1 interleave routing In cases where the decode fans out as it traverses downstream, the interleave granularity needs to increment to identify the port selector bits out of the remaining address bits. For example, recall that with an x2 parent port intereleave (IW == 1), the downstream decode for children of those ports will either see address bit IG+8 always set, or address bit IG+8 always clear. So if the child port needs to select a downstream port it can only use address bits starting at IG+9 (where IG and IW are the CXL encoded values for interleave granularity (ilog2(ig) - 8) and ways (ilog2(iw))). When the parent port interleave is x1 no such masking occurs and the child port can maintain the granularity that was routed to the parent port. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126583.1526540.657948655360009242.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:03 -07:00
Dan Williams	910bc55da8	cxl/region: Move HPA setup to cxl_region_attach() A recent bug fix added the setup of the endpoint decoder interleave geometry settings to cxl_region_attach(). Move the HPA setup there as well to keep all endpoint decoder parameter setting in a central location. For symmetry, move endpoint HPA teardown to cxl_region_detach(), and for switches move HPA setup / teardown to cxl_port_{setup,reset}_targets(). Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126020.1526540.14701949254436069807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 16:10:03 -07:00
Dan Williams	2901c8bded	cxl/region: Fix decoder interleave programming Jonathan notes: "Curiously interleave ways = 1 for the EPs which is obviously wrong" ...while testing the latest CXL development branch on QEMU. It turns out the region creation process failed to program the endpoint decoders. This was missed because the default settings of x1 at 4K intereleave still results in the region appearing to function. Jonathan caught the bug by reverse mapping the translations that need to happen for the QEMU support. Link: https://lore.kernel.org/r/62e95fdf9f6e2_30440294e4@dwillia2-xfh.jf.intel.com.notmuch Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165951146336.967013.11160153960900111443.stgit@dwillia2-xfh.jf.intel.com Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:19 -07:00
Bagas Sanjaya	038e6eb803	cxl/region: describe targets and nr_targets members of cxl_region_params Sphinx reported undescribed parameters in cxl_region_params struct: ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'targets' not described in 'cxl_region_params' ./drivers/cxl/cxl.h:376: warning: Function parameter or member 'nr_targets' not described in 'cxl_region_params' Describe these members. Fixes: `b9686e8c8e` ("cxl/region: Enable the assignment of endpoint decoders to regions") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220804075448.98241-3-bagasdotme@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Bagas Sanjaya	f13da0d9c3	cxl/regions: add padding for cxl_rr_ep_add nested lists Sphinx reported indentation warnings: Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:732: WARNING: Unexpected indentation. Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:733: WARNING: Block quote ends without a blank line; unexpected unindent. Documentation/driver-api/cxl/memory-devices:457: ./drivers/cxl/core/region.c:735: WARNING: Unexpected indentation. These warnings above are due to missing blank line padding in the nested list in kernel-doc comment for cxl_rr_ep_add(). Add the paddings to fix the warnings. Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220804075448.98241-2-bagasdotme@gmail.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Carpenter	9fd2cf4d6f	cxl/region: Fix IS_ERR() vs NULL check The nvdimm_pmem_region_create() function returns NULL on error. It does not return error pointers. Fixes: `04ad63f086` ("cxl/region: Introduce cxl_pmem_region objects") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/Yuo65lq2WtfdGJ0X@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	e29a8995d6	cxl/region: Fix region reference target accounting Dan reports: The error handling in cxl_port_attach_region() looks like it might have a similar bug. The cxl_rr->nr_targets++; might want a --. That function is more complicated. Indeed cxl_rr->nr_targets leaks when cxl_rr_ep_add() fails, but that flow is not clear. Fix the bug and the clarity by separating the 'new' region-reference case from the 'extend' region-reference case. This also moves the host-physical-address (HPA) validation, that the HPA of a new region being accounted to the port is greater than the HPA of all other regions associated with the port, to alloc_region_ref(). Introduce @nr_targets_inc to track when the error exit path needs to clean up cxl_rr->nr_targets. Fixes: `384e624bb2` ("cxl/region: Attach endpoint decoders") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: http://lore.kernel.org/r/165939482134.252363.1915691883146696327.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	69c9961387	cxl/region: Fix region commit uninitialized variable warning 0day robot reports: drivers/cxl/core/region.c:196 cxl_region_decode_commit() error: uninitialized symbol 'rc'. The re-checking of loop termination conditions to determine "success" makes it hard to see that @rc is initialized in all cases. Remove those to make it explicit that @rc reflects a commit error and that the rest of logic is concerned with unwinding committed decoders. This change potentially results in cxl_region_decode_reset() being called with @count == 0 where it was not called before, but cxl_region_decode_reset() treats that as a nop. Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/165951148105.967013.14191992449932268431.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	8d42854257	cxl/region: Fix port setup uninitialized variable warnings 0day robot reports: drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'eiw'. drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peig'. drivers/cxl/core/region.c:1068 cxl_port_setup_targets() error: uninitialized symbol 'peiw'. ...which are all valid reports. Add debug statement to consume the, albeit unexpected, errors. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165951147487.967013.929590444907251028.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-05 08:41:02 -07:00
Dan Williams	817b279467	cxl/region: Stop initializing interleave granularity In preparation for a patch that validates that the region ways setting is compatible with the granularity setting, the initial granularity setting needs to start at zero to indicate "unset". Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853777484.2430596.3423921169034844397.stgit@dwillia2-xfh.jf.intel.com [djbw: fix up unused variable] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:38:06 -07:00
Dan Williams	4d5c42a80b	cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime After adding support for emulating platform firmware established DPA reservations, the cxl-topology.sh [1] unit test started crashing with the following signature: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> __cxl_dpa_release+0x1b/0xd0 [cxl_core] cxl_dpa_release+0x1d/0x30 [cxl_core] release_nodes+0x63/0x90 devres_release_all+0x88/0xc0 ...i.e. a use after free of a 'struct cxl_endpoint_decoder' object. This results from the ordering of init_hdm_decoder() before add_hdm_decoder() where, at release time, the decoder is unregistered and released before the DPA reservation. Fix this by extending the life of the object until all DPA reservations have been released which also preserves platform decoder settings being settled by the time the decoder is published in sysfs (KOBJ_ADD time). Note that the @len == 0 case in __cxl_dpa_reserve() is avoided in practice as this function is only called for committed decoders and new non-zero DPA allocations. Link: https://github.com/pmem/ndctl/blob/pending/test/cxl-topology.sh [1] Fixes: `9c57cde0dc` ("cxl/hdm: Enumerate allocated DPA") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/165896020625.3546860.12390103413706292760.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	e77483055c	cxl/acpi: Minimize granularity for x1 interleaves The kernel enforces that region granularity is >= to the top-level interleave-granularity for the given CXL window. However, when the CXL window interleave is x1, i.e. non-interleaved at the host bridge level, then the specified granularity does not matter. Override the window specified granularity to the CXL minimum so that any valid region granularity is >= to the root granularity. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com [djbw: add CXL_DECODER_MIN_GRANULARITY per vishal] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	2bde6dbebc	cxl/region: Delete 'region' attribute from root decoders For switch and endpoint decoders the relationship of decoders to regions is 1:1. However, for root decoders the relationship is 1:N. Also, regions are already children of root decoders, so the 1:N relationship is observed by walking the following glob: /sys/bus/cxl/devices/$decoder/region* Hide the vestigial 'region' attribute for root decoders. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/165853776328.2430596.4647259305040072751.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:33 -07:00
Dan Williams	a53c28b6ae	cxl/acpi: Autoload driver for 'cxl_acpi' test devices In support of CXL unit tests in the ndctl project, arrange for the cxl_acpi driver to load in response to the registration of cxl_test devices. Reported-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/165853775783.2430596.13637998086505316619.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 15:36:32 -07:00
Dan Carpenter	5e42bcbc3f	cxl/region: decrement ->nr_targets on error in cxl_region_attach() The ++ needs a match -- on the clean up path. If the p->nr_targets value gets to be more than 16 it leads to uninitialized data in cxl_port_setup_targets(). drivers/cxl/core/region.c:995 cxl_port_setup_targets() error: uninitialized symbol 'eiw'. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YuepCvUAoCtdpcoO@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 14:24:34 -07:00
Dan Carpenter	c7e3548cac	cxl/region: prevent underflow in ways_to_cxl() The "ways" variable comes from the user. The ways_to_cxl() function has an upper bound but it doesn't check for negatives. Make the "ways" variable an unsigned int to fix this bug. Fixes: `80d10a6cee` ("cxl/region: Add interleave geometry attributes") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yueo3NV2hFCXx1iV@kili [djbw: fixup interleave_ways_store() to only accept unsigned input] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 12:12:33 -07:00
Dan Carpenter	88ab1dde79	cxl/region: uninitialized variable in alloc_hpa() This should check "p->res" instead of "res" (which is uninitialized). Fixes: `23a22cd1c9` ("cxl/region: Allocate HPA capacity to regions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Yueor88I/DkVSOtL@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-08-01 12:12:33 -07:00
Dan Williams	04ad63f086	cxl/region: Introduce cxl_pmem_region objects The LIBNVDIMM subsystem is a platform agnostic representation of system NVDIMM / persistent memory resources. To date, the CXL subsystem's interaction with LIBNVDIMM has been to register an nvdimm-bridge device and cxl_nvdimm objects to proxy CXL capabilities into existing LIBNVDIMM subsystem mechanics. With regions the approach is the same. Create a new cxl_pmem_region object to proxy CXL region details into a LIBNVDIMM definition. With this enabling LIBNVDIMM can partition CXL persistent memory regions with legacy namespace labels. A follow-on patch will add CXL region label and CXL namespace label support to persist region configurations across driver reload / system-reset events. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784340111.1758207.3036498385188290968.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	99183d26ed	cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge Be careful to only disable cxl_pmem objects related to a given cxl_nvdimm_bridge. Otherwise, offline_nvdimm_bus() reaches across CXL domains and disables more than is expected. Fixes: `21083f5152` ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784339569.1758207.1557084545278004577.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	8d48817df6	cxl/region: Add region driver boiler plate The CXL region driver is responsible for routing fully formed CXL regions to one of libnvdimm, for persistent memory regions, device-dax for volatile memory regions, or just act as an enumeration placeholder if the region was setup and configuration locked by platform firmware. In the platform-firmware-setup case the expectation is that region is already accounted in the system memory map, i.e. already enabled as "System RAM". For now, just attach to CXL regions in the CXL_CONFIG_COMMIT state, and take no further action. Given this driver is just a small / simple router, include it in the core rather than its own module. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-18-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-26 12:23:01 -07:00
Dan Williams	176baefb2e	cxl/hdm: Commit decoder state to hardware After all the soft validation of the region has completed, convey the region configuration to hardware while being careful to commit decoders in specification mandated order. In addition to programming the endpoint decoder base-address, interleave ways and granularity, the switch decoder target lists are also established. While the kernel can enforce spec-mandated commit order, it can not enforce spec-mandated reset order. For example, the kernel can't stop someone from removing an endpoint device that is occupying decoderN in a switch decoder where decoderN+1 is also committed. To reset decoderN, decoderN+1 must be torn down first. That "tear down the world" implementation is saved for a follow-on patch. Callback operations are provided for the 'commit' and 'reset' operations. While those callbacks may prove useful for CXL accelerators (Type-2 devices with memory) the primary motivation is to enable a simple way for cxl_test to intercept those operations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784338418.1758207.14659830845389904356.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	27b3f8d138	cxl/region: Program target lists Once the region's interleave geometry (ways, granularity, size) is established and all the endpoint decoder targets are assigned, the next phase is to program all the intermediate decoders. Specifically, each CXL switch in the path between the endpoint and its CXL host-bridge (including the logical switch internal to the host-bridge) needs to have its decoders programmed and the target list order assigned. The difficulty in this implementation lies in determining which endpoint decoder ordering combinations are valid. Consider the cxl_test case of 2 host bridges, each of those host-bridges attached to 2 switches, and each of those switches attached to 2 endpoints for a potential 8-way interleave. The x2 interleave at the host-bridge level requires that all even numbered endpoint decoder positions be located on the "left" hand side of the topology tree, and the odd numbered positions on the other. The endpoints that are peers on the same switch need to have a position that can be routed with a dedicated address bit per-endpoint. See check_last_peer() for the details. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337827.1758207.132121746122685208.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	384e624bb2	cxl/region: Attach endpoint decoders CXL regions (interleave sets) are made up of a set of memory devices where each device maps a portion of the interleave with one of its decoders (see CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure). As endpoint decoders are identified by a provisioning tool they can be added to a region provided the region interleave properties are set (way, granularity, HPA) and DPA has been assigned to the decoder. The attach event triggers several validation checks, for example: - is the DPA sized appropriately for the region - is the decoder reachable via the host-bridges identified by the region's root decoder - is the device already active in a different region position slot - are there already regions with a higher HPA active on a given port (per CXL 2.0 8.2.5.12.20 Committing Decoder Programming) ...and the attach event affords an opportunity to collect data and resources relevant to later programming the target lists in switch decoders, for example: - allocate a decoder at each cxl_port in the decode chain - for a given switch port, how many the region's endpoints are hosted through the port - how many unique targets (next hops) does a port need to map to reach those endpoints The act of reconciling this information and deploying it to the decoder configuration is saved for a follow-on patch. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784337277.1758207.4108508181328528703.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	6aa41144e7	cxl/acpi: Add a host-bridge index lookup mechanism The ACPI CXL Fixed Memory Window Structure (CFMWS) defines multiple methods to determine which host bridge provides access to a given endpoint relative to that device's position in the interleave. The "Interleave Arithmetic" defines either a "standard modulo" / round-random algorithm, or "xormap" based algorithm which can be defined as a non-linear transform. Given that there are already more options beyond "standard modulo" and that "xormap" may turn out to be ACPI CXL specific, provide a callback for the region provisioning code to map endpoint positions back to expected host bridge id (cxl_dport target). For now just support the simple modulo math case and save the xormap for a follow-on change. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-14-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	b9686e8c8e	cxl/region: Enable the assignment of endpoint decoders to regions The region provisioning process involves allocating DPA to a set of endpoint decoders, and HPA plus the region geometry to a region device. Then the decoder is assigned to the region. At this point several validation steps can be performed to validate that the decoder is suitable to participate in the region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784336184.1758207.16403282029203949622.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:07 -07:00
Dan Williams	23a22cd1c9	cxl/region: Allocate HPA capacity to regions After a region's interleave parameters (ways and granularity) are set, add a way for regions to allocate HPA (host physical address space) from the free capacity in their parent root-decoder. The allocator for this capacity reuses the 'struct resource' based allocator used for CONFIG_DEVICE_PRIVATE. Once the tuple of "ways, granularity, [uuid], and size" is set the region configuration transitions to the CXL_CONFIG_INTERLEAVE_ACTIVE state which is a precursor to allowing endpoint decoders to be added to a region. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784335630.1758207.420216490941955417.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	80d10a6cee	cxl/region: Add interleave geometry attributes Add ABI to allow the number of devices that comprise a region to be set as well as the interleave granularity for the region. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-11-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	dd5ba0ebbd	cxl/region: Add a 'uuid' attribute The process of provisioning a region involves triggering the creation of a new region object, pouring in the configuration, and then binding that configured object to the region driver to start its operation. For persistent memory regions the CXL specification mandates that it identified by a uuid. Add an ABI for userspace to specify a region's uuid. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: simplify locking] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784334465.1758207.8224025435884752570.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-25 12:18:06 -07:00
Ben Widawsky	779dd20cfb	cxl/region: Add region creation support CXL 2.0 allows for dynamic provisioning of new memory regions (system physical address resources like "System RAM" and "Persistent Memory"). Whereas DDR and PMEM resources are conveyed statically at boot, CXL allows for assembling and instantiating new regions from the available capacity of CXL memory expanders in the system. Sysfs with an "echo $region_name > $create_region_attribute" interface is chosen as the mechanism to initiate the provisioning process. This was chosen over ioctl() and netlink() to keep the configuration interface entirely in a pseudo-fs interface, and it was chosen over configfs since, aside from this one creation event, the interface is read-mostly. I.e. configfs supports cases where an object is designed to be provisioned each boot, like an iSCSI storage target, and CXL region creation is mostly for PMEM regions which are created usually once per-lifetime of a server instance. This is an improvement over nvdimm that pre-created "seed" devices that tended to confuse users looking to determine which devices are active and which are idle. Recall that the major change that CXL brings over previous persistent memory architectures is the ability to dynamically define new regions. Compare that to drivers like 'nfit' where the region configuration is statically defined by platform firmware. Regions are created as a child of a root decoder that encompasses an address space with constraints. When created through sysfs, the root decoder is explicit. When created from an LSA's region structure a root decoder will possibly need to be inferred by the driver. Upon region creation through sysfs, a vacant region is created with a unique name. Regions have a number of attributes that must be configured before the region can be bound to the driver where HDM decoder program is completed. An example of creating a new region: - Allocate a new region name: region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) - Create a new region by name: while region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) ! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region do true; done - Region now exists in sysfs: stat -t /sys/bus/cxl/devices/decoder0.0/$region - Delete the region, and name: echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com [djbw: simplify locking, reword changelog] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Dan Williams	7f8faf96a2	cxl/mem: Enumerate port targets before adding endpoints The port scanning algorithm in devm_cxl_enumerate_ports() walks up the topology and adds cxl_port objects starting from the root down to the endpoint. When those ports are initially created they know all their dports, but they do not know the downstream cxl_port instance that represents the next descendant in the topology. Rework create_endpoint() into devm_cxl_add_endpoint() that enumerates the downstream cxl_port topology into each port's 'struct cxl_ep' record for each endpoint it that the port is an ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Ben Widawsky	538831f1be	cxl/hdm: Add sysfs attributes for interleave ways + granularity The region provisioning flow involves selecting interleave ways + granularity settings for a region, and then programming the decoder topology to meet those constraints, if possible. For example, root decoders set the minimum interleave ways + granularity for any hosted regions. Given decoder programming is not atomic and collisions can occur between multiple requesting regions userspace will be responsible for conflict resolution and it needs these attributes to make those decisions. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784332235.1758207.7185062713652694607.stgit@dwillia2-xfh.jf.intel.com [djbw: reword changelog, make read-only, add sysfs ABI documentaion] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:25 -07:00
Dan Williams	391785859e	cxl/port: Move dport tracking to an xarray Reduce the complexity and the overhead of walking the topology to determine endpoint connectivity to root decoder interleave configurations. Note that cxl_detach_ep(), after it determines that the last @ep has departed and decides to delete the port, now needs to walk the dport array with the device_lock() held to remove entries. Previously list_splice_init() could be used atomically delete all dport entries at once and then perform entry tear down outside the lock. There is no list_splice_init() equivalent for the xarray. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784331647.1758207.6345820282285119339.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	256d0e9ee4	cxl/port: Move 'cxl_ep' references to an xarray per port In preparation for region provisioning that needs to walk the topology by endpoints, use an xarray to record endpoint interest in a given port. In addition to being more space and time efficient it also reduces the complexity of the implementation by moving locking internal to the xarray implementation. It also allows for a single cxl_ep reference to be recorded in multiple xarrays. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-2-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	1b58b4cac6	cxl/port: Record parent dport when adding ports At the time that cxl_port instances are being created, cache the dport from the parent port that points to this new child port. This will be useful for region provisioning when walking the tree to calculate decoder targets, and saves rewalking the dport list after the fact to build this information. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-1-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	de516b4011	cxl/port: Record dport in endpoint references Recall that the primary role of the cxl_mem driver is to probe if the given endpoint is connected to a CXL port topology. In that process it walks its device ancestry to its PCI root port. If that root port is also a CXL root port then the probe process adds cxl_port object instances at switch in the path between to the root and the endpoint. As those cxl_port instances are added, or if a previous enumeration attempt already created the port, a 'struct cxl_ep' instance is registered with that port to track the endpoints interested in that port. At the time the cxl_ep is registered the downstream egress path from the port to the endpoint is known. Take the opportunity to record that information as it will be needed for dynamic programming of decoder targets during region provisioning. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329944.1758207.15203961796832072116.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	cf880423b6	cxl/hdm: Add support for allocating DPA to an endpoint decoder The region provisioning flow will roughly follow a sequence of: 1/ Allocate DPA to a set of decoders 2/ Allocate HPA to a region 3/ Associate decoders with a region and validate that the DPA allocations and topologies match the parameters of the region. For now, this change (step 1) arranges for DPA capacity to be allocated and deleted from non-committed decoders based on the decoder's mode / partition selection. Capacity is allocated from the lowest DPA in the partition and any 'pmem' allocation blocks out all remaining ram capacity in its 'skip' setting. DPA allocations are enforced in decoder instance order. I.e. decoder N + 1 always starts at a higher DPA than instance N, and deleting allocations must proceed from the highest-instance allocated decoder to the lowest. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784329399.1758207.16732038126938632700.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:24 -07:00
Dan Williams	0c33b39352	cxl/hdm: Track next decoder to allocate The CXL specification enforces that endpoint decoders are committed in hw instance id order. In preparation for adding dynamic DPA allocation, record the hw instance id in endpoint decoders, and enforce allocations to occur in hw instance id order. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784328827.1758207.9627538529944559954.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:23 -07:00
Dan Williams	2c8669033f	cxl/hdm: Add 'mode' attribute to decoder objects Recall that the Device Physical Address (DPA) space of a CXL Memory Expander is potentially partitioned into a volatile and persistent portion. A decoder maps a Host Physical Address (HPA) range to a DPA range and that translation depends on the value of all previous (lower instance number) decoders before the current one. In preparation for allowing dynamic provisioning of regions, decoders need an ABI to indicate which DPA partition a decoder targets. This ABI needs to be prepared for the possibility that some other agent committed and locked a decoder that spans the partition boundary. Add 'decoderX.Y/mode' to endpoint decoders that indicates which partition 'ram' / 'pmem' the decoder targets, or 'mixed' if the decoder currently spans the partition boundary. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603881967.551046.6007594190951596439.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:23 -07:00
Dan Williams	9c57cde0dc	cxl/hdm: Enumerate allocated DPA In preparation for provisioning CXL regions, add accounting for the DPA space consumed by existing regions / decoders. Recall, a CXL region is a memory range comprised from one or more endpoint devices contributing a mapping of their DPA into HPA space through a decoder. Record the DPA ranges covered by committed decoders at initial probe of endpoint ports relative to a per-device resource tree of the DPA type (pmem or volatile-ram). The cxl_dpa_rwsem semaphore is introduced to globally synchronize DPA state across all endpoints and their decoders at once. The vast majority of DPA operations are reads as region creation is expected to be as rare as disk partitioning and volume creation. The device_lock() for this synchronization is specifically avoided for concern of entangling with sysfs attribute removal. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327682.1758207.7914919426043855876.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 17:19:12 -07:00
Dan Williams	3bf65915ce	cxl/core: Define a 'struct cxl_endpoint_decoder' Previously the target routing specifics of switch decoders and platform CXL window resource tracking of root decoders were factored out of 'struct cxl_decoder'. While switch decoders translate from SPA to downstream ports, endpoint decoders translate from SPA to DPA. This patch, 3 of 3, adds a 'struct cxl_endpoint_decoder' that tracks an endpoint-specific Device Physical Address (DPA) resource. For now this just defines ->dpa_res, a follow-on patch will handle requesting DPA resource ranges from a device-DPA resource tree. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784327088.1758207.15502834501671201192.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:41:20 -07:00
Dan Williams	0f157c7fa1	cxl/core: Define a 'struct cxl_root_decoder' Previously the target routing specifics of switch decoders were factored out of 'struct cxl_decoder' into 'struct cxl_switch_decoder'. This patch, 2 of 3, adds a 'struct cxl_root_decoder' as a superset of a switch decoder that also track the associated CXL window platform resource. Note that the reason the resource for a given root decoder needs to be looked up after the fact (i.e. after cxl_parse_cfmws() and add_cxl_resource()) is because add_cxl_resource() may have merged CXL windows in order to keep them at the top of the resource tree / decode hierarchy. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165784326541.1758207.9915663937394448341.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:40:47 -07:00
Dan Williams	974854ab07	cxl/acpi: Track CXL resources in iomem_resource Recall that CXL capable address ranges, on ACPI platforms, are published in the CEDT.CFMWS (CXL Early Discovery Table: CXL Fixed Memory Window Structures). These windows represent both the actively mapped capacity and the potential address space that can be dynamically assigned to a new CXL decode configuration (region / interleave-set). CXL endpoints like DDR DIMMs can be mapped at any physical address including 0 and legacy ranges. There is an expectation and requirement that the /proc/iomem interface and the iomem_resource tree in the kernel reflect the full set of platform address ranges. I.e. that every address range that platform firmware and bus drivers enumerate be reflected as an iomem_resource entry. The hard requirement to do this for CXL arises from the fact that facilities like CONFIG_DEVICE_PRIVATE expect to be able to treat empty iomem_resource ranges as free for software to use as proxy address space. Without CXL publishing its potential address ranges in iomem_resource, the CONFIG_DEVICE_PRIVATE mechanism may inadvertently steal capacity reserved for runtime provisioning of new CXL regions. So, iomem_resource needs to know about both active and potential CXL resource ranges. The active CXL resources might already be reflected in iomem_resource as "System RAM". insert_resource_expand_to_fit() handles re-parenting "System RAM" underneath a CXL window. The "_expand_to_fit()" behavior handles cases where a CXL window is not a strict superset of an existing entry in the iomem_resource tree. The "_expand_to_fit()" behavior is acceptable from the perspective of resource allocation. The expansion happens because a conflicting resource range is already populated, which means the resource boundary expansion does not result in any additional free CXL address space being made available. CXL address space allocation is always bounded by the orginal unexpanded address range. However, the potential for expansion does mean that something like walk_iomem_res_desc(IORES_DESC_CXL...) can only return fuzzy answers on corner case platforms that cause the resource tree to expand a CXL window resource over a range that is not decoded by CXL. This would be an odd platform configuration, but if it becomes a problem in practice the CXL subsytem could just publish an API that returns definitive answers. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/165784325943.1758207.5310344844375305118.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:40:01 -07:00
Dan Williams	e636479e2f	cxl/core: Define a 'struct cxl_switch_decoder' Currently 'struct cxl_decoder' contains the superset of attributes needed for all decoder types. Before more type-specific attributes are added to the common definition, reorganize 'struct cxl_decoder' into type specific objects. This patch, the first of three, factors out a cxl_switch_decoder type. See the new kdoc for what a 'struct cxl_switch_decoder' represents in a CXL topology. Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/165784325340.1758207.5064717153608954960.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-21 08:34:16 -07:00
Ira Weiny	c97006046c	cxl/port: Read CDAT table The per-device CDAT data provides performance data that is relevant for mapping which CXL devices can participate in which CXL ranges by QTG (QoS Throttling Group) (per ECN: CXL 2.0 CEDT CFMWS & QTG_DSM) [1]. The QTG association specified in the ECN is advisory. Until the cxl_acpi driver grows support for invoking the QTG _DSM method the CDAT data is only of interest to userspace that may need it for debug purposes. Search the DOE mailboxes available, query CDAT data, cache the data and make it available via a sysfs binary attribute per endpoint at: /sys/bus/cxl/devices/endpointX/CDAT ...similar to other ACPI-structured table data in /sys/firmware/ACPI/tables. The CDAT is relative to 'struct cxl_port' objects since switches in addition to endpoints can host a CDAT instance. Switch CDAT support is not implemented. This does not support table updates at runtime. It will always provide whatever was there when first cached. It is also the case that table updates are not expected outside of explicit DPA address map affecting commands like Set Partition with the immediate flag set. Given that the driver does not support Set Partition with the immediate flag set there is no current need for update support. Link: https://www.computeexpresslink.org/spec-landing [1] Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: drop in-kernel parsing infra for now, and other minor fixups] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-7-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-19 15:38:05 -07:00
Ira Weiny	3eddcc9385	cxl/pci: Create PCI DOE mailbox's for memory devices DOE mailbox objects will be needed for various mailbox communications with each memory device. Iterate each DOE mailbox capability and create PCI DOE mailbox objects as found. It is not anticipated that this is the final resting place for the iteration of the DOE devices. The support of switch ports will drive this code into the PCIe side. In this imagined architecture the CXL port driver would then query into the PCI device for the DOE mailbox array. For now creating the mailboxes in the CXL port is good enough for the endpoints. Later PCIe ports will need to support this to support switch ports more generically. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-5-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-19 15:38:04 -07:00
Dan Williams	b060edfd8c	cxl/pmem: Delete unused nvdimm attribute While there is a need to go from a LIBNVDIMM 'struct nvdimm' to a CXL 'struct cxl_nvdimm', there is no use case to go the other direction. Likely this is a leftover from an early version of the referenced commit before it implemented devm for releasing the created nvdimm. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-19-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-11 12:34:55 -07:00
Dan Williams	9e9e44017d	cxl/hdm: Initialize decoder type for memory expander devices Unless and until accelerator (type-2) drivers start registering for CXL.mem mapping services from the CXL subsystem core, initialize idle HDM decoders to the "expander" type. I.e. the only CXL devices using the CXL core presently are those implementing the CXL 2.0 Type-3 memory expander device class code that the cxl_pci driver claims. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-6-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 13:41:49 -07:00
Dan Williams	ee80001083	cxl/port: Cache CXL host bridge data Region creation has need for checking host-bridge connectivity when adding endpoints to regions. Record, at port creation time, the host-bridge to provide a useful shortcut from any location in the topology to the most-significant ancestor. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220624041950.559155-4-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 12:10:07 -07:00
Dan Williams	e7ad1bf683	tools/testing/cxl: Add partition support In support of testing DPA allocation mechanisms in the CXL core, the cxl_test environment needs to support establishing and retrieving the 'pmem partition boundary. Replace the platform_device_add_resources() method for delineating DPA within an endpoint with an emulated DEV_SIZE amount of partitionable capacity. Set DEV_SIZE such that an endpoint has enough capacity to simultaneously participate in 8 distinct regions. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603887411.551046.13234212587991192347.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 10:29:26 -07:00
Dan Williams	cc2a487870	cxl/mem: Add a debugfs version of 'iomem' for DPA, 'dpamem' Dump the device-physical-address map for a CXL expander in /proc/iomem style format. E.g.: cat /sys/kernel/debug/cxl/mem1/dpamem 00000000-0fffffff : ram 10000000-1fffffff : pmem Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603885318.551046.8308248564880066726.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 10:10:30 -07:00
Dan Williams	9b99ecf5a3	cxl/debug: Move debugfs init to cxl_core_init() In preparation for a new cxl debugfs file, move 'cxl' directory establishment and teardown to the core and let subsequent init routines reference that setup. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603884654.551046.4962104601691723080.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-10 09:57:28 -07:00
Ben Widawsky	14e473e1a7	cxl/hdm: Require all decoders to be enumerated In preparation for region provisioning all device decoders need to be enumerated since DPA allocations are calculated by summing the capacities of all decoders in a set. I.e. the programming for decoder[N] depends on the state of decoder[N-1], so skipping over decoders that fail to initialize prevents accurate DPA accounting. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> [djbw: reword changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603879664.551046.6863805202478861026.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 19:45:32 -07:00
Dan Williams	d3b75029f3	cxl/mem: Convert partition-info to resources To date the per-device-partition DPA range information has only been used for enumeration purposes. In preparation for allocating regions from available DPA capacity, convert those ranges into DPA-type resource trees. With resources and the new add_dpa_res() helper some open coded end address calculations and debug prints can be cleaned. The 'cxlds->pmem_res' and 'cxlds->ram_res' resources are child resources of the total-device DPA space and they in turn will host DPA allocations from cxl_endpoint_decoder instances (tracked by cxled->dpa_res). Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603878921.551046.8127845916514734142.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 19:43:30 -07:00
Dan Williams	419af595b1	cxl: Introduce cxl_to_{ways,granularity} Interleave granularity and ways have CXL specification defined encodings. Promote the conversion helpers to a common header, and use them to replace other open-coded instances. Force caller to consider the error case of the conversion similarly to other conversion helpers like kstrto*(). Co-developed-by: Ben Widawsky <bwidawsk@kernel.org> Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165603875016.551046.17236943065932132355.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:42:44 -07:00
Dan Williams	885d3bed6d	cxl/core: Drop is_cxl_decoder() This helper was only used to identify the object type for lockdep purposes. Now that lockdep support is done with explicit lock classes, this helper can be dropped. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603874340.551046.15491766127759244728.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:41:02 -07:00
Dan Williams	e50fe01e1f	cxl/core: Drop ->platform_res attribute for root decoders Root decoders are responsible for hosting the available host address space for endpoints and regions to claim. The tracking of that available capacity can be done in iomem_resource directly. As a result, root decoders no longer need to host their own resource tree. The current ->platform_res attribute was added prematurely. Otherwise, ->hpa_range fills the role of conveying the current decode range of the decoder. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603873619.551046.791596854070136223.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:23:37 -07:00
Dan Williams	e8b7ea58ab	cxl/core: Rename ->decoder_range ->hpa_range In preparation for growing a ->dpa_range attribute for endpoint decoders, rename the current ->decoder_range to the more descriptive ->hpa_range. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872867.551046.2170426227407458814.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:22:41 -07:00
Ben Widawsky	04ed37a2ba	cxl/hdm: Use local hdm variable Save a few characters and use the already initialized local variable. Signed-off-by: Ben Widawsky <bwidawsk@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603872171.551046.913207574344536475.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 16:21:19 -07:00
Dan Williams	fe80f1ad59	cxl/port: Keep port->uport valid for the entire life of a port The upcoming region provisioning implementation has a need to dereference port->uport during the port unregister flow. Specifically, endpoint decoders need to be able to lookup their corresponding memdev via port->uport. The existing ->dead flag was added for cases where the core was committed to tearing down the port, but needed to drop locks before calling device_unregister(). Reuse that flag to indicate to delete_endpoint() that it has no "release action" work to do as unregister_port() will handle it. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/165603871491.551046.6682199179541194356.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-07-09 13:06:50 -07:00
Vishal Verma	e35f571890	cxl/mbox: Fix missing variable payload checks in cmd size validation The conversion of command sizes to unsigned missed a couple of checks against variable size payloads during command validation, which made all variable payload commands unconditionally fail. Add the checks back using the new CXL_VARIABLE_PAYLOAD scheme. Fixes: `26f89535a5` ("cxl/mbox: Use type __u32 for mailbox payload sizes") Cc: <stable@vger.kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Reported-by: Abhi Cs <abhi.cs@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20220628220109.633564-1-vishal.l.verma@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-28 22:03:18 -07:00
Alison Schofield	8a66487506	cxl/mbox: Use __le32 in get,set_lsa mailbox structures CXL specification defines these as little endian. Fixes: `60b8f17215` ("cxl/pmem: Translate NVDIMM label commands to CXL label commands") Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220225221456.1025635-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Ben Widawsky	8ae3cebc17	cxl/core: Use is_endpoint_decoder Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since, by this point, cxl_endpoint_decoder_alloc() has a specified type. Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Jonathan Cameron	db9a3a35d3	cxl: Fix cleanup of port devices on failure to probe driver. The device is created, and then there is a check if a driver succesfully bound to it. In event of failing the bind (e.g. failure in cxl_port_probe()) the device is left registered. When a bus rescan later occurs, fresh devices are created leading to a multiple device representing the same underlying hardware. Bad things may follow and at very least we have far too many devices. Fix by ensuring autoremove is registered if the device create succeeds, but doesn't depend on sucessful binding to a driver. Bug was observed as side effect of incorrect ownership in [PATCH v9 6/9] cxl/port: Read CDAT table but will result from any failure to in cxl_port_probe(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220609134519.11668-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-06-21 14:09:00 -07:00
Dan Williams	34e37b4c43	cxl/port: Enable HDM Capability after validating DVSEC Ranges CXL memory expanders that support the CXL 2.0 memory device class code include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC Range" mechanism originally defined in CXL 1.1. Both mechanisms depend on a "mem_enable" bit being set in configuration space before either mechanism activates. When the HDM Decoder Capability is enabled the CXL DVSEC Range settings are ignored. Previously, the cxl_mem driver was relying on platform-firmware to set "mem_enable". That is an invalid assumption as there is no requirement that platform-firmware sets the bit before the driver sees a device, especially in hot-plug scenarios. Additionally, ACPI-platforms that support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery Table). That table outlines the platform permissible address ranges for CXL operation. So, there is a need for the driver to set "mem_enable", and there is information available to determine the validity of the CXL DVSEC Ranges. Arrange for the driver to optionally enable the HDM Decoder Capability if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range configuration was invalid. Be careful to only disable memory decode if the kernel was the one to enable it. In other words, if CXL is backing all of kernel memory at boot the device needs to maintain "mem_enable" and "HDM Decoder enable" all the way up to handoff back to platform firmware (e.g. ACPI S5 state entry may require CXL memory to stay active). Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Cc: Dan Carpenter <dan.carpenter@oracle.com> [dan: fix early terminiation of range-allowed loop] Cc: Ariel Sibley <ariel.sibley@microchip.com> [ariel: Memory_size must be non-zero] Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165307136375.2499769.861793697156744166.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-20 12:30:53 -07:00
Dan Williams	fcfbc93cc3	cxl/port: Reuse 'struct cxl_hdm' context for hdm init The port driver maps component registers for port operations. Reuse that mapping for HDM Decoder Capability setup / enable. Move devm_cxl_setup_hdm() before cxl_hdm_decode_init() and plumb @cxlhdm through the hdm init helpers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691712.1426646.14336397551571515480.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:42 -07:00
Dan Williams	5e5f4ad52f	cxl/port: Move endpoint HDM Decoder Capability init to port driver The responsibility for establishing HDM Decoder Capability based operation is more closely tied to port enabling than memdev enabling which is concerned with port enumeration. This later enables reusing @cxlhdm for probing / controlling "global enable" for the HDM Decoder Capability. For now, just do the nominal move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291691167.1426646.7936109077255288258.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	92804edb11	cxl/pci: Drop @info argument to cxl_hdm_decode_init() Now that nothing external to cxl_hdm_decode_init() considers 'struct cxl_endpoint_dvec_info' move it internal to cxl_hdm_decode_init(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291690612.1426646.7866084245521113414.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	a12562bb70	cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init() In preparation for changing how the driver handles 'mem_enable' in the CXL DVSEC control register. Merge the contents of cxl_hdm_decode_init() into cxl_dvsec_ranges() and rename the combined function cxl_hdm_decode_init(). The possible cleanups and fixes that result from this merge are saved for a follow-on change. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165291690027.1426646.10249756632415633752.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	dd2d42ad6f	cxl/mem: Skip range enumeration if mem_enable clear When a device does not have mem_enable set then the current range settings are moot. Skip the enumeration and cause cxl_hdm_decode_init() to proceed directly to enable the HDM Decoder Capability. Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291689442.1426646.18012291761753694336.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	14d7887407	cxl/mem: Consolidate CXL DVSEC Range enumeration in the core In preparation for fixing the setting of the 'mem_enabled' bit in CXL DVSEC Control register, move all CXL DVSEC range enumeration into the same source file. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	2e4ba0ec97	cxl/pci: Move cxl_await_media_ready() to the core Allow cxl_await_media_ready() to be mocked for testing purposes rather than carrying the maintenance burden of an indirect function call in the mainline driver. With the move cxl_await_media_ready() can no longer reuse the mailbox timeout override, so add a media_ready_timeout module parameter to the core to backfill. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	75b7ae2999	cxl/mem: Validate port connectivity before dvsec ranges In preparation for validating DVSEC ranges against the platform declared CXL memory ranges (ACPI CFMWS) move port enumeration before the endpoint's decoder validation. Ultimately this logic will move to the port driver, but create a bisect point before that larger move. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687749.1426646.18091538443879226995.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	76a4121e86	cxl/mem: Fix cxl_mem_probe() error exit The addition of cxl_mem_active() broke error exit scenarios for cxl_mem_probe(). Return early rather than proceed with disabling suspend, and update the label name since it is no longer a terminal "out" label that exits the function. Fixes: `9ea4dcf498` ("PM: CXL: Disable suspend") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291687176.1426646.15449254938752532784.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:41 -07:00
Dan Williams	194d5edadf	cxl/pci: Drop wait_for_valid() from cxl_await_media_ready() A check mem_info_valid already happens in __cxl_dvsec_ranges(). Rely on that instead of calling wait_for_valid again. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686632.1426646.7479581732894574486.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	1e14c9fbb5	cxl/pci: Consolidate wait_for_media() and wait_for_media_ready() Now that wait_for_media() does nothing supplemental to wait_for_media_ready() just promote wait_for_media_ready() to a common helper and drop wait_for_media(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291686046.1426646.4390664747934592185.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	2bcf3bbd34	cxl/mem: Drop mem_enabled check from wait_for_media() Media ready is asserted by the device independent of whether mem_enabled was ever set. Drop this check to allow for dropping wait_for_media() in favor of ->wait_media_ready(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/165291685501.1426646.10372821863672431074.stgit@dwillia2-xfh Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-05-19 08:50:40 -07:00
Dan Williams	38a34e1076	cxl: Drop cxl_device_lock() Now that all CXL subsystem locking is validated with custom lock classes, there is no need for the custom usage of the lockdep_mutex. Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:55 -07:00
Dan Williams	d864b8ea64	cxl/acpi: Add root device lockdep validation The CXL "root" device, ACPI0017, is an attach point for coordinating platform level CXL resources and is the parent device for a CXL port topology tree. As such it has distinct locking rules relative to other CXL subsystem objects, but because it is an ACPI device the lock class is established well before it is given to the cxl_acpi driver. However, the lockdep API does support changing the lock class "live" for situations like this. Add a device_lock_set_class() helper that a driver can use in ->probe() to set a custom lock class, and device_lock_reset_class() to return to the default "no validate" class before the custom lock class key goes out of scope after ->remove(). Note the helpers are all macros to support dead code elimination in the CONFIG_PROVE_LOCKING=n case, however device_set_lock_class() still needs #ifdef CONFIG_PROVE_LOCKING since lockdep_match_class() explicitly does not have a helper in the CONFIG_PROVE_LOCKING=n case (see comment in lockdep.h). The lockdep API needs 2 small tweaks to prevent "unused" warnings for the @key argument to lock_set_class(), and a new lock_set_novalidate_class() is added to supplement lockdep_set_novalidate_class() in the cases where the lock class is converted while the lock is held. Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165100081305.1528964.11138612430659737238.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:54 -07:00
Dan Williams	3750d01318	cxl: Replace lockdep_mutex with local lock classes In response to an attempt to expand dev->lockdep_mutex for device_lock() validation [1], Peter points out [2] that the lockdep API already has the ability to assign a dedicated lock class per subsystem device-type. Use lockdep_set_class() to override the default device_lock() '__lockdep_no_validate__' class for each CXL subsystem device-type. This enables lockdep to detect deadlocks and recursive locking within the device-driver core and the subsystem. The lockdep_set_class_and_subclass() API is used for port objects that recursively lock the 'cxl_port_key' class by hierarchical topology depth. Link: https://lore.kernel.org/r/164982968798.684294.15817853329823976469.stgit@dwillia2-desk3.amr.corp.intel.com [1] Link: https://lore.kernel.org/r/Ylf0dewci8myLvoW@hirez.programming.kicks-ass.net [2] Suggested-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165055519317.3745911.7342499516839702840.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 14:01:54 -07:00
Dan Carpenter	35e01667c8	cxl/mbox: fix logical vs bitwise typo This should be bitwise & instead of &&. Fixes: `6179045ccc` ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YmpgkbbQ1Yxu36uO@kili Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-28 10:18:31 -07:00
Alison Schofield	280302f0e8	cxl/mbox: Replace NULL check with IS_ERR() after vmemdup_user() vmemdup_user() returns an ERR_PTR() on failure. Use IS_ERR() to check the return value. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220407010915.1211258-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:12:04 -07:00
Alison Schofield	26f89535a5	cxl/mbox: Use type __u32 for mailbox payload sizes Payload sizes for mailbox commands are expected to be positive values coming from userspace. The documentation correctly describes these as always unsigned values. The mailbox and send structures that support the mailbox commands however, use __s32 types for the payloads. Replace __s32 with __u32 in the mailbox and send command structures and update usages. Kernel users of the interface already block all negative values and there is no known ability for userspace to have grown a dependency on submitting negative values to the kernel. The known user of the IOCTL, the CXL command line interface (cxl-cli) already enforces positive size values. A Smatch warning of a signedness uncovered this issue. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20220414051246.1244575-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:12:04 -07:00
Dan Williams	9ea4dcf498	PM: CXL: Disable suspend The CXL specification claims S3 support at a hardware level, but at a system software level there are some missing pieces. Section 9.4 (CXL 2.0) rightly claims that "CXL mem adapters may need aux power to retain memory context across S3", but there is no enumeration mechanism for the OS to determine if a given adapter has that support. Moreover the save state and resume image for the system may inadvertantly end up in a CXL device that needs to be restored before the save state is recoverable. I.e. a circular dependency that is not resolvable without a third party save-area. Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly allows for suspend, but requires unbinding all CXL memory devices before the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind flow is intended to also tear down all CXL memory regions associated with a given cxl_memdev. It is reasonable to assume that any device participating in a System RAM range published in the EFI memory map is covered by aux power and save-area outside the device itself. So this restriction can be minimized in the future once pre-existing region enumeration support arrives, and perhaps a spec update to clarify if the EFI memory map is sufficent for determining the range of devices managed by platform-firmware for S3 support. Per Rafael, if the CXL configuration prevents suspend then it should fail early before tasks are frozen, and mem_sleep should stop showing 'mem' as an option [1]. Effectively CXL augments the platform suspend ->valid() op since, for example, the ACPI ops are not aware of the CXL / PCI dependencies. Given the split role of platform firmware vs OS provisioned CXL memory it is up to the cxl_mem driver to determine if the CXL configuration has elements that platform firmware may not be prepared to restore. Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1] Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-22 16:09:42 -07:00
Dan Williams	35ee1f4990	cxl/mem: Replace redundant debug message with a comment cxl_mem_probe() already emits a log message when HDM operation can not be established. Delete the similar one in cxl_hdm_decode_init(). What is less obvious is why global_ctrl being enabled makes positive values of info->ranges irrelevant, and the Linux behavior with respect to the spec recommendation to mirror CXL Range registers with HDM Decoder Base + Size registers. Cc: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164944616743.454665.7055846627973202403.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	31e624a77e	cxl/mem: Rename cxl_dvsec_decode_init() to cxl_hdm_decode_init() cxl_dvsec_decode_init() is tasked with checking whether legacy DVSEC range based decode is in effect, or whether HDM can be enabled / already is enabled. As such it either succeeds or fails and that result is the return value. The @do_hdm_init variable is misleading in the case where HDM operation is already found to be active, so just call it @retval. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730736435.3806189.2537160791687837469.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	36bfc6ad50	cxl/pci: Make cxl_dvsec_ranges() failure not fatal to cxl_pci cxl_dvsec_ranges(), the helper for enumerating the presence of an active legacy CXL.mem configuration on a CXL 2.0 Memory Expander, is not fatal for cxl_pci because there is still value to enable mailbox operations even if CXL.mem operation is disabled. Recall that the reason cxl_pci does this initialization and not cxl_mem is to preserve the useful property (for unit testing) that cxl_mem is cxl_memdev + mmio generic, and does not require access to a 'struct pci_dev' to issue config cycles. Update 'struct cxl_endpoint_dvsec_info' to carry either a positive number of non-zero size legacy CXL DVSEC ranges, or the negative error code from __cxl_dvsec_ranges() in its @ranges member. Reported-by: Krzysztof Zach <krzysztof.zach@intel.com> Fixes: `560f785590` ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730735869.3806189.4032428192652531946.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	fbaf2b079d	cxl/mem: Make cxl_dvsec_range() init failure fatal In preparation for the cxl_pci driver to continue operation after cxl_dvsec_range() failure, update cxl_mem to check for negative error codes in info->ranges. Treat that condition as fatal regardless of the state of the HDM configuration since cxl_mem needs positive confirmation that legacy ranges were not established by platform firmware or another agent. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com. Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730735324.3806189.4167509857771192422.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	e39f9be08d	cxl/pci: Add debug for DVSEC range init failures In preparation for not treating DVSEC range initialization failures as fatal to cxl_pci_probe() add individual dev_dbg() statements for each of the major failure reasons in cxl_dvsec_ranges(). The rationale for cxl_dvsec_ranges() failure not being fatal is that there is still value for cxl_pci to enable mailbox operations even if CXL.mem operation is disabled. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/164730734812.3806189.2726330688692684104.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:58 -07:00
Dan Williams	e08063fb87	cxl/mem: Drop DVSEC vs EFI Memory Map sanity check When the driver finds legacy DVSEC ranges active on a CXL Memory Expander it indicates that platform firmware is not aware of, or is deliberately disabling common CXL 2.0 operation. In this case Linux generally has no choice, but to leave the device alone. The driver attempts to validate that the DVSEC range is in the EFI memory map. Remove that logic since there is no requirement that the BIOS publish DVSEC ranges in the EFI Memory Map. In the future the driver will want to permanently reserve this capacity out of the available CFMWS capacity and hide it from request_free_mem_region(), but it serves no purpose to warn about the range not appearing in the EFI Memory Map. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164730734246.3806189.13995924771963139898.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 19:11:57 -07:00
Davidlohr Bueso	c43e036d6f	cxl/mbox: Use new return_code handling Use the global cxl_mbox_cmd_rc table to improve debug messaging in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd() to map to proper kernel style errno codes - this patch continues to use -ENXIO only so no change in semantics. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:02 -07:00
Davidlohr Bueso	92fcc1abab	cxl/mbox: Improve handling of mbox_cmd hw return codes Upon a completed command the caller is still expected to check the actual return_code register to ensure it succeed. This adds, per the spec, the potential command return codes. It maps the hardware return code with the kernel's errno style, and by default continues to use -ENXIO (Command completed, but device reported an error). Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-4-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:02 -07:00
Davidlohr Bueso	cbe83a2052	cxl/pci: Use CXL_MBOX_SUCCESS to check against mbox_cmd return code Also mention the need for the caller to check against any errors from the hardware in return_code. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-3-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Davidlohr Bueso	ee92c7e261	cxl/mbox: Drop mbox_mutex comment ... we have lockdep for this. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed by: Adam Manzanares <a.manzanares@samsung.com> Link: https://lore.kernel.org/r/20220404021216.66841-2-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	6aa657f416	cxl/pmem: Remove CXL SET_PARTITION_INFO from exclusive_cmds list With SET_PARTITION_INFO on the exclusive_cmds list for the CXL_PMEM driver, userspace cannot execute a set-partition command without first unbinding the pmem driver from the device. When userspace requests a partition change to take effect on the next reboot this unbind requirement is unnecessarily restrictive. The driver does not need to enforce an unbind because partitions will not change until the next reboot. Of course, userspace still needs to be aware that changing the size of persistent capacity on the next reboot will result in the loss of data stored. That can happen regardless of whether it is presently bound at the time of issuing the set-partition command. When userspace requests a partition change to take effect immediately, restrictions are needed. The CXL_MEM driver currently blocks the usage of immediate mode, making the presence of SET_PARTITION_INFO, in this exclusive commands list, redundant. In the future, when the CXL_MEM driver adds support for immediate changes to device partitions it will ensure that the partition change will not affect any active decode. That means the work will not fall right back here, onto the CXL_PMEM driver. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/accc6abc878f0662093b81490a1a052f2ff6f06e.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	6179045ccc	cxl/mbox: Block immediate mode in SET_PARTITION_INFO command User space may send the SET_PARTITION_INFO mailbox command using the IOCTL interface. Inspect the input payload and fail if the immediate flag is set. This is the first instance of the driver inspecting an input payload from user space. Assume there will be more such cases and implement with an extensible helper. In order for the kernel to react to an immediate partition change it needs to assert that the change will not affect any active decode. At a minimum this requires validating that the device is using HDM decoders instead of the CXL DVSEC for decode, and that none of the active HDM decoders are affected by the partition change. For now, just fail until that support arrives. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/241821186c363833980adbc389e2c547bc5a6395.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	2dd5600a0e	cxl/mbox: Move cxl_mem_command param to a local variable cxl_validate_command_from_user() is now the single point of validation for mailbox commands coming from user space. Previously, it returned a a cxl_mem_command, but that was not sufficient when validation of the actual mailbox command became a requirement. Now, it returns a fully validated cxl_mbox_cmd. Remove the extraneous cxl_mem_command parameter. Define and use a local version only. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c11a437896d914daf36f5ac8ec62f999c5ec2da7.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	d97fe8eec2	cxl/mbox: Make handle_mailbox_cmd_from_user() use a mbox param Previously, handle_mailbox_cmd_from_user(), constructed the mailbox command and dispatched it to the hardware. The construction work has moved to the validation path. handle_mailbox_cmd_from_user() now expects a fully validated mbox param. Make it's caller, cxl_send_cmd(), deliver it. Update the comments and dereferencing of the new mbox parameter. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/77050ba512d6c30eccf7505467509e460dd325a0.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	82b8ba2953	cxl/mbox: Remove dependency on cxl_mem_command for a debug msg In preparation for removing access to struct cxl_mem_command, change this debug message to use cxl_mbox_cmd fields instead. Retrieve the pretty command name from cxl_mbox_cmd using a new opcode to command name helper. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/57265751d336a6e95f5ca31a9c77189408b05742.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:01 -07:00
Alison Schofield	9ae016aeb7	cxl/mbox: Construct a users cxl_mbox_cmd in the validation path This is a step in refactoring the handling of user space mailbox commands. The intent is to have all the validation work originate in cxl_validate_cmd_from_user(). Move the construction and validation of a mailbox command to the validation path. Continue to pass both the out_cmd and the mbox_cmd until handle_mbox_cmd_from_user() learns how to use a mbox_cmd param. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c9fbdad968a2b619f9108bb6c37cef1a853cdf5a.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	63cf60b7e0	cxl/mbox: Move build of user mailbox cmd to a helper functions In preparation for moving the construction of a mailbox command to the validation path, extract the work into a helper functions. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/493d7618a846d787c3ae28778935ca35e2b85eed.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	39ed8da4f3	cxl/mbox: Move raw command warning to raw command validation This move serves two purposes: 1) Emit the warning in the raw command validation path, and 2) Remove the dependency on the struct cxl_mem_command in handle_mailbox_cmd_from_user() in preparation for a refactor of that function. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/df5f0e0ec8afa1f75299aa86b4226ab4479ef325.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Alison Schofield	6dd0e5cc87	cxl/mbox: Move cxl_mem_command construction to helper funcs Sanitizing and constructing a cxl_mem_command from a userspace command is part of the validation process prior to submitting the command to a CXL device. Move this work to helper functions: cxl_to_mem_cmd(), cxl_to_mem_cmd_raw(). This declutters cxl_validate_cmd_from_user() in preparation for adding new validation steps. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/7d9b826f29262e3a484cb4bb7b63872134d60bd7.1648687552.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-12 16:07:00 -07:00
Dan Williams	d28820419c	cxl/pci: Drop shadowed variable 0day reports that wait_for_media_ready() declares an @rc variable twice. >> drivers/cxl/pci.c:439:7: warning: Local variable 'rc' shadows outer variable [shadowVariable] int rc; ^ drivers/cxl/pci.c:431:6: note: Shadowed declaration int rc, i; ^ drivers/cxl/pci.c:439:7: note: Shadow variable int rc; ^ Cc: Randy Dunlap <rdunlap@infradead.org> Fixes: `523e594d9c` ("cxl/pci: Implement wait for media active") Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/164944636936.455177.14136200464724208233.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-08 12:59:43 -07:00
Wan Jiabing	05e815539f	cxl/core/port: Fix NULL but dereferenced coccicheck error Fix the following coccicheck warning: ./drivers/cxl/core/port.c:913:21-24: ERROR: port is NULL but dereferenced. The put_device() is only relevent in the is_cxl_root() case. Fixes: `2703c16c75` ("cxl/core/port: Add switch port enumeration") Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Link: https://lore.kernel.org/r/20220307094158.404882-1-wanjiabing@vivo.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-03-22 10:51:17 -07:00
Dan Williams	74be98774d	cxl/port: Hold port reference until decoder release KASAN + DEBUG_KOBJECT_RELEASE reports a potential use-after-free in cxl_decoder_release() where it goes to reference its parent, a cxl_port, to free its id back to port->decoder_ida. BUG: KASAN: use-after-free in to_cxl_port+0x18/0x90 [cxl_core] Read of size 8 at addr ffff888119270908 by task kworker/35:2/379 CPU: 35 PID: 379 Comm: kworker/35:2 Tainted: G OE 5.17.0-rc2+ #198 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events kobject_delayed_cleanup Call Trace: <TASK> dump_stack_lvl+0x59/0x73 print_address_description.constprop.0+0x1f/0x150 ? to_cxl_port+0x18/0x90 [cxl_core] kasan_report.cold+0x83/0xdf ? to_cxl_port+0x18/0x90 [cxl_core] to_cxl_port+0x18/0x90 [cxl_core] cxl_decoder_release+0x2a/0x60 [cxl_core] device_release+0x5f/0x100 kobject_cleanup+0x80/0x1c0 The device core only guarantees parent lifetime until all children are unregistered. If a child needs a parent to complete its ->release() callback that child needs to hold a reference to extend the lifetime of the parent. Fixes: `40ba17afdf` ("cxl/acpi: Introduce cxl_decoder objects") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Tested-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164505751190.4175768.13324905271463416712.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-17 16:51:13 -08:00
Dan Williams	41ae9105f5	cxl/port: Fix endpoint refcount leak An endpoint can be unregistered via two paths. Either its parent port is unregistered, or the memdev that registered the endpoint is removed. The memdev remove path is responsible for synchronizing against the parent ->remove() event and if the memdev remove path wins, manually trigger unregister_port() via devm_release_action(). Until that race is resolved the memdev remove path holds a reference on the endpoint. If the parent port for the endpoint can not be found that is an indication that the endpoint has already been registered. Be sure to drop the reference in all exit paths from delete_endpoint(). Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Link: https://lore.kernel.org/r/164454148209.3429624.12905500880311609053.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-17 16:50:30 -08:00
Dan Williams	e6e17cc6ed	cxl/core: Fix cxl_device_lock() class detection If cxl_device_lock() is used on a non-CXL device the expectation is that the lock class will fall back to CXL_ANON_LOCK. Instead it crashes when trying to determine if the device is a 'decoder'. Specifically when the device has a NULL type pointer. Just check for NULL before de-referencing ->release. Fixes: `3c5b903955` ("cxl: Prove CXL locking") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164439225406.2941117.3927102269866914339.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-11 13:27:18 -08:00
Dan Williams	5c3c067b60	cxl/core/port: Fix unregister_port() lock assertion The device_lock_assert() in unregister_port() fails to pick the right device leading to splats like the following from: echo "ACPI0017:00" > /sys/bus/platform/drivers/cxl_acpi/unbind WARNING: CPU: 32 PID: 1147 at include/linux/device.h:787 unregister_port+0x49/0x50 [cxl_c [..] RIP: 0010:unregister_port+0x49/0x50 [cxl_core] [..] Call Trace: <TASK> release_nodes+0x63/0x80 devres_release_all+0x8b/0xc0 __device_release_driver+0x190/0x240 device_driver_detach+0x3e/0xa0 unbind_store+0x113/0x130 Fix it up to assert on the device_lock() for ACPI0017 for root and 1st level ports, and parent ports for all the rest. Fixes: `54cdbf845c` ("cxl/port: Add a driver for 'struct cxl_port' objects") Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164439224893.2941117.18331456248117887720.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-11 13:26:49 -08:00
Jonathan Cameron	74b0fe8040	cxl/regs: Fix size of CXL Capability Header Register In CXL 2.0, 8.2.5.1 CXL Capability Header Register: this register is given as 32 bits. 8.2.3 which covers the CXL 2.0 Component registers, including the CXL Capability Header Register states that access restrictions specified in Section 8.2.2 apply. 8.2.2 includes: * A 32 bit register shall be accessed as a 4 Byte quantity. ... If these rules are not followed, the behavior is undefined. Discovered during review of CXL QEMU emulation. Alex Bennée pointed out there was a comment saying that 4 byte registers must be read with a 4 byte read, but 8 byte reads were being emulated. https://lore.kernel.org/qemu-devel/87bkzyd3c7.fsf@linaro.org/ Fixing that, led to this code failing. Whilst a given hardware implementation 'might' work with an 8 byte read, it should not be relied upon. The QEMU emulation v5 will return 0 and log the wrong access width. The code moved, so one fixes tag for where this will directly apply and also a reference to the earlier introduction of the code for backports. Fixes: `0f06157e01` ("cxl/core: Move register mapping infrastructure") Fixes: `08422378c4` ("cxl/pci: Add HDM decoder capabilities") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/20220201153437.2873-1-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:33 -08:00
Dan Williams	7004cc9d15	cxl/core/port: Handle invalid decoders In case init_hdm_decoder() finds invalid settings, skip to the next valid decoder. Only fail port enumeration if zero valid decoders are found. This protects the driver init against broken hardware and / or future interleave capabilities. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164317464918.3438644.12371149695618136198.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:10 -08:00
Dan Williams	0909b4e528	cxl/core/port: Fix / relax decoder target enumeration If the decoder is not presently active the target_list may not be accurate. Perform a best effort mapping and assume that it will be fixed up when the decoder is enabled. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164317464406.3438644.6609329492458460242.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 23:15:10 -08:00
Ben Widawsky	9b71e1c9c3	cxl/core/port: Add endpoint decoders Recall that a CXL Port is any object that publishes a CXL HDM Decoder Capability structure. That is Host Bridge and Switches that have been enabled so far. Now, add decoder support to the 'endpoint' CXL Ports registered by the cxl_mem driver. They mostly share the same enumeration as Bridges and Switches, but witout a target list. The target of endpoint decode is device-internal DPA space, not another downstream port. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, hookup enumeration in the port driver] Link: https://lore.kernel.org/r/164386092069.765089.14895687988217608642.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	8aea0ef19f	cxl/core: Move target_list out of base decoder attributes In preparation for introducing endpoint decoder objects, move the target_list attribute out of the common set since it has no meaning for endpoint decoders. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298430100.3018233.4715072508880290970.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Ben Widawsky	8dd2bc0f8e	cxl/mem: Add the cxl_mem driver At this point the subsystem can enumerate all CXL ports (CXL.mem decode resources in upstream switch ports and host bridges) in a system. The last mile is connecting those ports to endpoints. The cxl_mem driver connects an endpoint device to the platform CXL.mem protoctol decode-topology. At ->probe() time it walks its device-topology-ancestry and adds a CXL Port object at every Upstream Port hop until it gets to CXL root. The CXL root object is only present after a platform firmware driver registers platform CXL resources. For ACPI based platform this is managed by the ACPI0017 device and the cxl_acpi driver. The ports are registered such that disabling a given port automatically unregisters all descendant ports, and the chain can only be registered after the root is established. Given ACPI device scanning may run asynchronously compared to PCI device scanning the root driver is tasked with rescanning the bus after the root successfully probes. Conversely if any ports in a chain between the root and an endpoint becomes disconnected it subsequently triggers the endpoint to unregister. Given lock depenedencies the endpoint unregistration happens in a workqueue asynchronously. If userspace cares about synchronizing delayed work after port events the /sys/bus/cxl/flush attribute is available for that purpose. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog, rework hotplug support] Link: https://lore.kernel.org/r/164398782997.903003.9725273241627693186.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	2703c16c75	cxl/core/port: Add switch port enumeration So far the platorm level CXL resources have been enumerated by the cxl_acpi driver, and cxl_pci has gathered all the pre-requisite information it needs to fire up a cxl_mem driver. However, the first thing the cxl_mem driver will be tasked to do is validate that all the PCIe Switches in its ancestry also have CXL capabilities and an CXL.mem link established. Provide a common mechanism for a CXL.mem endpoint driver to enumerate all the ancestor CXL ports in the topology and validate CXL.mem connectivity. Multiple endpoints may end up racing to establish a shared port in the topology. This race is resolved via taking the device-lock on a parent CXL Port before establishing a new child. The winner of the race establishes the port, the loser simply registers its interest in the port via 'struct cxl_ep' place-holder reference. At endpoint teardown the same parent port lock is taken as 'struct cxl_ep' references are deleted. Last endpoint to drop its reference unregisters the port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164398731146.902644.1029761300481366248.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	cf1f6877b0	cxl/memdev: Add numa_node attribute While CXL memory targets will have their own memory target node, individual memory devices may be affinitized like other PCI devices. Emit that attribute for memdevs. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298428430.3018233.16409089892707993289.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Dan Williams	bcc79ea343	cxl/pci: Emit device serial number Per the CXL specification (8.1.12.2 Memory Device PCIe Capabilities and Extended Capabilities) the Device Serial Number capability is mandatory. Emit it for user tooling to identify devices. It is reasonable to ask whether the attribute should be added to the list of PCI sysfs device attributes. The PCI layer can optionally emit it too, but the CXL subsystem is aiming to preserve its independence and the possibility of CXL topologies with non-PCI devices in it. To date that has only proven useful for the 'cxl_test' model, but as can be seen with seen with ACPI0016 devices, sometimes all that is needed is a platform firmware table to point to CXL Component Registers in MMIO space to define a "CXL" device. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164366608838.196598.16856227191534267098.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:32 -08:00
Ben Widawsky	523e594d9c	cxl/pci: Implement wait for media active CXL 2.0 8.1.3.8.2 states: Memory_Active: When set, indicates that the CXL Range 1 memory is fully initialized and available for software use. Must be set within Range 1. Memory_Active_Timeout of deassertion of reset to CXL device if CXL.mem HwInit Mode=1 Unfortunately, Memory_Active can take quite a long time depending on media size (up to 256s per 2.0 spec). Provide a callback for the eventual establishment of CXL.mem operations via the 'cxl_mem' driver the 'struct cxl_memdev'. The implementation waits for 60s by default for now and can be overridden by the mbox_ready_time module parameter. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: switch to sleeping wait] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298427373.3018233.9309741847039301834.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	560f785590	cxl/pci: Retrieve CXL DVSEC memory info Before CXL 2.0 HDM Decoder Capability mechanisms can be utilized in a device the driver must determine that the device is ready for CXL.mem operation and that platform firmware, or some other agent, has established an active decode via the legacy CXL 1.1 decoder mechanism. This legacy mechanism is defined in the CXL DVSEC as a set of range registers and status bits that take time to settle after a reset. Validate the CXL memory decode setup via the DVSEC and cache it for later consideration by the cxl_mem driver (to be added). Failure to validate is not fatal to the cxl_pci driver since that is only providing CXL command support over PCI.mmio, and might be needed to rectify CXL DVSEC validation problems. Any potential ranges that the device is already claiming via DVSEC need to be reconciled with the dynamic provisioning ranges provided by platform firmware (like ACPI CEDT.CFMWS). Leave that reconciliation to the cxl_mem driver. [djbw: shorten defines] [djbw: change precise spin wait to generous msleep] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> [djbw: clarify changelog] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164375911821.559935.7375160041663453400.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	06e279e5eb	cxl/pci: Cache device DVSEC offset The PCIe device DVSEC, defined in the CXL 2.0 spec, 8.1.3 is required to be implemented by CXL 2.0 endpoint devices. In preparation for consuming this information in a new cxl_mem driver, retrieve the CXL DVSEC position and warn about the implications of not finding it. Allow for mailbox operation even if the CXL DVSEC is missing. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164375309615.513620.7874131241128599893.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:31 -08:00
Ben Widawsky	4112a08dd3	cxl/pci: Store component register base in cxlds In preparation for defining a cxl_port object to represent the decoder resources of a memory expander capture the component register base address. The port driver uses the component register base to enumerate the HDM Decoder Capability structure. Unlike other cxl_port objects the endpoint port decodes from upstream SPA to downstream DPA rather than upstream port to downstream port. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [djbw: clarify changelog] Link: https://lore.kernel.org/r/164375084181.484304.3919737667590006795.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	664bf11583	cxl/core/port: Remove @host argument for dport + decoder enumeration Now that dport and decoder enumeration is centralized in the port driver, the @host argument for these helpers can be made implicit. For the root port the host is the port's uport device (ACPI0017 for cxl_acpi), and for all other descendant ports the devm context is the parent of @port. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164375043390.484143.17617734732003230076.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Ben Widawsky	54cdbf845c	cxl/port: Add a driver for 'struct cxl_port' objects The need for a CXL port driver and a dedicated cxl_bus_type is driven by a need to simultaneously support 2 independent physical memory decode domains (cache coherent CXL.mem and uncached PCI.mmio) that also intersect at a single PCIe device node. A CXL Port is a device that advertises a CXL Component Register block with an "HDM Decoder Capability Structure". >From Documentation/driver-api/cxl/memory-devices.rst: Similar to how a RAID driver takes disk objects and assembles them into a new logical device, the CXL subsystem is tasked to take PCIe and ACPI objects and assemble them into a CXL.mem decode topology. The need for runtime configuration of the CXL.mem topology is also similar to RAID in that different environments with the same hardware configuration may decide to assemble the topology in contrasting ways. One may choose performance (RAID0) striping memory across multiple Host Bridges and endpoints while another may opt for fault tolerance and disable any striping in the CXL.mem topology. The port driver identifies whether an endpoint Memory Expander is connected to a CXL topology. If an active (bound to the 'cxl_port' driver) CXL Port is not found at every PCIe Switch Upstream port and an active "root" CXL Port then the device is just a plain PCIe endpoint only capable of participating in PCI.mmio and DMA cycles, not CXL.mem coherent interleave sets. The 'cxl_port' driver lets the CXL subsystem leverage driver-core infrastructure for setup and teardown of register resources and communicating device activation status to userspace. The cxl_bus_type can rendezvous the async arrival of platform level CXL resources (via the 'cxl_acpi' driver) with the asynchronous enumeration of Memory Expander endpoints, while also implementing a hierarchical locking model independent of the associated 'struct pci_dev' locking model. The locking for dport and decoder enumeration is now handled in the core rather than callers. For now the port driver only enumerates and registers CXL resources (downstream port metadata and decoder resources) later it will be used to take action on its decoders in response to CXL.mem region provisioning requests. Note1: cxlpci.h has long depended on pci.h, but port.c was the first to not include pci.h. Carry that dependency in cxlpci.h. Note2: cxl port enumeration and probing complicates CXL subsystem init to the point that it helps to have centralized debug logging of probe events in cxl_bus_probe(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/164374948116.464348.1772618057599155408.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	83fbdbe4c1	cxl/core: Emit modalias for CXL devices In order to enable libkmod lookups for CXL device objects to their corresponding module, add 'modalias' to the base attribute of CXL devices. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164298424120.3018233.15611905873808708542.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	d17d0540a0	cxl/core/hdm: Add CXL standard decoder enumeration to the core Unlike the decoder enumeration for "root decoders" described by platform firmware, standard decoders can be enumerated from the component registers space once the base address has been identified (via PCI, ACPI, or another mechanism). Add common infrastructure for HDM (Host-managed-Device-Memory) Decoder enumeration and share it between host-bridge, upstream switch port, and cxl_test defined decoders. The locking model for switch level decoders is to hold the port lock over the enumeration. This facilitates moving the dport and decoder enumeration to a 'port' driver. For now, the only enumerator of decoder resources is the cxl_acpi root driver. Co-developed-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164374688404.395335.9239248252443123526.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	98d2d3a264	cxl/core: Generalize dport enumeration in the core The core houses infrastructure for decoder resources. A CXL port's dports are more closely related to decoder infrastructure than topology enumeration. Implement generic PCI based dport enumeration in the core, i.e. arrange for existing root port enumeration from cxl_acpi to share code with switch port enumeration which just amounts to a small difference in a pci_walk_bus() invocation once the appropriate 'struct pci_bus' has been retrieved. Set the convention that decoder objects are registered after all dports are enumerated. This enables userspace to know when the CXL core is finished establishing 'dportX' links underneath the 'portX' object. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164368114191.354031.5270501846455462665.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	af9cae9fac	cxl/pci: Rename pci.h to cxlpci.h Similar to the mem.h rename, if the core wants to reuse definitions from drivers/cxl/pci.h it is unable to use <pci.h> as that collides with archs that have an arch/$arch/include/asm/pci.h, like MIPS. Reported-by: kernel test robot <lkp@intel.com> Acked-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298422510.3018233.14693126572756675563.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:30 -08:00
Dan Williams	c978f1b10a	cxl/port: Up-level cxl_add_dport() locking requirements to the caller In preparation for moving dport enumeration into the core, require the port device lock to be acquired by the caller. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367759016.324231.105551648350470000.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	a46cfc0f01	cxl/pmem: Introduce a find_cxl_root() helper In preparation for switch port enumeration while also preserving the potential for multi-domain / multi-root CXL topologies. Introduce a 'struct device' generic mechanism for retrieving a root CXL port, if one is registered. Note that the only known multi-domain CXL configurations are running the cxl_test unit test on a system that also publishes an ACPI0017 device. With this in hand the nvdimm-bridge lookup can be with device_find_child() instead of bus_find_device() + custom mocked lookup infrastructure in cxl_test. The mechanism looks for a 2nd level port since the root level topology is platform-firmware specific and the 2nd level down follows standard PCIe topology expectations. The cxl_acpi 2nd level is associated with a PCIe Root Port. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367562182.225521.9488555616768096049.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	5ff7316f6f	cxl/port: Introduce cxl_port_to_pci_bus() Add a helper for converting a PCI enumerated cxl_port into the pci_bus that hosts its dports. For switch ports this is trivial, but for root ports there is no generic way to go from a platform defined host bridge device, like ACPI0016 to its corresponding pci_bus. Rather than spill ACPI goop outside of the cxl_acpi driver, just arrange for it to register an xarray translation from the uport device to the corresponding pci_bus. This is in preparation for centralizing dport enumeration in the core. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164364745633.85488.9744017377155103992.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	86c8ea0f3b	cxl/core/port: Use dedicated lock for decoder target list Lockdep reports: ====================================================== WARNING: possible circular locking dependency detected 5.16.0-rc1+ #142 Tainted: G OE ------------------------------------------------------ cxl/1220 is trying to acquire lock: ffff979b85475460 (kn->active#144){++++}-{0:0}, at: __kernfs_remove+0x1ab/0x1e0 but task is already holding lock: ffff979b87ab38e8 (&dev->lockdep_mutex#2/4){+.+.}-{3:3}, at: cxl_remove_ep+0x50c/0x5c0 [cxl_core] ...where cxl_remove_ep() is a helper that wants to delete ports while holding a lock on the host device for that port. That sets up a lockdep violation whereby target_list_show() can not rely holding the decoder's device lock while walking the target_list. Switch to a dedicated seqlock for this purpose. Reported-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164367209095.208169.1171673319121271280.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Dan Williams	3c5b903955	cxl: Prove CXL locking When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets an additional mutex that is not clobbered by lockdep_set_novalidate_class() like the typical device_lock(). This allows for local annotation of subsystem locks with mutex_lock_nested() per the subsystem's object/lock hierarchy. For CXL, this primarily needs the ability to lock ports by depth and child objects of ports by their parent parent-port lock. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00
Ben Widawsky	53fa1bff34	cxl/core: Track port depth In preparation for proving CXL subsystem usage of the device_lock() order track the depth of ports with the expectation that shallower port locks can be held over deeper port locks. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/164298419321.3018233.4469731547378993606.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-02-08 22:57:29 -08:00

... 3 4 5 6 7 ...

565 Commits