linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-27 04:54:41 +08:00

Author	SHA1	Message	Date
Maor Gottlieb	18731642d4	RDMA/mlx5: Expose UAPI to query DM Expose UAPI to query MEMIC DM, this will let user space application that didn't allocate the DM but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20210411122924.60230-8-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-13 19:36:37 -03:00
Maor Gottlieb	cea85fa5db	RDMA/mlx5: Add support in MEMIC operations MEMIC buffer, in addition to regular read and write operations, can support atomic operations from the host. Introduce and implement new UAPI to allocate address space for MEMIC operations such as atomic. This includes: 1. Expose new IOCTL for request mapping of MEMIC operation. 2. Hold the operations address in a list, so same operation to same DM will be allocated only once. 3. Manage refcount on the mlx5_ib_dm object, so it would be keep valid until all addresses were unmapped. Link: https://lore.kernel.org/r/20210411122924.60230-7-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-13 19:36:36 -03:00
Maor Gottlieb	39cc792ff2	RDMA/mlx5: Add support to MODIFY_MEMIC command Add two functions to allocate and deallocate MEMIC operations by using the MODIFY_MEMIC command. Link: https://lore.kernel.org/r/20210411122924.60230-6-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-13 19:36:36 -03:00
Maor Gottlieb	251b9d7887	RDMA/mlx5: Re-organize the DM code 1. Inline the checks from check_dm_type_support() into their respective allocation functions. 2. Fix use after free when driver fails to copy the MEMIC address to the user by moving the allocation code into their respective functions, hence we avoid the explicit call to free the DM in the error flow. 3. Split mlx5_ib_dm struct to memic and icm proper typesafety throughout. Fixes: `dc2316eba7` ("IB/mlx5: Fix device memory flows") Link: https://lore.kernel.org/r/20210411122924.60230-5-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-13 19:36:36 -03:00
Maor Gottlieb	831df88381	RDMA/mlx5: Move all DM logic to separate file Move all device memory related code to a separate file. Link: https://lore.kernel.org/r/20210411122924.60230-4-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-13 19:36:36 -03:00
Mark Bloch	3a46f4fb55	net/mlx5: E-Switch, Refactor send to vport to be more generic Now that each representor stores a pointer to the managing E-Switch use that information when creating the send-to-vport rules. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 13:07:46 -08:00
Mark Bloch	658cfceb62	RDMA/mlx5: Use representor E-Switch when getting netdev and metadata Now that a pointer to the managing E-Switch is stored in the representor use it. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-03-12 13:07:23 -08:00
Leon Romanovsky	f91803998c	RDMA/mlx5: Set correct kernel-doc identifier The W=1 allmodconfig build produces the following warning: drivers/infiniband/hw/mlx5/odp.c:1086: warning: wrong kernel-doc identifier on line: * Parse a series of data segments for page fault handling. Fix it by changing /** to be /* as it is written in kernel-doc documentation. Fixes: `5e769e444d` ("RDMA/hw/mlx5/odp: Fix formatting and add missing descriptions in 'pagefault_data_segments()'") Link: https://lore.kernel.org/r/20210302074214.1054299-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-03 13:21:24 -04:00
YueHaibing	3a9b3d4536	IB/mlx5: Add missing error code Set err to -ENOMEM if kzalloc fails instead of 0. Fixes: `7597385371` ("IB/mlx5: Enable subscription for device events over DEVX") Link: https://lore.kernel.org/r/20210222122343.19720-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-01 14:49:09 -04:00
Jason Gunthorpe	7289e26f39	Linux 5.11 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmAppPgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGeXYH/imZPBd4A1jIMehN 5HV2A53Z+MXmmaMuGj9X1KV6vsf55/xB+IhOoFdtRAIsO8c2yYSCO8i4+4R0XfYA +/YFJeq672rojQnmh6XbpR8dugaAV7CUHy6n7KDsyvtT6EOCpwFSwkOb4X3tBRX6 TlYgm2d/xgV/wRHSgLVugK0MdFCLMAnyb7mkPfar9QrMgG1BiDKLq07xmwnS23On TkqpJ9yZ/rJpUrrUqQYPShSO/FmA+fSfWs0CDv7EIrJ40LUScD6PZxSHWTIHtjLk E4jFda6wuqLRVWsBwaBzUIdD0zk7X5quHRzEpbC5ga16SK6yrWvE5YJJXCguIEuZ f3FMRYs= =CAjn -----END PGP SIGNATURE----- Merge tag 'v5.11' into rdma.git for-next Linux 5.11 Merged to resolve conflicts with RDMA rc commits - drivers/infiniband/sw/rxe/rxe_net.c The final logic is to call rxe_get_dev_from_net() again with the master netdev if the packet was rx'd on a vlan. To keep the elimination of the local variables requires a trivial edit to the code in -rc Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-18 11:19:29 -04:00
Jason Gunthorpe	68ad4d1cc6	Merge branch 'mlx5_timestamp' into rdma.git for-next Leon Romanovsky says: ==================== Add an extra timestamp format for mlx5_ib device. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * branch 'mlx5_timestamp': RDMA/mlx5: Fail QP creation if the device can not support the CQE TS net/mlx5: Add new timestamp mode bits	2021-02-16 14:49:36 -04:00
Aharon Landau	2fe8d4b878	RDMA/mlx5: Fail QP creation if the device can not support the CQE TS In ConnectX6Dx device, HW can work in real time timestamp mode according to the device capabilities per RQ/SQ/QP. When the flag IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION is set, the user expect to get TS on the CQEs in free running format, so we need to fail the QP creation if the current mode of the device doesn't support it. Link: https://lore.kernel.org/r/20210209131107.698833-3-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-16 14:48:47 -04:00
Tal Gilboa	7232c132d1	RDMA/mlx5: Allow CQ creation without attached EQs The traditional DevX CQ creation flow goes through mlx5_core_create_cq() which checks that the given EQN corresponds to an existing EQ and attaches a devx handler to the EQN for the CQ. In some cases the EQ will not be a kernel EQ, but will be controlled by modify CQ, don't block creating these just because the EQN can't be found in the kernel. Link: https://lore.kernel.org/r/20210211085549.1277674-1-leon@kernel.org Signed-off-by: Tal Gilboa <talgi@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-16 14:42:59 -04:00
Patrisious Haddad	c70f51de85	RDMA/mlx5: Support 400Gbps IB rate in mlx5 driver Support 400Gbps IB rate in mlx5 driver. Link: https://lore.kernel.org/r/20210209130429.698237-1-leon@kernel.org Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-09 14:00:35 -04:00
Yishai Hadas	db72438c93	RDMA/mlx5: Cleanup the synchronize_srcu() from the ODP flow Cleanup the synchronize_srcu() from the ODP flow as it was found to be a very heavy time consumer as part of dereg_mr. For example de-registration of 10000 ODP MRs each with size of 2M hugepage took 19.6 sec comparing de-registration of same number of non ODP MRs that took 172 ms. The new locking scheme uses the wait_event() mechanism which follows the use count of the MR instead of using synchronize_srcu(). By that change, the time required for the above test took 95 ms which is even better than the non ODP flow. Once fully dropped the srcu usage, had to come with a lock to protect the XA access. As part of using the above mechanism we could also clean the num_deferred_work stuff and follow the use count instead. Link: https://lore.kernel.org/r/20210202071309.2057998-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 20:31:11 -04:00
Parav Pandit	131796524f	IB/mlx5: Use rdma_for_each_port for port iteration Instead of open coding the loop for port iteration, use rdma_for_each_port macro provided by core. To use such macro, early initialization of phys_port_cnt is needed. Hence, initialize such constant early enough in the init stage. Whichever functions are called with port using rdma_for_each_port(), convert their port type from u8 to unsigned int to match the core API. Link: https://lore.kernel.org/r/20210203130133.4057329-6-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 12:06:01 -04:00
Parav Pandit	7416790e22	RDMA/core: Introduce and use API to read port immutable data Currently mlx5 driver caches port GID table length for 2 ports. It is also cached by IB core as port immutable data. When mlx5 representor ports are present, which are usually more than 2, invalid access to port_caps array can happen while validating the GID table length which is only for 2 ports. To avoid this, take help of the IB cores port immutable data by exposing an API to read the port immutable fields. Remove mlx5 driver's internal cache, thereby reduce code and data. Link: https://lore.kernel.org/r/20210203130133.4057329-5-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 12:06:01 -04:00
Parav Pandit	7a58779edd	IB/mlx5: Improve query port for representor port Improve query port functionality for representor port as below. 1. RoCE Qkey violation counters are not applicable for representor port. 2. Avoid setting gid_tbl_len twice for representor port. 3. Avoid setting ip_gids and IB_PORT_CM_SUP property for representor port as GID table is empty and CM support is not present in representor mode. Link: https://lore.kernel.org/r/20210203130133.4057329-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 12:06:00 -04:00
Parav Pandit	2019d70e91	IB/mlx5: Avoid calling query device for reading pkey table length Pkey table length for all the ports of the device is the same. Currently get_ports_cap() reads and stores it for each port by querying the device which reads more than just pkey table length. For representor device ports which can be in range of hundreds, it queries is for each such port and end up returning same value for all the ports. When in representor mode, modify QP accesses pkey port caps for a port index that can be outside of the port_caps table. Hence, simplify the logic to query the max pkey table length only once during device initialization sequence. Link: https://lore.kernel.org/r/20210203130133.4057329-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 12:06:00 -04:00
Parav Pandit	3ce60f443b	IB/mlx5: Move mlx5_port_caps from mlx5_core_dev to mlx5_ib_dev mlx5_port_caps are RDMA specific capabilities. These are not used by the mlx5_core_device at all. Move them to mlx5_ib_dev where it is used and reduce the scope of it to multiple drivers. Link: https://lore.kernel.org/r/20210203130133.4057329-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 12:06:00 -04:00
Parav Pandit	d6fd59e14e	IB/mlx5: Support default partition key for representor port Representor port has only one default pkey. Hence have simpler query pkey callback or it. Link: https://lore.kernel.org/r/20210127150010.1876121-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-02 19:31:44 -04:00
Parav Pandit	d286ac1d05	IB/mlx5: Return appropriate error code instead of ENOMEM When mlx5_ib_stage_init_init() fails, return the error code related to failure instead of -ENOMEM. Fixes: `16c1975f10` ("IB/mlx5: Create profile infrastructure to add and remove stages") Link: https://lore.kernel.org/r/20210127150010.1876121-8-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-02 13:21:18 -04:00
Mark Bloch	2614488d1f	RDMA/mlx5: Allow creating all QPs even when non RDMA profile is used The cited commit disallowed creating any QP which isn't raw ethernet, reg umr or the special UD qp for testing WC, this proved too strict. While modify can't be done (no GIDS/GID table for example) just creating a QP is okay. This patch partially reverts the bellow mentioned commit and places the restriction at the modify QP stage and not at the creation. DEVX commands should be used to manipulate such QPs. Fixes: `42caf9cb59` ("RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled") Link: https://lore.kernel.org/r/20210125120709.836718-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:19:43 -04:00
Lee Jones	30cd9fc5e7	RDMA/hw/mlx5/qp: Demote non-conformant kernel-doc header Fixes the following W=1 kernel build warning(s): drivers/infiniband/hw/mlx5/qp.c:5384: warning: Function parameter or member 'qp' not described in 'mlx5_ib_qp_set_counter' drivers/infiniband/hw/mlx5/qp.c:5384: warning: Function parameter or member 'counter' not described in 'mlx5_ib_qp_set_counter' Link: https://lore.kernel.org/r/20210121094519.2044049-3-lee.jones@linaro.org Cc: Leon Romanovsky <leon@kernel.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-22 14:37:29 -04:00
Lee Jones	5e769e444d	RDMA/hw/mlx5/odp: Fix formatting and add missing descriptions in 'pagefault_data_segments()' Fixes the following W=1 kernel build warning(s): drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'dev' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'pfault' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'wqe' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'wqe_end' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'bytes_mapped' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'total_wqe_bytes' not described in 'pagefault_data_segments' drivers/infiniband/hw/mlx5/odp.c:1062: warning: Function parameter or member 'receive_queue' not described in 'pagefault_data_segments' Link: https://lore.kernel.org/r/20210121094519.2044049-2-lee.jones@linaro.org Cc: Leon Romanovsky <leon@kernel.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-22 14:37:29 -04:00
Jianxin Xiong	90da7dc820	RDMA/mlx5: Support dma-buf based userspace memory region Implement the new driver method 'reg_user_mr_dmabuf'. Utilize the core functions to import dma-buf based memory region and update the mappings. Add code to handle dma-buf related page fault. Link: https://lore.kernel.org/r/1608067636-98073-5-git-send-email-jianxin.xiong@intel.com Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-20 16:49:25 -04:00
Parav Pandit	de641d74fb	Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion" This reverts commit `fbdd0049d9`. Due to commit in fixes tag, netdevice events were received only in one net namespace of mlx5_core_dev. Due to this when netdevice events arrive in net namespace other than net namespace of mlx5_core_dev, they are missed. This results in empty GID table due to RDMA device being detached from its net device. Hence, revert back to receive netdevice events in all net namespaces to restore back RDMA functionality in non init_net net namespace. The deadlock will have to be addressed in another patch. Fixes: `fbdd0049d9` ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion") Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 20:22:59 -04:00
Parav Pandit	559a3eacc4	IB/mlx5: Make function static mlx5_query_mad_ifc_smp_attr_node_info() is internal to mad.c Hence, make it static. Link: https://lore.kernel.org/r/20210113121703.559778-5-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 20:14:14 -04:00
Parav Pandit	ab40530a2e	IB/mlx5: Add mutex destroy call to cap_mask_mutex mutex mutex_destroy() call for device's cap_mask_mutex mutex is missing, let's add it to annotate destruction. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20210113121703.559778-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 20:14:14 -04:00
Yishai Hadas	1368ead04c	RDMA/mlx5: Use strict get/set operations for obj_id Use strict get/set operations for obj_id based on the specific object type. This comes to prevent any miss match between the general header to the legacy header commands. Link: https://lore.kernel.org/r/20201230130121.180350-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 10:40:54 -04:00
Yishai Hadas	8798e4ad0a	RDMA/mlx5: Use the correct obj_id upon DEVX TIR creation Use the correct obj_id upon DEVX TIR creation by strictly taking the tirn 24 bits and not the general obj_id which is 32 bits. Fixes: `7efce3691d` ("IB/mlx5: Add obj create and destroy functionality") Link: https://lore.kernel.org/r/20201230130121.180350-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 10:40:54 -04:00
Mark Bloch	1c3aa6bd0b	RDMA/mlx5: Fix wrong free of blue flame register on error If the allocation of the fast path blue flame register fails, the driver should free the regular blue flame register allocated a statement above, not the one that it just failed to allocate. Fixes: `16c1975f10` ("IB/mlx5: Create profile infrastructure to add and remove stages") Link: https://lore.kernel.org/r/20210113121703.559778-6-leon@kernel.org Reported-by: Hans Petter Selasky <hanss@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-14 12:51:29 -04:00
Parav Pandit	2cb091f629	IB/mlx5: Fix error unwinding when set_has_smi_cap fails When set_has_smi_cap() fails, multiport master cleanup is missed. Fix it by doing the correct error unwinding goto. Fixes: `a989ea01cb` ("RDMA/mlx5: Move SMI caps logic") Link: https://lore.kernel.org/r/20210113121703.559778-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-14 12:51:18 -04:00
Linus Torvalds	009bd55dfc	RDMA 5.11 pull request A smaller set of patches, nothing stands out as being particularly major this cycle: - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4, mlx4 and mlx5 - Bug fixes and polishing for the new rts ULP - Cleanup of uverbs checking for allowed driver operations - Use sysfs_emit all over the place - Lots of bug fixes and clarity improvements for hns - hip09 support for hns - NDR and 50/100Gb signaling rates - Remove dma_virt_ops and go back to using the IB DMA wrappers - mlx5 optimizations for contiguous DMA regions -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl/aNXUACgkQOG33FX4g mxqlMQ/+O6UhxKnDAnMB+HzDGvOm+KXNHOQBuzxz4ZWXqtUrW8WU5ca3PhXovc4z /QX0HhMhQmVsva5mjp1OGVATxQ2E+yasqFLg4QXAFWFR3N7s0u/sikE9i1DoPvOC lsmLTeRauCFaE4mJD5nvYwm+riECX0GmyVVW7v6V05xwAp0hwdhyU7Kb6Yh3lxsE umTz+onPNJcD6Tc4snziyC5QEp5ebEjAaj4dVI1YPR5X0c2RwC5E1CIDI6u4OQ2k j7/+Kvo8LNdYNERGiR169x6c1L7WS6dYnGMMeXRgyy0BVbVdRGDnvCV9VRmF66w5 99fHfDjNMNmqbGNt/4/gwNdVrR9aI4jMZWCh7SmsguX6XwNOlhYldy3x3WnlkfkQ e4O0huJceJqcB2Uya70GqufnAetRXsbjzcvWxpR5YAwRmcRkm1f6aGK3BxPjWEbr BbYRpiKMxxT4yTe65BuuThzx6g4pNQHe0z3BM/dzMJQAX+PZcs1CPQR8F8PbCrZR Ad7qw4HJ587PoSxPi3toVMpYZRP6cISh1zx9q/JCj8cxH9Ri4MovUCS3cF63Ny3B 1LJ2q0x8FuLLjgZJogKUyEkS8OO6q7NL8WumjvrYWWx19+jcYsV81jTRGSkH3bfY F7Esv5K2T1F2gVsCe1ZFFplQg6ja1afIcc+LEl8cMJSyTdoSub4= =9t8b -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "A smaller set of patches, nothing stands out as being particularly major this cycle. The biggest item would be the new HIP09 HW support from HNS, otherwise it was pretty quiet for new work here: - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4, mlx4 and mlx5 - Bug fixes and polishing for the new rts ULP - Cleanup of uverbs checking for allowed driver operations - Use sysfs_emit all over the place - Lots of bug fixes and clarity improvements for hns - hip09 support for hns - NDR and 50/100Gb signaling rates - Remove dma_virt_ops and go back to using the IB DMA wrappers - mlx5 optimizations for contiguous DMA regions" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits) RDMA/cma: Don't overwrite sgid_attr after device is released RDMA/mlx5: Fix MR cache memory leak RDMA/rxe: Use acquire/release for memory ordering RDMA/hns: Simplify AEQE process for different types of queue RDMA/hns: Fix inaccurate prints RDMA/hns: Fix incorrect symbol types RDMA/hns: Clear redundant variable initialization RDMA/hns: Fix coding style issues RDMA/hns: Remove unnecessary access right set during INIT2INIT RDMA/hns: WARN_ON if get a reserved sl from users RDMA/hns: Avoid filling sl in high 3 bits of vlan_id RDMA/hns: Do shift on traffic class when using RoCEv2 RDMA/hns: Normalization the judgment of some features RDMA/hns: Limit the length of data copied between kernel and userspace RDMA/mlx4: Remove bogus dev_base_lock usage RDMA/uverbs: Fix incorrect variable type RDMA/core: Do not indicate device ready when device enablement fails RDMA/core: Clean up cq pool mechanism RDMA/core: Update kernel documentation for ib_create_named_qp() MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address ...	2020-12-16 13:42:26 -08:00
Maor Gottlieb	e899389029	RDMA/mlx5: Fix MR cache memory leak If the MR cache entry invalidation failed, then we detach this entry from the cache, therefore we must to free the memory as well. Allcation backtrace for the leaker: [<00000000d8e423b0>] alloc_cache_mr+0x23/0xc0 [mlx5_ib] [<000000001f21304c>] create_cache_mr+0x3f/0xf0 [mlx5_ib] [<000000009d6b45dc>] mlx5_ib_alloc_implicit_mr+0x41/0×210 [mlx5_ib] [<00000000879d0d68>] mlx5_ib_reg_user_mr+0x9e/0×6e0 [mlx5_ib] [<00000000be74bf89>] create_qp+0x2fc/0xf00 [ib_uverbs] [<000000001a532d22>] ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ+0x1d9/0×230 [ib_uverbs] [<0000000070f46001>] rdma_alloc_commit_uobject+0xb5/0×120 [ib_uverbs] [<000000006d8a0b38>] uverbs_alloc+0x2b/0xf0 [ib_uverbs] [<00000000075217c9>] ksysioctl+0x234/0×7d0 [<00000000eb5c120b>] __x64_sys_ioctl+0x16/0×20 [<00000000db135b48>] do_syscall_64+0x59/0×2e0 Fixes: `1769c4c575` ("RDMA/mlx5: Always remove MRs from the cache before destroying them") Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-14 15:14:27 -04:00
Tom Rix	7f1d2dfa30	RDMA/mlx5: Remove unneeded semicolon A semicolon is not needed after a switch statement. Link: https://lore.kernel.org/r/20201031134638.2135060-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-10 12:00:27 -04:00
Maor Gottlieb	ca991a7d14	RDMA/mlx5: Assign dev to DM MR Currently, DM MR registration flow doesn't set the mlx5_ib_dev pointer and can cause a NULL pointer dereference if userspace dumps the MR via rdma tool. Assign the IB device together with the other fields and remove the redundant reference of mlx5_ib_dev from mlx5_ib_mr. Cc: stable@vger.kernel.org Fixes: `6c29f57ea4` ("IB/mlx5: Device memory mr registration support") Link: https://lore.kernel.org/r/20201203190807.127189-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-07 15:52:54 -04:00
Jason Gunthorpe	ef3642c4f5	RDMA/mlx5: Fix error unwinds for rereg_mr This is all a giant train wreck of error handling, in many cases the MR is left in some corrupted state where continuing on is going to lead to chaos, or various unwinds/order is missed. rereg had three possible completely different actions, depending on flags and various details about the MR. Split the three actions into three functions, and call the right action from the start. For each action carefully design the error handling to fit the action: - UMR access/PD update is a simple UMR, if it fails the MR isn't changed, so do nothing - PAS update over UMR is multiple UMR operations. To keep everything sane revoke access to the MKey while it is being changed and restore it once the MR is correct. - Recreating the mkey should completely build a parallel MR with a fully loaded PAS then swap and destroy the old one. If it fails the original should be left untouched. This is handled in the core code. Directly call the normal MR creation functions, possibly re-using the existing umem. Add support for working with ODP MRs. The READ/WRITE access flags can be changed by UMR and we can trivially convert to/from ODP MRs using the logic to build a completely new MR. This new logic also fixes various problems with MRs continuing to work while their PAS lists are no longer valid, eg during a page size change. Link: https://lore.kernel.org/r/20201130075839.278575-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-07 14:06:23 -04:00
Jason Gunthorpe	38f8ff5b44	RDMA/mlx5: Reorganize mlx5_ib_reg_user_mr() This function handles an ODP and regular MR flow all mushed together, even though the two flows are quite different. Split them into two dedicated functions. Link: https://lore.kernel.org/r/20201130075839.278575-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-07 14:06:23 -04:00
Jason Gunthorpe	6e0954b11c	RDMA/uverbs: Allow drivers to create a new HW object during rereg_mr mlx5 has an ugly flow where it tries to allocate a new MR and replace the existing MR in the same memory during rereg. This is very complicated and buggy. Instead of trying to replace in-place inside the driver, provide support from uverbs to change the entire HW object assigned to a handle during rereg_mr. Since destroying a MR is allowed to fail (ie if a MW is pointing at it) and can't be detected in advance, the algorithm creates a completely new uobject to hold the new MR and swaps the IDR entries of the two objects. The old MR in the temporary IDR entry is destroyed, and if it fails rereg_mr succeeds and destruction is deferred to FD release. This complexity is why this cannot live in a driver safely. Link: https://lore.kernel.org/r/20201130075839.278575-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-07 14:06:23 -04:00
Jason Gunthorpe	adac4cb3c1	RDMA/uverbs: Check ODP in ib_check_mr_access() as well No reason only one caller checks this. This properly blocks ODP from the rereg flow if the device does not support ODP. Link: https://lore.kernel.org/r/20201130075839.278575-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-07 14:06:23 -04:00
Leon Romanovsky	04b222f957	RDMA/mlx5: Remove IB representors dead code Delete dead code. Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2020-12-06 07:43:54 +02:00
Leon Romanovsky	e87114022e	net/mlx5: Simplify eswitch mode check Provide mlx5_core device instead of "priv" pointer while checking eswith mode. Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2020-12-06 07:43:54 +02:00
Leon Romanovsky	93f8244431	RDMA/mlx5: Convert mlx5_ib to use auxiliary bus The conversion to auxiliary bus solves long standing issue with existing mlx5_ib<->mlx5_core coupling. It required to have both modules in initramfs if one of them needed for the boot. Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2020-12-06 07:43:50 +02:00
Leon Romanovsky	2b1f747071	RDMA/core: Allow drivers to disable restrack DB Driver QP types are special case with no IBTA restrictions. For example, EFA implemented creation of this QP type as regular one, while mlx5 separated create to two step: create and modify. That separation causes to the situation where DC QP (mlx5) is always added to the same xarray index zero. This change allows to drivers like mlx5 simply disable restrack DB tracking, but it doesn't disable kref on the memory. Fixes: `52e0a118a2` ("RDMA/restrack: Track driver QP types in resource tracker") Link: https://lore.kernel.org/r/20201117070148.1974114-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-27 11:38:46 -04:00
Parav Pandit	7ec3df174f	RDMA/mlx5: Use PCI device for dma mappings DMA operation of the IB device is done using ib_device->dma_device. Instead of accessing parent of the IB device, use the PCI dma device which is setup to ib_device->dma_device during IB device registration. Link: https://lore.kernel.org/r/20201125064628.8431-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 15:49:05 -04:00
Leon Romanovsky	d4b2d19dc5	RDMA/mlx5: Silence the overflow warning while building offset mask Coverity reports "Potentially overflowing expression ..." warning, which is correct thing to complain from the compiler point of view, but this is not possible in the current code. Still, this is a small error as there are some future situations that might need to use a 32 bit offset. Use ULL so the calculation works up to 63. Fixes: `b045db62f6` ("RDMA/mlx5: Use ib_umem_find_best_pgoff() for SRQ") Link: https://lore.kernel.org/r/20201125061704.6580-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 15:49:05 -04:00
Jason Gunthorpe	d0b7721c5e	RDMA/mlx5: Check for ERR_PTR from uverbs_zalloc() The return code from uverbs_zalloc() was wrongly checked, it is ERR_PTR not NULL like other allocators: drivers/infiniband/hw/mlx5/devx.c:2110 devx_umem_reg_cmd_alloc() warn: passing zero to 'PTR_ERR' Fixes: `878f7b31c3` ("RDMA/mlx5: Use ib_umem_find_best_pgsz() for devx") Link: https://lore.kernel.org/r/0-v1-4d05ccc1c223+173-devx_err_ptr_jgg@nvidia.com Reported-by: kernel test robot <lkp@intel.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 15:49:04 -04:00
Avihai Horon	f957d4d09a	RDMA/mlx5: Enable querying AH for XRC QP types Address handle is set for connected QP types such as RC and UC, and thus can also be queried. Since XRC QP types INI and TGT are connected, it should be possible to query their address handle as well. Until now it was not the case, and although the firmware supported it, the driver allowed querying the address handle only for RC and UC. Hence, we enable it now for INI and TGT QPs as well. Link: https://lore.kernel.org/r/20201115121425.139833-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 11:58:19 -04:00
Gustavo A. R. Silva	808b2c925d	IB/mlx5: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding the new pseudo-keyword fallthrough; instead of letting the code fall through to the next case. Link: https://lore.kernel.org/r/2b0c87362bc86f6adfe56a5a6685837b71022bbf.1605896059.git.gustavoars@kernel.org Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-23 15:54:11 -04:00

1 2 3 4 5 ...

1500 Commits