linux/drivers/infiniband/core
Yamin Friedman c7ff819aef RDMA/core: Introduce shared CQ pool API
Allow a ULP to ask the core to provide a completion queue based on a
least-used search on a per-device CQ pools. The device CQ pools grow in a
lazy fashion when more CQs are requested.

This feature reduces the amount of interrupts when using many QPs.  Using
shared CQs allows for more effcient completion handling. It also reduces
the amount of overhead needed for CQ contexts.

Test setup:
Intel(R) Xeon(R) Platinum 8176M CPU @ 2.10GHz servers.
Running NVMeoF 4KB read IOs over ConnectX-5EX across Spectrum switch.
TX-depth = 32. The patch was applied in the nvme driver on both the target
and initiator. Four controllers are accessed from each core. In the
current test case we have exposed sixteen NVMe namespaces using four
different subsystems (four namespaces per subsystem) from one NVM port.
Each controller allocated X queues (RDMA QPs) and attached to Y CQs.
Before this series we had X == Y, i.e for four controllers we've created
total of 4X QPs and 4X CQs. In the shared case, we've created 4X QPs and
only X CQs which means that we have four controllers that share a
completion queue per core. Until fourteen cores there is no significant
change in performance and the number of interrupts per second is less than
a million in the current case.
==================================================
|Cores|Current KIOPs  |Shared KIOPs  |improvement|
|-----|---------------|--------------|-----------|
|14   |2332           |2723          |16.7%      |
|-----|---------------|--------------|-----------|
|20   |2086           |2712          |30%        |
|-----|---------------|--------------|-----------|
|28   |1971           |2669          |35.4%      |
|=================================================
|Cores|Current avg lat|Shared avg lat|improvement|
|-----|---------------|--------------|-----------|
|14   |767us          |657us         |14.3%      |
|-----|---------------|--------------|-----------|
|20   |1225us         |943us         |23%        |
|-----|---------------|--------------|-----------|
|28   |1816us         |1341us        |26.1%      |
========================================================
|Cores|Current interrupts|Shared interrupts|improvement|
|-----|------------------|-----------------|-----------|
|14   |1.6M/sec          |0.4M/sec         |72%        |
|-----|------------------|-----------------|-----------|
|20   |2.8M/sec          |0.6M/sec         |72.4%      |
|-----|------------------|-----------------|-----------|
|28   |2.9M/sec          |0.8M/sec         |63.4%      |
====================================================================
|Cores|Current 99.99th PCTL lat|Shared 99.99th PCTL lat|improvement|
|-----|------------------------|-----------------------|-----------|
|14   |67ms                    |6ms                    |90.9%      |
|-----|------------------------|-----------------------|-----------|
|20   |5ms                     |6ms                    |-10%       |
|-----|------------------------|-----------------------|-----------|
|28   |8.7ms                   |6ms                    |25.9%      |
|===================================================================

Performance improvement with sixteen disks (sixteen CQs per core) is
comparable.

Link: https://lore.kernel.org/r/1590568495-101621-3-git-send-email-yaminf@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29 16:09:02 -03:00
..
addr.c RDMA/addr: Mark addr_resolve as might_sleep() 2020-05-12 21:32:52 -03:00
agent.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
agent.h
cache.c IB/core: Fix potential NULL pointer dereference in pkey cache 2020-05-12 11:47:48 -03:00
cgroup.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
cm_msgs.h RDMA/cm: Remove CM message structs 2020-01-25 15:11:37 -04:00
cm.c RDMA/cm: Send and receive ECE parameter over the wire 2020-05-27 16:05:05 -03:00
cma_configfs.c IB/cma: Fix ports memory leak in cma_configfs 2020-05-22 15:37:19 -03:00
cma_priv.h RDMA/ucma: Extend ucma_connect to receive ECE parameters 2020-05-27 16:05:05 -03:00
cma_trace.c RDMA/cma: Add trace points in RDMA Connection Manager 2020-01-07 16:10:53 -04:00
cma_trace.h RDMA/cma: Add trace points in RDMA Connection Manager 2020-01-07 16:10:53 -04:00
cma.c RDMA/cma: Provide ECE reject reason 2020-05-27 16:05:05 -03:00
core_priv.h RDMA/core: Introduce shared CQ pool API 2020-05-29 16:09:02 -03:00
counters.c RDMA/counter: Prevent auto-binding a QP which are not tracked with res 2019-12-12 15:38:15 -05:00
cq.c RDMA/core: Introduce shared CQ pool API 2020-05-29 16:09:02 -03:00
device.c RDMA/core: Introduce shared CQ pool API 2020-05-29 16:09:02 -03:00
fmr_pool.c RDMA: Delete DEBUG code 2019-08-20 13:27:53 -04:00
ib_core_uverbs.c RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence 2020-01-25 14:48:33 -04:00
iwcm.c RDMA/iwcm: Fix iwcm work deallocation 2020-03-04 14:28:25 -04:00
iwcm.h
iwpm_msg.c RDMA/iwpm: Delete unnecessary checks before the macro call "dev_kfree_skb" 2019-08-27 13:09:23 -03:00
iwpm_util.c RDMA/iwpm: Delete unnecessary checks before the macro call "dev_kfree_skb" 2019-08-27 13:09:23 -03:00
iwpm_util.h infiniband: fix core/ipwm_util.h kernel-doc warnings 2019-10-22 14:45:31 -03:00
lag.c RDMA/core: Consider flow label when building skb 2020-05-06 16:51:43 -03:00
mad_priv.h RDMA: Replace zero-length array with flexible-array member 2020-02-20 13:33:51 -04:00
mad_rmpp.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
mad_rmpp.h
mad.c RDMA: Allow ib_client's to fail when add() is called 2020-05-06 11:57:33 -03:00
Makefile IB/uverbs: Introduce create/destroy QP commands over ioctl 2020-05-21 20:39:36 -03:00
mr_pool.c Linux 5.2-rc6 2019-06-28 21:18:23 -03:00
multicast.c RDMA: Allow ib_client's to fail when add() is called 2020-05-06 11:57:33 -03:00
netlink.c IB/core: Avoid deadlock during netlink message handling 2019-10-24 20:49:37 -03:00
nldev.c RDMA/core: Fix double put of resource 2020-05-12 11:47:48 -03:00
opa_smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
packer.c
rdma_core.c RDMA/core: Allow the ioctl layer to abort a fully created uobject 2020-05-21 20:10:46 -03:00
rdma_core.h IB/uverbs: Introduce create/destroy QP commands over ioctl 2020-05-21 20:39:36 -03:00
restrack.c RDMA/restrack: Remove PID namespace support 2019-10-23 15:58:31 -03:00
restrack.h RDMA/restrack: Remove PID namespace support 2019-10-23 15:58:31 -03:00
roce_gid_mgmt.c drivers: use in_dev_for_each_ifa_rtnl/rcu 2019-06-02 18:06:26 -07:00
rw.c RDMA/rw: use DIV_ROUND_UP to calculate nr_ops 2020-04-15 11:34:49 -03:00
sa_query.c RDMA/core: Use sizeof_field() helper 2020-05-27 13:46:05 -03:00
sa.h RDMA/core: Annotate timeout as unsigned long 2018-10-16 13:34:01 -04:00
security.c RDMA/core: Ensure security pkey modify is not lost 2020-03-24 19:53:25 -03:00
smi.c
smi.h RDMA: Start use ib_device_ops 2018-12-12 07:40:16 -07:00
sysfs.c RDMA/core: Fix several reference count leaks. 2020-05-29 15:35:49 -03:00
trace.c RDMA/core: Trace points for diagnosing completion queue issues 2020-01-07 16:10:53 -04:00
ucma.c RDMA/cma: Provide ECE reject reason 2020-05-27 16:05:05 -03:00
ud_header.c RDMA/core: Use sizeof_field() helper 2020-05-27 13:46:05 -03:00
umem_odp.c RDMA/odp: Fix leaking the tgid for implicit ODP 2020-03-10 14:29:07 -03:00
umem.c RDMA/core: Add weak ordering dma attr to dma mapping 2020-02-13 13:38:02 -04:00
user_mad.c RDMA: Allow ib_client's to fail when add() is called 2020-05-06 11:57:33 -03:00
uverbs_cmd.c RDMA/core: Use sizeof_field() helper 2020-05-27 13:46:05 -03:00
uverbs_ioctl.c RDMA/core: Use sizeof_field() helper 2020-05-27 13:46:05 -03:00
uverbs_main.c IB/uverbs: Refactor related objects to use their own asynchronous event FD 2020-05-21 20:34:53 -03:00
uverbs_marshall.c IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *' 2018-06-25 14:19:57 -06:00
uverbs_std_types_async_fd.c RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj 2020-05-12 17:02:25 -03:00
uverbs_std_types_counters.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_cq.c IB/uverbs: Extend CQ to get its own asynchronous event FD 2020-05-21 20:34:53 -03:00
uverbs_std_types_device.c RDMA/core: Add the core support field to METHOD_GET_CONTEXT 2020-01-16 15:55:46 -04:00
uverbs_std_types_dm.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_flow_action.c IB: When attrs.udata/ufile is available use that instead of uobject 2019-04-08 13:05:25 -03:00
uverbs_std_types_mr.c RDMA/core: Allow the ioctl layer to abort a fully created uobject 2020-05-21 20:10:46 -03:00
uverbs_std_types_qp.c IB/uverbs: Introduce create/destroy QP commands over ioctl 2020-05-21 20:39:36 -03:00
uverbs_std_types_srq.c IB/uverbs: Introduce create/destroy SRQ commands over ioctl 2020-05-21 20:39:35 -03:00
uverbs_std_types_wq.c IB/uverbs: Introduce create/destroy WQ commands over ioctl 2020-05-21 20:39:35 -03:00
uverbs_std_types.c IB/uverbs: Introduce create/destroy QP commands over ioctl 2020-05-21 20:39:36 -03:00
uverbs_uapi.c IB/uverbs: Introduce create/destroy QP commands over ioctl 2020-05-21 20:39:36 -03:00
uverbs.h IB/uverbs: Extend CQ to get its own asynchronous event FD 2020-05-21 20:34:53 -03:00
verbs.c RDMA/core: Add protection for shared CQs used by ULPs 2020-05-29 15:40:51 -03:00