According to the IB Specification, srq_limit shouldn't be configured
during SRQ creation. If a user set srq_limit at this time, the driver
should forced it to zero, or the result of creating SRQ will conflict with
the result of querying SRQ.
Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
If a user posts WR by wr_list, the head pointer of idx_queue won't be
updated until all wqes are filled, so the judgment of whether head equals
to tail will get a wrong result. Fix above issue and move the head and
tail pointer from the srq structure into the idx_queue structure. After
idx_queue is filled with wqe idx, the head pointer of it will increase.
Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The RQ/SRQ of HIP08 needs one special sge to stop receive reliably. So the
driver needs to allocate at least one SGE when creating RQ/SRQ and ensure
that at least one SGE is filled with the special value during post_recv.
Besides, the kernel driver should only do this for kernel ULP. For
userspace ULP, the userspace driver will allocate the reserved SGE in
buffer, and the kernel driver just needs to pin the corresponding size of
memory based on the userspace driver's requirements.
Link: https://lore.kernel.org/r/1611997090-48820-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
When creating or re-registering an MR, storing the PDN, access flag and
IOVA information ASAP can simplify the number of parameters passed into
the subsequent process.
Link: https://lore.kernel.org/r/1611395282-991-3-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Split the hns_roce_mtr_create() into serval small functions, remove unused
member in 'struct hns_roce_buf_attr' and delete unnecessary MTR page count
check flow to make the MTR creation related codes clearer.
Link: https://lore.kernel.org/r/1611395282-991-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes the following W=1 kernel build warning(s):
drivers/infiniband/hw/hns/hns_roce_mr.c:1003: warning: Function parameter or member 'hr_dev' not described in 'hns_roce_mtr_create'
Link: https://lore.kernel.org/r/20210121094519.2044049-6-lee.jones@linaro.org
Cc: Lijun Ou <oulijun@huawei.com>
Cc: Weihang Li <liweihang@huawei.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Fixes the following W=1 kernel build warning(s):
drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1398: warning: Function parameter or member 'dereset' not described in 'hns_roce_v1_reset'
drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1398: warning: Excess function parameter 'enable' description in 'hns_roce_v1_reset'
Link: https://lore.kernel.org/r/20210121094519.2044049-5-lee.jones@linaro.org
Cc: Lijun Ou <oulijun@huawei.com>
Cc: Weihang Li <liweihang@huawei.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Wei Hu <xavier.huwei@huawei.com>
Cc: Nenglong Zhao <zhaonenglong@hisilicon.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
GFP_KERNEL may cause ida_alloc_range() to sleep, but the spinlock covering
this function is not allowed to sleep, so the spinlock needs to be changed
to mutex.
As there is a certain chance of memory allocation failure, GFP_ATOMIC is
not suitable for QP allocation scenarios.
Fixes: 71586dd200 ("RDMA/hns: Create QP with selected QPN for bank load balance")
Link: https://lore.kernel.org/r/1611048513-28663-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to improve performance by balancing the load between different
banks of cache, the CQC cache is desigend to choose one of 4 banks
according to lower 2 bits of CQN. The hns driver needs to count the number
of CQ on each bank and then assigns the CQ being created to the bank with
the minimum load first.
Link: https://lore.kernel.org/r/1610008589-35770-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This change fixes the checkpatch warning described in
commit cbacb5ab0a ("docs: printk-formats: Stop encouraging use of
unnecessary %h[xudi] and %hh[xudi]")
Standard integer promotion is already done and %hx and %hhx is useless so
do not encourage the use of %hh[xudi] or %h[xudi].
Link: https://lore.kernel.org/r/20201223193041.122850-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There is no need to get queue number repeatly for different queues from an
AEQE entity, as they are the same. Furthermore, redefine the AEQE
structure to make the codes more readable.
In addition, HNS_ROCE_EVENT_TYPE_CEQ_OVERFLOW is removed because the
hardware never reports this event.
Link: https://lore.kernel.org/r/1607650657-35992-12-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Some %d in print format string should be %u, and some prints miss the
useful errno or are in nonstandard format. Just fix above issues.
Link: https://lore.kernel.org/r/1607650657-35992-11-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Types of some fields, variables and parameters of some functions should be
unsigned.
Link: https://lore.kernel.org/r/1607650657-35992-10-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There is no need to initialize some variable because they will be assigned
with a value later.
Link: https://lore.kernel.org/r/1607650657-35992-9-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Just format the code without modifying anything, including fixing some
redundant and missing blanks and spaces and changing the variable
definition order.
Link: https://lore.kernel.org/r/1607650657-35992-8-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
As the qp access right is checked and setted in common function
hns_roce_v2_set_opt_fields(), there is no need to set again for a special
case INIT2INIT.
Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Fixes: 7db82697b8 ("RDMA/hns: Add support for extended atomic in userspace")
Link: https://lore.kernel.org/r/1607650657-35992-7-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
According to the RoCE v1 specification, the sl (service level) 0-7 are
mapped directly to priorities 0-7 respectively, sl 8-15 are reserved. The
driver should verify whether the value of sl is larger than 7, if so, an
exception should be returned.
Link: https://lore.kernel.org/r/1607650657-35992-6-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Only the low 12 bits of vlan_id is valid, and service level has been
filled in Address Vector. So there is no need to fill sl in vlan_id in
Address Vector.
Fixes: 7406c0036f ("RDMA/hns: Only record vlan info for HIP08")
Link: https://lore.kernel.org/r/1607650657-35992-5-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The high 6 bits of traffic class in GRH is DSCP (Differentiated Services
Codepoint), the driver should shift it before the hardware gets it when
using RoCEv2.
Fixes: 606bf89e98 ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function")
Fixes: fba429fcf9 ("RDMA/hns: Fix missing fields in address vector")
Link: https://lore.kernel.org/r/1607650657-35992-4-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Whether to enable the these features should better depend on the enable
flags, not the value of related fields.
Fixes: 5c1f167af1 ("RDMA/hns: Init SRQ table for hip08")
Fixes: 3cb2c996c9 ("RDMA/hns: Add support for SCCC in size of 64 Bytes")
Link: https://lore.kernel.org/r/1607650657-35992-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
For ib_copy_from_user(), the length of udata may not be the same as that
of cmd. For ib_copy_to_user(), the length of udata may not be the same as
that of resp. So limit the length to prevent out-of-bounds read and write
operations from ib_copy_from_user() and ib_copy_to_user().
Fixes: de77503a59 ("RDMA/hns: RDMA/hns: Assign rq head pointer when enable rq record db")
Fixes: 633fb4d9fd ("RDMA/hns: Use structs to describe the uABI instead of opencoding")
Fixes: ae85bf92ef ("RDMA/hns: Optimize qp param setup flow")
Fixes: 6fd610c573 ("RDMA/hns: Support 0 hop addressing for SRQ buffer")
Fixes: 9d9d4ff788 ("RDMA/hns: Update the kernel header file of hns")
Link: https://lore.kernel.org/r/1607650657-35992-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
These flags will be returned to the userspace through ABI, so they should
be defined in hns-abi.h. Furthermore, there is no need to include
hns-abi.h in every source files, it just needs to be included in the
common header file.
Link: https://lore.kernel.org/r/1606872560-17823-1-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
mlx5 has an ugly flow where it tries to allocate a new MR and replace the
existing MR in the same memory during rereg. This is very complicated and
buggy. Instead of trying to replace in-place inside the driver, provide
support from uverbs to change the entire HW object assigned to a handle
during rereg_mr.
Since destroying a MR is allowed to fail (ie if a MW is pointing at it)
and can't be detected in advance, the algorithm creates a completely new
uobject to hold the new MR and swaps the IDR entries of the two objects.
The old MR in the temporary IDR entry is destroyed, and if it fails
rereg_mr succeeds and destruction is deferred to FD release. This
complexity is why this cannot live in a driver safely.
Link: https://lore.kernel.org/r/20201130075839.278575-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The variable 'cnt' is used to represent the max number of sge an SQ WQE
can use at first, then it means how many extended sge an SQ has. In
addition, this function has no need to return a value. So refactor and
encapsulate the parts of getting number of extended sge a WQE can use to
make it easier to understand.
Link: https://lore.kernel.org/r/1606558959-48510-4-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Page alignment is required when setting the number of extended sge
according to the hardware's achivement. If the space of needed extended
sge is greater than one page, the roundup_pow_of_two() can ensure
that. But if the needed extended sge isn't 0 and can not be filled in a
whole page, the driver should align it specifically.
Fixes: 54d6638765 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1606558959-48510-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
One RC SQ WQE can store 2 sges but UD can't, so ignore 2 valid sges of
wr.sglist for RC which have been filled in WQE before setting extended
sge. Either of RC and UD can not contain 0-length sges, so these 0-length
sges should be skipped.
Fixes: 54d6638765 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1606558959-48510-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl/EM9oeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG/3kH/RNkFyTlHlUkZpJx
8Ks2yWgUln7YhZcmOaG/IcIyWnhCgo3l35kiaH7XxM+rPMZzidp51MHUllaTAQDc
u+5EFHMJsmTWUfE8ocHPb1cPdYEDSoVr6QUsixbL9+uADpRz+VZVtWMb89EiyMrC
wvLIzpnqY5UNriWWBxD0hrmSsT4g9XCsauer4k2KB+zvebwg6vFOMCFLFc2qz7fb
ABsrPFqLZOMp+16chGxyHP7LJ6ygI/Hwf7tPW8ppv4c+hes4HZg7yqJxXhV02QbJ
s10s6BTcEWMqKg/T6L/VoScsMHWUcNdvrr3uuPQhgup240XdmB1XO8rOKddw27e7
VIjrjNw=
=4ZaP
-----END PGP SIGNATURE-----
Merge tag 'v5.10-rc6' into rdma.git for-next
For dependencies in following patches
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stash is a mechanism that uses the core information carried by the ARM AXI
bus to access the L3 cache. It can be used to improve the performance by
increasing the hit ratio of L3 cache. QPs need to enable stash by default.
Link: https://lore.kernel.org/r/1606374251-21512-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Stash is a mechanism that uses the core information carried by the ARM AXI
bus to access the L3 cache. It can be used to improve the performance by
increasing the hit ratio of L3 cache. CQs need to enable stash by default.
Link: https://lore.kernel.org/r/1606374251-21512-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to improve performance by balancing the load between different
banks of cache, the QPC cache is desigend to choose one of 8 banks
according to lower 3 bits of QPN. The hns driver needs to count the number
of QP on each bank and then assigns the QP being created to the bank with
the minimum load first.
Link: https://lore.kernel.org/r/1606220649-1465-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The loopback flag will be set to 1 by the hardware when the source mac
address is same as the destination mac address. So the driver don't need
to compare them.
Fixes: d6a3627e31 ("RDMA/hns: Optimize wqe buffer set flow for post send")
Link: https://lore.kernel.org/r/1605526408-6936-4-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Traffic class and hop limit in address vector is not assigned from GRH,
but it will be filled into UD SQ WQE. So the hardware will get a wrong
value.
Fixes: 82e620d9c3 ("RDMA/hns: Modify the data structure of hns_roce_av")
Link: https://lore.kernel.org/r/1605526408-6936-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
When a memory window is bound to a memory region, the local write access
should be set for its mtpt table.
Fixes: c7c2819140 ("RDMA/hns: Add MW support for hip08")
Link: https://lore.kernel.org/r/1606386372-21094-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The maximum number of retransmission should be returned when querying QP,
not the value of retransmission counter.
Fixes: 99fcf82521 ("RDMA/hns: Fix the wrong value of rnr_retry when querying qp")
Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1606382977-21431-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The SRQ capacity is got from the firmware, whose field should be ended at
bit 19.
Fixes: ba6bb7e974 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1606382812-23636-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add a group of flags to control the 'struct hns_roce_buf' allocation
flow, this is used to support the caller running in atomic context.
Link: https://lore.kernel.org/r/1605347916-15964-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
A return statement is omitted after getting HEM table, then the newly
allocated pointer will be freed directly, which will cause a calltrace
when the driver was removed.
Fixes: d6d91e4621 ("RDMA/hns: Add support for configuring GMV table")
Link: https://lore.kernel.org/r/1605180582-46504-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add a interface to fill GMV(SGID/SMAC/VLAN) table for HIP09, all of above
source address information is stored as an entry in GMV table. The users
just need to provide the index to the hardware when POST SEND.
Link: https://lore.kernel.org/r/1603508836-33054-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
HIP09 supports to store SGID/SMAC/VLAN together in a table named GMV. The
driver needs to allocate memory for it and tell the information about this
region to hardware.
Link: https://lore.kernel.org/r/1603508836-33054-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The doorbell needs to store PI information into QPC, so the RoCEE should
wait for the results of storing, that is, it needs two bus operations to
complete a doorbell. When ROCEE is in SDI mode, multiple doorbells may be
interlocked because the RoCEE can only handle bus operations serially. So a
flag to mark if HIP09 is working in SDI mode is added. When the SDI flag is
set, the ROCEE will ignore the PI information of the doorbell, continue to
fetch wqe and verify its validity by it's owner_bit.
Link: https://lore.kernel.org/r/1603195493-22741-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Allowing userspace to invoke these commands is probably going to crash
these drivers as they are not tested and not expecting to use them on a
user object.
For example pvrdma touches cq->ring_state which is not initialized for
user QPs.
These commands are effected:
- IB_USER_VERBS_CMD_REQ_NOTIFY_CQ is ibv_cmd_req_notify_cq() in
rdma-core, only hfi1, ipath and rxe calls it.
- IB_USER_VERBS_CMD_POLL_CQ is ibv_cmd_poll_cq() in rdma-core, only
ipath and hfi1 calls it.
- IB_USER_VERBS_CMD_POST_SEND/RECV is ibv_cmd_post_send/recv() in
rdma-core, only ipath and hfi1 call them.
- IB_USER_VERBS_CMD_POST_SRQ_RECV is ibv_cmd_post_srq_recv() in
rdma-core, only ipath and hfi1 calls it.
- IB_USER_VERBS_CMD_PEEK_CQ isn't even implemented anywhere
- IB_USER_VERBS_CMD_CREATE/DESTROY_AH is ibv_cmd_create/destroy_ah() in
rdma-core, only bnxt_re, efa, hfi1, ipath, mlx5, orcrdma, and rxe call
it.
Link: https://lore.kernel.org/r/10-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Each driver should check that the QP attrs create_flags is supported.
Unfortuantely when create_flags was added to the QP attrs the drivers were
not updated. uverbs_ex_cmd_mask was used to block it - even though kernel
drivers use these flags too.
Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_QP from uverbs_ex_cmd_mask. Fix the error code
to be EOPNOTSUPP.
Link: https://lore.kernel.org/r/8-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Each driver should check that the CQ attrs is supported. Unfortuantely
when flags was added to the CQ attrs the drivers were not updated,
uverbs_ex_cmd_mask was used to block it. This was missed when create CQ
was converted to ioctl, so non-zero flags could have been passed into
drivers.
Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_CQ from uverbs_ex_cmd_mask.
Fixes: 41b2a71fc8 ("IB/uverbs: Move ioctl path of create_cq and destroy_cq to a new file")
Link: https://lore.kernel.org/r/7-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Each driver should check that it can support the provided attr_mask during
modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block
modify_qp_ex because the driver didn't check RATE_LIMIT.
Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
uverbs was blocking srq_types the driver doesn't support based on the
CREATE_XSRQ cmd_mask. Fix all drivers to check for supported srq_types
during create_srq and move CREATE_XSRQ to the core code.
Link: https://lore.kernel.org/r/5-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
These functions all depend on the driver providing a specific op:
- REREG_MR is rereg_user_mr(). bnxt_re set this without providing the op
- ATTACH/DEATCH_MCAST is attach_mcast()/detach_mcast(). usnic set this
without providing the op
- OPEN_QP doesn't involve the driver but requires a XRCD. qedr provides
xrcd but forgot to set it, usnic doesn't provide XRCD but set it anyhow.
- OPEN/CLOSE_XRCD are the ops alloc_xrcd()/dealloc_xrcd()
- CREATE_SRQ/DESTROY_SRQ are the ops create_srq()/destroy_srq()
- QUERY/MODIFY_SRQ is op query_srq()/modify_srq(). hns sets this but
sometimes supplies a NULL op.
- RESIZE_CQ is op resize_cq(). bnxt_re sets this boes doesn't supply an op
- ALLOC/DEALLOC_MW is alloc_mw()/dealloc_mw(). cxgb4 provided an
(now deleted) implementation but no userspace
All drivers were checked that no drivers provide the op without also
setting uverbs_cmd_mask so this should have no functional change.
Link: https://lore.kernel.org/r/4-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Since a while now the uverbs layer checks if the driver implements a
function before allowing the ucmd to proceed. This largely obsoletes the
cmd_mask stuff, but there is some tricky bits in drivers preventing it
from being removed.
Remove the easy elements of uverbs_ex_cmd_mask by pre-setting them in the
core code. These are triggered soley based on the related ops function
pointer.
query_device_ex is not triggered based on an op, but all drivers already
implement something compatible with the extension, so enable it globally
too.
Link: https://lore.kernel.org/r/2-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The code in setup_dma_device has become rather convoluted, move all of
this to the drivers. Drives now pass in a DMA capable struct device which
will be used to setup DMA, or drivers must fully configure the ibdev for
DMA and pass in NULL.
Other than setting the masks in rvt all drivers were doing this already
anyhow.
mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for
DMA based on their hardweare limits in:
__mthca_init_one()
dma_set_max_seg_size (1G)
__mlx4_init_one()
dma_set_max_seg_size (1G)
mlx5_pci_init()
set_dma_caps()
dma_set_max_seg_size (2G)
Other non software drivers (except usnic) were extended to UINT_MAX [1, 2]
instead of 2G as was before.
[1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/
[2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/
Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org
Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Some code was removed but the variables were still there, and some
parameters have been changed to be queried from firmware. So the
definitions of them are no longer needed.
Fixes: 2a3d923f87 ("RDMA/hns: Replace magic numbers with #defines")
Fixes: 82e620d9c3 ("RDMA/hns: Modify the data structure of hns_roce_av")
Fixes: 8254746978 ("IB/hns: Implement the add_gid/del_gid and optimize the GIDs management")
Fixes: 21b97f5387 ("RDMA/hns: Fixup qp release bug")
Link: https://lore.kernel.org/r/1601371934-40003-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
GSI QP can't be created from the user space, hence the udata check is
always false (udata == NULL). Remove that check and simplify the flow.
Link: https://lore.kernel.org/r/20200926102450.2966017-9-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
HIP08 supports RC inline up to size of 32 Bytes, and all data should be
put into SQWQE. For HIP09, this capability is extended to 1024 Bytes, if
length of data is longer than 32 Bytes, they will be filled into extended
sge space.
Link: https://lore.kernel.org/r/1599744069-9968-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The hardware will add AckReq flag in BTH header according to the value of
ack_req_freq to request ACK from responder for the packets with this flag.
It should be greater than or equal to lp_pktn_ini instead of using a fixed
value.
Fixes: 7b9bd73ed1 ("RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC")
Link: https://lore.kernel.org/r/1600509802-44382-8-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The rnr_retry returned to the user is not correct, it should be got from
another fields in QPC.
Fixes: bfe860351e ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1600509802-44382-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
calc_pg_sz() may gets a data calculation overflow if the PAGE_SIZE is 64 KB
and hop_num is 2. It is because that all variables involved in calculation
are defined in type of int. So change the type of bt_chunk_size,
buf_chunk_size and obj_per_chunk_default to u64.
Fixes: ba6bb7e974 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1600509802-44382-6-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
According to the RoCE v1 specification, the sl (service level) 0-7 are
mapped directly to priorities 0-7 respectively, sl 8-15 are reserved. The
driver should verify whether the the value of sl is larger than 7, if so,
an exception should be returned.
Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1600509802-44382-5-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
HIP08 doesn't support modifying the maximum number of outstanding WR in an
SRQ. However, the driver does not return a failure message, and users may
mistakenly think that the resizing is executed successfully. So the driver
needs to intercept this operation.
Fixes: ffb1308b88 ("RDMA/hns: Move SRQ code to the reasonable place")
Link: https://lore.kernel.org/r/1600509802-44382-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
According to the IB specifications, the verbs should return an immediate
error when the users set an unsupported opcode. Furthermore, refactor codes
about opcode in process of post_send to make the difference between opcodes
clearer.
Link: https://lore.kernel.org/r/1600509802-44382-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
For HIP09, size of SCCC (Soft Congestion Control Context) is increased to
64 Bytes from 32 Bytes. The hardware will get the configuration of SCCC
from driver instead of using a fixed value.
Link: https://lore.kernel.org/r/1600245806-56321-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The new version of RoCEE supports using QPC in size of 256B or 512B, so
that HIP09 can supports new congestion control algorithms by using QPC in
larger size.
Link: https://lore.kernel.org/r/1600245806-56321-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The new version of RoCEE supports using CQE in size of 32B or 64B. The
performance of bus can be improved by using larger size of CQE.
Link: https://lore.kernel.org/r/1600245806-56321-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The new version of RoCEE supports using CEQE in size of 4B or 64B, AEQE in
size of 16B or 64B. The performance of bus can be improved by using larger
size of EQE.
Link: https://lore.kernel.org/r/1600245806-56321-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to
an invalid value instead of being left uninitialized.
Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Fixes: a2f3d4479f ("RDMA/hns: Avoid unncessary initialization")
Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Move allocation and destruction of memory windows under ib_core
responsibility and clean drivers to ensure that no updates to MW
ib_core structures are done in driver layer.
Link: https://lore.kernel.org/r/20200902081623.746359-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
mtr_umem_page_count() does the same thing, replace it with the core code.
Also, ib_umem_find_best_pgsz() should always be called to check that the
umem meets the page_size requirement. If there is a limited set of
page_sizes that work it the pgsz_bitmap should be set to that set. 0 is a
failure and the umem cannot be used.
Lightly tidy the control flow to implement this flow properly.
Link: https://lore.kernel.org/r/12-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This helper does the same as rdma_for_each_block(), except it works on a
umem. This simplifies most of the call sites.
Link: https://lore.kernel.org/r/4-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Like any other verbs objects, CQ shouldn't fail during destroy, but
mlx5_ib didn't follow this contract with mixed IB verbs objects with
DEVX. Such mix causes to the situation where FW and kernel are fully
interdependent on the reference counting of each side.
Kernel verbs and drivers that don't have DEVX flows shouldn't fail.
Fixes: e39afe3d6d ("RDMA: Convert CQ allocations to be under core responsibility")
Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In similar way to other IB objects, restore the ability to return error on
SRQ destroy. Strictly speaking, this change is not necessary, and provided
here to ensure a symmetrical interface like other destroy functions.
Fixes: 68e326dea1 ("RDMA: Handle SRQ allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Like any other IB verbs objects, AH are refcounted by ib_core. The release
of those objects are controlled by ib_core with promise that AH destroy
can't fail.
Being SW object for now, this change makes dealloc_ah() to behave like any
other destroy IB flows.
Fixes: d345691471 ("RDMA: Handle AH allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The IB verbs objects are counted by the kernel and ib_core ensures that
deallocate PD will success so it will be called once all other objects
that depends on PD will be released. This is achieved by managing various
reference counters on such objects.
The mlx5 driver didn't follow this standard flow when allowed DEVX objects
that are not managed by ib_core to be interleaved with the ones under
ib_core responsibility.
In such interleaved scenarios deallocate command can fail and ib_core will
leave uobject in internal DB and attempt to clean it later to free
resources anyway.
This change partially restores returned value from dealloc_pd() for all
drivers, but keeping in mind that non-DEVX devices and kernel verbs paths
shouldn't fail.
Fixes: 21a428a019 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9ML+IeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGA8EIAIy/kTbFS0yrE9yV
hb98oX0z9+EU9YQg9vhaRWwPd+rJF/JMQZLqYcwbhjG9abaUL3T3fEcSAefMHw8E
LAt+hYzA38dHt7tqhsFQX3vV1VorvDVICBVN0yRPRWKKikq4OPIHzaAR9tleGAF5
8btQisl1PjN+obwYmLuNb6aX16OCwAF+uXOwehcoJs9dvMNhwtXRzfOflWzOvOo6
tE0bHErlylLDfLv4ZzEfczTdks4QJZ7C0xLSf3oN9AAynW42Xnhct4hi8qZY/hCf
CMaqeN4hdpub6TvQIqBdDqMMjEXGFgeNSnAEBQY9VpvUqz8NTu6sQxwgJEKDF5tg
d81lv2c=
=uW/F
-----END PGP SIGNATURE-----
Merge tag 'v5.9-rc3' into rdma.git for-next
Required due to dependencies in following patches.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The UDP source port number in RoCE v2 is used to create entropy for
network routers (ECMP), load balancers and 802.3ad link aggregation
switching that are not aware of RoCE IB headers. Considering that the IB
core has achieved a new interface to get a hashed value of it, the fixed
value of it in QPC and UD WQE in hns driver could be fixed and the port
number is to be set dynamically now.
For QPC of RC, the value could be hashed from flow_lable if the user pass
it in or from remote qpn and local qpn. For WQE of UD, it is set according
to fl or as a random value.
Link: https://lore.kernel.org/r/1598002289-8611-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
It should be considered an illegal operation if the ULP attempts to modify
a QP from another state to the current hardware state. Otherwise, the ULP
can modify some fields of QPC at any time. For example, for a QP in state
of RTS, modify it from RTR to RTS can change the PSN, which is always not
as expected.
Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Link: https://lore.kernel.org/r/1598353674-24270-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This patch caused some issues on SEND operation, and it should be reverted
to make the drivers work correctly. There will be a better solution that
has been tested carefully to solve the original problem.
This reverts commit 711195e57d.
Fixes: 711195e57d ("RDMA/hns: Reserve one sge in order to avoid local length error")
Link: https://lore.kernel.org/r/1597829984-20223-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Smaller set of RDMA updates. A smaller number of 'big topics' with the
majority of changes being driver updates.
- Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re
- Removal of dead or redundant code across the drivers
- RAW resource tracker dumps to include a device specific data blob for
device objects to aide device debugging
- Further advance the IOCTL interface, remove the ability to turn it off.
Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands
- Remove stubs related to devices with no pkey table
- A shared CQ scheme to allow multiple ULPs to share the CQ rings of a
device to give higher performance
- Several more static checker, syzkaller and rare crashers fixed
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl8sSA0ACgkQOG33FX4g
mxpp1w/8Df/KIB38PVHpKraIW10bX03KsXwoskMYCA+ITYWM5ce+P7YF+yXXGs69
Vh2vUYHlr1RvqXQkq3Y3LjzCPKTYFuNFVQRZF1LrfbfOpSS9aoQqoxwgKs08dibm
YDeRwueWneksWhXeEZLA0QoKd4kEWrScA/n7VGYQ4YcWw8FLKa9t6OMSGivCrFLu
QA+sA9nytrvMWC5uJUCdeVwlRnoaICPYHmM5yafOykPyEciRw2jU1kzTRVy5Z0Hu
iCsXm2lJPcVoMgSjW6SgktY3oBkQeSu3ZZesT3eTM6FJsoDYkuSiKjNmWSZjW1zv
x6CFGjVVin41rN4FMTeqqnwYoML9Q/obbyHvBHs5MTd5J8tLDhesQj3Ev7CUaUed
b0s38v+oEL1w22nkOChfeyfh7eLcy3yiszqvkIU9ABk8mF0p1guGQYsfguzbsq0K
3ZRw/361SxCUBvU6P8CdQbIJlhkH+Un7d81qyt+rhLgaZYm/N+d8auIKUxP1jCxh
q9hss2Cj2U9eZsA/wGNqV1LNazfEAAj/5qjItMirbRd90FL8h+AP2LfJfC7p+id3
3BfOui0JbZqNTTl4ftTxPuxtWDEdTPgwi7JvQd/be9HRlSV8DYCSMUzYFn8A+Zya
cbxjxFuBJWmF+y9csDIVBTdFi+j9hO6notw+G89NznuB3QlPl50=
=0z2L
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A quiet cycle after the larger 5.8 effort. Substantially cleanup and
driver work with a few smaller features this time.
- Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re
- Removal of dead or redundant code across the drivers
- RAW resource tracker dumps to include a device specific data blob
for device objects to aide device debugging
- Further advance the IOCTL interface, remove the ability to turn it
off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands
- Remove stubs related to devices with no pkey table
- A shared CQ scheme to allow multiple ULPs to share the CQ rings of
a device to give higher performance
- Several more static checker, syzkaller and rare crashers fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits)
RDMA/mlx5: Fix flow destination setting for RDMA TX flow table
RDMA/rxe: Remove pkey table
RDMA/umem: Add a schedule point in ib_umem_get()
RDMA/hns: Fix the unneeded process when getting a general type of CQE error
RDMA/hns: Fix error during modify qp RTS2RTS
RDMA/hns: Delete unnecessary memset when allocating VF resource
RDMA/hns: Remove redundant parameters in set_rc_wqe()
RDMA/hns: Remove support for HIP08_A
RDMA/hns: Refactor hns_roce_v2_set_hem()
RDMA/hns: Remove redundant hardware opcode definitions
RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP
RDMA/include: Replace license text with SPDX tags
RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
RDMA/cma: Execute rdma_cm destruction from a handler properly
RDMA/cma: Remove unneeded locking for req paths
RDMA/cma: Using the standard locking pattern when delivering the removal event
RDMA/cma: Simplify DEVICE_REMOVAL for internal_id
RDMA/efa: Add EFA 0xefa1 PCI ID
RDMA/efa: User/kernel compatibility handshake mechanism
...
If the hns ROCEE reports a general error CQE (types not specified by the IB
General Specifications), it's no need to change the QP state to error, and
the driver should just skip it.
Fixes: 7c044adca2 ("RDMA/hns: Simplify the cqe code of poll cq")
Link: https://lore.kernel.org/r/1595932941-40613-8-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
One qp state migrations legal configuration was deleted mistakenly.
Fixes: 357f342946 ("RDMA/hns: Simplify the state judgment code of qp")
Link: https://lore.kernel.org/r/1595932941-40613-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The hns_roce_cmq_setup_basic_desc() can clear the whole desc, so removes
these redundant memset operations.
Link: https://lore.kernel.org/r/1595932941-40613-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There are some functions called by set_rc_wqe() use two parameters:
"void *wqe" and "struct hns_roce_v2_rc_send_wqe *rc_sq_wqe", but the first
one can be got from the second one. So remove the redundant wqe from
related functions.
Link: https://lore.kernel.org/r/1595932941-40613-5-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
HIP08_A is an temporary version and all features of it are supported by
HIP08_B. So remove the relevant code.
Link: https://lore.kernel.org/r/1595932941-40613-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The parts about preparing and sending mailbox to hardware is not strongly
related to other codes in hns_roce_v2_set_hem(), and can be encapsulated
into a separate function.
Link: https://lore.kernel.org/r/1595932941-40613-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
HNS_ROCE_SQ_OPCODE_XXXs and HNS_ROCE_V2_WQE_OP_XXXs have same values, so
remove a set of redundant definitions. In addition, remove the suffix of
HNS_ROCE_V2_WQE_OP_BIND_MW_TYPE.
Link: https://lore.kernel.org/r/1595932941-40613-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
ROCE uses "VA % buf_page_size" to caclulate the offset in the PBL's first
page, the actual PA corresponding to the MR's VA is equal to MR's PA plus
this offset. The first PA in PBL has already been aligned to PAGE_SIZE
after calling ib_umem_get(), but the MR's VA may not. If the buf_page_size
is smaller than the PAGE_SIZE, this will lead the HW to access the wrong
memory because the offset is smaller than expected.
Fixes: 9b2cf76c9f ("RDMA/hns: Optimize PBL buffer allocation process")
Link: https://lore.kernel.org/r/1594726935-45666-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The RoCE Engine will schedule to another QP after one has sent
(2 ^ lp_pktn_ini) packets. lp_pktn_ini is set in QPC and should be
calculated from 2 factors:
1. current MTU as a integer
2. the RoCE Engine's maximum slice length 64KB
But the driver use MTU as a enum ib_mtu and the max inline capability, the
lp_pktn_ini will be much bigger than expected which may cause traffic of
some QPs to never get scheduled.
Fixes: b713128de7 ("RDMA/hns: Adjust lp_pktn_ini dynamically")
Link: https://lore.kernel.org/r/1594726138-49294-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
If hns ROCEE is set to level-0 addressing, the length of the entire buffer
can be used as the page size. The driver needn't to split the buffer into
small units because all pages are continuous.
Link: https://lore.kernel.org/r/1593525696-12570-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Allocating an MR flow can only be initiated by kernel users, and not from
userspace so a udata parameter is redundant.
Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to avoid double multiplexing of the resource when it is a CQ, add
a dedicated callback function.
Link: https://lore.kernel.org/r/20200623113043.1228482-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
If a IMP reset caused by some hardware errors and hns RoCE driver reset
occurred at the same time, there is a possiblity that the IMP will stop
dealing with command and users can't use the hardware. The logs are as
follows:
hns3 0000:fd:00.1: cleaned 0, need to clean 1
hns3 0000:fd:00.1: firmware version query failed -11
hns3 0000:fd:00.1: Cmd queue init failed
hns3 0000:fd:00.1: Upgrade reset level
hns3 0000:fd:00.1: global reset interrupt
The hns NIC driver divides the reset process into 3 status:
initialization, hardware resetting and softwaring restting. RoCE driver
gets reset status by interfaces provided by NIC driver and commands will
not be sent to the IMP if the driver is in any above status. The main
reason for this issue is that there is a time gap between status 1 and 2,
if the RoCE driver sends commands to the IMP during this gap, the IMP will
stop working because it is not ready.
To eliminate the time gap, the hns NIC driver has added a new interface in
commit a4de02287a ("net: hns3: provide .get_cmdq_stat interface for the
client"), so RoCE driver can ensure that no commands will be sent during
resetting.
Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
ibmr.device is assigned after MR is successfully registered, but both
write_mtpt() and frmr_write_mtpt() accesses it during the mr registration
process, which may cause the following error when trying to register MR in
userspace and pbl_hop_num is set to 0.
pc : hns_roce_mtr_find+0xa0/0x200 [hns_roce]
lr : set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
sp : ffff00023e73ba20
x29: ffff00023e73ba20 x28: ffff00023e73bad8
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000000000002 x24: 0000000000000000
x23: ffff00023e73bad0 x22: 0000000000000000
x21: ffff0000094d9000 x20: 0000000000000000
x19: ffff8020a6bdb2c0 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0140000000000000 x12: 0040000000000041
x11: ffff000240000000 x10: 0000000000001000
x9 : 0000000000000000 x8 : ffff802fb7558480
x7 : ffff802fb7558480 x6 : 000000000003483d
x5 : ffff00023e73bad0 x4 : 0000000000000002
x3 : ffff00023e73bad8 x2 : 0000000000000000
x1 : 0000000000000000 x0 : ffff0000094d9708
Call trace:
hns_roce_mtr_find+0xa0/0x200 [hns_roce]
set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2]
hns_roce_v2_write_mtpt+0x14c/0x168 [hns_roce_hw_v2]
hns_roce_mr_enable+0x6c/0x148 [hns_roce]
hns_roce_reg_user_mr+0xd8/0x130 [hns_roce]
ib_uverbs_reg_mr+0x14c/0x2e0 [ib_uverbs]
ib_uverbs_write+0x27c/0x3e8 [ib_uverbs]
__vfs_write+0x60/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
Solve above issue by adding a pointer of structure hns_roce_dev as a
parameter of write_mtpt() and frmr_write_mtpt(), so that both of these
functions can access it before finishing MR's registration.
Fixes: 9b2cf76c9f ("RDMA/hns: Optimize PBL buffer allocation process")
Link: https://lore.kernel.org/r/1592314629-51715-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The "dmac" variable is used before it is initialized.
Fixes: 494c3b3122 ("RDMA/hns: Refactor the QP context filling process related to WQE buffer configure")
Link: https://lore.kernel.org/r/20200529083918.GA1298465@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The pointer raq is being assigned twice. Fix this by removing one of the
redundant assignments.
Fixes: 14ba87304b ("RDMA/hns: Remove redundant type cast for general pointers")
Link: https://lore.kernel.org/r/20200528150427.420624-1-colin.king@canonical.com
Addressses-Coverity: ("Evaluation order violation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Instead of i with the sge number of wr will make the comparision more
clear, that is, when the sge number in wr is small than the maximum
supported sge number in the queue, then a stop sge needed to be filled at
the end of sges in wr.
Link: https://lore.kernel.org/r/1590152579-32364-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The redundant parameters "hr_dev" need to be removed from
free_kernel_wrid() and free_srq_wrid().
Link: https://lore.kernel.org/r/1590152579-32364-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is no need to do a type cast on genernal pointers, they could be
assigned to any type of variables. In addition, optimize initialization of
some variables and adjust order of them.
Link: https://lore.kernel.org/r/1590152579-32364-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, the MTR region is configed before hns_roce_mtr_map() is
invoked, but in some scenarios, the region is configed at MTR creation,
the caller need to store this config and call hns_roce_mtr_map() later. So
optimize the usage by wrapping the MTR region config into MTR.
Link: https://lore.kernel.org/r/1589982799-28728-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Split the code related to WQE buffer configure from the QPC filling
process into two functions: config_qp_sq_buf() and config_qp_rq_buf(),
this will make the code more readable.
Link: https://lore.kernel.org/r/1589982799-28728-9-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It's easier to understand and maintain enable flags of cq using a single
field in type of u32 than defining a field for every flags in the
structure hns_roce_cq, and we can add new flags for features more
conveniently in the future.
Link: https://lore.kernel.org/r/1589982799-28728-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The hardware can truncate PI/CI when posting or polling, the driver does
not need to do truncation. Therefore keep the software's PI/CI consistent
with it in the hardware.
Link: https://lore.kernel.org/r/1589982799-28728-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When rq/srq sge length is smaller than sq sge length, it will produce a
local length error and may cause the bus to hang. Therefore, for rq wqe
and srq wqe, one reserved sge pointing to a reserved mr is used to avoid
this error.
Link: https://lore.kernel.org/r/1588931159-56875-10-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
srq_context is a local variables and is only used to get some fields from
buffer of mailbox. It's meaningless to copy mailbox's buffer's contents
back to it.
Link: https://lore.kernel.org/r/1588931159-56875-8-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The length information should be stored in the struct ib_mr object,
otherwise the length value of a valid mr object would always be 0.
Link: https://lore.kernel.org/r/1588931159-56875-7-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
For ilog2(x), if x is 0 and not a constant variable, it will return
-1. And there will be an error as below:
hns3 0000:7d:00.0 hns_0: Local work queue 0x8 catast error, sub_event type is: 2
So modify to_hr_hem_entries_shift() to return 0 if conut is 0.
Fixes: 54d6638765 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1588931159-56875-6-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When allocating eq buffer, the size of base address page should be defined
by eqe_ba_pg_sz instead of srqwqe_ba_pg_sz.
Fixes: 477a0a3870 ("RDMA/hns: Optimize 0 hop addressing for EQE buffer")
Link: https://lore.kernel.org/r/1588931159-56875-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The firmware has reduced the number of descriptions of command
HNS_ROCE_OPC_QUERY_PF_TIMER_RES to 1. The driver needs to adapt, otherwise
the hardware will report error 4(CMD_NEXT_ERR).
Fixes: 0e40dc2f70 ("RDMA/hns: Add timer allocation support for hip08")
Link: https://lore.kernel.org/r/1588931159-56875-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The qkey queried through the query ud qp verb is a fixed value and it
should be read from qp context.
Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1588931159-56875-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
It's easier to understand and maintain enable flags of qp using a single
field in type of unsigned long than defining a field for every flags in
the structure hns_roce_qp, and we can add new flags for features more
conveniently in the future.
Link: https://lore.kernel.org/r/1588674607-25337-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
These caps are assigned in query_pf_caps() or set_default_caps(), and
should not be assigned out of these two functions.
Link: https://lore.kernel.org/r/1588242691-12913-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
lp_pktn_ini means the number of loopback slice packets for long messages,
it should depend on MTU(fixed to 4096B currently) and max size of SQ
inline.
Link: https://lore.kernel.org/r/1588242691-12913-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is a comments with some chinese semicolons that cause encoding
issues each time hns_roc_hw_v2.h was modified from a IDE. So fix this by
using correct symbols.
Link: https://lore.kernel.org/r/1588242691-12913-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Optimize the SRQ's WQE buffer parameters calculating process to make the
codes more readable by using new functions about multi-hop addressing to
calculating capabilities of SRQ.
Link: https://lore.kernel.org/r/1588071823-40200-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Just move the SRQ related code to more reasonable place, and unify format
of some prints.
Link: https://lore.kernel.org/r/1588071823-40200-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Optimize the QP's WQE buffer parameters calculating process to make the
codes more readable mainly by merging calculation of extended sge space of
kernel and userspace. In addition, add some inline functions to simply
codes about multi-hop addressing.
Link: https://lore.kernel.org/r/1588071823-40200-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The MTT (Memory Translate Table) interface is no longer used to configure
the buffer address to BT (Base Address Table) that requires driver
mapping. Because the MTT is not compatible with multi-hop addressing of
the hip08, it is replaced by MTR (Memory Translate Region) interface, and
all the MTT functions should be removed.
Link: https://lore.kernel.org/r/1588071823-40200-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
PBL table has its own implementation for multi-hop addressing currently,
but for the hardware, all table's addressing use the same logic, there is
no need to implement repeatedly. So optimize the PBL buffer allocation
process by using the mtr's interfaces.
Link: https://lore.kernel.org/r/1588071823-40200-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Following patch adds additional argument to the create AH function, so it
make sense to group ah_attr and flags arguments in struct.
Link: https://lore.kernel.org/r/20200430192146.12863-13-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulate codes to get status of cqe into a function and use map table
instead of switch-case to reduce cyclomatic complexity of
hns_roce_v2_poll_one().
Link: https://lore.kernel.org/r/1586938475-37049-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Removes the unnecessary memset opertaion and adjust style of some lines in
hns_roce_v2_set_mac().
Link: https://lore.kernel.org/r/1586938475-37049-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Remove the unnecessary memset operation and adjust style of some lines in
hns_roce_config_link_table().
Link: https://lore.kernel.org/r/1586938475-37049-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add the zero hop addressing support by using mtr interface for CQE buffer,
so the hns driver can support addressing hopnum between 0 to 3 for CQE.
Link: https://lore.kernel.org/r/1586779091-51410-7-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add the zero hop addressing support by using mtr interface for SRQ buffer,
so the hns driver can support addressing hopnum between 0 to 3 for SRQ.
Link: https://lore.kernel.org/r/1586779091-51410-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add the zero hop addressing support by using new mtr interface for WQE
buffer and simple mtr invoking process, so WQE buffer can support hopnum
between 0 to 3.
Link: https://lore.kernel.org/r/1586779091-51410-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use the new mtr interface to simple the hop 0 addressing and multihop
addressing process.
Link: https://lore.kernel.org/r/1586779091-51410-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When the value of nbufs is 1, the buffer is in direct mode, which may cause
confusion. So optimizes current codes to make it easier to maintain and
understand.
Link: https://lore.kernel.org/r/1586779091-51410-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, WQE and EQE table have already used the mtr interface to config
and access memory by multi-hop addressing when hopnum is from 1 to 3. But
if hopnum is 0, each table need write its own but repetitive logic, and
many duplicate code exists in the mtr interfaces invoke process.
So wraps the public logic as 3 functions: hns_roce_mtr_create(),
hns_roce_mtr_destroy() and hns_roce_mtr_map() to support hopnum ranges from
0 to 3. In addition, makes the mtr interfaces easier to use.
Link: https://lore.kernel.org/r/1586779091-51410-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is a potential execution path in which variable *ret* is returned
without being properly initialized, previously.
Fix this by initializing variable *ret* to 0.
Link: https://lore.kernel.org/r/20200328023539.GA32016@embeddedor
Addresses-Coverity-ID: 1491917 ("Uninitialized scalar variable")
Fixes: 2f49de21f3 ("RDMA/hns: Optimize mhop get flow for multi-hop addressing")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Just reduce the default number to 64 for backward compatibility, the
driver can still get this configuration from the firmware.
Link: https://lore.kernel.org/r/1585194018-4381-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The original value means sending 16 packets at a time, and it should be
configured to 0 which means sending 1 packet instead. It is modified to
reduce the number of PFC frames to make sure the performance meets
expectations when flow control is enabled on hip08.
Link: https://lore.kernel.org/r/1585194018-4381-2-git-send-email-liweihang@huawei.com
Signed-off-by: Jihua Tao <taojihua4@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The field smac in ib_wc was used for create AH and then it will be treated
as destination mac address in UD sqwqe, but related code about filling smac
into AH has been removed in core. Actually, the dmac in UD sqwqe is parsed
from the dgid in grh which is passed in by ULP now, so this assignment
should be removed.
Link: https://lore.kernel.org/r/1584674622-52773-10-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Before calling modify_qp_reset_to_init(), the entire qpc mask has been
cleared, so it is no longer necessary to clear the specific fields in the
mask.
Link: https://lore.kernel.org/r/1584674622-52773-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
ceq and aeq is a ring buffer, consumer index of them will be set to zero
after reaching the maximum value. The warning should be removed or it may
mislead the users.
Link: https://lore.kernel.org/r/1584674622-52773-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The capbilities of hardware should be got at first and then used in
hns_roce_alloc_vf_resource(). Also removes an unnecessary if ... else
condition in it.
Link: https://lore.kernel.org/r/1584674622-52773-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use ibdev_err/dbg/warn() instead of dev_err/dbg/warn(), and modify some
prints into format of "failed to do something, ret = n".
Link: https://lore.kernel.org/r/1584674622-52773-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Optimizes hns_roce_table_mhop_get() by encapsulating code about clearing
hem into clear_mhop_hem(), which will make the code flow clearer.
Link: https://lore.kernel.org/r/1584417324-2255-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Depth of qp shouldn't be allowed to be set to zero, after ensuring that,
subsequent process can be simplified. And when qp is changed from reset to
reset, the capability of minimum qp depth was used to identify hardware of
hip06, it should be changed into a more readable form.
Link: https://lore.kernel.org/r/1584006624-11846-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Splits hns_roce_v2_post_send() into three sub-functions: set_rc_wqe(),
set_ud_wqe() and update_sq_db() to simplify the code.
Link: https://lore.kernel.org/r/1583839084-31579-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Currently, before the qp is created, a page size needs to be calculated
for the base address table to store all base addresses in the mtr. As a
result, the parameter configuration of the mtr is complex. So integrate
the process of calculating the base table page size into the hem related
interface to simplify the process of using mtr.
Link: https://lore.kernel.org/r/1583839084-31579-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Simplify the wr opcode conversion from ib to hns by using a map table
instead of the switch-case statement.
Link: https://lore.kernel.org/r/1583839084-31579-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulates the wqe buffer process details for datagram seg, fast mr seg
and atomic seg.
Link: https://lore.kernel.org/r/1583839084-31579-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There are serval global functions related to wqe buffer in the hns driver
and are called in different files. These symbols cannot directly represent
the namespace they belong to. So add prefix 'hns_roce_' to 3 wqe buffer
related global functions: get_recv_wqe(), get_send_wqe(), and
get_send_extend_sge().
Link: https://lore.kernel.org/r/1583839084-31579-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
These judgments were used to keep the compatibility with older versions of
userspace that don't have the field named "cap_flags" in structure
hns_roce_ib_create_cq_resp. But it will be wrong to compare outlen with
the size of resp if another new field were added in resp. oulen should be
compared with the end offset of cap_flags in resp.
Fixes: 4f8f0d5e33 ("RDMA/hns: Package the flow of creating cq")
Link: https://lore.kernel.org/r/1583845569-47257-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The proper return code is "-EOPNOTSUPP" when the requested QP type is
not supported by the provider.
Link: https://lore.kernel.org/r/20200130082049.463-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulate the kernel qp doorbell allocation related code into 2
functions: alloc_qp_db() and free_qp_db().
Link: https://lore.kernel.org/r/1582526258-13825-8-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulate the kernel qp wrid allocation related code into 2 functions:
alloc_kernel_wrid() and free_kernel_wrid().
Link: https://lore.kernel.org/r/1582526258-13825-7-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulate qp buffer allocation related code into 3 functions:
alloc_qp_buf(), map_wqe_buf() and free_qp_buf().
Link: https://lore.kernel.org/r/1582526258-13825-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Encapsulate the code associated with the qp number assignment into
alloc_qpn() and free_qpn().
Link: https://lore.kernel.org/r/1582526258-13825-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Rename the qp context related functions and adjusts the code location to
distinguish between the qp context and the entire qp.
Link: https://lore.kernel.org/r/1582526258-13825-3-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Wrap the duplicate code in hip08 and hip06 qp destruction process as
hns_roce_qp_destroy() to simply the qp destroy flow.
Link: https://lore.kernel.org/r/1582526258-13825-2-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There are two paths to update qp producer index into hardware now, one
path is doorbell in post verbs (send and recv), the another is mailbox in
modify qp verb which is called by flush process. This will lead the
hardware to be broken to correctly generate flush cqe. With stopping
doorbell update and holding qp spinlock in modify qp during flush process,
the problem can be solved.
Fixes: 0425e3e6e0 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Link: https://lore.kernel.org/r/1582367158-27030-3-git-send-email-liuyixian@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Set revisions that equal to or higher than HIP08_B as default to maintain
backward compatibility.
Link: https://lore.kernel.org/r/1582363039-10714-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
outstanding WQEs if QP state gets into errored mode for some reason. To
overcome this hardware problem and as a workaround, when QP is detected to
be in errored state during various legs like post send, post receive
etc[1], flush needs to be performed from the driver.
The earlier patch[1] sent to solve the hardware limitation explained in
the cover-letter had a bug in the software flushing leg. It acquired mutex
while modifying QP state to errored state and while conveying it to the
hardware using the mailbox. This caused leg to sleep while holding
spin-lock and caused crash.
Suggested Solution: we have proposed to defer the flushing of the QP in
the Errored state using the workqueue to get around with the limitation of
our hardware.
This patch specifically adds the calls to the flush handler from where
parts of the code like post_send/post_recv etc. when the QP state gets
into the errored mode.
[1] https://patchwork.kernel.org/patch/10534271/
Link: https://lore.kernel.org/r/1580983005-13899-3-git-send-email-liuyixian@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
outstanding WQEs if QP state gets into errored mode for some reason. To
overcome this hardware problem and as a workaround, when QP is detected to
be in errored state during various legs like post send, post receive etc
[1], flush needs to be performed from the driver.
The earlier patch[1] sent to solve the hardware limitation explained in
the cover-letter had a bug in the software flushing leg. It acquired mutex
while modifying QP state to errored state and while conveying it to the
hardware using the mailbox. This caused leg to sleep while holding
spin-lock and caused crash.
Suggested Solution:
we have proposed to defer the flushing of the QP in the Errored state
using the workqueue to get around with the limitation of our hardware.
This patch adds the framework of the workqueue and the flush handler
function.
[1] https://patchwork.kernel.org/patch/10534271/
Link: https://lore.kernel.org/r/1580983005-13899-2-git-send-email-liuyixian@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The eqe has a private multi-hop addressing implementation, but there is
already a set of interfaces in the hns driver that can achieve this.
So, simplify the eqe buffer allocation process by using the mtr interface
and remove large amount of repeated logic.
Link: https://lore.kernel.org/r/20200126145835.11368-1-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Some magic numbers are hard to understand, so replace them with macros or
add some comments for them.
Link: https://lore.kernel.org/r/20200126145504.9700-1-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.
The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.
The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.
The third part is actual user of those in-kernel APIs.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ
scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl
+UcZb31auy5P8ueJYokRLhLAyRcOIAg=
=yaHb
-----END PGP SIGNATURE-----
Merge tag 'rds-odp-for-5.5' into rdma.git for-next
From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma
Leon Romanovsky says:
====================
Use ODP MRs for kernel ULPs
The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.
The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.
The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.
The third part is actual user of those in-kernel APIs.
====================
* tag 'rds-odp-for-5.5':
net/rds: Use prefetch for On-Demand-Paging MR
net/rds: Handle ODP mr registration/unregistration
net/rds: Detect need of On-Demand-Paging memory registration
RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
RDMA/mlx5: Don't fake udata for kernel path
IB/mlx5: Add ODP WQE handlers for kernel QPs
IB/core: Add interface to advise_mr for kernel users
IB/core: Introduce ib_reg_user_mr
IB: Allow calls to ib_umem_get from kernel ULPs
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.
This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
To support extended atomic operations including cmp & swap and fetch & add
of 8 bytes, 16 bytes, 32 bytes, 64 bytes in userspace, some field in qpc
should be configured.
Link: https://lore.kernel.org/r/1579052546-11746-1-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Get pf capabilities from firmware according to different hardwares, if it
fails, all capabilities will be set with a default value.
Link: https://lore.kernel.org/r/1578738761-3176-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
pf capabilities are set by default for hip08 previously which should
depends on different types of hardware. So add new interfaces to get them
from firmware.
Link: https://lore.kernel.org/r/1578738761-3176-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In struct hns_roce_caps, max_srq_sg and max_srqwqes is unused, and
max_srqs has the same effect with num_srqs. So remove them from this
structrue.
Link: https://lore.kernel.org/r/1578738761-3176-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When hardware is in resetting stage, we may can't poll back all the
expected work completions as the hardware won't generate cqe anymore.
This patch allows the driver to compose the expected wc instead of the
hardware during resetting stage. Once the hardware finished resetting, we
can poll cq from hardware again.
Link: https://lore.kernel.org/r/1578572412-25756-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Driver should first check whether the sge is valid, then fill the valid
sge and the caculated total into hardware, otherwise invalid sges will
cause an error.
Fixes: 52e3b42a2f ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode")
Fixes: 7bdee4158b ("RDMA/hns: Fill sq wqe context of ud type in hip08")
Link: https://lore.kernel.org/r/1578571852-13704-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix some coding style issuses without changing logic of codes, most of the
modification is unreasonable line breaks and alignments.
Link: https://lore.kernel.org/r/1578313276-29080-8-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There are already necessary prints in outer function, prints in
hns_roce_function_clear() may confuse users. So these prints is removed.
Link: https://lore.kernel.org/r/1578313276-29080-6-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Current state and new state of qp won't be configured when modifying qp,
so these two redundant parameters should be removed.
Link: https://lore.kernel.org/r/1578313276-29080-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>