Commit Graph

9196 Commits

Author SHA1 Message Date
Yishai Hadas
719598c98d IB/mlx5: Update the supported DEVX commands
Update the supported DEVX commands, it includes adding to the
query/modify command's list and to the encoding handling.

In addition, a valid range for general commands was added to be used for
future commands.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:46:42 -05:00
Yishai Hadas
fb98153bbf IB/mlx5: Enforce DEVX privilege by firmware
Enforce DEVX privilege by firmware, this enables future device
functionality without the need to make driver changes unless a new
privilege type will be introduced.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:46:42 -05:00
Yishai Hadas
34613eb1d2 IB/mlx5: Enable modify and query verbs objects via DEVX
Enables modify and query verbs objects via the DEVX interface.
To support this the above DEVX handlers were changed to get any
object type via the UVERBS_IDR_ANY_OBJECT mechanism.

The type checking and handling is done per object as part of the
driver code.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:46:42 -05:00
Yishai Hadas
04ca16cc19 IB/core: Enable getting an object type from a given uobject
Enable getting an object type from a given uobject, the type is saved
upon tree merging and is returned as part of some helper function.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:46:41 -05:00
Yishai Hadas
4d7e8cc574 IB/core: Introduce UVERBS_IDR_ANY_OBJECT
Introduce the UVERBS_IDR_ANY_OBJECT type to match any IDR object.

Once used, the infrastructure skips checking for the IDR type, it
becomes the driver handler responsibility.

This enables drivers to get in a given method an object from various of
types.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:46:41 -05:00
Doug Ledford
f33cb7e760 Merge 'mlx5-next' into mlx5-devx
The enhanced devx support series needs commit:
9d43faac02 ("net/mlx5: Update mlx5_ifc with DEVX UCTX capabilities bits")

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04 13:36:57 -05:00
Leon Romanovsky
36ff48805a RDMA/mlx5: Unfold modify RMP function
There is no need to perform modify_rmp in two separate function,
while one of them uses stack as a placeholder for data while other
allocates it dynamically. Combine those two functions to one call
instead of two.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:26:00 +02:00
Leon Romanovsky
a1eb180238 RDMA/mlx5: Unfold create RMP function
There is no need to perform create_rmp in two separate function, while
one of them uses stack as a placeholder for data while other allocates
it dynamically. Combine those two functions to one instead of two.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:25:55 +02:00
Leon Romanovsky
f3da6577da RDMA/mlx5: Initialize SRQ tables on mlx5_ib
Transfer initialization and cleanup from mlx5_priv struct of
mlx5_core_dev to be part of mlx5_ib_dev. This completes removal
of SRQ from mlx5_core.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:25:50 +02:00
Leon Romanovsky
b4990804e1 RDMA/mlx5: Update SRQ functions signatures to mlx5_ib format
Reflect the change of moving SRQ code from mlx5_core to mlx5_ib by
updating function signatures do not require mlx5_core_dev as an input,
because all operations in mlx5_ib are supposed to use mlx5_ib_dev.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:25:45 +02:00
Leon Romanovsky
81773ce5f0 RDMA/mlx5: Use stages for callback to setup and release DEVX
Reuse existing infrastructure to initialize and release DEVX uid.
The DevX interface is intended for user space access, so it is supposed
to be initialized before ib_register_device(). Also it isn't supported
in switchdev mode and don't need to initialize it in that mode.

Fixes: 76dc5a8406 ("IB/mlx5: Manage device uid for DEVX white list commands")
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:23:53 +02:00
Leon Romanovsky
c48d386b2b RDMA/mlx5: Remove SRQ signature global flag
SRQ signature is not supported, hence no need for special static
global variable to announce it.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:14:37 +02:00
Leon Romanovsky
f02d0d6e53 net/mlx5: Move SRQ functions to RDMA part
There is no need to keep SRQ which is RDMA object in mlx5_core.
In this patch, we partially move the execution code, while next patches
will move table initialization/release logic too.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:14:30 +02:00
Leon Romanovsky
6cd0014ab9 net/mlx5: Align SRQ licenses and copyright information
Ensure that both RDMA and netdev parts of SRQ implementation
has same copyright and license information annotated by SPDX
tags.

Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04 09:14:09 +02:00
Leon Romanovsky
ffd321e4b7 RDMA/nldev: Export to user space number of contexts
[leonro@server ~]$ rdma res show
1: mlx5_0: pd 3 cq 5 qp 4 cm_id 0 mr 0 ctx 0

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 14:58:25 -05:00
Leon Romanovsky
12d23a9198 RDMA/uverbs: Annotate alloc/deallloc paths with context tracking
Add restrack annotations to track allocations of ucontexts.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 14:58:25 -05:00
Leon Romanovsky
606152107b RDMA/restrack: Track ucontext
Add ability to track allocated ib_ucontext, which are limited
resource and worth to be visible by users.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 14:58:25 -05:00
Jason Gunthorpe
974d6b4b2b RDMA/uverbs: Use only attrs for the write() handler signature
All of the old arguments can be derived from the uverbs_attr_bundle
structure, so get rid of the redundant arguments. Most of the prior work
has been removing users of the arguments to allow this to be a simple
patch.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
ece9ca97cc RDMA/uverbs: Do not check the input length on create_cq/qp paths
If the user did not provide a long enough command buffer then the missing
bytes are forced to zero. There is no reason to check the length if a zero
value is OK.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
c3bea3d2dc RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()
This has a very complicated memory layout, with two flex arrays. Use
the iterator API to make reading it clearer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
335708c751 RDMA/uverbs: Add a simple iterator interface for reading the command
Several methods have a command with a trailing flex array, and they
all open code some extraction scheme. Centralize this into a simple
iterator API.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
7eebced1ba RDMA/uverbs: Simplify ib_uverbs_ex_query_device
We truncate the response structure if there is not enough room in the
user buffer so there is no reason to have all the mess with finely managing
response_length. Just fully fill the attrs and truncate on copy.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
40efca7a46 RDMA/uverbs: Fill in the response for IB_USER_VERBS_EX_CMD_MODIFY_QP
A response struct was defined, and userspace is providing it (but not
checking it). Fill it in and write it out.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
29a29d1852 RDMA/uverbs: Use uverbs_request() and core for write_ex handlers
The write_ex handlers have this horrible boilerplate in every function to
do the zero extend/zero check and min size checks. This is now handled in
the core code via the meta-data, and the zero checks are handled by
uverbs_request(). Replace all the occurrences.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
3c2c20947d RDMA/uverbs: Use uverbs_request() for request copying
This function properly zero-extends, and zero-checks if the user
buffer is not the same size as the kernel command struct.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
9a0738575f RDMA/uverbs: Use uverbs_response() for remaining response copying
This function properly truncates and zero-fills the response which is the
standard used by the ioctl uAPI when working with user data.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
931373a118 RDMA/uverbs: Get rid of the 'callback' scheme in the compat path
There is no reason for this. For response processing we simply need to
copy, truncate, and zero fill the response into whatever output buffer
was provided. Add a function uverbs_response() that does this
consistently.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 12:01:58 -05:00
Jason Gunthorpe
c2a939fda4 RDMA/uverbs: Use uverbs_attr_bundle to pass ucore for write/write_ex
This creates a consistent way to access the two core buffers across write
and write_ex handlers.

Remove the open coded ucore conversion in the write/ex compatibility
handlers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 11:57:41 -05:00
Jason Gunthorpe
bbb28ad903 RDMA/uverbs: Remove out_len checks that are now done by the core
write() methods must work with fixed sized structures as that is the only
way to know where the udata segment starts. The common udata code now
rejects any write() that has a response buffer shorter than the core's
response.

Thus all the checks of out_len for write methods are redundant and can be
removed.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-03 11:57:41 -05:00
Saeed Mahameed
09e574fa76 IB/mlx5: Handle raw delay drop general event
Handle FW general event rq delay drop as it was received from FW via mlx5
notifiers API, instead of handling the processed software version of that
event. After this patch we can safely remove all software processed FW
events types and definitions.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-29 16:40:32 -08:00
Saeed Mahameed
134e9349ec IB/mlx5: Handle raw port change event rather than the software version
Use the FW version of the port change event as forwarded via new mlx5
notifiers API.

After this patch, processed software version of the port change event
will become deprecated and will be totally removed in downstream
patches.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-29 16:40:31 -08:00
Saeed Mahameed
df097a278c IB/mlx5: Use the new mlx5 core notifier API
Remove the deprecated mlx5_interface->event mlx5_ib callback and use new
mlx5 notifier API to subscribe for mlx5 events.

For native mlx5_ib devices profiles pf_profile/nic_rep_profile register
the notifier callback mlx5_ib_handle_event which treats the notifier
context as mlx5_ib_dev.

For vport repesentors, don't register any notifier, same as before, they
didn't receive any mlx5 events.

For slave port (mlx5_ib_multiport_info) register a different notifier
callback mlx5_ib_event_slave_port, which knows that the event is coming
for mlx5_ib_multiport_info and prepares the event job accordingly.
Before this on the event handler work we had to ask mlx5_core if this is
a slave port mlx5_core_is_mp_slave(work->dev), now it is not needed
anymore.
mlx5_ib_multiport_info notifier registration is done on
mlx5_ib_bind_slave_port and de-registration is done on
mlx5_ib_unbind_slave_port.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-29 16:40:31 -08:00
Guy Levi
34f4c9554d IB/mlx5: Use fragmented QP's buffer for in-kernel users
The current implementation of create QP requires contiguous memory, such a
requirement is problematic once the memory is fragmented or the system is
low in memory, it causes failures in dma_zalloc_coherent().

This patch takes advantage of the new mlx5_core API which allocates a
fragmented buffer. This makes the QP creation much more resilient to
memory fragmentation. Data-path code was adapted to the fact that WQEs can
cross buffers.

We also use the opportunity to fix some cosmetic legacy coding convention
errors which were in the feature scope.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 17:12:13 -07:00
Guy Levi
20e5a59b2e IB/mlx5: Use fragmented SRQ's buffer for in-kernel users
The current implementation of create SRQ requires contiguous memory, such
a requirement is problematic once the memory is fragmented or the system
is low in memory, it causes failures in dma_zalloc_coherent().

This patch takes the advantage of the new mlx5_core API which allocates a
fragmented buffer, and makes the SRQ creation much more resilient to
memory fragmentation. Data-path code was adapted to the fact that WQEs can
cross buffers.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 17:12:13 -07:00
Chuck Lever
b024dd0eba rxe: IB_WR_REG_MR does not capture MR's iova field
FRWR memory registration is done with a series of calls and WRs.
1. ULP invokes ib_dma_map_sg()
2. ULP invokes ib_map_mr_sg()
3. ULP posts an IB_WR_REG_MR on the Send queue

Step 2 generates an iova. It is permissible for ULPs to change this
iova (with certain restrictions) between steps 2 and 3.

rxe_map_mr_sg captures the MR's iova but later when rxe processes the
REG_MR WR, it ignores the MR's iova field. If a ULP alters the MR's iova
after step 2 but before step 3, rxe never captures that change.

When the remote sends an RDMA Read targeting that MR, rxe looks up the
R_key, but the altered iova does not match the iova stored in the MR,
causing the RDMA Read request to fail.

Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 17:10:06 -07:00
Mark Bloch
bfc5d83918 RDMA/mlx5: Attach a DEVX counter via raw flow creation
Allow a user to attach a DEVX counter via mlx5 raw flow creation. In order
to attach a counter we introduce a new attribute:

MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX

A counter can be attached to multiple flow steering rules.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 16:51:33 -07:00
Leon Romanovsky
67810e8c3c RDMA/qib: Remove all occurrences of BUG_ON()
QIB driver was added in 2010 with many BUG_ON(), most of them were cleaned
out after years of development and usages.

It looks like that it is safe now to remove rest of BUG_ONs.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 15:59:40 -07:00
Colin Ian King
d12c416dd1 IB/usnic: fix spelling mistake "miniumum" -> "minimum"
There is a spelling mistake in a usnic_err error message, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 15:59:40 -07:00
kbuild test robot
90849f4d05 RDMA/uverbs: fix ptr_ret.cocci warnings
drivers/infiniband/core/uverbs_cmd.c:1095:1-3: WARNING: PTR_ERR_OR_ZERO can be used

 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Fixes: 7106a97697 ("RDMA/uverbs: Make write() handlers return 0 on success")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 15:59:40 -07:00
Colin Ian King
901018f29e RDMA/drivers: Fix spelling mistake "initalize" -> "initialize"
Fix spelling mistake in usnic_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-29 15:58:22 -07:00
Jason Gunthorpe
07f05f40d9 RDMA/uverbs: Use uverbs_attr_bundle to pass udata for ioctl()
Have the core code initialize the driver_udata if the method has a udata
description. This is done using the same create_udata the handler was
supposed to call.

This makes ioctl consistent with the write and write_ex paths.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
3a6532c9af RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write
Now that we have metadata describing the command format the core code can
directly compute the udata pointers and all the really ugly
ib_uverbs_init_udata() calls can be removed from the handlers.

This means all the write() handlers are no longer sensitive to the layout
of the command buffer.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
ef87df2c7a RDMA/uverbs: Use uverbs_attr_bundle to pass udata for write_ex
The core code needs to compute the udata so we may as well pass it in the
uverbs_attr_bundle instead of on the stack. This converts the simple case
of write_ex() which already has a core calculation.

Also change the write() path to use the attrs for ib_uverbs_init_udata()
instead of on the stack. This lets the write to write_ex compatibility
path continue to follow the lead of the _ex path.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
da0f60df7b RDMA/uverbs: Prohibit write() calls with too small buffers
The size meta-data in the prior patch describes the smallest acceptable
buffer for the write() interface. Globally check this in the core code.

This is necessary in the case of write() methods that have a driver udata
to prevent computing a negative udata buffer length.

The return code of -ENOSPC is chosen here as some of the handlers already
use this code, however many other handler use EINVAL.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
669dac1e00 RDMA/uverbs: Add structure size info to write commands
We need the structure sizes to compute the location of the udata in the
core code. Annotate the sizes into the new macro language.

This is generated largely by script and checked by comparing against the
similar list in rdma-core.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
15a1b4becb RDMA/uverbs: Do not pass ib_uverbs_file to ioctl methods
The uverbs_attr_bundle already contains this pointer, and most methods
don't actually need it. Get rid of the redundant function argument.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
7106a97697 RDMA/uverbs: Make write() handlers return 0 on success
Currently they return the command length, while all other handlers return
0. This makes the write path closer to the write_ex and ioctl path.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Jason Gunthorpe
8313c10fa8 RDMA/uverbs: Replace ib_uverbs_file with uverbs_attr_bundle for write
Now that we can add meta-data to the description of write() methods we
need to pass the uverbs_attr_bundle into all write based handlers so
future patches can use it as a container for any new data transferred out
of the core.

This is the first step to bringing the write() and ioctl() methods to a
common interface signature.

This is a simple search/replace, and we push the attr down into the uobj
and other APIs to keep changes minimal.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-26 16:48:07 -07:00
Colin Ian King
d2c9d9abe1 IB/qib: fix spelling mistake "colescing" -> "coalescing"
There is a spelling mistake in the module description text, fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-26 16:45:06 -07:00
Parav Pandit
01b671170d RDMA/core: Sync unregistration with netlink commands
When the rdma device is getting removed, get resource info can race with
device removal, as below:

      CPU-0                                  CPU-1
    --------                               --------
    rdma_nl_rcv_msg()
       nldev_res_get_cq_dumpit()
          mutex_lock(device_lock);
          get device reference
          mutex_unlock(device_lock);        [..]
                                            ib_unregister_device()
                                            /* Valid reference to
                                             * device->dev exists.
                                             */
                                             ib_dealloc_device()

          [..]
          provider->fill_res_entry();

Even though device object is not freed, fill_res_entry() can get called on
device which doesn't have a driver anymore. Kernel core device reference
count is not sufficient, as this only keeps the structure valid, and
doesn't guarantee the driver is still loaded.

Similar race can occur with device renaming and device removal, where
device_rename() tries to rename a unregistered device. While this is fine
for devices of a class which are not net namespace aware, but it is
incorrect for net namespace aware class coming in subsequent series.  If a
class is net namespace aware, then the below [1] call trace is observed in
above situation.

Therefore, to avoid the race, keep a reference count and let device
unregistration wait until all netlink users drop the reference.

[1] Call trace:
kernfs: ns required in 'infiniband' for 'mlx5_0'
WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120
libahci i2c_core mlxfw libata dca [last unloaded: devlink]
RIP: 0010:kernfs_find_ns+0x104/0x120
Call Trace:
kernfs_find_and_get_ns+0x2e/0x50
sysfs_rename_link_ns+0x40/0xb0
device_rename+0xb2/0xf0
ib_device_rename+0xb3/0x100 [ib_core]
nldev_set_doit+0x165/0x190 [ib_core]
rdma_nl_rcv_msg+0x249/0x250 [ib_core]
? netlink_deliver_tap+0x8f/0x3e0
rdma_nl_rcv+0xd6/0x120 [ib_core]
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2f0/0x3e0
sock_sendmsg+0x30/0x40
__sys_sendto+0xdc/0x160

Fixes: da5c850782 ("RDMA/nldev: add driver-specific resource tracking")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-22 12:39:26 -07:00