linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-28 07:04:00 +08:00

Author	SHA1	Message	Date
Eyal Itkin	647bf3d8a8	IB/rxe: Fix mem_check_range integer overflow Update the range check to avoid integer-overflow in edge case. Resolves CVE 2016-8636. Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-08 12:28:30 -05:00
Eyal Itkin	628f07d33c	IB/rxe: Fix resid update Update the response's resid field when larger than MTU, instead of only updating the local resid variable. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Eyal Itkin <eyal.itkin@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-08 12:28:30 -05:00
David S. Miller	501ec18757	mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYmQKaAAoJEEg/ir3gV/o+RBMH/RGHNw3yPB2MyWo28V3eabw+ xl/SymiNOUgmq03ULYoc6xJpi9RCya7m/Kyce1M/M1gSz6LXubG2IDw9QsKV8lnc +5rwHCKjop6MdR3khsgqvWqGiKfQN0+QON5MjlPZB3/4u8qFcjauhfXpiX9naMO5 aB/Sm9zRPwRnsEhy2AwPyZqOxe5boZzHqmZxpthIgPMtqbpBYNkTkooljsj/KqXf AO3y/mdGykELPF3lIHTE4X9zixx5s6MrlAYX2uGUrAojs2WVIBsq3iXI/J8X9zs/ lg7to15WoMttR66vRZ120U6tx17OMmoxuAp+bmgZumabi/wDAZGSy5ELbH28WlY= =F+t/ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 13:44:08 -05:00
Christoph Hellwig	b6a05c823f	scsi: remove eh_timed_out methods in the transport template Instead define the timeout behavior purely based on the host_template eh_timed_out method and wire up the existing transport implementations in the host templates. This also clears up the confusion that the transport template method overrides the host template one, so some drivers have to re-override the transport template one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-06 19:10:03 -05:00
Parav Pandit	d0d7b10b05	net-next: treewide use is_vlan_dev() helper function. This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 16:33:29 -05:00
Saeed Mahameed	2b31f7ae5f	net/mlx5: TX WQE update Add new TX WQE fields for Connect-X5 vlan insertion support, type and vlan_tci, when type = MLX5_ETH_WQE_INSERT_VLAN the HW will insert the vlan and prio fields (vlan_tci) to the packet. Those bits and the inline header fields are mutually exclusive, and valid only when: MLX5_CAP_ETH(mdev, wqe_inline_mode) == MLX5_CAP_INLINE_MODE_NOT_REQUIRED and MLX5_CAP_ETH(mdev, wqe_vlan_insert), who will be set in ConnectX-5 and later HW generations. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:16 +02:00
David S. Miller	4e8f2fc1a5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-28 10:33:06 -05:00
Yuval Shaia	24dc831b77	IB/core: Add inline function to validate port Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:33:59 -05:00
Bart Van Assche	2bce1a6d22	IB/srpt: Accept GUIDs as port names Port and ACL information must be configured before an initiator logs in. Make it possible to configure this information before a subnet prefix has been assigned to a port by not only accepting GIDs as target port and initiator port names but by also accepting port GUIDs. Add a 'priv' member to struct se_wwn to allow target drivers to associate their own data with struct se_wwn. Reported-by: Doug Ledford <dledford@redhat.com> References: http://www.spinics.net/lists/linux-rdma/msg39505.html Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:31:50 -05:00
Christophe Jaillet	a3dd3a48a5	IB/cma: Fix reversed test This test looks reverted. We should log an error message only if 'ib_attach_mcast()' fails. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:29:20 -05:00
Jack Morgenstein	b4cfe3971f	RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled If IPV6 has not been enabled in the underlying kernel, we must avoid calling IPV6 procedures in rdma_cm.ko. This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements surrounding any code which calls external IPV6 procedures. In the instance fixed here, procedure cma_bind_addr() called ipv6_addr_type() -- which resulted in calling external procedure __ipv6_addr_type(). Fixes: `6c26a77124` ("RDMA/cma: fix IPv6 address resolution") Cc: <stable@vger.kernel.org> # v4.2+ Cc: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-27 14:29:04 -05:00
Zhu Yanjun	5c37077fd0	IB/ipoib: Remove the unnecessary error check The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero. As such, in the function ipoib_open, err_stop will never be reached. So remove this err_stop and change the return type of the function ipoib_mcast_start_thread/ipoib_ib_dev_up to void. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:22:24 -05:00
Shiraz Saleem	3f9fade5e7	i40iw: Set maj_err and min_err in i40iw_sc_cqp_create Set maj_err and min_err in i40iw_sc_cqp_create so that it returns correct values for all return cases. This also addresses an uninitialized variable warning for maj_err and min_err in i40iw_create_cqp. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Leon Romanovsky	564649b4ea	IB/qib: Remove empty function Commit `f06267104d` ("RDMA: Update workqueue usage") removed content of qib_qsfp_deinit(...) and left it empty. This patch deletes all leftovers of that function. Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Jack Wang	21d6454a39	RDMA/core: create struct ib_port_cache As Jason suggested, we have 4 elements for per port arrays, it's better to have a separate structure to represent them. It simplifies code a bit, ~ 30 lines of code less :) Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Zhu Yanjun	dfc0e55506	IB/ipoib: function interface change The ipoib_ib_dev_down/ipoib_ib_dev_stop return zero unconditionally and the callers never check the returned values, change the return type to void and remove the redundant return values. Reviewed-by: Shan Hai <shan.hai@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Dan Carpenter	820cd30ac2	i40iw: fix some indenting in i40iw_sc_vsi_init() The debug printk was indented more than it should have been and we can remove an unnecessary line break. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Moni Shoua	19b752a19d	IB/cma: Allow port reuse for rdma_id When allocating a port number for binding to a rdma_id, assuming the allocation is not for a specific port, the rule is to allow only ports that were not in use before by any other rdma_id. This condition is too strong to achieve the goal of a unique 5 tuple rdma_id. Instead, we can compare current rdma_id with other rdma_id for difference in one of destination port, source address and destination address to allow port reuse. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Moni Shoua	498683c6a7	IB/cma: Add debug messages to error flows Print debug messages to the kernel log to add more information about RDMA_CM events that indicate an error. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Zhu Yanjun	f7534f45dc	IB/ipoib: Remove unnecessary returned value check In the function ipoib_set_dev_features, the returned value is always 0. As such, it is not necessary to check the returned value. This is not a bug. It is a trivial problem. Reviewed-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Colin Ian King	506f71d181	IB/isert: fix spelling mistake: "teminating" -> "terminating" Trivial fix to spelling mistake in isert_warn message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:20:37 -05:00
Yonatan Cohen	2d4b21e0a2	IB/rxe: Prevent from completer to operate on non valid QP On UD QP completer tasklet is scheduled for each packet sent. If it is followed by a destroy_qp(), the kernel panic will happen as the completer tries to operate on a destroyed QP. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:17:32 -05:00
Maor Gottlieb	f39f775218	IB/rxe: Fix rxe dev insertion to rxe_dev_list The first argument of list_add_tail is the new item and the second is the head of the list. Fix the code to pass arguments in the right order, otherwise not all the rxe devices will be removed during teardown. Fixes: `8700e3e7c4` ('Soft RoCE driver') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 16:17:25 -05:00
Kenneth Lee	828f6fa65c	IB/umem: Release pid in error and ODP flow 1. Release pid before enter odp flow 2. Release pid when fail to allocate memory Fixes: `87773dd56d` ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get") Fixes: `8ada2c1c0c` ("IB/core: Add support for on demand paging regions") Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:44:31 -05:00
Ram Amrani	f449c7a2d8	RDMA/qedr: Dispatch port active event from qedr_add Relying on qede to trigger qedr on startup is problematic. When probing both if qedr loads slowly then qede can assume qedr is missing and not trigger it. This patch adds a triggering from qedr and protects against a race via an atomic bit. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:08 -05:00
Ram Amrani	9c1e0228ab	RDMA/qedr: Fix and simplify memory leak in PD alloc Free the PD if no internal resources were available. Move userspace code under the relevant 'if'. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:07 -05:00
Ram Amrani	af2b14b8b8	RDMA/qedr: Fix RDMA CM loopback The loopback logic in RDMA CM packets compares Ethernet addresses and was accidently inverse. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:02 -05:00
Ram Amrani	1a59075197	RDMA/qedr: Fix formatting Remove standalone ';'. List function's parameters in a single line. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:35:01 -05:00
Ram Amrani	27a4b1a6d6	RDMA/qedr: Mark three functions as static mark qedr_get_state_from_ibqp(), __qedr_alloc_mr() and __qedr_post_send() as static since they are only used in the same file. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:56 -05:00
Ram Amrani	933e6dcaa0	RDMA/qedr: Don't reset QP when queues aren't flushed Fail QP state transition from error to reset if SQ/RQ are not empty and still in the process of flushing out the queued work entries. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:55 -05:00
Ram Amrani	c78c314961	RDMA/qedr: Don't spam dmesg if QP is in error state It is normal to flush CQEs if the QP is in error state. Hence there's no use in printing a message per CQE to dmesg. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:54 -05:00
Ram Amrani	91bff997db	RDMA/qedr: Remove CQ spinlock from CM completion handlers There is only a single event queue that triggers the completion events for the RDMA CM and it is being processed serially. This means that inherently there can no parallelism of CQ completion handler callbacks, hence the lock is redundant. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:43 -05:00
Ram Amrani	59e8970b37	RDMA/qedr: Return max inline data in QP query result Return the maximum supported amount of inline data, not the qp's current configured inline data size, when filling out the results of a query qp call. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:37 -05:00
Ram Amrani	865cea40b6	RDMA/qedr: Return success when not changing QP state If the user is requesting us to change the QP state to the same state that it is already in, return success instead of failure. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:36 -05:00
Amrani, Ram	097b615965	RDMA/qedr: Fix MTU returned from QP query MTU value returned from QP query should include overhead. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:30 -05:00
Amrani, Ram	d3f4aadd61	RDMA/core: Add the function ib_mtu_int_to_enum As the functionality to convert the MTU from a number to enum_ib_mtu is ubiquitous, define a dedicated function and remove the duplicated code. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 15:34:22 -05:00
Ganesh Goudar	bab572f1d4	iw_cxgb4: Guard against null cm_id in dump_ep/qp Endpoints that are aborting can have already dereferenced the cm_id and set ep->com.cm_id to NULL. So guard against that in dump_ep() and dump_qp(). Also create a common function for setting up ip address pointers since the same logic is needed in several places. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:44:01 -05:00
Yuval Shaia	f57e8ca50e	IB/mad: Add port_num to error message Print the invalid port number to ease troubleshooting. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:20:42 -05:00
Yuval Shaia	1dd70ea360	IB/vmw_pvrdma: Remove unused qp_type Remove the unused qp_type parameter from function's args Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:20:42 -05:00
Yuval Shaia	6c6e51a617	IB/core: Fix typo in comment Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:19:48 -05:00
Adit Ranadive	ff89b070b7	IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path If the interrupt allocation failed we should start freeing the CQ rings rather than unregistering the netdev notifier. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Signed-off-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:15:28 -05:00
Adit Ranadive	7d211c81e9	IB/vmw_pvrdma: Don't leak info from alloc_ucontext Clear out the user response struct correctly. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 14:15:28 -05:00
Or Gerlitz	7898489880	IB/mlx5: Enable Eth VFs to query their min-inline value for user-space For some mlx5 HW models (CX4, CX4Lx), the VF driver needs to put part of the packet headers on the TX descriptor so the e-switch can do proper matching and steering. This is called "min-inline", it's advertized to the VF by the FW and also enforced on them by the HW, such that if they don't obey, their packets are dropped. SRIOV VF libmlx5 instances should take into account the min-inline value of their vports. For that end, we provide this value through the vendor response part of init_ucontext command. The min inline value is reported in a way which will let newer libmlx5 instances realize that they are running over an older kernel and act accordingly (e.g apply some educated guess). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-24 21:14:06 +02:00
Bart Van Assche	0bbb3b7496	IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it Make the rxe and rdmavt drivers use dma_virt_ops. Update the comments that refer to the source files removed by this patch. Remove struct ib_dma_mapping_ops. Remove ib_device.dma_ops. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Jonathan Toppins <jtoppins@redhat.com> Cc: Alex Estrin <alex.estrin@intel.com> Cc: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:31:32 -05:00
Bart Van Assche	99db949403	IB/core: Remove ib_device.dma_device Add code in ib_register_device() for copying the DMA masks. Use &ib_device.dev in DMA mapping operations instead of dma_device. Remove ib_device.dma_device because due to this and previous patches it is no longer used. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	e3dfa60c0a	IB/srpt: Modify a debug statement Since a later patch will remove ib_device.dma_device and since knowing the value of that pointer is not too important, remove dma_device from the debug output. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	dee2b82a5f	IB/srp: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	61118cecf2	IB/iser: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	db97ed0a2e	IB/IPoIB: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	85e9f1dbbd	IB/rxe: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:26:17 -05:00
Bart Van Assche	a62ef9a7d2	IB/vmw_pvrdma: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Adit Ranadive <aditr@vmware.com> Cc: VMware PV-Drivers <pv-drivers@vmware.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	6b06d52dbe	IB/usnic: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christian Benvenuti <benve@cisco.com> Cc: Dave Goodell <dgoodell@cisco.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	989ab358f7	IB/qib: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	69117101f9	IB/qedr: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Ram Amrani <Ram.Amrani@cavium.com> Cc: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	e6a73f2672	IB/ocrdma: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Selvin Xavier <selvin.xavier@avagotech.com> Cc: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	a487a0bff3	IB/nes: Remove a superfluous assignment statement Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	26e372705f	IB/mthca: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	9b0c289ec4	IB/mlx5: Switch from dma_device to dev.parent Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	d66c88a8fc	IB/mlx4: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	f2296adccf	IB/i40iw: Remove a superfluous assignment statement Due to a previous patch initializing ib_device.dev.parent is sufficient and initializing dma_device is no longer needed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Faisal Latif <faisal.latif@intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	fecd02eb2c	IB/hns: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Lijun Ou <oulijun@huawei.com> Cc: Wei Hu(Xavier) <xavier.huwei@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	3067771c51	IB/hfi1: Switch from dma_device to dev.parent Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	d08868a15a	IB/cxgb4: Set dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hariprasad S <hariprasad@chelsio.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	91f734b4f3	IB/cxgb3: Set dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Steve Wise <swise@chelsio.com> Acked-by: Steve Wise <swise@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	1e35a0880f	IB/core: Use dev.parent instead of dma_device Prepare for removal of ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	97a9ea8480	IB/core: Initialize ib_device.dev.parent earlier Move the ib_device.dev.parent initialization code from ib_device_register_sysfs() to ib_register_device(). Additionally, allow HBA drivers to set ib_device.dev.parent without setting ib_device.dma_device. This is the first step towards removing ib_device.dma_device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	5f0cb80134	IB/qib: Remove DMA mapping code The qib DMA mapping code is no longer built since commit `eb636ac0e4` ("IB/qib: Remove dma.c and use rdmavt version of dma functions"). Hence remove it. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	e6d356d3cd	IB/hf1: Remove DMA mapping code The hfi1 DMA mapping code has never been built in any upstream kernel. Hence remove it. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Bart Van Assche	5657933dbb	treewide: Move dma_ops from struct dev_archdata into struct device Some but not all architectures provide set_dma_ops(). Move dma_ops from struct dev_archdata into struct device such that it becomes possible on all architectures to configure dma_ops per device. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 12:23:35 -05:00
Max Gurtovoy	83236f0157	IB/iser: remove unused variable from iser_conn struct max_sectors calculation was fixed in commit: `9c674815d3` ("IB/iser: Fix max_sectors calculation"). Thus, iser_conn variable scsi_max_sectors is not needed anymore. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:37:45 -05:00
Max Gurtovoy	1e5db6c31a	IB/iser: Fix sg_tablesize calculation For devices that can register page list that is bigger than USHRT_MAX, we actually take the wrong value for sg_tablesize. E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX) so we set sg_tablesize to 0 by mistake. Therefore, each IO that is bigger than 4k splitted to "< 4k" chunks that cause performance degredation. Remove wrong sg_tablesize assignment, and use the value that was set during address resolution handler with the needed casting. Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:37:45 -05:00
Israel Rukshin	0a475ef422	IB/srp: fix invalid indirect_sg_entries parameter value After setting indirect_sg_entries module_param to huge value (e.g 500,000), srp_alloc_req_data() fails to allocate indirect descriptors for the request ring (kmalloc fails). This commit enforces the maximum value of indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param description. Fixes: `65e8617fba` (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS) Fixes: `c07d424d61` (IB/srp: add support for indirect tables that don't fit in SRP_CMD) Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:30:14 -05:00
Israel Rukshin	ad8e66b4a8	IB/srp: fix mr allocation when the device supports sg gaps If the device support arbitrary sg list mapping (device cap IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with IB_MR_TYPE_SG_GAPS. Fixes: `509c5f33f4` ("IB/srp: Prevent mapping failures") Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-24 11:03:17 -05:00
Mohamad Haj Yahia	105433659d	net/mlx5: Add support to s-tag in mlx5 firmware interface Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry match param. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-01-19 23:19:55 +02:00
Peter Zijlstra	2c935bc572	locking/atomic, kref: Add kref_read() Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-01-14 11:37:18 +01:00
Jack Wang	102c5ce082	RDMA/cma: use cached port state when bind loopback Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:04 -05:00
Jack Wang	93b1f29de7	RDMA/cma: resolve to first active ib port When we try to resolve a dest addr, if we don't give src addr, cma core will try to resolve to our source ib device automatically. The current logic only checks if a given port has the same subnet_prefix as our dest, which is not enough if we use default well known subnet_prefix on our active port, as it will be the same as the subnet_prefix on inactive ports and we might match against an inactive port by accident. To resolve this, we should also check if port is active before we resolve it as a suitable src address for a given dest. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:04 -05:00
Jack Wang	9e2c3f1c7f	RDMA/core: export ib_get_cached_port_state Export function for rdma_cm, patch for rdma_cm to follow. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 23:00:00 -05:00
Jack Wang	aaaca121c7	RDMA/core: add port state cache We need a port state cache in ib_core, later we will use in rdma_cm. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 22:59:55 -05:00
Feras Daoud	27d41d29c7	IB/ipoib: Change list_del to list_del_init in the tx object Since ipoib_cm_tx_start function and ipoib_cm_tx_reap function belong to different work queues, they can run in parallel. In this case if ipoib_cm_tx_reap calls list_del and release the lock, ipoib_cm_tx_start may acquire it and call list_del_init on the already deleted object. Changing list_del to list_del_init in ipoib_cm_tx_reap fixes the problem. Fixes: `839fcaba35` ("IPoIB: Connected mode experimental support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:06 -05:00
Feras Daoud	c586071d1d	IB/ipoib: Replace list_del of the neigh->list with list_del_init In order to resolve a situation where a few process delete the same list element in sequence and cause panic, list_del is replaced with list_del_init. In this case if the first process that calls list_del releases the lock before acquiring it again, other processes who can acquire the lock will call list_del_init. Fixes: `b63b70d877` ("IPoIB: Use a private hash table for path lookup") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:05 -05:00
Feras Daoud	13ee429a02	IB/ipoib: Use debug prints instead of warnings in RNR WC status If a receive request has not been posted to the work queue, the incoming message is rejected and the peer will receive a receiver-not-ready (RNR) error. In IPoIB, IB_WC_RNR_RETRY_EXC_ERR error is part of the life cycle therefore ipoib_cm_handle_tx_wc function will print to debug instead of warnings. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:05 -05:00
Feras Daoud	d32b9a81d7	IB/ipoib: Add detailed error message to dev_queue_xmit call Add a detailed return code to dev_queue_xmit function when calling to requeue packet via __skb_dequeue. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:04 -05:00
Feras Daoud	89a3987ab7	IB/ipoib: rtnl_unlock can not come after free_netdev The ipoib_vlan_add function calls rtnl_unlock after free_netdev, rtnl_unlock not only releases the lock, but also calls netdev_run_todo. The latter function browses the net_todo_list array and completes the unregistration of all its net_device instances. If we call free_netdev before rtnl_unlock, then netdev_run_todo call over the freed device causes panic. To fix, move rtnl_unlock call before free_netdev call. Fixes: `9baa0b0364` ("IB/ipoib: Add rtnl_link_ops support") Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:04 -05:00
Feras Daoud	0a0007f283	IB/ipoib: Fix deadlock between rmmod and set_mode When calling set_mode from sys/fs, the call flow locks the sys/fs lock first and then tries to lock rtnl_lock (when calling ipoib_set_mod). On the other hand, the rmmod call flow takes the rtnl_lock first (when calling unregister_netdev) and then tries to take the sys/fs lock. Deadlock a->b, b->a. The problem starts when ipoib_set_mod frees it's rtnl_lck and tries to get it after that. set_mod: [<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90 [<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20 [<ffffffff814fed2b>] mutex_lock+0x2b/0x50 [<ffffffff81448675>] rtnl_lock+0x15/0x20 [<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib] [<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib] [<ffffffff8134b840>] dev_attr_store+0x20/0x30 [<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170 [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 [<ffffffff8117ba81>] sys_write+0x51/0x90 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b rmmod: [<ffffffff81279ffc>] ? put_dec+0x10c/0x110 [<ffffffff8127a2ee>] ? number+0x2ee/0x320 [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0 [<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0 [<ffffffff8127b550>] ? string+0x40/0x100 [<ffffffff814fe323>] wait_for_common+0x123/0x180 [<ffffffff81060250>] ? default_wake_function+0x0/0x20 [<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0 [<ffffffff814fe43d>] wait_for_completion+0x1d/0x20 [<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270 [<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0 [<ffffffff81273f66>] kobject_del+0x16/0x40 [<ffffffff8134cd14>] device_del+0x184/0x1e0 [<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0 [<ffffffff8143c05e>] rollback_registered+0xae/0x130 [<ffffffff8143c102>] unregister_netdevice+0x22/0x70 [<ffffffff8143c16e>] unregister_netdev+0x1e/0x30 [<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib] [<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core] [<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib] [<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core] Fixes: `862096a8bb` ("IB/ipoib: Add more rtnl_link_ops callbacks") Cc: <stable@vger.kernel.org> # v3.6+ Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:03 -05:00
Feras Daoud	1c3098cdb0	IB/ipoib: Fix deadlock over vlan_mutex This patch fixes Deadlock while executing ipoib_vlan_delete. The function takes the vlan_rwsem semaphore and calls unregister_netdevice. The later function calls ipoib_mcast_stop_thread that cause workqueue flush. When the queue has one of the ipoib_ib_dev_flush_xxx events, a deadlock occur because these events also tries to catch the same vlan_rwsem semaphore. To fix, unregister_netdevice should be called after releasing the semaphore. Fixes: `cbbe1efa49` ("IPoIB: Fix deadlock between ipoib_open() and child interface create") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:02 -05:00
Feras Daoud	80b5b35aba	IB/ipoib: Set device connection mode only when needed When changing the connection mode, the ipoib_set_mode function did not check if the previous connection mode equals to the new one. This commit adds the required check and return 0 if the new mode equals to the previous one. Fixes: `839fcaba35` ("IPoIB: Connected mode experimental support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 14:01:02 -05:00
Feras Daoud	29da686dff	IB/ipoib: When given an invalid UD MTU, give debug msg In datagram mode, the IB UD (Unreliable Datagram) transport is used so the MTU of the interface is equal to the IB L2 MTU minus the IPoIB encapsulation header. Any request to change the MTU value above the maximum range will change the MTU to the max allowed, but will not show any warning message. An ipoib_warn is issued in such cases, letting the user know that even though the value is legal, it can't be currently applied. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 13:59:56 -05:00
ssh10	db287ec5cb	RDMA/ocrdma: Replace BUG() with BUG_ON() Replace BUG() with BUG_ON() using coccinelle Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:52 -05:00
ssh10	b462b06eb6	RDMA/cxgb4: Use AF_INET for sin_family field Elsewhere the sin_family field holds a value with a name of the form AF_..., so it seems reasonable to do so here as well. Also the values of PF_INET and AF_INET are the same. The semantic patch that makes this change is as follows: //</smpl> @@ struct sockaddr_in sip; @@ ( sip.sin_family == - PF_INET + AF_INET \| sip.sin_family != - PF_INET + AF_INET \| sip.sin_family = - PF_INET + AF_INET ) //</smpl> Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:52 -05:00
Amrani, Ram	df15856132	RDMA/qedr: restructure functions that create/destroy QPs Simplify function and sub-function flow of QP creation and destruction. This also serves as a preparation for SRQ and iWARP support. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 12:21:41 -05:00
Geliang Tang	bb75f33cf0	RDMA/qib: use rb_entry() To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Cao jin	e8f4eb3bfa	RDMA/hfi1: drop pci_link_reset() In AER recovery, pci_error_handlers.link_reset() is never called, drop it now. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Cao jin	850d08721a	RDMA/qib: drop qib_pci_link_reset() In AER recovery, pci_error_handlers.link_reset() is never called, drop it now. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Kees Cook	7f6856b789	RDMA/i40iw: use designated initializers Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Kees Cook	6554c9f7f7	RDMA/nes: use designated initializers Prepare to mark sensitive kernel structures for randomization by making sure they're using designated initializers. These were identified during allyesconfig builds of x86, arm, and arm64, with most initializer fixes extracted from grsecurity. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-12 11:38:41 -05:00
Bart Van Assche	c5540a0195	IB/rxe: Fix an skb leak Additionally, make it easier to detect skb leaks by issuing a warning if a leak occurs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	839f5ac0d8	IB/rxe: Remove a pointless indirection layer Neither rxe->ifc_ops nor any of the function pointers in struct struct rxe_ifc_ops ever change. Hence remove the rxe->ifc_ops indirection mechanism. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	ab17654476	IB/rxe: Fix reference leaks in memory key invalidation code Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	b3a4599610	IB/rxe: Fix a MR reference leak in check_rkey() Avoid that calling check_rkey() for mem->state == RXE_MEM_STATE_FREE triggers an MR reference leak. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	18d3451c0d	IB/rxe: Generate a completion for all failed work requests Change do_complete() such that an error completion is not only generated if a QP is in the error state but also if a work request failed. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	723ec9ae2a	IB/rxe: Introduce functions for queue draining This change makes the code easier to read and avoids that code is duplicated. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	642c7cbcaf	IB/rxe: Add a runtime check in alloc_index() Since index values equal to or above 'range' can trigger memory corruption, complain if index >= range. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	43553b47c3	IB/rxe: Issue warnings once It is strongly recommended to report kernel warnings once instead of every time a condition is hit. Hence change WARN_ON() into WARN_ON_ONCE() / BUILD_BUG_ON() as appropriate. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	32404fb764	IB/rxe: Let the compiler check the type of the cleanup functions Change the argument type of these functions from void * into struct rxe_pool_entry *. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	046ef24d25	IB/rxe: Enable type checking on SKB_TO_PKT() and PKT_TO_SKB() arguments Let the compiler check the type of the arguments passed to SKB_TO_PKT() and PKT_TO_SKB(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	967335ab90	IB/rxe: Remove superfluous casts Casting a pointer to 'void *' explicitly is not necessary in C code. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	175f1244c1	IB/rxe: Remove an unused variable and an unused argument The variable 'av' is not used so remove it. Since that change removes the last user of the 'wqe' argument, remove that argument too. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	c8b82182cb	IB/rxe: Remove an unused function Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	2bec3baded	IB/rxe: Constify the pool name Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Bart Van Assche	8d8f083720	IB/rxe: Suppress sparse warnings Avoid that sparse complains about using 0 as a pointer, about missing function declarations and also avoid that sparse complains about endianness. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 16:52:47 -05:00
Selvin Xavier	69ae543969	RDMA: Adding ethertype ETH_P_IBOE Update the if_ether.h with the ethertype for Infiniband over Ethernet packets. Also, removing the occurances of 0x8915 from infiniband vendor drivers. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:05:11 -05:00
Steve Wise	3bcf96e018	iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort Function rx_data(), which handles ingress CPL_RX_DATA messages, was always sending an RX_DATA_ACK with the goal of updating the credits. However, if the RDMA connection is moved out of FPDU mode abruptly, then it is possible for iw_cxgb4 to process queued RX_DATA CPLs after HW has aborted the connection. These CPLs should not trigger RX_DATA_ACKS. If they do, HW can see a READ after DELETE of the DB_LE hash entry for the tid and post a LE_DB HashTblMemCrcError. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	c12a67fec8	iw_cxgb4: free EQ queue memory on last deref Commit `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") introduced a bug where the RDMA QP EQ queue memory (and QIDs) are possibly freed before the underlying connection has been fully shutdown. The result being a possible DMA read issued by HW after the queue memory has been unmapped and freed. This results in possible WR corruption in the worst case, system bus errors if an IOMMU is in use, and SGE "bad WR" errors reported in the very least. The fix is to defer unmap/free of queue memory and QID resources until the QP struct has been fully dereferenced. To do this, the c4iw_ucontext must also be kept around until the last QP that references it is fully freed. In addition, since the last QP deref can happen in an IRQ disabled context, we need a new workqueue thread to do the final unmap/free of the EQ queue memory. Fixes: `ad61a4c7a9` ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Steve Wise	4fe7c2962e	iw_cxgb4: refactor sq/rq drain logic With the addition of the IB/Core drain API, iw_cxgb4 supported drain by watching the CQs when the QP was out of RTS and signalling "drain complete" when the last CQE is polled. This, however, doesn't fully support the drain semantics. Namely, the drain logic is supposed to signal "drain complete" only when the application has _processed_ the last CQE, not just removed them from the CQ. Thus a small timing hole exists that can cause touch after free type bugs in applications using the drain API (nvmf, iSER, for example). So iw_cxgb4 needs a better solution. The iWARP Verbs spec mandates that "_at some point_ after the QP is moved to ERROR", the iWARP driver MUST synchronously fail post_send and post_recv calls. iw_cxgb4 was currently not allowing any posts once the QP is in ERROR. This was in part due to the fact that the HW queues for the QP in ERROR state are disabled at this point, so there wasn't much else to do but fail the post operation synchronously. This restriction is what drove the first drain implementation in iw_cxgb4 that has the above mentioned flaw. This patch changes iw_cxgb4 to allow post_send and post_recv WRs after the QP is moved to ERROR state for kernel mode users, thus still adhering to the Verbs spec for user mode users, but allowing flush WRs for kernel users. Since the HW queues are disabled, we just synthesize a CQE for this post, queue it to the SW CQ, and then call the CQ event handler. This enables proper drain operations for the various storage applications. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-01-10 14:01:38 -05:00
Parav Pandit	43579b5f2c	IB/core: added support to use rdma cgroup controller Added support APIs for IB core to register/unregister every IB/RDMA device with rdma cgroup for tracking rdma resources. IB core registers with rdma cgroup controller. Added support APIs for uverbs layer to make use of rdma controller. Added uverbs layer to perform resource charge/uncharge functionality. Added support during query_device uverb operation to ensure it returns resource limits by honoring rdma cgroup configured limits. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-01-10 11:14:27 -05:00
David S. Miller	bda65b4255	mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. Thanks, Eli and Matan -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYc9kSAAoJEEg/ir3gV/o+a0EH/jEGiopH7CHc4T4nXT1I4kQa TicrkMNV3Sr9MBWwn8TLOyx+Fi1dex4cumrJI/BNVjC6h/nS6JHbslYoZxTkX9lT L0vRsHJBVr/PODqimIGNnlJFBPhNJSGiHG4JHlJHlpvcGNahitN3gXmUjcRNju+V ExnvgwWzAXM0qg1qWf5A/3HmqbtYES1rJXQUsimtc2QAif/SIayBD4fEA8x5zNBA i0p8xcDrzUqmeblkpnsJA3w40s1rsuqvJnvLPDpbpKENtHfw1UFZ2987P7LvOrIv NF/mZBkStC0gOZX6dLEAdoZXL1gTsJX19hTkUMfYH4BHqHARa2/oCS3wcCf1Giw= =C+cp -----END PGP SIGNATURE----- Merge tag 'mlx5-4kuar-for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 17:09:31 -05:00
Eli Cohen	30aa60b3bd	IB/mlx5: Support 4k UAR for libmlx5 Add fields to structs to convey to kernel an indication whether the library supports multi UARs per page and return to the library the size of a UAR based on the queried value. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:09 +02:00
Eli Cohen	b037c29a80	IB/mlx5: Allow future extension of libmlx5 input data Current check requests that new fields in struct mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero. This was introduced so new libraries passing additional information to the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified by old kernels that do not support their request by failing the operation. This schecme is problematic since it requires libmlx5 to issue the requests with descending input size for struct mlx5_ib_alloc_ucontext_req_v2. To avoid this, we require that new features that will obey the following rules: If the feature requires one or more fields in the response and the at least one of the fields can be encoded such that a zero value means the kernel ignored the request then this field will provide the indication to the library. If no response is required or if zero is a valid response, a new field should be added that indicates to the library whether its request was processed. Fixes: `b368d7cb8c` ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:09 +02:00
Eli Cohen	5fe9dec0d0	IB/mlx5: Use blue flame register allocator in mlx5_ib Make use of the blue flame registers allocator at mlx5_ib. Since blue flame was not really supported we remove all the code that is related to blue flame and we let all consumers to use the same blue flame register. Once blue flame is supported we will add the code. As part of this patch we also move the definition of struct mlx5_bf to mlx5_ib.h as it is only used by mlx5_ib. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-09 20:25:08 +02:00
Eli Cohen	0b80c14f00	IB/mlx5: Fix retrieval of index to first hi class bfreg First the function retrieving the index of the first hi latency class blue flame register. High latency class bfregs are located right above medium latency class bfregs. Fixes: `c1be5232d2` ('IB/mlx5: Fix micro UAR allocator') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	2f5ff26478	mlx5: Fix naming convention with respect to UARs This establishes a solid naming conventions for UARs. A UAR (User Access Region) can have size identical to a system page or can be fixed 4KB depending on a value queried by firmware. Each UAR always has 4 blue flame register which are used to post doorbell to send queue. In addition, a UAR has section used for posting doorbells to CQs or EQs. In this patch we change names to reflect this conventions. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	f4044dac63	IB/mlx5: Fix error handling order in create_kernel_qp Make sure order of cleanup is exactly the opposite of initialization. Fixes: `9603b61de1` ('mlx5: Move pci device handling from mlx5_ib to mlx5_core') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
Eli Cohen	de8d6e02ef	IB/mlx5: Fix kernel to user leak prevention logic The logic was broken as it failed to update the response length for architectures with PAGE_SIZE larger than 4kB. As a result further extension of the ucontext response struct would fail. Fixes: `d69e3bcf79` ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-01-08 11:21:26 +02:00
David S. Miller	76eb75be79	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-01-05 11:03:07 -05:00
Artemy Kovalyov	aa8e08d2f5	IB/mlx5: Improve MR check Add "type" field to mlx5_core MKEY struct. Check whether page fault happens on MKEY corresponding to MR. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	17d2f88f92	IB/mlx5: Add ODP atomics support Handle ODP atomic operations. When initiator of RDMA atomic operation use ODP MR to provide source data handle pagefault properly. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	d9aaed8387	{net,IB}/mlx5: Refactor page fault handling * Update page fault event according to last specification. * Separate code path for page fault EQ, completion EQ and async EQ. * Move page fault handling work queue from mlx5_ib static variable into mlx5_core page fault EQ. * Allocate memory to store ODP event dynamically as the events arrive, since in atomic context - use mempool. * Make mlx5_ib page fault handler run in process context. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	7d0cc6edcc	IB/mlx5: Add MR cache for large UMR regions In this change we turn mlx5_ib_update_mtt() into generic mlx5_ib_update_xlt() to perfrom HCA translation table modifiactions supporting both atomic and process contexts and not limited by number of modified entries. Using this function we increase preallocated MRs up to 16GB. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	c438fde1c2	IB/mlx5: Add support for big MRs Make use of extended UMR translation offset. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Artemy Kovalyov	3161625589	IB/mlx5: Refactor UMR post send format * Update struct mlx5_wqe_umr_ctrl_seg. * Currenlty UMR send_flags aim only certain use cases: enabled/disable cached MR, modifying XLT for ODP. By making flags independent make UMR more flexible allowing arbitrary manipulations. * Since different UMR formats have different entry sizes UMR request should receive exact size of translation table update instead of number of entries. Rename field npages to xlt_size in struct mlx5_umr_wr and update relevant code accordingly. * Add support of length64 bit. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Binoy Jayan	d5ea2df9ce	IB/mlx5: Add helper mlx5_ib_post_send_wait Clean up the following common code (to post a list of work requests to the send queue of the specified QP) at various places and add a helper function 'mlx5_ib_post_send_wait' to implement the same. - Initialize 'mlx5_ib_umr_context' on stack - Assign "mlx5_umr_wr:wr:wr_cqe to umr_context.cqe - Acquire the semaphore - call ib_post_send with a single ib_send_wr - wait_for_completion() - Check for umr_context.status - Release the semaphore Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Leon Romanovsky	9f885201f2	IB/mlx5: Reorder code in query device command The order of features exposed by private mlx5-abi.h file is CQE zipping, packet pacing and multi-packet WQE. The internal order implemented in mlx5_ib_query_device() is multi-packet WQE, CQE zipping and packet pacing. Such difference hurts code readability, so let's sync, while mlx5-abi.h (exposed to userspace) is the primary order. This commit doesn't change any functionality. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-02 15:51:20 -05:00
Jack Morgenstein	10b1c04e92	net/mlx4_core: Fix raw qp flow steering rules under SRIOV Demoting simple flow steering rule priority (for DPDK) was achieved by wrapping FW commands MLX4_QP_FLOW_STEERING_ATTACH/DETACH for the PF as well, and forcing the priority to MLX4_DOMAIN_NIC in the wrapper function for the PF and all VFs. In function mlx4_ib_create_flow(), this change caused the main rule creation for the PF to be wrapped, while it left the associated tunnel steering rule creation unwrapped for the PF. This mismatch caused rule deletion failures in mlx4_ib_destroy_flow() for the PF when the detach wrapper function did not find the associated tunnel-steering rule (since creation of that rule for the PF did not go through the wrapper function). Fix this by setting MLX4_QP_FLOW_STEERING_ATTACH/DETACH to be "native" (so that the PF invocation does not go through the wrapper), and perform the required priority demotion for the PF in the mlx4_ib_create_flow() code path. Fixes: `48564135cb` ("net/mlx4_core: Demote simple multicast and broadcast flow steering rules") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-29 14:17:40 -05:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Linus Torvalds	296915912d	First round of -rc fixes for 4.10 kernel - Series of qedr fixes - Series of rxe fixes - One isolated i40iw fix - One isolated cma fix - One isolated cxgb4 fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYXAGvAAoJELgmozMOVy/dDukQAMMNarWp0U8KfNYRU5tyCBwd aIQC1gFT6GUCFys40Z6L84m1D3NpGR+vzVv3grVBeuge73b79zAOHXvVDwJCA+Jl QQLG3vZ13C3158sLDiK8zL+4Ob5OfOQ5nQ2spvDfJWpye9SD+pWFcrpqvK02ANRN kFHILk1gROBTNi46yBR5hjWOkw7Bua6XLsPxh6xoaDZ43NL0r0xgm43FTnj/19x3 0zpZYYKP+3C6U7678rqaog9zfXHvadghW5/WBJ/VgfKqEmH89ESx4J2MvbB8DxFD 1tWAOpr5TNY5jnh8mtUsceDjCzQivc/RWqAu05BspEwcavjSLFyRYr1epR0/4oAd PqLSmfORmhpJ8+5Kmn+chtXo3TT4SYGHIzSUbgbEV/ClwX/7UW+w8mfQZ3buUBq/ cQp/oRnJcsrQIEDFO3AH7P+6Sxy6t3zbSl5oKBUOI1u4RFmC7YBPqo9fQu2Z2mGk 3+AWQaPr7qgEcFzXBgLzvd4LhTYKsvmiNwrcXi9KjjwQjNEVg15qqF2YtmxEUgi9 kh3IOcGan3iSblhV/WLrxcOjlPQrPpBOVnTPhUskFtlsrD+032OxeOBpVoU3nCUt MjTYWoNTYdw4wHz0w373o0uR4+4nl4a5OmO4Fh6Drmg5hm4Bl9BWy0Kziu93Z1Ay Z2utZVWLWhBzn8yJujUz =NW9g -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs	2016-12-23 10:38:48 -08:00
Andrew Boyer	5cc8fabc5e	IB/rxe: Don't check for null ptr in send() pkt->qp was already dereferenced earlier in the function. Fixes Smatch complaint: drivers/infiniband/sw/rxe/rxe_net.c:458 send() warn: variable dereferenced before check 'pkt->qp' (see line 441) Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Andrew Boyer	cbf1f9a46c	IB/rxe: Drop future atomic/read packets rather than retrying If the completer is in the middle of a large read operation, one lost packet can cause havoc. Going to COMPST_ERROR_RETRY will cause the requester to resend the request. After that, any packet from the first attempt still in the receive queue will be interpreted as an error, restarting the error/retry sequence. The transfer will quickly exhaust its retries. This behavior is very noticeable when doing 512KB reads on a QEMU system configured with 1500B MTU. Also, a resent request here will prompt the responder on the other side to immediately start resending, but the resent packets will get stuck in the already-loaded receive queue and will never be processed. Rather than erroring out every time an unexpected future packet arrives, just drop it. Eventually the retry timer will send a duplicate request; the completer will be able to make progress since the queue will start relatively empty. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Andrew Boyer	37b3619394	IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	74c3875c3d	qedr: Always notify the verb consumer of flushed CQEs Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	27035a1b37	qedr: clear the vendor error field in the work completion We clear the vendor error field in the work completion so that if a work completion is erroneous the field won't confuse the caller. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	922d9a40d3	qedr: post_send/recv according to QP state Enable posting to SQ only in RTS, ERR and SQD QP state. Enable posting to RQ in ERR QP state. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	8b0cabc650	qedr: ignore inline flag in read verbs In the current implementation a read verb with IB_SEND_INLINE may be illegally configured. In this fix we ignore the inline bit in the case of a read verb. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	b4c2cc48aa	qedr: modify QP state to error when destroying it Current code didn't modify the QP state to error because it queried the QP state as a bitmap while it isn't. So the code never got executed. This patch fixes this and queries for each QP state respectively and not at once via a bitmask. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	d6ebbf29c3	qedr: return correct value on modify qp Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	a121135973	qedr: return error if destroy CQ failed Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Amrani, Ram	c7eb3bced7	qedr: configure the number of CQEs on CQ creation Configure ibcq->cqe when a CQ is created. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Chien Tin Tung	61f51b7b20	i40iw: Set 128B as the only supported RQ WQE size RQ WQE size other than 128B is not supported. Correct RQ size calculation to use 128B only. Since this breaks ABI, add additional code to provide compatibility with v4 user provider, libi40iw. Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-22 11:36:12 -05:00
Bart Van Assche	e259934d4d	IB/rxe: Fix a memory leak in rxe_qp_cleanup() A socket is associated with every QP by the rxe driver but sock_release() is never called. Add a call to sock_release() in rxe_qp_cleanup(). Fixes: commit 8700e3e7c48A5 ("Add Soft RoCE driver") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Moni Shoua <monis@mellanox.com> Cc: Kamal Heib <kamalh@mellanox.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: <stable@vger.kernel.org> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-18 13:35:19 -05:00
Steve Wise	b414fa01c3	iw_cxgb4: set correct FetchBurstMax for QPs The current QP FetchBurstMax value is 256B, which is incorrect since a WR can exceed that value. The result being a partial WR fetched by hardware, and a fatal "bad WR" error posted by the SGE. So bump the FetchBurstMax to 512B. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-12-18 13:35:19 -05:00

1 2 3 4 5 ...

6486 Commits