The device stores IPv6 addresses that are used for encapsulation in
linear memory that is managed by the driver.
Changing the remote address of an ip6gre net device never worked
properly, but since cited commit the following reproducer [1] would
result in a warning [2] and a memory leak [3]. The problem is that the
new remote address is never added by the driver to its hash table (and
therefore the device) and the old address is never removed from it.
Fix by programming the new address when the configuration of the ip6gre
net device changes and removing the old one. If the address did not
change, then the above would result in increasing the reference count of
the address and then decreasing it.
[1]
# ip link add name bla up type ip6gre local 2001:db8:1::1 remote 2001:db8:2::1 tos inherit ttl inherit
# ip link set dev bla type ip6gre remote 2001:db8:3::1
# ip link del dev bla
# devlink dev reload pci/0000:01:00.0
[2]
WARNING: CPU: 0 PID: 1682 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3002 mlxsw_sp_ipv6_addr_put+0x140/0x1d0
Modules linked in:
CPU: 0 UID: 0 PID: 1682 Comm: ip Not tainted 6.12.0-rc3-custom-g86b5b55bc835 #151
Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023
RIP: 0010:mlxsw_sp_ipv6_addr_put+0x140/0x1d0
[...]
Call Trace:
<TASK>
mlxsw_sp_router_netdevice_event+0x55f/0x1240
notifier_call_chain+0x5a/0xd0
call_netdevice_notifiers_info+0x39/0x90
unregister_netdevice_many_notify+0x63e/0x9d0
rtnl_dellink+0x16b/0x3a0
rtnetlink_rcv_msg+0x142/0x3f0
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x242/0x390
netlink_sendmsg+0x1de/0x420
____sys_sendmsg+0x2bd/0x320
___sys_sendmsg+0x9a/0xe0
__sys_sendmsg+0x7a/0xd0
do_syscall_64+0x9e/0x1a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
[3]
unreferenced object 0xffff898081f597a0 (size 32):
comm "ip", pid 1626, jiffies 4294719324
hex dump (first 32 bytes):
20 01 0d b8 00 02 00 00 00 00 00 00 00 00 00 01 ...............
21 49 61 83 80 89 ff ff 00 00 00 00 01 00 00 00 !Ia.............
backtrace (crc fd9be911):
[<00000000df89c55d>] __kmalloc_cache_noprof+0x1da/0x260
[<00000000ff2a1ddb>] mlxsw_sp_ipv6_addr_kvdl_index_get+0x281/0x340
[<000000009ddd445d>] mlxsw_sp_router_netdevice_event+0x47b/0x1240
[<00000000743e7757>] notifier_call_chain+0x5a/0xd0
[<000000007c7b9e13>] call_netdevice_notifiers_info+0x39/0x90
[<000000002509645d>] register_netdevice+0x5f7/0x7a0
[<00000000c2e7d2a9>] ip6gre_newlink_common.isra.0+0x65/0x130
[<0000000087cd6d8d>] ip6gre_newlink+0x72/0x120
[<000000004df7c7cc>] rtnl_newlink+0x471/0xa20
[<0000000057ed632a>] rtnetlink_rcv_msg+0x142/0x3f0
[<0000000032e0d5b5>] netlink_rcv_skb+0x50/0x100
[<00000000908bca63>] netlink_unicast+0x242/0x390
[<00000000cdbe1c87>] netlink_sendmsg+0x1de/0x420
[<0000000011db153e>] ____sys_sendmsg+0x2bd/0x320
[<000000003b6d53eb>] ___sys_sendmsg+0x9a/0xe0
[<00000000cae27c62>] __sys_sendmsg+0x7a/0xd0
Fixes: cf42911523 ("mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/e91012edc5a6cb9df37b78fd377f669381facfcb.1729866134.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Non-coherent architectures, like ARM, may require invalidating caches
before the device can use the DMA mapped memory, which means that before
posting pages to device, drivers should sync the memory for device.
Sync for device can be configured as page pool responsibility. Set the
relevant flag and define max_len for sync.
Cc: Jiri Pirko <jiri@resnulli.us>
Fixes: b5b60bb491 ("mlxsw: pci: Use page pool for Rx buffers allocation")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/92e01f05c4f506a4f0a9b39c10175dcc01994910.1729866134.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When Rx packet is received, drivers should sync the pages for CPU, to
ensure the CPU reads the data written by the device and not stale
data from its cache.
Add the missing sync call in Rx path, sync the actual length of data for
each fragment.
Cc: Jiri Pirko <jiri@resnulli.us>
Fixes: b5b60bb491 ("mlxsw: pci: Use page pool for Rx buffers allocation")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/461486fac91755ca4e04c2068c102250026dcd0b.1729866134.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tx header should be pushed for each packet which is transmitted via
Spectrum ASICs. The cited commit moved the call to skb_cow_head() from
mlxsw_sp_port_xmit() to functions which handle Tx header.
In case that mlxsw_sp->ptp_ops->txhdr_construct() is used to handle Tx
header, and txhdr_construct() is mlxsw_sp_ptp_txhdr_construct(), there is
no call for skb_cow_head() before pushing Tx header size to SKB. This flow
is relevant for Spectrum-1 and Spectrum-4, for PTP packets.
Add the missing call to skb_cow_head() to make sure that there is both
enough room to push the Tx header and that the SKB header is not cloned and
can be modified.
An additional set will be sent to net-next to centralize the handling of
the Tx header by pushing it to every packet just before transmission.
Cc: Richard Cochran <richardcochran@gmail.com>
Fixes: 24157bc69f ("mlxsw: Send PTP packets as data packets to overcome a limitation")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/5145780b07ebbb5d3b3570f311254a3a2d554a44.1729866134.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It is meant to use xa_err() to extract the error encoded in the return
value of xa_store().
Fixes: 44c2fbebe1 ("mlxsw: spectrum_router: Share nexthop counters in resilient groups")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20241017023223.74180-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Command bitmask have a dedicated bit for MANAGE_PAGES command, this bit
isn't Initialize during command bitmask Initialization, only during
MANAGE_PAGES.
In addition, mlx5_cmd_trigger_completions() is trying to trigger
completion for MANAGE_PAGES command as well.
Hence, in case health error occurred before any MANAGE_PAGES command
have been invoke (for example, during mlx5_enable_hca()),
mlx5_cmd_trigger_completions() will try to trigger completion for
MANAGE_PAGES command, which will result in null-ptr-deref error.[1]
Fix it by Initialize command bitmask correctly.
While at it, re-write the code for better understanding.
[1]
BUG: KASAN: null-ptr-deref in mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core]
Write of size 4 at addr 0000000000000214 by task kworker/u96:2/12078
CPU: 10 PID: 12078 Comm: kworker/u96:2 Not tainted 6.9.0-rc2_for_upstream_debug_2024_04_07_19_01 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5_health0000:08:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x7e/0xc0
kasan_report+0xb9/0xf0
kasan_check_range+0xec/0x190
mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core]
mlx5_cmd_flush+0x94/0x240 [mlx5_core]
enter_error_state+0x6c/0xd0 [mlx5_core]
mlx5_fw_fatal_reporter_err_work+0xf3/0x480 [mlx5_core]
process_one_work+0x787/0x1490
? lockdep_hardirqs_on_prepare+0x400/0x400
? pwq_dec_nr_in_flight+0xda0/0xda0
? assign_work+0x168/0x240
worker_thread+0x586/0xd30
? rescuer_thread+0xae0/0xae0
kthread+0x2df/0x3b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x2d/0x70
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
Fixes: 9b98d395b8 ("net/mlx5: Start health poll at earlier stage of driver load")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Currently, mlx5 driver does not enforce vector index to be lower than
the maximum number of supported completion vectors when requesting a
new completion EQ. Thus, mlx5_comp_eqn_get() fails when trying to
acquire an IRQ with an improper vector index.
To prevent the case above, enforce that vector index value is
valid and lower than maximum in mlx5_comp_eqn_get() before handling the
request.
Fixes: f14c1a14e6 ("net/mlx5: Allocate completion EQs dynamically")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The HWS BWC API uses one lock per queue and usually acquires one of
them, except when doing changes which require locking all queues in
order. Naturally, lockdep isn't too happy about acquiring the same lock
class multiple times, so inform it that each queue lock is a different
class to avoid false positives.
Fixes: 2ca62599aa ("net/mlx5: HWS, added send engine and context handling")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
hws_send_queues_bwc_locks_destroy destroyed more queue locks than
allocated, leading to memory corruption (occasionally) and warnings such
as DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) in __mutex_destroy because
sometimes, the 'mutex' being destroyed was random memory.
The severity of this problem is proportional to the number of queues
configured because the code overreaches beyond the end of the
bwc_send_queue_locks array by 2x its length.
Fix that by using the correct number of bwc queues.
Fixes: 2ca62599aa ("net/mlx5: HWS, added send engine and context handling")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fix error flow bug that could lead to double free of a buffer
during a failure to calculate a suitable definer layout.
Fixes: 74a778b4a6 ("net/mlx5: HWS, added definers handling")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Removed wrong access to the num_of_rules field of the matcher.
This is a usual u32 variable, but the access was as if it was atomic.
This fixes the following CI warnings:
mlx5hws_bwc.c:708:17: warning: large atomic operation may incur significant performance penalty;
the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Watomic-alignment]
Fixes: 510f9f61a1 ("net/mlx5: HWS, added API and enabled HWS support")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409291101.6NdtMFVC-lkp@intel.com/
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Several new features here:
virtio-balloon supports new stats
vdpa supports setting mac address
vdpa/mlx5 suspend/resume as well as MKEY ops are now faster
virtio_fs supports new sysfs entries for queue info
virtio/vsock performance has been improved
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbz7ykPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpkk8H/A3vMRYXBzne9anezZLvADKS/CpX7v0DFEVj
VfSMWXvYdUariYDyyb7pZsvK5QR22pE0pIaW6Kcgv9fNwq27M/H6g6NJk5ny8a7d
216AQs1J28pXPPY+q03fhf3SzE3yHP8aeD9lyiO9QJYfs9vjtoyZeBGt3a4IUSX4
ZeNBAx8xWTBcEDIIcZLdY1DNDTbZ4+qQ12Ln9IKq7D4xkE6l7Xh+HGdgTWTnDZ8P
qEUUOmJTFKTQdOiVuU4NN3wzgHKWHdwKg0uWXo7ereYr3kYe3q//jCcLMv88a1x0
XP7NRBQg/rsErwTMdLz6ffyqXJs6lGGqNXzRfZKEwAvmnh/+zs4=
=gNBq
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"Several new features here:
- virtio-balloon supports new stats
- vdpa supports setting mac address
- vdpa/mlx5 suspend/resume as well as MKEY ops are now faster
- virtio_fs supports new sysfs entries for queue info
- virtio/vsock performance has been improved
And fixes, cleanups all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits)
vsock/virtio: avoid queuing packets when intermediate queue is empty
vsock/virtio: refactor virtio_transport_send_pkt_work
fw_cfg: Constify struct kobj_type
vdpa/mlx5: Postpone MR deletion
vdpa/mlx5: Introduce init/destroy for MR resources
vdpa/mlx5: Rename mr_mtx -> lock
vdpa/mlx5: Extract mr members in own resource struct
vdpa/mlx5: Rename function
vdpa/mlx5: Delete direct MKEYs in parallel
vdpa/mlx5: Create direct MKEYs in parallel
MAINTAINERS: add virtio-vsock driver in the VIRTIO CORE section
virtio_fs: add sysfs entries for queue information
virtio_fs: introduce virtio_fs_put_locked helper
vdpa: Remove unused declarations
vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ command
vdpa/mlx5: Small improvement for change_num_qps()
vdpa/mlx5: Keep notifiers during suspend but ignore
vdpa/mlx5: Parallelize device resume
vdpa/mlx5: Parallelize device suspend
vdpa/mlx5: Use async API for vq modify commands
...
When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable
can overflow. Like in the following case:
$> ethtool --set-ring eth1 rx 8192
$> ip link set dev eth1 mtu 144
$> ethtool --features eth1 rx-gro-hw on
... yields in dmesg:
mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22)
because hd_per_wq is 64K which overflows to 0 and makes the command
fail.
This patch increases the variable size to 32 bit.
Fixes: 99be56171f ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Fixed all the 'E2BIG' returns in error flow of functions to
the negative '-E2BIG' as we are using negative error codes
everywhere in HWS code.
This also fixes the following smatch warnings:
"warn: was negative '-E2BIG' intended?"
Fixes: 74a778b4a6 ("net/mlx5: HWS, added definers handling")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/f8c77688-7d83-4937-baba-ac844dfe2e0b@stanley.mountain/
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
which is dereferenced on the next line in a reference
to the modify field.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: a6696735d6 ("net/mlx5e: Convert TIR to a dedicated object")
Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Collecting crdump involves reading vsc registers from pci config space
of mlx device, which can take long time to complete. This might result
in starving other threads waiting to run on the cpu.
Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:
- mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
- mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
read the entire 1310716 bytes. It was called 53813 times because
there are jumps in read_addr.
- On average mlx5_vsc_gw_read_fast() took 35284.4ns.
- In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
The average time for each call was 17548.3ns. In some instances
vsc_read() was called more than one time when the flag was not set.
As expected the thread released the cpu after 16 iterations in
mlx5_vsc_wait_on_flag().
- Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.
It was seen in the field that crdump can take more than 5 seconds to
complete. During that time mlx5_vsc_wait_on_flag() did not release the
cpu because it did not complete 16 iterations. It is believed that pci
config reads were slow. Adding cond_resched() every 128 register read
improves the situation. In the common case the, crdump takes ~1.8989s,
the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
yields the cpu every ~18.0ms.
Fixes: 8b9d8baae1 ("net/mlx5: Add Crdump support")
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Remove the erroneous unmap in case no DMA mapping was established
The multi-packet WQE transmit code attempts to obtain a DMA mapping for
the skb. This could fail, e.g. under memory pressure, when the IOMMU
driver just can't allocate more memory for page tables. While the code
tries to handle this in the path below the err_unmap label it erroneously
unmaps one entry from the sq's FIFO list of active mappings. Since the
current map attempt failed this unmap is removing some random DMA mapping
that might still be required. If the PCI function now presents that IOVA,
the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
function in error state.
The erroneous behavior was seen in a stress-test environment that created
memory pressure.
Fixes: 5af75c747e ("net/mlx5e: Enhanced TX MPWQE for SKBs")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Currently, commands that qualify as throttled can't be used via the
async API. That's due to the fact that the throttle semaphore can sleep
but the async API can't.
This patch allows throttling in the async API by using the tentative
variant of the semaphore and upon failure (semaphore at 0) returns EBUSY
to signal to the caller that they need to wait for the completion of
previously issued commands.
Furthermore, make sure that the semaphore is released in the callback.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Message-Id: <20240816090159.1967650-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Usual collection of small improvements and fixes:
- Bug fixes and minor improvments in cxgb4, siw, mlx5, rxe, efa, rts, hfi,
erdma, hns, irdma
- Code cleanups/typos/etc. Tidy alloc_ordered_workqueue() calls
- Multipath PCI for mlx5
- Variable size work queue, SRQ changes, and relaxed ordering for new bnxt HW
- New ODP fault resolution FW protocol in mlx5
- New "rdma monitor" netlink mechanism
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZvGZowAKCRCFwuHvBreF
YcvbAP9abSxZte3zzG1ZJ/6BShSCGJvu4RMMMQI6wNJWZZiJ5wEA18MdaWzGFS8O
BzP48Z/0VGsd2MOfNX4JeyYIs7SNYQA=
=FXLo
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Usual collection of small improvements and fixes, nothing especially
stands out to me here.
The new multipath PCI feature is a sign of things to come, I think we
will see more of this in the next 10 years. Broadcom and HNS continue
to update their drivers for their new HW generations.
Summary:
- Bug fixes and minor improvments in cxgb4, siw, mlx5, rxe, efa, rts,
hfi, erdma, hns, irdma
- Code cleanups/typos/etc. Tidy alloc_ordered_workqueue() calls
- Multipath PCI for mlx5
- Variable size work queue, SRQ changes, and relaxed ordering for new
bnxt HW
- New ODP fault resolution FW protocol in mlx5
- New 'rdma monitor' netlink mechanism"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits)
RDMA/bnxt_re: Remove the unused variable en_dev
RDMA/nldev: Add missing break in rdma_nl_notify_err_msg()
RDMA/irdma: fix error message in irdma_modify_qp_roce()
RDMA/cxgb4: Added NULL check for lookup_atid
RDMA/hns: Fix ah error counter in sw stat not increasing
RDMA/bnxt_re: Recover the device when FW error is detected
RDMA/bnxt_re: Group all operations under add_device and remove_device
RDMA/bnxt_re: Use the aux device for L2 ULP callbacks
RDMA/bnxt_re: Change aux driver data to en_info to hold more information
RDMA/nldev: Expose whether RDMA monitoring is supported
RDMA/nldev: Add support for RDMA monitoring
RDMA/mlx5: Use IB set_netdev and get_netdev functions
RDMA/device: Remove optimization in ib_device_get_netdev()
RDMA/mlx5: Initialize phys_port_cnt earlier in RDMA device creation
RDMA/mlx5: Obtain upper net device only when needed
RDMA/mlx5: Check RoCE LAG status before getting netdev
RDMA/mlx5: Consider the query_vuid cap for data_direct
net/mlx5: Handle memory scheme ODP capabilities
RDMA/mlx5: Add implicit MR handling to ODP memory scheme
RDMA/mlx5: Add handling for memory scheme page fault events
...
- Update some thermal drivers to eliminate thermal_zone_get_trip()
calls from them and get rid of that function (Rafael Wysocki).
- Update the thermal sysfs code to store trip point attributes in trip
descriptors and get to trip points via attribute pointers (Rafael
Wysocki).
- Move the computation of the low and high boundaries for
thermal_zone_set_trips() to __thermal_zone_device_update() (Daniel
Lezcano).
- Introduce a debugfs-based facility for thermal core testing (Rafael
Wysocki).
- Replace the thermal zone .bind() and .unbind() callbacks for binding
cooling devices to thermal zones with one .should_bind() callback
used for deciding whether or not a given cooling devices should be
bound to a given trip point in a given thermal zone (Rafael Wysocki).
- Eliminate code that has no more users after the other changes, drop
some redundant checks from the thermal core and clean it up (Rafael
Wysocki).
- Fix rounding of delay jiffies in the thermal core (Rafael Wysocki).
- Refuse to accept trip point temperature or hysteresis that would lead
to an invalid threshold value when setting them via sysfs (Rafael
Wysocki).
- Adjust states of all uninitialized instances in the .manage()
callback of the Bang-bang thermal governor (Rafael Wysocki).
- Drop a couple of redundant checks along with the code depending on
them from the thermal core (Rafael Wysocki).
- Rearrange the thermal core to avoid redundant checks and simplify
control flow in a couple of code paths (Rafael Wysocki).
- Add power domain DT bindings for new Amlogic SoCs (Georges Stark).
- Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr() in the ST
driver and add a Kconfig dependency on THERMAL_OF subsystem for the
STi driver (Raphael Gallais-Pou).
- Simplify the error code path in the probe functions in the brcmstb
driver with the helo of dev_err_probe() (Yan Zhen).
- Make imx_sc_thermal use dev_err_probe() (Alexander Stein).
- Remove trailing space after \n newline in the Renesas driver (Colin
Ian King).
- Add DT binding compatible string for the SA8255p to the tsens thermal
driver (Nikunj Kela).
- Use the devm_clk_get_enabled() helpers to simplify the init routine
in the sprd thermal driver (Huan Yang).
- Remove __maybe_unused notations for the functions by using the new
RUNTIME_PM_OPS() and SYSTEM_SLEEP_PM_OPS() macros on the IMx and
Qoriq drivers (Fabio Estevam)
- Remove unused declarations from the ti-soc-thermal driver's header
file as the functions in question were removed previously (Zhang
Zekun).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmbjJz8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3vAP/iS4NTxdF7RJk1ocNCHDyX5pwcS51vZ4
6OfU4EDEieCZMmgIUUexjGvnhwDBy1CRhYD3BeRAmj9AL+89Dpm5DcXPLVcCf2P9
wnVrTDfEE2udGvJIJnpKcwsWR96/zot4mt5PSPprtUvDnskTqYlflZYF1FhA1DiS
rPPKa553wdBAja1ypyGcP/N4nq3DQpcIFi1VQVUgnmdcAe50CA8yd8aQukWcfXoO
L5pmHMOqPWdP1pxwxx1uUzHX9BRlPVHsxpNfKojcwrv9rZQ99Nkyy+M28/DTQEaT
I17tdANTv04GUzzu421D2KREeXNsq3GtXtBRQhUegNZiQQxXe/wCB2UU/EFZDEQg
MSXmGmensDV1xsEBuUy3x99vVsdZND0mnY0R3Gk2LvIWPEoRVMWkS3NUD2cqq48R
0C0kERkxlAaGQU/GpEZZTun/u3LeicNUKs4vOaFsqADEzoiDKm/kLMhdKzU4FVDD
wGJLIkTJInVL2sMWYuYeTxnx03Qs0aCYW5TTTzJUqkU15fx8/smLDe3cM5Px2Hbk
HvRVGXPK0uz6CJdPcUqbdV0916INLyzdrfAGdRy2gFUp7DjBHROhQHSAkxQYiRfL
W00ZSI8FSMx3+pv0HYmr4WNNk6gNrVyXBmRLLZAE2Th7c9tVw9pEMzXF5lnpYbII
FFc6OY/gNxyr
=mr7q
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These mostly continue to rework the thermal core and the thermal zone
driver interface to make the code more straightforward and reduce
bloat
The most significant piece of this work is a change of the code
related to binding cooling devices to thermal zones which, among other
things, replaces two previously existing thermal zone operations with
one allowing driver implementations to be much simpler
There is also a new thermal core testing module allowing mock thermal
zones to be created and controlled via debugfs in order to exercise
the thermal core functionality. It is expected to be used for
implementing thermal core self tests in the future
Apart from the above, there are assorted thermal driver updates
Specifics:
- Update some thermal drivers to eliminate thermal_zone_get_trip()
calls from them and get rid of that function (Rafael Wysocki)
- Update the thermal sysfs code to store trip point attributes in
trip descriptors and get to trip points via attribute pointers
(Rafael Wysocki)
- Move the computation of the low and high boundaries for
thermal_zone_set_trips() to __thermal_zone_device_update() (Daniel
Lezcano)
- Introduce a debugfs-based facility for thermal core testing (Rafael
Wysocki)
- Replace the thermal zone .bind() and .unbind() callbacks for
binding cooling devices to thermal zones with one .should_bind()
callback used for deciding whether or not a given cooling devices
should be bound to a given trip point in a given thermal zone
(Rafael Wysocki)
- Eliminate code that has no more users after the other changes, drop
some redundant checks from the thermal core and clean it up (Rafael
Wysocki)
- Fix rounding of delay jiffies in the thermal core (Rafael Wysocki)
- Refuse to accept trip point temperature or hysteresis that would
lead to an invalid threshold value when setting them via sysfs
(Rafael Wysocki)
- Adjust states of all uninitialized instances in the .manage()
callback of the Bang-bang thermal governor (Rafael Wysocki)
- Drop a couple of redundant checks along with the code depending on
them from the thermal core (Rafael Wysocki)
- Rearrange the thermal core to avoid redundant checks and simplify
control flow in a couple of code paths (Rafael Wysocki)
- Add power domain DT bindings for new Amlogic SoCs (Georges Stark)
- Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr() in the ST
driver and add a Kconfig dependency on THERMAL_OF subsystem for the
STi driver (Raphael Gallais-Pou)
- Simplify the error code path in the probe functions in the brcmstb
driver with the helo of dev_err_probe() (Yan Zhen)
- Make imx_sc_thermal use dev_err_probe() (Alexander Stein)
- Remove trailing space after \n newline in the Renesas driver (Colin
Ian King)
- Add DT binding compatible string for the SA8255p to the tsens
thermal driver (Nikunj Kela)
- Use the devm_clk_get_enabled() helpers to simplify the init routine
in the sprd thermal driver (Huan Yang)
- Remove __maybe_unused notations for the functions by using the new
RUNTIME_PM_OPS() and SYSTEM_SLEEP_PM_OPS() macros on the IMx and
Qoriq drivers (Fabio Estevam)
- Remove unused declarations from the ti-soc-thermal driver's header
file as the functions in question were removed previously (Zhang
Zekun)"
* tag 'thermal-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (48 commits)
thermal: core: Drop thermal_zone_device_is_enabled()
thermal: core: Check passive delay in monitor_thermal_zone()
thermal: core: Drop dead code from monitor_thermal_zone()
thermal: core: Drop redundant lockdep_assert_held()
thermal: gov_bang_bang: Adjust states of all uninitialized instances
thermal: sysfs: Add sanity checks for trip temperature and hysteresis
thermal/drivers/imx_sc_thermal: Use dev_err_probe
thermal/drivers/ti-soc-thermal: Remove unused declarations
thermal/drivers/imx: Remove __maybe_unused notations
thermal/drivers/qoriq: Remove __maybe_unused notations
thermal/drivers/sprd: Use devm_clk_get_enabled() helpers
dt-bindings: thermal: tsens: document support on SA8255p
thermal/drivers/renesas: Remove trailing space after \n newline
thermal/drivers/brcmstb_thermal: Simplify with dev_err_probe()
thermal/drivers/sti: Depend on THERMAL_OF subsystem
thermal/drivers/st: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
dt-bindings: thermal: amlogic,thermal: add optional power-domains
thermal: core: Drop tz field from struct thermal_instance
thermal: core: Drop redundant checks from thermal_bind_cdev_to_trip()
thermal: core: Rename cdev-to-thermal-zone bind/unbind functions
...
There is a copy and paste bug so this code checks "sq->dep_wqe" where
"sq->wr_priv" was intended. It could result in a NULL pointer
dereference.
Fixes: 2ca62599aa ("net/mlx5: HWS, added send engine and context handling")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/da822315-02b7-4f5b-9c86-0d5176c5069d@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The IB layer provides a common interface to store and get net
devices associated to an IB device port (ib_device_set_netdev()
and ib_device_get_netdev()).
Previously, mlx5_ib stored and managed the associated net devices
internally.
Replace internal net device management in mlx5_ib with
ib_device_set_netdev() when attaching/detaching a net device and
ib_device_get_netdev() when retrieving the net device.
Export ib_device_get_netdev().
For mlx5 representors/PFs/VFs and lag creation we replace the netdev
assignments with the IB set/get netdev functions.
In active-backup mode lag the active slave net device is stored in the
lag itself. To assure the net device stored in a lag bond IB device is
the active slave we implement the following:
- mlx5_core: when modifying the slave of a bond we send the internal driver event
MLX5_DRIVER_EVENT_ACTIVE_BACKUP_LAG_CHANGE_LOWERSTATE.
- mlx5_ib: when catching the event call ib_device_set_netdev()
This patch also ensures the correct IB events are sent in switchdev lag.
While at it, when in multiport eswitch mode, only a single IB device is
created for all ports. The said IB device will receive all netdev events
of its VFs once loaded, thus to avoid overwriting the mapping of PF IB
device to PF netdev, ignore NETDEV_REGISTER events if the ib device has
already been mapped to a netdev.
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://patch.msgid.link/20240909173025.30422-6-michaelgur@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
When running over new FW that supports the new memory scheme ODP, set
the cap in the FW to signal the FW we are working in the new scheme.
In the memory scheme ODP the per_transport_service capabilities are RO
for the driver so we skip their setting.
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://patch.msgid.link/20240909100504.29797-9-michaelgur@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
mlx5e_free_rq previously cleaned resources in an order that was not the
reverse of the resource allocation order in mlx5e_alloc_rq.
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-16-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When SHAMPO can't identify the protocol/header of a packet, it will
yield a packet that is not split - all the packet is in the data part.
Count this value in packets and bytes.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-15-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new command status MLX5_CMD_STAT_NOT_READY to handle cases
where the firmware is not ready.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20240911201757.1505453-14-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
SFs didn't allow to configure IRQ affinity for its vectors. Allow users
to configure the affinity of the SFs irqs.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20240911201757.1505453-13-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sync reset request is nacked by the driver when PCIe bridge connected to
mlx5 device has HotPlug interrupt enabled. However, when using reset
method of hot reset this check can be skipped as Hotplug is supported on
this reset method.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-12-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On device that supports sync reset for firmware activate using hot
reset, the driver queries the required reset method while handling the
sync reset request. If the required reset method is hot reset, the
driver will use pci_reset_bus() to reset the PCI link instead of the
link toggle.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-11-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Native capability for some steering engines lacks support for adding an
additional match with the same value to the same flow group. To accommodate
the NO APPEND flag in these scenarios, we include the new rule in the
existing flow table entry (fte) without immediate hardware commitment. When
a request is made to delete the corresponding hardware rule, we then commit
the pending rule to hardware.
Only one pending rule is supported because NO_APPEND is primarily used
during replacement operations. In this scenario, a rule is initially added.
When it needs replacement, the new rule is added with NO_APPEND set. Only
after the insertion of the new rule is the original rule deleted.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce a dedicated structure to encapsulate flow context, actions,
destination count, and modification mask. This refactoring lays the
groundwork for forthcoming patches that will integrate the NO APPEND
software logic. Future modifications should focus solely on these
specific fields.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Counter is in struct fte, remove it.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Downstream patches will need this as we might not want to reset
it when a pending rule is connected to the FTE.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As preparation for HW Steering support, where the function
get_root_namespace() is needed to get root FDB, make it an API function
and rename it to mlx5_get_root_namespace().
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20240911201757.1505453-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As preparation for HW steering support in fs core level, move SW
steering helper function that can be reused by HW steering to fs_cmd.h.
The function mlx5_fs_cmd_is_fw_term_table() checks if a flow table is a
flow steering termination table and so should be handled by FW steering.
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fixed all the '-ret' returns in error flow of functions to 'ret',
as the internal functions are already returning negative error values
(e.g. -EINVAL)
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Changed all the functions comments to adhere with kernel-doc formatting.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR.
No conflicts (sort of) and no adjacent changes.
This merge reverts commit b3c9e65eb2 ("net: hsr: remove seqnr_lock")
from net, as it was superseded by
commit 430d67bdcb ("net: hsr: Use the seqnr lock for frames received via interlink port.")
in net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Expose IFC bits to support the new memory scheme on demand paging.
Change the macro reading odp capabilities to be able to read from the
new IFC layout and align the code in upper layers to be compiled.
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://patch.msgid.link/20240909100504.29797-3-michaelgur@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Currently, trying to set the bridge mode attribute when numvfs=0 leads to a
crash:
bridge link set dev eth2 hwmode vepa
[ 168.967392] BUG: kernel NULL pointer dereference, address: 0000000000000030
[...]
[ 168.969989] RIP: 0010:mlx5_add_flow_rules+0x1f/0x300 [mlx5_core]
[...]
[ 168.976037] Call Trace:
[ 168.976188] <TASK>
[ 168.978620] _mlx5_eswitch_set_vepa_locked+0x113/0x230 [mlx5_core]
[ 168.979074] mlx5_eswitch_set_vepa+0x7f/0xa0 [mlx5_core]
[ 168.979471] rtnl_bridge_setlink+0xe9/0x1f0
[ 168.979714] rtnetlink_rcv_msg+0x159/0x400
[ 168.980451] netlink_rcv_skb+0x54/0x100
[ 168.980675] netlink_unicast+0x241/0x360
[ 168.980918] netlink_sendmsg+0x1f6/0x430
[ 168.981162] ____sys_sendmsg+0x3bb/0x3f0
[ 168.982155] ___sys_sendmsg+0x88/0xd0
[ 168.985036] __sys_sendmsg+0x59/0xa0
[ 168.985477] do_syscall_64+0x79/0x150
[ 168.987273] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 168.987773] RIP: 0033:0x7f8f7950f917
(esw->fdb_table.legacy.vepa_fdb is null)
The bridge mode is only relevant when there are multiple functions per
port. Therefore, prevent setting and getting this setting when there are no
VFs.
Note that after this change, there are no settings to change on the PF
interface using `bridge link` when there are no VFs, so the interface no
longer appears in the `bridge link` output.
Fixes: 4b89251de0 ("net/mlx5: Support ndo bridge_setlink and getlink")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Before creating a scheduling element in a NIC or E-Switch scheduler,
ensure that the requested element type is supported. If the element is
of type Transmit Scheduling Arbiter (TSAR), also verify that the
specific TSAR type is supported.
Fixes: 214baf2287 ("net/mlx5e: Support HTB offload")
Fixes: 85c5f7c920 ("net/mlx5: E-switch, Create QoS on demand")
Fixes: 0fe132eac3 ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Ensure the scheduling element type and TSAR type are explicitly
initialized in the QoS rate group creation.
This prevents potential issues due to default values.
Fixes: 1ae258f8b3 ("net/mlx5: E-switch, Introduce rate limiting groups API")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add MLX5E_400GAUI_8_400GBASE_CR8 to the extended modes
in ptys2ext_ethtool_table, since it was missing.
Fixes: 6a89737241 ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add MLX5E_1000BASE_T and MLX5E_100BASE_TX to the legacy
modes in ptys2legacy_ethtool_table, since they were missing.
Fixes: 665bc53969 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add the upcoming ConnectX-9 device ID to the table of supported
PCI device IDs.
Fixes: f908a35b22 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>