Commit Graph

39629 Commits

Author SHA1 Message Date
Guangbin Huang
276e604216 net: hns3: PF enable promisc for VF when mac table is overflow
If unicast mac address table is full, and user add a new mac address, the
unicast promisc needs to be enabled for the new unicast mac address can be
used. So does the multicast promisc.

Now this feature has been implemented for PF, and VF should be implemented
too. When the mac table of VF is overflow, PF will enable promisc for this
VF.

Fixes: 1e6e76101f ("net: hns3: configure promisc mode for VF asynchronously")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Jian Shen
108b3c7810 net: hns3: fix show wrong state when add existing uc mac address
Currently, if function adds an existing unicast mac address, eventhough
driver will not add this address into hardware, but it will return 0 in
function hclge_add_uc_addr_common(). It will cause the state of this
unicast mac address is ACTIVE in driver, but it should be in TO-ADD state.

To fix this problem, function hclge_add_uc_addr_common() returns -EEXIST
if mac address is existing, and delete two error log to avoid printing
them all the time after this modification.

Fixes: 72110b5674 ("net: hns3: return 0 and print warning when hit duplicate MAC")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Jian Shen
0472e95ffe net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
HCLGE_FLAG_MQPRIO_ENABLE is supposed to set when enable
multiple TCs with tc mqprio, and HCLGE_FLAG_DCB_ENABLE is
supposed to set when enable multiple TCs with ets. But
the driver mixed the flags when updating the tm configuration.

Furtherly, PFC should be available when HCLGE_FLAG_MQPRIO_ENABLE
too, so remove the unnecessary limitation.

Fixes: 5a5c909174 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Jian Shen
d82650be60 net: hns3: don't rollback when destroy mqprio fail
For destroy mqprio is irreversible in stack, so it's unnecessary
to rollback the tc configuration when destroy mqprio failed.
Otherwise, it may cause the configuration being inconsistent
between driver and netstack.

As the failure is usually caused by reset, and the driver will
restore the configuration after reset, so it can keep the
configuration being consistent between driver and hardware.

Fixes: 5a5c909174 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Jian Shen
a8e76fefe3 net: hns3: remove tc enable checking
Currently, in function hns3_nic_set_real_num_queue(), the
driver doesn't report the queue count and offset for disabled
tc. If user enables multiple TCs, but only maps user
priorities to partial of them, it may cause the queue range
of the unmapped TC being displayed abnormally.

Fix it by removing the tc enable checking, ensure the queue
count is not zero.

With this change, the tc_en is useless now, so remove it.

Fixes: a75a8efa00 ("net: hns3: Fix tc setup when netdev is first up")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Jian Shen
5b09e88e1b net: hns3: do not allow call hns3_nic_net_open repeatedly
hns3_nic_net_open() is not allowed to called repeatly, but there
is no checking for this. When doing device reset and setup tc
concurrently, there is a small oppotunity to call hns3_nic_net_open
repeatedly, and cause kernel bug by calling napi_enable twice.

The calltrace information is like below:
[ 3078.222780] ------------[ cut here ]------------
[ 3078.230255] kernel BUG at net/core/dev.c:6991!
[ 3078.236224] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 3078.243431] Modules linked in: hns3 hclgevf hclge hnae3 vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O)
[ 3078.258880] CPU: 0 PID: 295 Comm: kworker/u8:5 Tainted: G           O      5.14.0-rc4+ #1
[ 3078.269102] Hardware name:  , BIOS KpxxxFPGA 1P B600 V181 08/12/2021
[ 3078.276801] Workqueue: hclge hclge_service_task [hclge]
[ 3078.288774] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 3078.296168] pc : napi_enable+0x80/0x84
tc qdisc sho[w  3d0e7v8 .e3t0h218 79] lr : hns3_nic_net_open+0x138/0x510 [hns3]

[ 3078.314771] sp : ffff8000108abb20
[ 3078.319099] x29: ffff8000108abb20 x28: 0000000000000000 x27: ffff0820a8490300
[ 3078.329121] x26: 0000000000000001 x25: ffff08209cfc6200 x24: 0000000000000000
[ 3078.339044] x23: ffff0820a8490300 x22: ffff08209cd76000 x21: ffff0820abfe3880
[ 3078.349018] x20: 0000000000000000 x19: ffff08209cd76900 x18: 0000000000000000
[ 3078.358620] x17: 0000000000000000 x16: ffffc816e1727a50 x15: 0000ffff8f4ff930
[ 3078.368895] x14: 0000000000000000 x13: 0000000000000000 x12: 0000259e9dbeb6b4
[ 3078.377987] x11: 0096a8f7e764eb40 x10: 634615ad28d3eab5 x9 : ffffc816ad8885b8
[ 3078.387091] x8 : ffff08209cfc6fb8 x7 : ffff0820ac0da058 x6 : ffff0820a8490344
[ 3078.396356] x5 : 0000000000000140 x4 : 0000000000000003 x3 : ffff08209cd76938
[ 3078.405365] x2 : 0000000000000000 x1 : 0000000000000010 x0 : ffff0820abfe38a0
[ 3078.414657] Call trace:
[ 3078.418517]  napi_enable+0x80/0x84
[ 3078.424626]  hns3_reset_notify_up_enet+0x78/0xd0 [hns3]
[ 3078.433469]  hns3_reset_notify+0x64/0x80 [hns3]
[ 3078.441430]  hclge_notify_client+0x68/0xb0 [hclge]
[ 3078.450511]  hclge_reset_rebuild+0x524/0x884 [hclge]
[ 3078.458879]  hclge_reset_service_task+0x3c4/0x680 [hclge]
[ 3078.467470]  hclge_service_task+0xb0/0xb54 [hclge]
[ 3078.475675]  process_one_work+0x1dc/0x48c
[ 3078.481888]  worker_thread+0x15c/0x464
[ 3078.487104]  kthread+0x160/0x170
[ 3078.492479]  ret_from_fork+0x10/0x18
[ 3078.498785] Code: c8027c81 35ffffa2 d50323bf d65f03c0 (d4210000)
[ 3078.506889] ---[ end trace 8ebe0340a1b0fb44 ]---

Once hns3_nic_net_open() is excute success, the flag
HNS3_NIC_STATE_DOWN will be cleared. So add checking for this
flag, directly return when HNS3_NIC_STATE_DOWN is no set.

Fixes: e888402789 ("net: hns3: call hns3_nic_net_open() while doing HNAE3_UP_CLIENT")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 11:03:54 +01:00
Feng Zhou
513e605d7a ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".

Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
	adapter->ring_feature[RING_F_FDIR].limit = count;

It becomes 63.

When user use xdp, "ixgbe_set_rss_queues" will set queues num.
	adapter->num_rx_queues = rss_i;
	adapter->num_tx_queues = rss_i;
	adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);

And rss_i's value is from
	f = &adapter->ring_feature[RING_F_FDIR];
	rss_i = f->indices = f->limit;

So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
	for (i = 0; i < adapter->num_rx_queues; i++)
		if (adapter->xdp_ring[i]->xsk_umem)

It leads to panic.

Call trace:
[exception RIP: ixgbe_xdp+368]
RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
 8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
 9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

So I fix ixgbe_max_channels so that it will not allow a setting of queues
to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup,
take the smaller value of num_rx_queues and num_xdp_queues.

Fixes: 4a9b32f30f ("ixgbe: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 10:51:51 +01:00
David S. Miller
49f01349d1 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
t-queue

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-09-28

This series contains updates to ice driver only.

Dave adds support for QoS DSCP allowing for DSCP to TC mapping via APP
TLVs.

Ani adds enforcement of DSCP to only supported devices with the
introduction of a feature bitmap and corrects messaging of unsupported
modules based on link mode.

Jake refactors devlink info functions to be void as the functions no
longer return errors.

Jeff fixes a macro name to properly reflect the value.

Len Baker converts a kzalloc allocation to, the preferred, kcalloc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 10:30:56 +01:00
Naveen Mamindlapalli
43510ef4dd octeontx2-nicvf: Add PTP hardware clock support to NIX VF
This patch adds PTP PHC support to NIX VF interfaces. This enables
a VF to run PTP master/slave instance. PTP block being a shared
hardware resource it is recommended to avoid running multiple
PTP instances in the system which will impact the PTP clock
accuracy.

Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 10:27:33 +01:00
Rakesh Babu
ffd2f89ad0 octeontx2-pf: Enable promisc/allmulti match MCAM entries.
Whenever the interface is brought up/down then set_rx_mode
function is called by the stack which enables promisc/allmulti
MCAM entries. But there are cases when driver brings
interface down and then up such as while changing number
of channels. In these cases promisc/allmulti MCAM entries
are left disabled as set_rx_mode callback is not called.
This patch enables these MCAM entries in all such cases.

Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29 10:27:33 +01:00
Len Baker
30cba287eb ice: Prefer kcalloc over open coded arithmetic
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

In this case this is not actually dynamic sizes: both sides of the
multiplication are constant values. However it is best to refactor this
anyway, just to keep the open-coded math idiom out of code.

So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.

[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Signed-off-by: Len Baker <len.baker@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Jeff Guo
b37e4e94c1 ice: Fix macro name for IPv4 fragment flag
In IPv4 header, fragment flags indicate whether the packet needs
to be fragmented or not. The value 0x20 represents MF (More Fragment); fix
the macro name to match this.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Jacob Keller
0128cc6e92 ice: refactor devlink getter/fallback functions to void
After commit a8f89fa277 ("ice: do not abort devlink info if board
identifier can't be found"), the getter/fallback() functions no longer
report an error. Convert the interface to a void so that it is no
longer possible to add a version field that is fatal. This makes
sense, because we should not fail to report other versions just
because one of the version pieces could not be found.

Finally, clean up the getter functions line wrapping so that none of
them take more than 80 columns, as is the usual style for networking
files.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Anirudh Venkataramanan
4fc5fbee5c ice: Fix link mode handling
The messaging for unsupported module detection is different for
lenient mode and strict mode. Update the code to print the right
messaging for a given link mode.

Media topology conflict is not an error in lenient mode, so return
an error code only if not in lenient mode.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Anirudh Venkataramanan
40b247608b ice: Add feature bitmap, helpers and a check for DSCP
DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this,
this patch introduces a bitmap of features and helper functions.

The feature bitmap is set based on device IDs on driver init. Currently,
DSCP is the only feature in this bitmap, but there will be more in the
future. In the DCB netlink flow, check if the feature bit is set before
exercising DSCP.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Dave Ertman
2a87bd73e5 ice: Add DSCP support
Implement code to handle submission of APP TLV's
containing DSCP to TC mapping.

The first such mapping received on an interface
will cause that PF to switch to L3 DSCP QoS mode,
apply the default config for that mode, and apply
the received mapping.

Only one such mapping will be allowed per DSCP value,
and when the last DSCP mapping is deleted, the PF
will switch back into L2 VLAN QoS mode, applying the
appropriate default QoS settings.

L3 DSCP QoS mode will only be allowed in SW DCBx
mode, in other words, when the FW LLDP engine is
disabled.  Commands that break this mutual exclusivity
will be blocked.

Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-28 09:42:04 -07:00
Arnd Bergmann
1e0083bd07 gve: DQO: avoid unused variable warnings
The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:

drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
  494 |                 struct gve_tx_dma_buf *buf =

This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.

Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.

Fixes: a57e5de476 ("gve: DQO: Add TX path")
Link: https://lore.kernel.org/netdev/20210723231957.1113800-1-bcf@google.com/
Link: https://lore.kernel.org/netdev/20210721151100.2042139-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 15:24:36 +01:00
Geetha sowjanya
af3826db74 octeontx2-pf: Use hardware register for CQE count
Current driver uses software CQ head pointer to poll on CQE
header in memory to determine if CQE is valid. Software needs
to make sure, that the reads of the CQE do not get re-ordered
so much that it ends up with an inconsistent view of the CQE.
To ensure that DMB barrier after read to first CQE cacheline
and before reading of the rest of the CQE is needed.
But having barrier for every CQE read will impact the performance,
instead use hardware CQ head and tail pointers to find the
valid number of CQEs.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 14:10:24 +01:00
Yi Guo
99bbc4ae69 octeontx2-af: Add external ptp input clock
PTP hardware block can be configured to utilize
the external clock. Also the current ptp timestamp
can be captured when external trigger is applied on
a gpio pin. These features are required in scenarios
like connecting a external timing device to the chip
for time synchronization. The timing device provides
the clock and trigger(PPS signal) to the PTP block.
This patch does the following:
1. configures PTP block to use external clock
frequency and timestamp capture on external event.
2. sends PTP_REQ_EXTTS events to kernel ptp phc susbsytem
with captured timestamps
3. aligns PPS edge to adjusted ptp clock in the ptp device
by setting the PPS_THRESH to the reminder of the last
timestamp value captured by external PPS

Signed-off-by: Yi Guo <yig@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:50:37 +01:00
Subbaraya Sundeep
e266f66393 octeontx2-af: Use ptp input clock info from firmware data
The input clock frequency of PTP block is figured
out from hardware reset block currently. The firmware
data already has this info in sclk. Hence simplify
ptp driver to use sclk from firmware data.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:50:37 +01:00
Hariprasad Kelam
d148920868 octeontx2-af: cn10k: RPM hardware timestamp configuration
MAC on CN10K support hardware timestamping such that 8 bytes addition
header is prepended to incoming packets. This patch does necessary
configuration to enable Hardware time stamping upon receiving request
from PF netdev interfaces.

Timestamp configuration is different on MAC (CGX) Octeontx2 silicon
and MAC (RPM) OcteonTX3 CN10k. Based on silicon variant appropriate
fn() pointer is called. Refactor MAC specific mbox messages to remove
unnecessary gaps in mboxids.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:50:37 +01:00
Harman Kalra
e37e08fffc octeontx2-af: Reset PTP config in FLR handler
Upon receiving ptp config request from netdev interface , Octeontx2 MAC
block CGX is configured to append timestamp to every incoming packet
and NPC config is updated with DMAC offset change.

Currently this configuration is not reset in FLR handler. This patch
resets the same.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:50:37 +01:00
Arnd Bergmann
c894b51e2a net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
This function copies strings around between multiple buffers
including a large on-stack array that causes a build warning
on 32-bit systems:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c: In function 'hclge_dbg_dump_tm_pg':
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1: error: the frame size of 1424 bytes is larger than 1400 bytes [-Werror=frame-larger-than=]

The function can probably be cleaned up a lot, to go back to
printing directly into the output buffer, but dynamically allocating
the structure is a simpler workaround for now.

Fixes: 04d96139dd ("net: hns3: refine function hclge_dbg_dump_tm_pri()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:31:47 +01:00
Randy Dunlap
103bde372f net: sun: SUNVNET_COMMON should depend on INET
When CONFIG_INET is not set, there are failing references to IPv4
functions, so make this driver depend on INET.

Fixes these build errors:

sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_start_xmit_common':
sunvnet_common.c:(.text+0x1a68): undefined reference to `__icmp_send'
sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_poll_common':
sunvnet_common.c:(.text+0x358c): undefined reference to `ip_send_check'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Aaron Young <aaron.young@oracle.com>
Cc: Rashmi Narasimhan <rashmi.narasimhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:20:21 +01:00
Shannon Nelson
c23bb54f28 ionic: fix gathering of debug stats
Don't print stats for which we haven't reserved space as it can
cause nasty memory bashing and related bad behaviors.

Fixes: aa620993b1 ("ionic: pull per-q stats work out of queue loops")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:19:57 +01:00
David S. Miller
3fb2a54b41 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/t
nguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-09-27

This series contains updates to e100 driver only.

Jake corrects under allocation of register buffer due to incorrect
calculations and fixes buffer overrun of register dump.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:18:33 +01:00
Arnd Bergmann
51bb08dd04 net: ks8851: fix link error
An object file cannot be built for both loadable module and built-in
use at the same time:

arm-linux-gnueabi-ld: drivers/net/ethernet/micrel/ks8851_common.o: in function `ks8851_probe_common':
ks8851_common.c:(.text+0xf80): undefined reference to `__this_module'

Change the ks8851_common code to be a standalone module instead,
and use Makefile logic to ensure this is built-in if at least one
of its two users is.

Fixes: 797047f875 ("net: ks8851: Implement Parallel bus operations")
Link: https://lore.kernel.org/netdev/20210125121937.3900988-1-arnd@kernel.org/
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:11:20 +01:00
Arnd Bergmann
d68c2e1d19 net: stmmac: fix off-by-one error in sanity check
My previous patch had an off-by-one error in the added sanity
check, the arrays are MTL_MAX_{RX,TX}_QUEUES long, so if that
index is that number, it has overflown.

The patch silenced the warning anyway because the strings could
no longer overlap with the input, but they could still overlap
with other fields.

Fixes: 3e0d5699a9 ("net: stmmac: fix gcc-10 -Wrestrict warning")
Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:09:02 +01:00
Arnd Bergmann
861f40fa0e am65-cpsw: avoid null pointer arithmetic
clang warns about arithmetic on NULL pointers:

drivers/net/ethernet/ti/am65-cpsw-ethtool.c:71:2: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]
        AM65_CPSW_REGDUMP_REC(AM65_CPSW_REGDUMP_MOD_NUSS, 0x0, 0x1c),
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/ti/am65-cpsw-ethtool.c:64:29: note: expanded from macro 'AM65_CPSW_REGDUMP_REC'
        .hdr.len = (((u32 *)(end)) - ((u32 *)(start)) + 1) * sizeof(u32) * 2 + \
                                   ^ ~~~~~~~~~~~~~~~~

The expression here is easily changed to a calculation based on integers
that is no less readable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 13:06:09 +01:00
MichelleJin
d7cade5137 net/mlx5e: check return value of rhashtable_init
When rhashtable_init() fails, it returns -EINVAL.
However, since error return value of rhashtable_init is not checked,
it can cause use of uninitialized pointers.
So, fix unhandled errors of rhashtable_init.

Signed-off-by: MichelleJin <shjy180909@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28 12:59:24 +01:00
Magnus Karlsson
6aab0bb0c5 i40e: Use the xsk batched rx allocation interface
Use the new xsk batched rx allocation interface for the zero-copy data
path. As the array of struct xdp_buff pointers kept by the driver is
really a ring that wraps, the allocation routine is modified to detect
a wrap and in that case call the allocation function twice. The
allocation function cannot deal with wrapped rings, only arrays. As we
now know exactly how many buffers we get and that there is no
wrapping, the allocation function can be simplified even more as all
if-statements in the allocation loop can be removed, improving
performance.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-6-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
db804cfc21 ice: Use the xsk batched rx allocation interface
Use the new xsk batched rx allocation interface for the zero-copy data
path. As the array of struct xdp_buff pointers kept by the driver is
really a ring that wraps, the allocation routine is modified to detect
a wrap and in that case call the allocation function twice. The
allocation function cannot deal with wrapped rings, only arrays. As we
now know exactly how many buffers we get and that there is no
wrapping, the allocation function can be simplified even more as all
if-statements in the allocation loop can be removed, improving
performance.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-5-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson
57f7f8b6bc ice: Use xdp_buf instead of rx_buf for xsk zero-copy
In order to use the new xsk batched buffer allocation interface, a
pointer to an array of struct xsk_buff pointers need to be provided so
that the function can put the result of the allocation there. In the
ice driver, we already have a ring that stores pointers to
xdp_buffs. This is only used for the xsk zero-copy driver and is a
union with the structure that is used for the regular non zero-copy
path. Unfortunately, that structure is larger than the xdp_buffs
pointers which mean that there will be a stride (of 20 bytes) between
each xdp_buff pointer. And feeding this into the xsk_buff_alloc_batch
interface will not work since it assumes a regular array of xdp_buff
pointers (each 8 bytes with 0 bytes in-between them on a 64-bit
system).

To fix this, remove the xdp_buff pointer from the rx_buf union and
move it one step higher to the union above which only has pointers to
arrays in it. This solves the problem and we can directly feed the SW
ring of xdp_buff pointers straight into the allocation function in the
next patch when that interface is used. This will improve performance.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-4-magnus.karlsson@gmail.com
2021-09-28 00:18:34 +02:00
Jacob Keller
51032e6f17 e100: fix buffer overrun in e100_get_regs
The e100_get_regs function is used to implement a simple register dump
for the e100 device. The data is broken into a couple of MAC control
registers, and then a series of PHY registers, followed by a memory dump
buffer.

The total length of the register dump is defined as (1 + E100_PHY_REGS)
* sizeof(u32) + sizeof(nic->mem->dump_buf).

The logic for filling in the PHY registers uses a convoluted inverted
count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and
assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will
fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This
is actually one more than the supposed number of PHY registers.

The memory dump buffer is then filled into the space at
[2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past
the total size.

The end result is that we overrun the total buffer size allocated by the
kernel, which could lead to a panic or other issues due to memory
corruption.

It is difficult to determine the actual total number of registers
here. The only 8255x datasheet I could find indicates there are 28 total
MDI registers. However, we're reading 29 here, and reading them in
reverse!

In addition, the ethtool e100 register dump interface appears to read
the first PHY register to determine if the device is in MDI or MDIx
mode. This doesn't appear to be documented anywhere within the 8255x
datasheet. I can only assume it must be in register 28 (the extra
register we're reading here).

Lets not change any of the intended meaning of what we copy here. Just
extend the space by 4 bytes to account for the extra register and
continue copying the data out in the same order.

Change the E100_PHY_REGS value to be the correct total (29) so that the
total register dump size is calculated properly. Fix the offset for
where we copy the dump buffer so that it doesn't overrun the total size.

Re-write the for loop to use counting up instead of the convoluted
down-counting. Correct the mdio_read offset to use the 0-based register
offsets, but maintain the bizarre reverse ordering so that we have the
ABI expected by applications like ethtool. This requires and additional
subtraction of 1. It seems a bit odd but it makes the flow of assignment
into the register buffer easier to follow.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-27 08:57:30 -07:00
Jacob Keller
4329c8dc11 e100: fix length calculation in e100_get_regs_len
commit abf9b90205 ("e100: cleanup unneeded math") tried to simplify
e100_get_regs_len and remove a double 'divide and then multiply'
calculation that the e100_reg_regs_len function did.

This change broke the size calculation entirely as it failed to account
for the fact that the numbered registers are actually 4 bytes wide and
not 1 byte. This resulted in a significant under allocation of the
register buffer used by e100_get_regs.

Fix this by properly multiplying the register count by u32 first before
adding the size of the dump buffer.

Fixes: abf9b90205 ("e100: cleanup unneeded math")
Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-09-27 08:57:29 -07:00
Doug Berger
2d8bdf525d net: bcmgenet: add support for ethtool flow control
This commit extends the supported ethtool operations to allow MAC
level flow control to be configured for the bcmgenet driver.

The ethtool utility can be used to change the configuration of
auto-negotiated symmetric and asymmetric modes as well as manually
configuring support for RX and TX Pause frames individually.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:34:52 +01:00
Doug Berger
fc13d8c037 net: bcmgenet: pull mac_config from adjust_link
This commit separates out the MAC configuration that occurs on a
PHY state change into a function named bcmgenet_mac_config().

This allows the function to be called directly elsewhere.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:34:52 +01:00
Doug Berger
fcb5dfe7dc net: bcmgenet: remove old link state values
The PHY state machine has been fixed to only call the adjust_link
callback when the link state has changed. Therefore the old link
state variables are no longer needed to detect a change in link
state.

This commit effectively reverts
commit 5ad6e6c508 ("net: bcmgenet: improve bcmgenet_mii_setup()")

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:34:51 +01:00
Doug Berger
50e356686f net: bcmgenet: remove netif_carrier_off from adjust_link
The bcmgenet_mii_setup() function is registered as the adjust_link
callback from the phylib for the GENET driver.

The phylib always sets the netif_carrier according to phydev->link
prior to invoking the adjust_link callback, so there is no need to
repeat that in the link down case within the network driver.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:34:51 +01:00
Leon Romanovsky
0d98ff22de net: ethernet: ti: Move devlink registration to be last devlink command
This change prevents from users to access device before devlink is
fully configured.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
1b8e0bdbea qed: Move devlink registration to be last devlink command
This change prevents from users to access device before devlink is
fully configured.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
7911c8bd54 ionic: Move devlink registration to be last devlink command
This change prevents from users to access device before devlink is
fully configured.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
4f2a81c40c nfp: Move delink_register to be last command
Open user space access to the devlink after driver is probed.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
67d78e7f76 net: mscc: ocelot: delay devlink registration to the end
Open access to the devlink interface when the driver fully initialized.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
b2ab483fcb mlxsw: core: Register devlink instance last
Make sure that devlink is open to receive user input when all
parameters are initialized.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
64ea2d0e72 net/mlx5: Accept devlink user input after driver initialization complete
The change of devlink_alloc() to accept device makes sure that device
is fully initialized and device_register() does nothing except allowing
users to use that devlink instance.

Such change ensures that no user input will be usable till that point and
it eliminates the need to worry about internal locking as long as devlink_register
is called last since all accesses to the devlink are during initialization.

This change fixes the following lockdep warning.

 ======================================================
 WARNING: possible circular locking dependency detected
 5.14.0-rc2+ #27 Not tainted
 ------------------------------------------------------
 devlink/265 is trying to acquire lock:
 ffff8880133c2bc0 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_unload_one+0x1e/0xa0 [mlx5_core]
 but task is already holding lock:
 ffffffff8362b468 (devlink_mutex){+.+.}-{3:3}, at: devlink_nl_pre_doit+0x2b/0x8d0
 which lock already depends on the new lock.
 the existing dependency chain (in reverse order) is:

 -> #1 (devlink_mutex){+.+.}-{3:3}:
        __mutex_lock+0x149/0x1310
        devlink_register+0xe7/0x280
        mlx5_devlink_register+0x118/0x480 [mlx5_core]
        mlx5_init_one+0x34b/0x440 [mlx5_core]
        probe_one+0x480/0x6e0 [mlx5_core]
        pci_device_probe+0x2a0/0x4a0
        really_probe+0x1cb/0xba0
        __driver_probe_device+0x18f/0x470
        driver_probe_device+0x49/0x120
        __driver_attach+0x1ce/0x400
        bus_for_each_dev+0x11e/0x1a0
        bus_add_driver+0x309/0x570
        driver_register+0x20f/0x390
        0xffffffffa04a0062
        do_one_initcall+0xd5/0x400
        do_init_module+0x1c8/0x760
        load_module+0x7d9d/0xa4b0
        __do_sys_finit_module+0x118/0x1a0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae

 -> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
        __lock_acquire+0x2999/0x5a40
        lock_acquire+0x1a9/0x4a0
        __mutex_lock+0x149/0x1310
        mlx5_unload_one+0x1e/0xa0 [mlx5_core]
        mlx5_devlink_reload_down+0x185/0x2b0 [mlx5_core]
        devlink_reload+0x1f2/0x640
        devlink_nl_cmd_reload+0x6c3/0x10d0
        genl_family_rcv_msg_doit+0x1e9/0x2f0
        genl_rcv_msg+0x27f/0x4a0
        netlink_rcv_skb+0x11e/0x340
        genl_rcv+0x24/0x40
        netlink_unicast+0x433/0x700
        netlink_sendmsg+0x6fb/0xbe0
        sock_sendmsg+0xb0/0xe0
        __sys_sendto+0x192/0x240
        __x64_sys_sendto+0xdc/0x1b0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(devlink_mutex);
                                lock(&dev->intf_state_mutex);
                                lock(devlink_mutex);
   lock(&dev->intf_state_mutex);

  *** DEADLOCK ***

 3 locks held by devlink/265:
  #0: ffffffff836371d0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
  #1: ffffffff83637288 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x31a/0x4a0
  #2: ffffffff8362b468 (devlink_mutex){+.+.}-{3:3}, at: devlink_nl_pre_doit+0x2b/0x8d0

 stack backtrace:
 CPU: 0 PID: 265 Comm: devlink Not tainted 5.14.0-rc2+ #27
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack_lvl+0x45/0x59
  check_noncircular+0x268/0x310
  ? print_circular_bug+0x460/0x460
  ? __kernel_text_address+0xe/0x30
  ? alloc_chain_hlocks+0x1e6/0x5a0
  __lock_acquire+0x2999/0x5a40
  ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
  ? add_lock_to_list.constprop.0+0x6c/0x530
  lock_acquire+0x1a9/0x4a0
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? lock_release+0x6c0/0x6c0
  ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
  ? lock_is_held_type+0x98/0x110
  __mutex_lock+0x149/0x1310
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? lock_is_held_type+0x98/0x110
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? find_held_lock+0x2d/0x110
  ? mutex_lock_io_nested+0x1160/0x1160
  ? mlx5_lag_is_active+0x72/0x90 [mlx5_core]
  ? lock_downgrade+0x6d0/0x6d0
  ? do_raw_spin_lock+0x12e/0x270
  ? rwlock_bug.part.0+0x90/0x90
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  mlx5_devlink_reload_down+0x185/0x2b0 [mlx5_core]
  ? netlink_broadcast_filtered+0x308/0xac0
  ? mlx5_devlink_info_get+0x1f0/0x1f0 [mlx5_core]
  ? __build_skb_around+0x110/0x2b0
  ? __alloc_skb+0x113/0x2b0
  devlink_reload+0x1f2/0x640
  ? devlink_unregister+0x1e0/0x1e0
  ? security_capable+0x51/0x90
  devlink_nl_cmd_reload+0x6c3/0x10d0
  ? devlink_nl_cmd_get_doit+0x1e0/0x1e0
  ? devlink_nl_pre_doit+0x72/0x8d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? __lock_acquire+0x15e2/0x5a40
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? mutex_lock_io_nested+0x1160/0x1160
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x4a0
  ? devlink_nl_cmd_get_doit+0x1e0/0x1e0
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11e/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x930/0x930
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x750/0x750
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6fb/0xbe0
  ? netlink_unicast+0x700/0x700
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x192/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? do_sys_openat2+0x10a/0x370
  ? down_write_nested+0x150/0x150
  ? do_user_addr_fault+0x215/0xd50
  ? __x64_sys_openat+0x11f/0x1d0
  ? __x64_sys_open+0x1a0/0x1a0
  __x64_sys_sendto+0xdc/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f50b50b6b3a
 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
 RSP: 002b:00007fff6c0d3f38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f50b50b6b3a
 RDX: 0000000000000038 RSI: 000055763ac08440 RDI: 0000000000000003
 RBP: 000055763ac08410 R08: 00007f50b5192200 R09: 000000000000000c
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000000000000000 R14: 000055763ac08410 R15: 000055763ac08440
 mlx5_core 0000:00:09.0: firmware version: 4.8.9999
 mlx5_core 0000:00:09.0: 0.000 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x255 link)
 mlx5_core 0000:00:09.0 eth1: Link up

Fixes: a6f3b62386 ("net/mlx5: Move devlink registration before interfaces load")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
1e72685916 net/mlx4: Move devlink_register to be the last initialization command
Refactor the code to make sure that devlink_register() is the last
command during initialization stage.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
4beb0c241b net/prestera: Split devlink and traps registrations to separate routines
Separate devlink registrations and traps registrations so devlink will
be registered when driver is fully initialized.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
1d264db405 octeontx2: Move devlink registration to be last devlink command
This change prevents from users to access device before devlink is fully
configured. This change allows us to delete call to devlink_params_publish()
and impossible check during unregister flow.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
838cefd5e5 ice: Open devlink when device is ready
Move devlink_registration routine to be the last command, when the
device is fully initialized.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
44691f5352 net: hinic: Open device for the user access when it is ready
Move devlink registration to be the last command in device activation,
so it opens the driver to accept such devlink commands from the user
when it is fully initialized.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
bbb9ae25fc dpaa2-eth: Register devlink instance at the end of probe
Move devlink_register to be the last command in the initialization
sequence.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:59 +01:00
Leon Romanovsky
8d44b5cf60 liquidio: Overcome missing device lock protection in init/remove flows
The liquidio driver is broken by design. It initialize PCI devices
in separate delayed works. It causes to the situation where device lock
is dropped during initialize and remove sequences.

That lock is part of driver/core and needed to protect from races during
init, destroy and bus invocations.

In addition to lack of locking protection, it has incorrect order of
destroy flows and very questionable synchronization scheme based on
atomic_t.

This change doesn't fix that driver but makes sure that rest of the
netdev subsystem doesn't suffer from such basic protection by adding
device_lock over devlink_*() APIs and by moving devlink_register()
to be last command in setup_nic_devices().

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:58 +01:00
Leon Romanovsky
5df290e7a7 bnxt_en: Register devlink instance at the end devlink configuration
Move devlink_register() to be last command in devlink configuration
sequence, so no user space access will be possible till devlink instance
is fully operable. As part of this change, the devlink_params_publish
call is removed as not needed.

This change fixes forgotten devlink_params_unpublish() too.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 16:31:58 +01:00
Arnd Bergmann
ef5d6356e2 cxgb: avoid open-coded offsetof()
clang-14 does not like the custom offsetof() macro in vsc7326:

drivers/net/ethernet/chelsio/cxgb/vsc7326.c:597:3: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]
                HW_STAT(RxUnicast, RxUnicastFramesOK),
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb/vsc7326.c:594:56: note: expanded from macro 'HW_STAT'
        { reg, (&((struct cmac_statistics *)NULL)->stat_name) - (u64 *)NULL }

Rewrite this to use the version provided by the kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 14:04:23 +01:00
Arnd Bergmann
3e0d5699a9 net: stmmac: fix gcc-10 -Wrestrict warning
gcc-10 and later warn about a theoretical array overrun when
accessing priv->int_name_rx_irq[i] with an out of bounds value
of 'i':

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_request_irq_multi_msi':
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:3528:17: error: 'snprintf' argument 4 may overlap destination object 'dev' [-Werror=restrict]
 3528 |                 snprintf(int_name, int_name_len, "%s:%s-%d", dev->name, "tx", i);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:3404:60: note: destination object referenced by 'restrict'-qualified argument 1 was declared here
 3404 | static int stmmac_request_irq_multi_msi(struct net_device *dev)
      |                                         ~~~~~~~~~~~~~~~~~~~^~~

The warning is a bit strange since it's not actually about the array
bounds but rather about possible string operations with overlapping
arguments, but it's not technically wrong.

Avoid the warning by adding an extra bounds check.

Fixes: 8532f613bc ("net: stmmac: introduce MSI Interrupt routines for mac, safety, RX & TX")
Link: https://lore.kernel.org/all/20210421134743.3260921-1-arnd@kernel.org/
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:50:46 +01:00
Christian Lamparter
584351c31d net: ethernet: emac: utilize of_net's of_get_mac_address()
of_get_mac_address() reads the same "local-mac-address" property.
... But goes above and beyond by checking the MAC value properly.

printk+message seems outdated too,
so let's put dev_err in the queue.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:28:48 +01:00
Yang Li
867d1ac99f net: sparx5: fix resource_size.cocci warnings
Use resource_size function on resource object
instead of explicit computation.

Clean up coccicheck warning:
./drivers/net/ethernet/microchip/sparx5/sparx5_main.c:237:19-22: ERROR:
Missing resource_size with iores [ idx ]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:27:45 +01:00
Cai Huoqing
4247ef0269 ibmveth: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()
Replacing kmalloc/kfree/dma_map_single/dma_unmap_single()
with dma_alloc_coherent/dma_free_coherent() helps to reduce
code size, and simplify the code, and coherent DMA will not
clear the cache every time.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:25:25 +01:00
Desnes A. Nunes do Rosario
2974b8a691 Revert "ibmvnic: check failover_pending in login response"
This reverts commit d437f5aa23.

Code has been duplicated through commit <273c29e944bd> "ibmvnic: check
failover_pending in login response"

Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:21:53 +01:00
Cai Huoqing
f947fcaffd net: cisco: Fix a function name in comments
Use dma_alloc_coherent() instead of pci_alloc_consistent(),
because only dma_alloc_coherent() is called here.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Reviewed-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 13:19:36 +01:00
Cai Huoqing
005552854f net: smsc: Fix function names in print messages and comments
Use dma_xxx_xxx() instead of pci_xxx_xxx(),
because the pci function wrappers are not called here.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:44:33 +01:00
Cai Huoqing
e7e9d2088d net: sis: Fix a function name in comments
Use dma_alloc_coherent() instead of pci_alloc_consistent(),
because only dma_alloc_coherent() is called here.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:44:33 +01:00
Cai Huoqing
8b58cba44e net: broadcom: Fix a function name in comments
Use dma_alloc_coherent() instead of pci_alloc_consistent(),
because only dma_alloc_coherent() is called here.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:44:33 +01:00
Cai Huoqing
8d04c7b964 net: atl1c: Fix a function name in print messages
Use dma_map_single() instead of pci_map_single(),
because the pci function wrappers are not called here.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:44:32 +01:00
Matthew Hagan
763716a55c net: bgmac-platform: handle mac-address deferral
This patch is a replication of Christian Lamparter's "net: bgmac-bcma:
handle deferred probe error due to mac-address" patch for the
bgmac-platform driver [1].

As is the case with the bgmac-bcma driver, this change is to cover the
scenario where the MAC address cannot yet be discovered due to reliance
on an nvmem provider which is yet to be instantiated, resulting in a
random address being assigned that has to be manually overridden.

[1] https://lore.kernel.org/netdev/20210919115725.29064-1-chunkeey@gmail.com

Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:28:15 +01:00
Colin Ian King
44b6aa2ef6 net: hns: Fix spelling mistake "maped" -> "mapped"
There is a spelling mistake in a dev_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:21:16 +01:00
Kiran Kumar K
edadeb38dc octeontx2-af: Optimize KPU1 processing for variable-length headers
Optimized KPU1 entry processing for variable-length custom L2 headers
of size 24B, 90B by
	- Moving LA LTYPE parsing for 24B and 90B headers to PKIND.
	- Removing LA flags assignment for 24B and 90B headers.
	- Reserving a PKIND 55 to parse variable length headers.

Also, new mailbox(NPC_SET_PKIND) added to configure PKIND with
corresponding variable-length offset, mask, and shift count
(NPC_AF_KPUX_ENTRYX_ACTION0).

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-26 11:26:17 +01:00
Kiran Kumar K
2fae469ae2 octeontx2-af: Limit KPU parsing for GTPU packets
With current KPU profile, while parsing GTPU packets, GTPU payload
is also being parsed and GTPU PDU payload is being treated as IPV4
data, which is not correct. In case of GTPU packets, parsing should
be stopped after identifying the GTPU. Adding changes to limit KPU
profile parsing for GTPU payload.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-26 11:26:17 +01:00
Dima Chumak
05000bbba1 net/mlx5e: Enable TC offload for ingress MACVLAN
Support offloading of TC rules that filter ingress traffic from a MACVLAN
device, which is attached to uplink representor.

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:57 -07:00
Dima Chumak
fca572f2bc net/mlx5e: Enable TC offload for egress MACVLAN
Support offloading of TC rules that mirror/redirect egress traffic to a
MACVLAN device, which is attached to mlx5 representor net device.

Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:56 -07:00
Roi Dayan
7990b1b5e8 net/mlx5e: loopback test is not supported in switchdev mode
In switchdev mode we insert steering rules to eswitch that
make sure packets can't be looped back.
Modify the self tests infra and have flags per test.
Add a flag for tests that needs to be skipped in switchdev mode.

Before this commit:

$ ethtool --test enp8s0f0
 The test result is FAIL
 The test extra info:
 Link Test        0
 Speed Test       0
 Health Test      0
 Loopback Test    1

After this commit:

$ ethtool --test enp8s0f0
 The test result is PASS
 The test extra info:
 Link Test        0
 Speed Test       0
 Health Test      0

Example output in dmesg:

enp8s0f0: Self test begin..
enp8s0f0:         [0] Link Test start..
enp8s0f0:         [0] Link Test end: result(0)
enp8s0f0:         [1] Speed Test start..
enp8s0f0:         [1] Speed Test end: result(0)
enp8s0f0:         [2] Health Test start..
enp8s0f0:         [2] Health Test end: result(0)
enp8s0f0: Self test out: status flags(0x1)

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:55 -07:00
Roi Dayan
c50775d0e2 net/mlx5e: Use NL_SET_ERR_MSG_MOD() for errors parsing tunnel attributes
This to be consistent and adds the module name to the error message.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:55 -07:00
Roi Dayan
f3e02e479d net/mlx5e: Use tc sample stubs instead of ifdefs in source file
Instead of having sparse ifdefs in source files use a single
ifdef in the tc sample header file and use stubs.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:54 -07:00
Roi Dayan
1cc35b707c net/mlx5e: Remove redundant priv arg from parse_pedit_to_reformat()
The priv argument is not being used. remove it.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:54 -07:00
Roi Dayan
6b50cf45b6 net/mlx5e: Check action fwd/drop flag exists also for nic flows
The driver should add offloaded rules with either a fwd or drop action.
The check existed in parsing fdb flows but not when parsing nic flows.
Move the test into actions_match_supported() which is called for
checking nic flows and fdb flows.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:53 -07:00
Roi Dayan
7f8770c716 net/mlx5e: Set action fwd flag when parsing tc action goto
Do it when parsing like in other actions instead of when
checking if goto is supported in current scenario.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:53 -07:00
Roi Dayan
475fb86ac9 net/mlx5e: Remove incorrect addition of action fwd flag
A user is expected to explicit request a fwd or drop action.
It is not correct to implicit add a fwd action for the user,
when modify header action flag exists.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:52 -07:00
Roi Dayan
1836d78015 net/mlx5e: Use correct return type
modify_header_match_supported() should return type bool but
it returns the value returned by is_action_keys_supported()
which is type int.

is_action_keys_supported() always returns either -EOPNOTSUPP
or 0 and it shouldn't change as the purpose of the function
is checking for support. so just make the function return
a bool type.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:52 -07:00
Aya Levin
6c2509d446 net/mlx5e: Add error flow for ethtool -X command
Prior to this patch, ethtool -X fail but the user receives a success
status. Try to roll-back when failing and return success status
accordingly.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:51 -07:00
Yevgeny Kliteynik
c228dce262 net/mlx5: DR, Fix code indentation in dr_ste_v1
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-24 11:46:51 -07:00
Leon Romanovsky
e6a54d6f22 qed: Don't ignore devlink allocation failures
devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.

Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.

Fixes: 755f982bb1 ("qed/qede: make devlink survive recovery")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:12:57 +01:00
Leon Romanovsky
2ff04286a9 ice: Delete always true check of PF pointer
PF pointer is always valid when PCI core calls its .shutdown() and
.remove() callbacks. There is no need to check it again.

Fixes: 837f08fdec ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:12:57 +01:00
Leon Romanovsky
61415c3db3 bnxt_en: Properly remove port parameter support
This driver doesn't have any port parameters and registers
devlink port parameters with empty table. Remove the useless
calls to devlink_port_params_register and _unregister.

Fixes: da203dfa89 ("Revert "devlink: Add a generic wake_on_lan port parameter"")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:12:56 +01:00
Leon Romanovsky
e624c70e11 bnxt_en: Check devlink allocation and registration status
devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.

Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.

Fixes: 4ab0c6a8ff ("bnxt_en: add support to enable VF-representors")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:12:56 +01:00
Joshua Roys
a8551c9b75 net: mlx4: Add support for XDP_REDIRECT
Signed-off-by: Joshua Roys <roysjosh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:10:35 +01:00
Vladimir Oltean
325fd36ae7 net: enetc: fix the incorrect clearing of IF_MODE bits
The enetc phylink .mac_config handler intends to clear the IFMODE field
(bits 1:0) of the PM0_IF_MODE register, but incorrectly clears all the
other fields instead.

For normal operation, the bug was inconsequential, due to the fact that
we write the PM0_IF_MODE register in two stages, first in
phylink .mac_config (which incorrectly cleared out a bunch of stuff),
then we update the speed and duplex to the correct values in
phylink .mac_link_up.

Judging by the code (not tested), it looks like maybe loopback mode was
broken, since this is one of the settings in PM0_IF_MODE which is
incorrectly cleared.

Fixes: c76a97218d ("net: enetc: force the RGMII speed and duplex instead of operating in inband mode")
Reported-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 14:03:04 +01:00
Amit Cohen
ba1c71324b mlxsw: Add support for IP-in-IP with IPv6 underlay for Spectrum-2 and above
Currently, mlxsw driver supports IP-in-IP only with IPv4 underlay.
Add support for IPv6 underlay for Spectrum-2 and above.

Most of the configurations are same to IPv4, the main difference between
IPv4 and IPv6 is related to saving IP addresses.
IPv6 addresses are saved as part of KVD and the relevant registers hold
pointer to them.
Add API for that as part of ipip_ops, so then only Spectrum-2 and above
will save IPv6 addresses in this way.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
8d4f10463c mlxsw: spectrum_router: Increase parsing depth for IPv6 decapsulation
The Spectrum ASIC has a configurable limit on how deep into the packet
it parses. By default, the limit is 96 bytes.

For IP-in-IP packets, with IPv6 outer and inner headers, the default
parsing depth is not enough and without increasing it such packets cannot
be properly decapsulated.

Use the existing API to set parsing depth, call it once for each
decapsulation entry when it is created/destroyed.
There is no need to protect the code with new mutex because 'router->lock'
is already taken in these code paths.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
53eedd61de mlxsw: Add IPV6_ADDRESS kvdl entry type
Add support for allocating and freeing KVD entries for IPv6 addresses.

These addresses are programmed by the RIPS register and referenced by
the RATR and RTDP registers for IPv6 underlay encapsulation and
decapsulation, respectively.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
713e8502fd mlxsw: spectrum_ipip: Add mlxsw_sp_ipip_gre6_ops
Add operations for IP-in-IP GRE6. For now the function can_offload()
returns false and the other functions warn as they should never be called.

A later patch will add dedicated operations for Spectrum-2 which will
support IPv6 underlay, so use 'mlxsw_sp1_' prefix for the new operations.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
a82feba686 mlxsw: Create separate ipip_ops_arr for different ASICs
Currently, there is support for IP-in-IP only with IPv4 underlay for all
supported Spectrum ASICs.
The next patches will add support for IPv6 underlay only for Spectrum-2
and above.

Add infrastructure for splitting IP-in-IP support between different
ASICs - create separate ipip_ops_arr and add ipips_init function to set the
right ops.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
36c2ab890b mlxsw: reg: Add support for ritr_loopback_ipip6_pack()
The RITR register is used to configure the router interface table.

For IP-in-IP, it stores the underlay source IP address for encapsulation
and also the ingress RIF for the underlay lookup.

Add support for IPv6 IP-in-IP configuration.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:52 +01:00
Amit Cohen
c729ae8d6c mlxsw: reg: Add support for ratr_ipip6_entry_pack()
The RATR register is used to configure the Router Adjacency (next-hop)
Table.

For IP-in-IP entry, underlay destination IPv4 is saved as part of this
register and underlay destination IPv6 is saved by RIPS register and RATR
saves pointer to it.

Add function for setting IPv6 IP-in-IP configuration.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
a917bb271d mlxsw: reg: Add support for rtdp_ipip6_pack()
The RTDP register is used for configuring the tunnel decapsulation
properties of NVE and IP-in-IP.

Linux tunnels verify packets before decapsulation based on the packet's
source IP, which must match tunnel remote IP.
RTDP is used to configure decapsulation so that it filters out packets that
are not IPv6 or have the wrong source IP or wrong GRE key.

For IP-in-IP entry, source IPv4 is saved as part of this register and
source IPv6 is saved by RIPS register and RTDP saves pointer to it.

Create common function for configuring both IPv4 and IPv6 and add
dedicated functions for each protocol.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
dd8a9552d4 mlxsw: reg: Add Router IP version Six Register
The RIPS register is used to store IPv6 addresses for use by the NVE and
IP-in-IP.

For IPv6 underlay support, RATR register needs to hold a pointer to the
remote IPv6 address for encapsulation and RTDP register needs to hold a
pointer to the local IPv6 address for decapsulation check.

Add the required register for saving IPv6 addresses.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
59bf980dd9 mlxsw: Take tunnel's type into account when searching underlay device
The function __mlxsw_sp_ipip_netdev_ul_dev_get() returns the underlay
device that corresponds to the overlay device that it gets.
Currently, this function assumes that the tunnel is IPv4 GRE, because it
is the only one that is supported by mlxsw driver.

This assumption will no longer be correct when IPv6 GRE support is added,
resulting in wrong underlay device being returned.
Instead, check 'ol_dev->type' and return the underlay device accordingly.

Move the function to spectrum_ipip.c because spectrum_router.c should not
be aware to tunnel type.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
80ef2abcdd mlxsw: spectrum_ipip: Create common function for mlxsw_sp_ipip_ol_netdev_change_gre()
The function mlxsw_sp_ipip_ol_netdev_change_gre4() contains code that
can be shared between IPv4 and IPv6.
The only difference is the way that arguments are taken from tunnel
parameters, which are different between IPv4 and IPv6.

For that, add structure 'mlxsw_sp_ipip_parms' to hold all the required
parameters for the function and save it as part of
'struct mlxsw_sp_ipip_entry' instead of the existing structure that is
not shared between IPv4 and IPv6. Add new operation as part of
'mlxsw_sp_ipip_ops' to initialize the new structure.

Then mlxsw_sp_ipip_ol_netdev_change_gre{4,6}() will prepare the new
structure and both will call the same function.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
8aba32cea3 mlxsw: spectrum_router: Fix arguments alignment
Suppress the following checkpatch.pl check [1] by adding a variable to
store the IP-in-IP options. Noticed while adding equivalent IPv6 code in
subsequent patches.

[1]
CHECK: Alignment should match open parenthesis
+               mlxsw_reg_ritr_loopback_ipip4_pack(ritr_pl, lb_cf.lb_ipipt,
+
+ MLXSW_REG_RITR_LOOPBACK_IPIP_OPTIONS_GRE_KEY_PRESET,

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
aa6fd8f177 mlxsw: spectrum_ipip: Pass IP tunnel parameters by reference and as 'const'
Currently, there are several functions that handle 'struct ip_tunnel_parm'
and 'struct __ip6_tnl_parm'.

Change the mentioned functions to get IP tunnel parameters by reference
and as 'const'.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Amit Cohen
45bce5c99d mlxsw: spectrum_router: Create common function for fib_entry_type_unset() code
mlxsw_sp_fib4_entry_type_unset() is not specific for IPv4 FIB entry,
move the code to mlxsw_sp_fib_entry_type_unset(), and call this function
from mlxsw_sp_fib4_entry_type_unset() so then it will be used for IPv6
also.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24 10:26:51 +01:00
Jakub Kicinski
2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Sudarsana Reddy Kalluru
4d88c339c4 atlantic: Fix issue in the pm resume flow.
After fixing hibernation resume flow, another usecase was found which
should be explicitly handled - resume when device is in "down" state.
Invoke aq_nic_init jointly with aq_nic_start only if ndev was already
up during suspend/hibernate. We still need to perform nic_deinit() if
caller requests for it, to handle the freeze/resume scenarios.

Fixes: 57f780f1c4 ("atlantic: Fix driver resume flow.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-23 13:24:14 +01:00
Aya Levin
fdbccea419 net/mlx4_en: Don't allow aRFS for encapsulated packets
Driver doesn't support aRFS for encapsulated packets, return early error
in such a case.

Fixes: 1eb8c695bd ("net/mlx4_en: Add accelerated RFS support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-23 13:17:39 +01:00
Vladimir Oltean
acc64f52af net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
The blamed commit made the fatally incorrect assumption that ports which
aren't in the FORWARDING STP state should not have packets forwarded
towards them, and that is all that needs to be done.

However, that logic alone permits BLOCKING ports to forward to
FORWARDING ports, which of course allows packet storms to occur when
there is an L2 loop.

The ocelot_get_bridge_fwd_mask should not only ask "what can the bridge
do for you", but "what can you do for the bridge". This way, only
FORWARDING ports forward to the other FORWARDING ports from the same
bridging domain, and we are still compatible with the idea of multiple
bridges.

Fixes: df291e54cc ("net: ocelot: support multiple bridges")
Suggested-by: Colin Foster <colin.foster@in-advantage.com>
Reported-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-23 13:15:31 +01:00
Felix Fietkau
e68daf61ed net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
Sometimes multiple CLS_REPLACE calls are issued for the same connection.
rhashtable_insert_fast does not check for these duplicates, so multiple
hardware flow entries can be created.
Fix this by checking for an existing entry early

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-23 13:14:19 +01:00
Shai Malin
1ea7812326 qed: rdma - don't wait for resources under hw error recovery flow
If the HW device is during recovery, the HW resources will never return,
hence we shouldn't wait for the CID (HW context ID) bitmaps to clear.
This fix speeds up the error recovery flow.

Fixes: 64515dc899 ("qed: Add infrastructure for error detection and recovery")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22 14:38:34 +01:00
Ido Schimmel
e3a3aae74d mlxsw: spectrum_router: Start using new trap adjacency entry
Start using the trap adjacency entry that was added in the previous
patch and remove the existing one which is no longer needed.

Note that the name of the old entry was inaccurate as the entry did not
discard packets, but trapped them.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22 14:35:02 +01:00
Ido Schimmel
4bdf80bcb7 mlxsw: spectrum_router: Add trap adjacency entry upon first nexthop group
In commit 0c3cbbf96d ("mlxsw: Add specific trap for packets routed via
invalid nexthops"), mlxsw started allocating a new adjacency entry
during driver initialization, to trap packets routed via invalid
nexthops.

This behavior was later altered in commit 983db6198f ("mlxsw:
spectrum_router: Allocate discard adjacency entry when needed") to only
allocate the entry upon the first route that requires it. The motivation
for the change is explained in the commit message.

The problem with the current behavior is that the entry shows up as a
"leak" in a new BPF resource monitoring tool [1]. This is caused by the
asymmetry of the allocation/free scheme. While the entry is allocated
upon the first route that requires it, it is only freed during
de-initialization of the driver.

Instead, track the number of active nexthop groups and allocate the
adjacency entry upon the creation of the first group. Free it when the
number of active groups reaches zero.

The next patch will convert mlxsw to start using the new entry and
remove the old one.

[1] https://github.com/Mellanox/mlxsw/tree/master/Debugging/libbpf-tools/resmon

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22 14:35:01 +01:00
Leon Romanovsky
db4278c55f devlink: Make devlink_register to be void
devlink_register() can't fail and always returns success, but all drivers
are obligated to check returned status anyway. This adds a lot of boilerplate
code to handle impossible flow.

Make devlink_register() void and simplify the drivers that use that
API call.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22 14:15:12 +01:00
Florian Fainelli
c3a4c69360 net: bcmgenet: Request APD, DLL disable and IDDQ-SR
When interfacing with a Broadcom PHY, request the auto-power down, DLL
disable and IDDQ-SR modes to be enabled.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-21 10:58:35 +01:00
Yufeng Mo
5126b9d3d4 net: hns3: fix a return value error in hclge_get_reset_status()
hclge_get_reset_status() should return the tqp reset status.
However, if the CMDQ fails, the caller will take it as tqp reset
success status by mistake. Therefore, uses a parameters to get
the tqp reset status instead.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
liaoguojia
ef39d63260 net: hns3: check vlan id before using it
The input parameters may not be reliable, so check the vlan id before
using it, otherwise may set wrong vlan id into hardware.

Fixes: dc8131d846 ("net: hns3: Fix for packet loss due wrong filter config in VLAN tbls")
Signed-off-by: liaoguojia <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
Yufeng Mo
63b1279d99 net: hns3: check queue id range before using
The input parameters may not be reliable. Before using the
queue id, we should check this parameter. Otherwise, memory
overwriting may occur.

Fixes: d341001846 ("net: hns3: refactor the mailbox message between PF and VF")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
Jiaran Zhang
311c0aaa9b net: hns3: fix misuse vf id and vport id in some logs
vport_id include PF and VFs, vport_id = 0 means PF, other values mean VFs.
So the actual vf id is equal to vport_id minus 1.

Some VF print logs are actually vport, and logs of vf id actually use
vport id, so this patch fixes them.

Fixes: ac887be5b0 ("net: hns3: change print level of RAS error log from warning to error")
Fixes: adcf738b80 ("net: hns3: cleanup some print format warning")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
Jian Shen
91bc0d5272 net: hns3: fix inconsistent vf id print
The vf id from ethtool is added 1 before configured to driver.
So it's necessary to minus 1 when printing it, in order to
keep consistent with user's configuration.

Fixes: dd74f815dd ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
Jian Shen
e184cec5e2 net: hns3: fix change RSS 'hfunc' ineffective issue
When user change rss 'hfunc' without set rss 'hkey' by ethtool
-X command, the driver will ignore the 'hfunc' for the hkey is
NULL. It's unreasonable. So fix it.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: 374ad29176 ("net: hns3: Add RSS general configuration support for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 13:28:39 +01:00
Michael Chan
5bed8b0704 bnxt_en: Fix TX timeout when TX ring size is set to the smallest
The smallest TX ring size we support must fit a TX SKB with MAX_SKB_FRAGS
+ 1.  Because the first TX BD for a packet is always a long TX BD, we
need an extra TX BD to fit this packet.  Define BNXT_MIN_TX_DESC_CNT with
this value to make this more clear.  The current code uses a minimum
that is off by 1.  Fix it using this constant.

The tx_wake_thresh to determine when to wake up the TX queue is half the
ring size but we must have at least BNXT_MIN_TX_DESC_CNT for the next
packet which may have maximum fragments.  So the comparison of the
available TX BDs with tx_wake_thresh should be >= instead of > in the
current code.  Otherwise, at the smallest ring size, we will never wake
up the TX queue and will cause TX timeout.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 10:08:36 +01:00
Aleksander Jan Bajkowski
998ac35801 net: lantiq: add support for jumbo frames
Add support for jumbo frames. Full support for jumbo frames requires
changes in the DSA switch driver (lantiq_gswip.c).

Tested on BT Hone Hub 5A.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-20 10:07:52 +01:00
Hariprasad Kelam
14e94f9445 octeontx2-af: verify CQ context updates
As per HW errata AQ modification to CQ could be discarded on heavy
traffic. This patch implements workaround for the same after each
CQ write by AQ check whether the requested fields (except those
which HW can update eg: avg_level) are properly updated or not.

If CQ context is not updated then perform AQ write again.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 14:05:50 +01:00
Lama Kayal
72a3c58d18 net/mlx4_en: Resolve bad operstate value
Any link state change that's done prior to net device registration
isn't reflected on the state, thus the operational state is left
obsolete, with 'UNKNOWN' status.

To resolve the issue, query link state from FW upon open operations
to ensure operational state is updated.

Fixes: c27a02cd94 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:21:04 +01:00
Christian Lamparter
029497e66b net: bgmac-bcma: handle deferred probe error due to mac-address
Due to the inclusion of nvmem handling into the mac-address getter
function of_get_mac_address() by
commit d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
it is now possible to get a -EPROBE_DEFER return code. Which did cause
bgmac to assign a random ethernet address.

This exact issue happened on my Meraki MR32. The nvmem provider is
an EEPROM (at24c64) which gets instantiated once the module
driver is loaded... This happens once the filesystem becomes available.

With this patch, bgmac_probe() will propagate the -EPROBE_DEFER error.
Then the driver subsystem will reschedule the probe at a later time.

Cc: Petr Štetiar <ynezz@true.cz>
Cc: Michael Walle <michael@walle.cc>
Fixes: d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:14:05 +01:00
Krzysztof Kozlowski
fdb4758385 net: freescale: drop unneeded MODULE_ALIAS
The MODULE_DEVICE_TABLE already creates proper alias for platform
driver.  Having another MODULE_ALIAS causes the alias to be duplicated.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:01:40 +01:00
Colin Foster
ba68e99419 net: mscc: ocelot: remove buggy duplicate write to DEV_CLOCK_CFG
When updating ocelot to use phylink, a second write to DEV_CLOCK_CFG was
mistakenly left in. It used the variable "speed" which, previously, would
would have been assigned a value of OCELOT_SPEED_1000. In phylink the
variable is be SPEED_1000, which is invalid for the
DEV_CLOCK_LINK_SPEED macro. Removing it as unnecessary and buggy.

Fixes: e6e12df625 ("net: mscc: ocelot: convert to phylink")
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:59:52 +01:00
Colin Foster
163957c43d net: mscc: ocelot: remove buggy and useless write to ANA_PFC_PFC_CFG
A useless write to ANA_PFC_PFC_CFG was left in while refactoring ocelot to
phylink. Since priority flow control is disabled, writing the speed has no
effect.

Further, it was using ethtool.h SPEED_ instead of OCELOT_SPEED_ macros,
which are incorrectly offset for GENMASK.

Lastly, for priority flow control to properly function, some scenarios
would rely on the rate adaptation from the PCS while the MAC speed would
be fixed. So it isn't used, and even if it was, neither "speed" nor
"mac_speed" are necessarily the correct values to be used.

Fixes: e6e12df625 ("net: mscc: ocelot: convert to phylink")
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:59:52 +01:00
Randy Dunlap
8775851107 igc: fix build errors for PTP
When IGC=y and PTP_1588_CLOCK=m, the ptp_*() interface family is
not available to the igc driver. Make this driver depend on
PTP_1588_CLOCK_OPTIONAL so that it will build without errors.

Various igc commits have used ptp_*() functions without checking
that PTP_1588_CLOCK is enabled. Fix all of these here.

Fixes these build errors:

ld: drivers/net/ethernet/intel/igc/igc_main.o: in function `igc_msix_other':
igc_main.c:(.text+0x6494): undefined reference to `ptp_clock_event'
ld: igc_main.c:(.text+0x64ef): undefined reference to `ptp_clock_event'
ld: igc_main.c:(.text+0x6559): undefined reference to `ptp_clock_event'
ld: drivers/net/ethernet/intel/igc/igc_ethtool.o: in function `igc_ethtool_get_ts_info':
igc_ethtool.c:(.text+0xc7a): undefined reference to `ptp_clock_index'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_feature_enable_i225':
igc_ptp.c:(.text+0x330): undefined reference to `ptp_find_pin'
ld: igc_ptp.c:(.text+0x36f): undefined reference to `ptp_find_pin'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_init':
igc_ptp.c:(.text+0x11cd): undefined reference to `ptp_clock_register'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_stop':
igc_ptp.c:(.text+0x12dd): undefined reference to `ptp_clock_unregister'
ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe':

Fixes: 64433e5bf4 ("igc: Enable internal i225 PPS")
Fixes: 60dbede0c4 ("igc: Add support for ethtool GET_TS_INFO command")
Fixes: 87938851b6 ("igc: enable auxiliary PHC functions for the i225")
Fixes: 5f2958052c ("igc: Add basic skeleton for PTP")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ederson de Souza <ederson.desouza@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:20:53 +01:00
Russell King
9eb7b5e7cb net: dpaa2-mac: add support for more ethtool 10G link modes
Phylink documentation says:
  Note that the PHY may be able to transform from one connection
  technology to another, so, eg, don't clear 1000BaseX just
  because the MAC is unable to BaseX mode. This is more about
  clearing unsupported speeds and duplex settings. The port modes
  should not be cleared; phylink_set_port_modes() will help with this.

So add the missing 10G modes.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Marek Behún <kabel@kernel.org>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:11:40 +01:00
Claudiu Manoil
9f7afa05c9 enetc: Fix uninitialized struct dim_sample field usage
The only struct dim_sample member that does not get
initialized by dim_update_sample() is comp_ctr. (There
is special API to initialize comp_ctr:
dim_update_sample_with_comps(), and it is currently used
only for RDMA.) comp_ctr is used to compute curr_stats->cmps
and curr_stats->cpe_ratio (see dim_calc_stats()) which in
turn are consumed by the rdma_dim_*() API.  Therefore,
functionally, the net_dim*() API consumers are not affected.
Nevertheless, fix the computation of statistics based
on an uninitialized variable, even if the mentioned statistics
are not used at the moment.

Fixes: ae0e6a5d16 ("enetc: Add adaptive interrupt coalescing")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:10:26 +01:00
Claudiu Manoil
7237a494de enetc: Fix illegal access when reading affinity_hint
irq_set_affinity_hit() stores a reference to the cpumask_t
parameter in the irq descriptor, and that reference can be
accessed later from irq_affinity_hint_proc_show(). Since
the cpu_mask parameter passed to irq_set_affinity_hit() has
only temporary storage (it's on the stack memory), later
accesses to it are illegal. Thus reads from the corresponding
procfs affinity_hint file can result in paging request oops.

The issue is fixed by the get_cpu_mask() helper, which provides
a permanent storage for the cpumask_t parameter.

Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 12:10:26 +01:00
Claudiu Beznea
0f4f6d7332 net: macb: enable mii on rgmii for sama7g5
Both MAC IPs available on SAMA7G5 support MII on RGMII feature.
Enable these by adding proper capability to proper macb_config
objects.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:14:39 +01:00
Claudiu Beznea
1a9b5a26da net: macb: add support for mii on rgmii
Cadence IP has option to enable MII support on RGMII interface. This
could be selected though bit 28 of network control register. This option
is not enabled on all the IP versions thus add a software capability to
be selected by the proper implementation of this IP.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:14:39 +01:00
Claudiu Beznea
d7b3485f1c net: macb: align for OSSMODE offset
Align for OSSMODE offset.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:14:39 +01:00
Claudiu Beznea
1dac0084d4 net: macb: add description for SRTSM
Add description for SRTSM bit.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:14:38 +01:00
Florian Fainelli
b972b54a68 net: bcmgenet: Patch PHY interface for dedicated PHY driver
When we are using a dedicated PHY driver (not the Generic PHY driver)
chances are that it is going to configure RGMII delays and do that in a
way that is incompatible with our incorrect interpretation of the
phy_interface value.

Add a quirk in order to reverse the PHY_INTERFACE_MODE_RGMII to the
value of PHY_INTERFACE_MODE_RGMII_ID such that the MAC continues to be
configured the way it used to be, but the PHY driver can account for
adding delays. Conversely when PHY_INTERFACE_MODE_RGMII_TXID is
specified, return PHY_INTERFACE_MODE_RGMII_RXID to the PHY since we will
have enabled a TXC MAC delay (id_mode_dis=0, meaning there is a delay
inserted).

This is not considered a bug fix at this point since it only affects
Broadcom STB platforms shipping with a Device Tree blob that is not
updatable in the field (quite a few devices out there) and which was
generated using the scripted Device Tree environment shipped with those
platforms' SDK.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 11:52:07 +01:00
Heiner Kallweit
0efcc3f201 sky2: Stop printing VPD info to debugfs
Sky2 is parsing the VPD and adds the parsed information to its debugfs
file. This isn't needed in kernel, userspace tools like lspci can be
used to display such information nicely. Therefore remove this from
the driver.

lspci -vv:

Capabilities: [50] Vital Product Data
	Product Name: Marvell Yukon 88E8070 Gigabit Ethernet Controller
	Read-only fields:
		[PN] Part number: Yukon 88E8070
		[EC] Engineering changes: Rev. 1.0
		[MN] Manufacture ID: Marvell
		[SN] Serial number: AbCdEfG970FD4
		[CP] Extended capability: 01 10 cc 03
		[RV] Reserved: checksum good, 9 byte(s) reserved
	Read/write fields:
		[RW] Read-write area: 1 byte(s) free
	End

Relevant part in debugfs file:

0000:01:00.0 Product Data
Marvell Yukon 88E8070 Gigabit Ethernet Controller
 Part Number: Yukon 88E8070
 Engineering Level: Rev. 1.0
 Manufacturer: Marvell
 Serial Number: AbCdEfG970FD4

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/bbaee8ab-9b2e-de04-ee7b-571e094cc5fe@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 18:27:43 -07:00
Krzysztof Kozlowski
5ef8a02915 net: microchip: encx24j600: drop unneeded MODULE_ALIAS
The MODULE_DEVICE_TABLE already creates proper alias for spi driver.
Having another MODULE_ALIAS causes the alias to be duplicated.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 14:21:00 +01:00
Cai Huoqing
b20b54fb00 net: stmmac: dwmac-visconti: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and the error value
gets printed.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 14:02:40 +01:00
Colin Ian King
3503e673db octeontx2-af: Remove redundant initialization of variable blkaddr
The variable blkaddr is being initialized with a value that is never
read, it is being updated later on in a for-loop. The assignment is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 14:00:51 +01:00
Colin Ian King
d853f1d3c9 octeontx2-af: Fix uninitialized variable val
In the case where the condition !is_rvu_otx2(rvu) is false variable
val is not initialized and can contain a garbage value. Fix this by
initializing val to zero and bit-wise or'ing in BIT_ULL(51) to val
for the true condition case of !is_rvu_otx2(rvu).

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 4b5a3ab17c ("octeontx2-af: Hardware configuration for inline IPsec")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 14:00:51 +01:00
Vladimir Oltean
3c9cfb5269 net: update NXP copyright text
NXP Legal insists that the following are not fine:

- Saying "NXP Semiconductors" instead of "NXP", since the company's
  registered name is "NXP"

- Putting a "(c)" sign in the copyright string

- Putting a comma in the copyright string

The only accepted copyright string format is "Copyright <year-range> NXP".

This patch changes the copyright headers in the networking files that
were sent by me, or derived from code sent by me.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 13:52:17 +01:00
Hao Chen
6042d4348a net: e1000e: solve insmod 'Unknown symbol mutex_lock' error
After I turn on the CONFIG_LOCK_STAT=y, insmod e1000e.ko will report:
[    5.641579] e1000e: Unknown symbol mutex_lock (err -2)
[   90.775705] e1000e: Unknown symbol mutex_lock (err -2)
[  132.252339] e1000e: Unknown symbol mutex_lock (err -2)

This problem fixed after include <linux/mutex.h>.

Signed-off-by: Hao Chen <chenhaoa@uniontech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 11:11:25 +01:00
Cai Huoqing
61524e43ab net: netsec: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and the error value
gets printed.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 09:42:29 +01:00
Jakub Kicinski
561bed688b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts!

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-16 13:58:38 -07:00
Linus Torvalds
fc0c0548c1 Networking fixes for 5.15-rc2, including fixes from bpf.
Current release - regressions:
 
  - vhost_net: fix OoB on sendmsg() failure
 
  - mlx5: bridge, fix uninitialized variable usage
 
  - bnxt_en: fix error recovery regression
 
 Current release - new code bugs:
 
  - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()
 
 Previous releases - regressions:
 
  - r6040: restore MDIO clock frequency after MAC reset
 
  - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
 
  - dsa: flush switchdev workqueue before tearing down CPU/DSA ports
 
 Previous releases - always broken:
 
  - ptp: dp83640: don't define PAGE0, avoid compiler warning
 
  - igc: fix tunnel segmentation offloads
 
  - phylink: update SFP selected interface on advertising changes
 
  - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
 
  - mlx5e: fix mutual exclusion between CQE compression and HW TS
 
 Misc:
 
  - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode
 
  - sfc: fallback for lack of xdp tx queues
 
  - hns3: add option to turn off page pool feature
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFDnXIACgkQMUZtbf5S
 Irs+pA//UvT0UqL+G/9He9eVMaX/7mnvoShbU/hHaymLuvJE4dx6B3qw8im5fQ0i
 b7ovy1dxjMjSQI4E+2FleajRBgsdXvWgMCopFZr7h35+qPXziLxTTztwsaOIjD8U
 42HBpdwFrVtdrOhSsiKsGKe8KXqIehHhFAp4vQrRlsvsz1Yk/93o6wYaXaXqR6pQ
 qlTVgI7nAuLAw4VfFZ3k7iAiQsyrXrn5kUXq5/Wy6kK7iFnIN7X9j+epTJP/EXHn
 +17SW/iGnjUfwXsRSC2ysgICtOPatu48Cke5hq5XOGqXAtRVhhLfMDiZlr8a6+z4
 gS+QETakBXLrAfvYK6neLjlMhg1FX+Y952xd3XIU4jz+DgIh03r2Nd1xBsp17Yhu
 RRBc96w2KxuHQ6USEXbi5dY1K6o9DK52m/WRmnqXnaNX4ADeesgcrxXHqZ6O6I5R
 +ddvVWylaHxvngCQSL4DvZdYNJVoKjdlo6fHrwD1qu0Uv0rGz6O+AEI9o8DkK5Af
 ZhFBETvjTKzjCykd3UlaRo/ibYChIPYwpGTK+TTI9rUm8OWNw1+sC/Z14bx0/3Vl
 RB53rkJ93/qi6Wb9ljUxM8CPGViuPvMkQJjpJOaktrj0GOgQz3EBdDkIQMaplZx5
 r2AeHW7X5VGxCWMRg4NS/x7Do1jh8m/ch8/EulOAB5fG7KQPfYs=
 =xd6p
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Linus Torvalds
db71f8fb44 3com 3c515: make it compile on 64-bit architectures
This driver isn't enabled most places because of the ISA config
dependency, but alpha still has it.  And I think the 'Jensen' actually
did have an ISA slot.

However, it doesn't build cleanly, because the "Vortex bus master" code
just casts the skb->data pointer to 'int':

        outl((int) (skb->data), ioaddr + Wn7_MasterAddr);

which is all kinds of broken.  Even on a good old traditional PC/AT it
would be broken because the high bits will be random kernel address
bits, but presumably the hardware ignores those bits.  I mean, it's ISA.
We're talking 16MB dma limits. The "good old days".

Make the build happy with this kind of craziness by using the proper
isa_virt_to_bus() handling that the full bus master code uses anyway
(the Vortex bus mastering is a limited special case).

Who knows, this might even work.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-16 11:14:47 -07:00
Srujana Challa
4b5a3ab17c octeontx2-af: Hardware configuration for inline IPsec
On OcteonTX2/CN10K SoC, the admin function (AF) is the only one
with all priviliges to configure HW and alloc resources, PFs and
it's VFs have to request AF via mailbox for all their needs.
This patch adds new mailbox messages for CPT PFs and VFs to configure
HW resources for inline-IPsec.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 14:37:38 +01:00
David Thompson
ee8a9600b5 mlxbf_gige: clear valid_polarity upon open
The network interface managed by the mlxbf_gige driver can
get into a problem state where traffic does not flow.
In this state, the interface will be up and enabled, but
will stop processing received packets.  This problem state
will happen if three specific conditions occur:
    1) driver has received more than (N * RxRingSize) packets but
       less than (N+1 * RxRingSize) packets, where N is an odd number
       Note: the command "ethtool -g <interface>" will display the
       current receive ring size, which currently defaults to 128
    2) the driver's interface was disabled via "ifconfig oob_net0 down"
       during the window described in #1.
    3) the driver's interface is re-enabled via "ifconfig oob_net0 up"

This patch ensures that the driver's "valid_polarity" field is
cleared during the open() method so that it always matches the
receive polarity used by hardware.  Without this fix, the driver
needs to be unloaded and reloaded to correct this problem state.

Fixes: f92e1869d7 ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 14:31:58 +01:00
Hariprasad Kelam
63f85c401e octeontx2-pf: CN10K: Hide RPM stats over ethtool
CN10K MAC block (RPM) differs in number of stats compared to Octeontx2
MAC block (CGX). RPM supports stats for each class of PFC and error
packets etc. It would be difficult for user to read stats from ethtool
and map to their definition.

New debugfs file is already added to read RPM stats along with their
definition. This patch adds proper checks such that RPM stats will not
be part of ethtool.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 14:31:05 +01:00
Paolo Abeni
40ee363c84 igc: fix tunnel offloading
Checking tunnel offloading, it turns out that offloading doesn't work
as expected.  The following script allows to reproduce the issue.
Call it as `testscript DEVICE LOCALIP REMOTEIP NETMASK'

=== SNIP ===
if [ $# -ne 4 ]
then
  echo "Usage $0 DEVICE LOCALIP REMOTEIP NETMASK"
  exit 1
fi
DEVICE="$1"
LOCAL_ADDRESS="$2"
REMOTE_ADDRESS="$3"
NWMASK="$4"
echo "Driver: $(ethtool -i ${DEVICE} | awk '/^driver:/{print $2}') "
ethtool -k "${DEVICE}" | grep tx-udp
echo
echo "Set up NIC and tunnel..."
ip addr add "${LOCAL_ADDRESS}/${NWMASK}" dev "${DEVICE}"
ip link set "${DEVICE}" up
sleep 2
ip link add vxlan1 type vxlan id 42 \
		   remote "${REMOTE_ADDRESS}" \
		   local "${LOCAL_ADDRESS}" \
		   dstport 0 \
		   dev "${DEVICE}"
ip addr add fc00::1/64 dev vxlan1
ip link set vxlan1 up
sleep 2
rm -f vxlan.pcap
echo "Running tcpdump and iperf3..."
( nohup tcpdump -i any -w vxlan.pcap >/dev/null 2>&1 ) &
sleep 2
iperf3 -c fc00::2 >/dev/null
pkill tcpdump
echo
echo -n "Max. Paket Size: "
tcpdump -r vxlan.pcap -nnle 2>/dev/null \
| grep "${LOCAL_ADDRESS}.*> ${REMOTE_ADDRESS}.*OTV" \
| awk '{print $8}' | awk -F ':' '{print $1}' \
| sort -n | tail -1
echo
ip link del vxlan1
ip addr del ${LOCAL_ADDRESS}/${NWMASK} dev "${DEVICE}"
=== SNAP ===

The expected outcome is

  Max. Paket Size: 64904

This is what you see on igb, the code igc has been taken from.
However, on igc the output is

  Max. Paket Size: 1516

so the GSO aggregate packets are segmented by the kernel before calling
igc_xmit_frame.  Inside the subsequent call to igc_tso, the check for
skb_is_gso(skb) fails and the function returns prematurely.

It turns out that this occurs because the feature flags aren't set
entirely correctly in igc_probe.  In contrast to the original code
from igb_probe, igc_probe neglects to set the flags required to allow
tunnel offloading.

Setting the same flags as igb fixes the issue on igc.

Fixes: 34428dff36 ("igc: Add GSO partial support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 14:29:58 +01:00
Eli Cohen
7c3a0a018e net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
Remove the assert from the callback priv lookup function since it does
not require RTNL lock and is already protected by flow_indr_block_lock.

This will avoid warnings from being emitted to dmesg if the driver
registers its callback after an ingress qdisc was created for a
netdevice.

The warnings started after the following patch was merged:
commit 74fc4f8287 ("net: Fix offloading indirect devices dependency on qdisc order creation")

Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 14:09:30 +01:00
Cai Huoqing
52583c8d8b net: thunderx: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
4fd3ff3b29 net: hinic: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
015a22f46b net: ethoc: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
a72691ee19 net: enetc: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
9eda994d4b net: chelsio: cxgb4vf: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
b0ab7096dd net: atl1e: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
d502933c30 net: atl1c: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Cai Huoqing
95b5fc03c1 net: arc_emac: Make use of the helper function dev_err_probe()
When possible use dev_err_probe help to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
And using dev_err_probe() can reduce code size, and simplify the code.

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:35:35 +01:00
Guenter Roeck
dff2d13114 net: i825xx: Use absolute_pointer for memcpy from fixed memory location
gcc 11.x reports the following compiler warning/error.

  drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
  arch/m68k/include/asm/string.h:72:25: error:
	'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]

Use absolute_pointer() to work around the problem.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15 12:04:28 -07:00
Ido Schimmel
49fd3b645d mlxsw: Add support for transceiver modules reset
Implement support for ethtool_ops::reset in order to reset transceiver
modules. The module backing the netdev is reset when the 'ETH_RESET_PHY'
flag is set. After a successful reset, the flag is cleared by the driver
and other flags are ignored. This is in accordance with the interface
documentation:

"The reset() operation must clear the flags for the components which
were actually reset. On successful return, the flags indicate the
components which were not reset, either because they do not exist in the
hardware or because they cannot be reset independently. The driver must
never reset any components that were not requested."

Reset is useful in order to allow a module to transition out of a fault
state. From section 6.3.2.12 in CMIS 5.0: "Except for a power cycle, the
only exit path from the ModuleFault state is to perform a module reset
by taking an action that causes the ResetS transition signal to become
TRUE (see Table 6-11)".

An error is returned when the netdev is administratively up:

 # ip link set dev swp11 up

 # ethtool --reset swp11 phy
 ETHTOOL_RESET 0x40
 Cannot issue ETHTOOL_RESET: Invalid argument

 # ip link set dev swp11 down

 # ethtool --reset swp11 phy
 ETHTOOL_RESET 0x40
 Components reset:     0x40

An error is returned when the module is shared by multiple ports (split
ports) and the "phy-shared" flag is not set:

 # devlink port split swp11 count 4

 # ethtool --reset swp11s0 phy
 ETHTOOL_RESET 0x40
 Cannot issue ETHTOOL_RESET: Invalid argument

 # ethtool --reset swp11s0 phy-shared
 ETHTOOL_RESET 0x400000
 Components reset:     0x400000

 # devlink port unsplit swp11s0

 # ethtool --reset swp11 phy
 ETHTOOL_RESET 0x40
 Components reset:     0x40

An error is also returned when one of the ports using the module is
administratively up:

 # devlink port split swp11 count 4

 # ip link set dev swp11s1 up

 # ethtool --reset swp11s0 phy-shared
 ETHTOOL_RESET 0x400000
 Cannot issue ETHTOOL_RESET: Invalid argument

 # ip link set dev swp11s1 down

 # ethtool --reset swp11s0 phy-shared
 ETHTOOL_RESET 0x400000
 Components reset:     0x400000

Reset is performed by writing to the "rst" bit of the PMAOS register,
which instructs the firmware to assert the reset signal connected to the
module for a fixed amount of time.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:16 +01:00
Ido Schimmel
8f4ebdb0a2 mlxsw: Make PMAOS pack function more generic
The PMAOS register has enable bits (e.g., PMAOS.ee) that allow changing
only a subset of the fields, which is exactly what subsequent patches
will need to do. Instead of passing multiple arguments to its pack
function, only pass the module index and let the rest be set by the
different callers.

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:16 +01:00
Ido Schimmel
ef23841bb9 mlxsw: reg: Add fields to PMAOS register
The Ports Module Administrative and Operational Status (PMAOS) register
configures and retrieves the per-module status. Extend it with fields
required to support various module settings such as reset and power
mode.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:16 +01:00
Ido Schimmel
896f399be0 mlxsw: Track per-module port status
In the common port module core, track the number of logical ports that
are mapped to the port module and the number of logical ports using it
that are administratively up.

This will be used by later patches to potentially veto and control
certain operations on the module, such as reset and setting its power
mode.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:16 +01:00
Ido Schimmel
196bff2927 mlxsw: spectrum: Do not return an error in mlxsw_sp_port_module_unmap()
The return value is never checked. Allows us to simplify a later patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Ido Schimmel
06277ca238 mlxsw: spectrum: Do not return an error in ndo_stop()
The return value is not checked by the networking stack. Allows us to
simplify a later patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Ido Schimmel
bd6e43f595 mlxsw: core_env: Convert 'module_info_lock' to a mutex
After the previous patch, the lock is always taken in process context so
it can be converted to a mutex. It is needed for future changes where we
will need to be able to sleep when holding the lock.

Convert the lock to a mutex.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Ido Schimmel
163f3d2dd0 mlxsw: core_env: Defer handling of module temperature warning events
Module temperature events are currently handled in softIRQ context,
requiring the 'module_info_lock' to be a spin lock. In future patchsets
we will need to be able to hold the lock while sleeping.

Therefore, defer handling of these events using a work queue so that the
next patch will be able to convert the lock to a mutex.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Ido Schimmel
25a91f835a mlxsw: core: Remove mlxsw_core_is_initialized()
After the previous patch, the switch driver is always initialized last,
making this function redundant.

Remove it.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Ido Schimmel
3d7a6f6779 mlxsw: core: Initialize switch driver last
Commit 961cf99a07 ("mlxsw: core: Re-order initialization sequence")
changed the initialization sequence so that the switch driver (e.g.,
mlxsw_spectrum) is initialized before registration with the hwmon and
thermal subsystems.

This was done in order to avoid situations where hwmon/thermal code uses
features not supported by current firmware version, which is only
validated as part of switch driver initialization.

Later, commit b79cb787ac ("mlxsw: Move fw flashing code into core.c")
moved firmware validation and flashing code from the switch driver to
mlxsw_core so that it is performed before driver initialization.

Therefore, change the initialization sequence back to its original form.

In addition to being more straightforward, it will allow us to simplify
parts of the code in subsequent patches and future patchsets.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:17:15 +01:00
Leon Romanovsky
e9310aed8e net/mlx5: Publish and unpublish all devlink parameters at once
The devlink parameters were published in two steps despite being static
and known in advance.

First step was to use devlink_params_publish() which iterated over all
known up to that point parameters and sent notification messages.
In second step, the call was devlink_param_publish() that looped over
same parameters list and sent notification for new parameters.

In order to simplify the API, move devlink_params_publish() to be called
when all parameters were already added and save the need to iterate over
parameters list again.

As a side effect, this change fixes the error unwind flow in which
parameters were not marked as unpublished.

Fixes: 82e6c96f04 ("net/mlx5: Register to devlink ingress VLAN filter trap")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:12:55 +01:00
Sukadev Bhattiprolu
bbd809305b ibmvnic: Reuse tx pools when possible
Rather than releasing the tx pools on every close and reallocating
them on open, reuse the tx pools unless the pool parameters (number
of pools, size of each pool or size of each buffer in a pool) have
changed.

If the pool parameters changed, then release the old pools (if
any) and allocate new ones.

Specifically release tx pools, if:
	- adapter is removed,
	- pool parameters change during reset,
	- we encounter an error when opening the adapter in response
	  to a user request (in ibmvnic_open()).

and don't release them:
	- in __ibmvnic_close() or
	- on errors in __ibmvnic_open()

in the hope that we can reuse them during this or next reset.

With these changes reset_tx_pools() can be dropped because its
optimization is now included in init_tx_pools() itself.

cleanup_tx_pools() releases all the skbs associated with the pool and
is called from ibmvnic_cleanup(), which is called on every reset. Since
we want to reuse skbs across resets, move cleanup_tx_pools() out of
ibmvnic_cleanup() and call it only when user closes the adapter.

Add two new adapter fields, ->prev_mtu, ->prev_tx_pool_size to track the
previous values and use them to decide whether to reuse or realloc the
pools.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
489de956e7 ibmvnic: Reuse rx pools when possible
Rather than releasing the rx pools on and reallocating them on every
reset, reuse the rx pools unless the pool parameters (number of pools,
size of each pool or size of each buffer in a pool) have changed.

If the pool parameters changed, then release the old pools (if any)
and allocate new ones.

Specifically release rx pools, if:
	- adapter is removed,
	- pool parameters change during reset,
	- we encounter an error when opening the adapter in response
	  to a user request (in ibmvnic_open()).

and don't release them:
	- in __ibmvnic_close() or
	- on errors in __ibmvnic_open()

in the hope that we can reuse them on the next reset.

With these, reset_rx_pools() can be dropped because its optimzation is
now included in init_rx_pools() itself.

cleanup_rx_pools() releases all the skbs associated with the pool and
is called from ibmvnic_cleanup(), which is called on every reset. Since
we want to reuse skbs across resets, move cleanup_rx_pools() out of
ibmvnic_cleanup() and call it only when user closes the adapter.

Add two new adapter fields, ->prev_rx_buf_sz, ->prev_rx_pool_size to
keep track of the previous values and use them to decide whether to
reuse or realloc the pools.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
f8ac0bfa7d ibmvnic: Reuse LTB when possible
Reuse the long term buffer during a reset as long as its size has
not changed. If the size has changed, free it and allocate a new
one of the appropriate size.

When we do this, alloc_long_term_buff() and reset_long_term_buff()
become identical. Drop reset_long_term_buff().

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
129854f061 ibmvnic: Use bitmap for LTB map_ids
In a follow-on patch, we will reuse long term buffers when possible.
When doing so we have to be careful to properly assign map ids. We
can no longer assign them sequentially because a lower map id may be
available and we could wrap at 255 and collide with an in-use map id.

Instead, use a bitmap to track active map ids and to find a free map id.
Don't need to take locks here since the map_id only changes during reset
and at that time only the reset worker thread should be using the adapter.

Noticed this when analyzing an error Dany Madden ran into with the
patch set.

Reported-by: Dany Madden <drt@linux.ibm.com>
Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
0d1af4fa71 ibmvnic: init_tx_pools move loop-invariant code
In init_tx_pools() move some loop-invariant code out of the loop.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
8243c7ed6d ibmvnic: Use/rename local vars in init_tx_pools
Use/rename local variables in init_tx_pools() for consistency with
init_rx_pools() and for readability. Also add some comments

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
0df7b9ad8f ibmvnic: Use/rename local vars in init_rx_pools
To make the code more readable, use/rename some local variables.
Basically we have a set of pools, num_pools. Each pool has a set of
buffers, pool_size and each buffer is of size buff_size.

pool_size is a bit ambiguous (whether size in bytes or buffers). Add
a comment in the header file to make it explicit.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
0f2bf3188c ibmvnic: Fix up some comments and messages
Add/update some comments/function headers and fix up some messages.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
38106b2c43 ibmvnic: Consolidate code in replenish_rx_pool()
For better readability, consolidate related code in replenish_rx_pool()
and add some comments.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Aleksander Jan Bajkowski
14d4e308e0 net: lantiq: configure the burst length in ethernet drivers
Configure the burst length in Ethernet drivers. This improves
Ethernet performance by 58%. According to the vendor BSP,
8W burst length is supported by ar9 and newer SoCs.

The NAT benchmark results on xRX200 (Down/Up):
* 2W: 330 Mb/s
* 4W: 432 Mb/s    372 Mb/s
* 8W: 520 Mb/s    389 Mb/s

Tested on xRX200 and xRX330.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Adrian Bunk
52ce14c134 bnx2x: Fix enabling network interfaces without VFs
This function is called to enable SR-IOV when available,
not enabling interfaces without VFs was a regression.

Fixes: 65161c3555 ("bnx2x: Fix missing error code in bnx2x_iov_init_one()")
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reported-by: YunQiang Su <wzssyqa@gmail.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20210912190523.27991-1-bunk@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14 19:26:48 -07:00
Guangbin Huang
5c56ff486d net: hns3: PF support get multicast MAC address space assigned by firmware
The new firmware supports to divides the whole multicast MAC address space
equally to functions of all PFs, and calculates the space size of each PF
according to its function number.

To support this feature, PF driver adds querying multicast MAC address
space size from firmware and limits used number according to space size.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 14:32:29 +01:00
Guangbin Huang
e435a6b531 net: hns3: PF support get unicast MAC address space assigned by firmware
Currently, there are two ways for PF to set the unicast MAC address space
size: specified by config parameters in firmware or set to default value.

That's mean if the config parameters in firmware is zero, driver will
divide the whole unicast MAC address space equally to 8 PFs. However, in
this case, the unicast MAC address space will be wasted a lot when the
hardware actually has less then 8 PFs. And in the other hand, if one PF has
much more VFs than other PFs, then each function of this PF will has much
less address space than other PFs.

In order to ameliorate the above two situations, introduce the third way
of unicast MAC address space assignment: firmware divides the whole unicast
MAC address space equally to functions of all PFs, and calculates the space
size of each PF according to its function number. PF queries the space size
by the querying device specification command when in initialization
process.

The third way assignment is lower priority than specified by config
parameters, only if the config parameters is zero can be used, and if
firmware does not support the third way assignment, then driver still
divides the whole unicast MAC address space equally to 8 PFs.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 14:32:29 +01:00
Heiner Kallweit
01649011cc r8169: remove support for chip version RTL_GIGA_MAC_VER_27
This patch is a follow-up to beb401ec50 ("r8169: deprecate support for
RTL_GIGA_MAC_VER_27") that came with 5.12. Nobody complained, so let's
remove support for this chip version.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 14:13:58 +01:00
Jiri Pirko
cd92d79d5f mlxsw: reg: Remove PMTM register
It is not used anymore, remove it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
32ada69bba mlxsw: spectrum: Use PMTDB register to obtain split info
Newly introduced PMTDB register is there to provide all needed info
about particular requested port split configuration. Use it instead of
figuring the info out manually in the driver.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
78f824b335 mlxsw: reg: Add Port Module To local DataBase Register
The PMTDB register allows to query the possible module<->local port
mapping than can be used in PMLP. It does not represent the actual/current
mapping of the local to module. Actual mapping is only defined by PMLP.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
1dbfc9d765 mlxsw: spectrum: Use PLLP to get front panel number and split number
Instead of relying on the values coming from the PMLP register, use PLLP
to get the information about port front panel number and split number.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
ed403777f6 mlxsw: reg: Add Port Local port to Label Port mapping Register
The PLLP register returns the mapping from Local Port into Label Port.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
fec2386162 mlxsw: spectrum: Move port SWID set before core port init
During port creation, mlxsw_core_port_init() is called with the front
panel port number and the split port sub-number. Currently, this
information is determined by the driver without firmware assistance.

Subsequent patches are going to query this information from firmware,
but this requires the port to assigned to SWID.

Therefore, move port SWID assignment before mlxsw_core_port_init().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
13eb056ee5 mlxsw: spectrum: Move port module mapping before core port init
During port creation, mlxsw_core_port_init() is called with the front
panel port number and the split port sub-number. Currently, this
information is determined by the driver without firmware assistance.

Subsequent patches are going to query this information from firmware,
but this requires the port to be mapped to a module.

Therefore, move port mapping before mlxsw_core_port_init().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiri Pirko
847371ce04 mlxsw: spectrum: Bump minimum FW version to xx.2008.3326
Add latest verified version of Nvidia Spectrum-family switch firmware,
for Spectrum (13.2008.3326), Spectrum-2 (29.2008.3326) and Spectrum-3
(30.2008.3326).

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:44:16 +01:00
Jiaran Zhang
427900d27d net: hns3: fix the timing issue of VF clearing interrupt sources
Currently, the VF does not clear the interrupt source immediately after
receiving the interrupt. As a result, if the second interrupt task is
triggered when processing the first interrupt task, clearing the
interrupt source before exiting will clear the interrupt sources of the
two tasks at the same time. As a result, no interrupt is triggered for
the second task. The VF detects the missed message only when the next
interrupt is generated.

Clearing it immediately after executing check_evt_cause ensures that:
1. Even if two interrupt tasks are triggered at the same time, they can
be processed.
2. If the second task is triggered during the processing of the first
task and the interrupt source is not cleared, the interrupt is reported
after vector0 is enabled.

Fixes: b90fcc5bd9 ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Jiaran Zhang
472430a7b0 net: hns3: fix the exception when query imp info
When the command for querying imp info is issued to the firmware,
if the firmware does not support the command, the returned value
of bd num is 0.
Add protection mechanism before alloc memory to prevent apply for
0-length memory.

Fixes: 0b198b0d80 ("net: hns3: refactor dump m7 info of debugfs")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Yufeng Mo
b81d894874 net: hns3: disable mac in flr process
The firmware will not disable mac in flr process. Therefore, the driver
needs to proactively disable mac during flr, which is the same as the
function reset.

Fixes: 35d93a3004 ("net: hns3: adjust the process of PF reset")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Yufeng Mo
1dc839ec09 net: hns3: change affinity_mask to numa node range
Currently, affinity_mask is set to a single cpu. As a result,
irqbalance becomes invalid in SUBSET or EXACT mode. To solve
this problem, change affinity_mask to numa node range. In this
way, irqbalance can be performed on the cpu of the numa node.

Fixes: 0812545487 ("net: hns3: add interrupt affinity support for misc interrupt")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Yufeng Mo
d18e81183b net: hns3: pad the short tunnel frame before sending to hardware
The hardware cannot handle short tunnel frames below 65 bytes,
and will cause vlan tag missing problem. So pads packet size to
65 bytes for tunnel frames to fix this bug.

Fixes: 3db084d28dc0("net: hns3: Fix for vxlan tx checksum bug")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Yunsheng Lin
f7ec554b73 net: hns3: add option to turn off page pool feature
When page pool is added to the hns3 driver, it is always
enabled unconditionally, which means spilt page handling
in the hns3 driver is dead code.

As there is a requirement to test the performance between
spilt page handling in driver and page pool, so add a module
param to support disabling the page pool.

When the page pool is proved to perform better in most case,
the spilt page handling in driver can be removed.

Fixes: 93188e9642 ("net: hns3: support skb's frag page recycling based on page pool")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 14:25:37 +01:00
Len Baker
9eb4c320be nfp: Prefer struct_size over open coded arithmetic
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

So, use the struct_size() helper to do the arithmetic instead of the
argument "size + count * size" in the kzalloc() function.

[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Signed-off-by: Len Baker <len.baker@gmx.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 13:02:38 +01:00
Shai Malin
f55e36d5ab qed: Improve the stack space of filter_config()
As it was reported and discussed in: https://lore.kernel.org/lkml/CAHk-=whF9F89vsfH8E9TGc0tZA-yhzi2Di8wOtquNB5vRkFX5w@mail.gmail.com/
This patch improves the stack space of qede_config_rx_mode() by
splitting filter_config() to 3 functions and removing the
union qed_filter_type_params.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-13 12:41:32 +01:00