Commit Graph

106067 Commits

Author SHA1 Message Date
Taehee Yoo
cbc21dc1cf amt: add data plane of amt interface
Before forwarding multicast traffic, the amt interface establishes between
gateway and relay. In order to establish, amt defined some message type
and those message flow looks like the below.

                      Gateway                  Relay
                      -------                  -----
                         :        Request        :
                     [1] |           N           |
                         |---------------------->|
                         |    Membership Query   | [2]
                         |    N,MAC,gADDR,gPORT  |
                         |<======================|
                     [3] |   Membership Update   |
                         |   ({G:INCLUDE({S})})  |
                         |======================>|
                         |                       |
    ---------------------:-----------------------:---------------------
   |                     |                       |                     |
   |                     |    *Multicast Data    |  *IP Packet(S,G)    |
   |                     |      gADDR,gPORT      |<-----------------() |
   |    *IP Packet(S,G)  |<======================|                     |
   | ()<-----------------|                       |                     |
   |                     |                       |                     |
    ---------------------:-----------------------:---------------------
                         ~                       ~
                         ~        Request        ~
                     [4] |           N'          |
                         |---------------------->|
                         |   Membership Query    | [5]
                         | N',MAC',gADDR',gPORT' |
                         |<======================|
                     [6] |                       |
                         |       Teardown        |
                         |   N,MAC,gADDR,gPORT   |
                         |---------------------->|
                         |                       | [7]
                         |   Membership Update   |
                         |  ({G:INCLUDE({S})})   |
                         |======================>|
                         |                       |
    ---------------------:-----------------------:---------------------
   |                     |                       |                     |
   |                     |    *Multicast Data    |  *IP Packet(S,G)    |
   |                     |     gADDR',gPORT'     |<-----------------() |
   |    *IP Packet (S,G) |<======================|                     |
   | ()<-----------------|                       |                     |
   |                     |                       |                     |
    ---------------------:-----------------------:---------------------
                         |                       |
                         :                       :

1. Discovery
 - Sent by Gateway to Relay
 - To find Relay unique ip address
2. Advertisement
 - Sent by Relay to Gateway
 - Contains the unique IP address
3. Request
 - Sent by Gateway to Relay
 - Solicit to receive 'Query' message.
4. Query
 - Sent by Relay to Gateway
 - Contains General Query message.
5. Update
 - Sent by  Gateway to Relay
 - Contains report message.
6. Multicast Data
 - Sent by Relay to Gateway
 - encapsulated multicast traffic.
7. Teardown
 - Not supported at this time.

Except for the Teardown message, it supports all messages.

In the next patch, IGMP/MLD logic will be added.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:36:08 +00:00
Taehee Yoo
b9022b53ad amt: add control plane of amt interface
It adds definitions and control plane code for AMT.
this is very similar to udp tunneling interfaces such as gtp, vxlan, etc.
In the next patch, data plane code will be added.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:36:08 +00:00
Jakub Kicinski
a66f64b808 netdevsim: rename 'driver' entry points
Rename functions serving as driver entry points
from nsim_dev_... to nsim_drv_... this makes the
API boundary between bus and dev clearer.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:29:41 +00:00
Jakub Kicinski
a3353ec325 netdevsim: move max vf config to dev
max_vfs is a strange little beast because the file
hangs off of nsim's debugfs, but it configures a field
in the bus device. Move it to dev.c, let's look at it
as if the device driver was imposing VF limit based
on FW info (like pci_sriov_set_totalvfs()).

Again, when moving refactor the function not to hold
the vfs lock pointlessly while parsing the input.
Wrap the access from the read side in READ_ONCE()
to appease concurrency checkers. Do not check if
return value from snprintf() is negative...

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:29:41 +00:00
Jakub Kicinski
1c401078bc netdevsim: move details of vf config to dev
Since "eswitch" configuration was added bus.c contains
a lot of device details which really belong to dev.c.

Restructure the code while moving it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:29:41 +00:00
Jakub Kicinski
5e388f3dc3 netdevsim: move vfconfig to nsim_dev
When netdevsim got split into the faux bus vfconfig ended
up in the bus device (think pci_dev) which is strange because
it contains very networky not to say netdevy information.
Move it to nsim_dev, which is the driver "priv" structure
for the device.

To make sure we don't race with probe/remove take
the device lock (much like PCI).

While at it remove the NULL-checking of vfconfigs.
It appears to be pointless.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:29:41 +00:00
Jakub Kicinski
26c37d89f6 netdevsim: take rtnl_lock when assigning num_vfs
Legacy VF NDOs look at num_vfs and then based on that
index into vfconfig. If we don't rtnl_lock() num_vfs
may get set to 0 and vfconfig freed/replaced while
the NDO is running.

We don't need to protect replacing vfconfig since it's
only done when num_vfs is 0.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:29:41 +00:00
Dexuan Cui
635096a86e net: mana: Support hibernation and kexec
Implement the suspend/resume/shutdown callbacks for hibernation/kexec.

Add mana_gd_setup() and mana_gd_cleanup() for some common code, and
use them in the mand_gd_* callbacks.

Reuse mana_probe/remove() for the hibernation path.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
62ea8b77ed net: mana: Improve the HWC error handling
Currently when the HWC creation fails, the error handling is flawed,
e.g. if mana_hwc_create_channel() -> mana_hwc_establish_channel() fails,
the resources acquired in mana_hwc_init_queues() is not released.

Enhance mana_hwc_destroy_channel() to do the proper cleanup work and
call it accordingly.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
3c37f35735 net: mana: Report OS info to the PF driver
The PF driver might use the OS info for statistical purposes.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
6c7ea69653 net: mana: Fix the netdev_err()'s vPort argument in mana_init_port()
Use the correct port index rather than 0.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Yu Xiao
f7536ffb09 nfp: flower: Allow ipv6gretap interface for offloading
The tunnel_type check only allows for "netif_is_gretap", but for
OVS the port is actually "netif_is_ip6gretap" when setting up GRE
for ipv6, which means offloading request was rejected before.

Therefore, adding "netif_is_ip6gretap" allow ipv6gretap interface
for offloading.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:09:55 +00:00
David S. Miller
ebed1cf5b8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-10-29

This series contains updates to ice and iavf drivers and virtchnl header
file.

Brett removes vlan_promisc argument from a function call for ice driver.
In the virtchnl header file he removes an unused, reserved define and
converts raw value defines to instead use the BIT macro.

Marcin adds syncing of MAC addresses when creating switchdev VFs to
remove error messages on link up and stops showing buffer information
for port representors to remove duplicated entries being displayed for
ice driver.

Karen introduces a helper to go from pci_dev to iavf_adapter in the
iavf driver.

Przemyslaw fixes an issue where iavf was attempting to free IRQs before
calling disable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:05:20 +00:00
David S. Miller
2aec919f8d mlx5-updates-2021-10-29
1) Minor trivial refactoring and improvements
 2) Check for unsupported parameters fields in SW steering
 3) Support TC offload for OVS internal port, from Ariel, see below.
 
 Ariel Levkovich says:
 
 =====================
 
 Support HW offload of TC rules involving OVS internal port
 device type as the filter device or the destination
 device.
 
 The support is for flows which explicitly use the internal
 port as source or destination device as well as indirect offload
 for flows performing tunnel set or unset via a tunnel device
 and the internal port is the tunnel overlay device.
 
 Since flows with internal port as source port are added
 as egress rules while redirecting to internal port is done
 as an ingress redirect, the series introduces the necessary
 changes in mlx5_core driver to support the new types of flows
 and actions.
 
 =====================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmF8X0sACgkQSD+KveBX
 +j5PsQf/RfsE+spW0yJriJQ6Et+o+/CYR+AQYat5MaXjRw8uMz6uBcfXWCIBbYjw
 OwNP4ZagWXIHMkelj2Ap0Qlu4yqkUBy1A0le7HcAzOeje1vc9BObS15w9pJvQ9cp
 br3ZK5VZnQccSfF/LQpSjlGhD9083kETA2uXlCz7vitn8MVaya6ue6GU+wFC4Wnz
 LjOJ4PMXCEfhpA+efD0nD4EK6FJjqvJoVQkxWNmgOW7yg5PcyWXZD/tsDZUI8DGl
 0GlnM6W2H8bC0YhW01cnOsWPU+vtMLCsaF0YKqsLhnWUsaYSD5lXPIHqH6VpucZ7
 LSv/c2U9pBnWkf7UoFyuEeQxAhz1rg==
 =Uxa9
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-10-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-10-29

1) Minor trivial refactoring and improvements
2) Check for unsupported parameters fields in SW steering
3) Support TC offload for OVS internal port, from Ariel, see below.

Ariel Levkovich says:

=====================

Support HW offload of TC rules involving OVS internal port
device type as the filter device or the destination
device.

The support is for flows which explicitly use the internal
port as source or destination device as well as indirect offload
for flows performing tunnel set or unset via a tunnel device
and the internal port is the tunnel overlay device.

Since flows with internal port as source port are added
as egress rules while redirecting to internal port is done
as an ingress redirect, the series introduces the necessary
changes in mlx5_core driver to support the new types of flows
and actions.

=====================

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 12:53:24 +00:00
Jakub Kicinski
6d40edcf4e Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-10-29

This series contains updates to igc driver only.

Sasha removes an unnecessary media type check, adds a new device ID, and
changes a device reset to a port reset command.
====================

Link: https://lore.kernel.org/r/20211029174101.2970935-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29 21:23:18 -07:00
Leon Romanovsky
d269287761 bnxt_en: Remove not used other ULP define
There is only one bnxt ULP in the upstream kernel and definition
for other ULP can be safely removed.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/3a8ea720b28ec4574648012d2a00208f1144eff5.1635527693.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29 21:21:09 -07:00
Jakub Kicinski
5c59579100 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-10-29

This series contains updates to i40e, ice, igb, and ixgbevf drivers.

Yang Li simplifies return statements of bool values for i40e and ice.

Jan Kundrát corrects problems with I2C bit-banging for igb.

Colin Ian King removes unneeded variable initialization for ixgbevf.
====================

Link: https://lore.kernel.org/r/20211029164641.2714265-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29 21:19:30 -07:00
Jakub Kicinski
ba064e4cf9 netdevsim: remove max_vfs dentry
Commit d395381909 ("netdevsim: Add max_vfs to bus_dev")
added this file and saved the dentry for no apparent reason.

Link: https://lore.kernel.org/r/20211028211753.22612-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29 21:15:08 -07:00
Ariel Levkovich
b16eb3c81f net/mlx5: Support internal port as decap route device
When performing route device lookup for decap action, support
the case of ovs internal port as the lookup result.

In such case, an internal port struct is mapped and attached
to the flow attributes so that the source port matching of the
rule will match on the internal port's metadata value.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:31 -07:00
Ariel Levkovich
5e99427217 net/mlx5e: Term table handling of internal port rules
Adjust termination table logic to handle rules which
involve internal port as filter or forwarding device.

For cases where the rule forwards from internal port
to uplink, always choose to go via termination table.
This is because it is not known from where the packet
originally arrived to the internal port and it is possible
that it came from the uplink itself, in which case
a term table is required to perform hairpin.
If the packet arrived from a vport, going via term
table has no effect.

For cases where the rule forwards to an internal port
from uplink the rep pointer will point to the uplink rep,
avoid going via termination table as it is not required.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:31 -07:00
Ariel Levkovich
166f431ec6 net/mlx5e: Add indirect tc offload of ovs internal port
Register callbacks for tc blocks of ovs internal port devices.

This allows an indirect offloading rules that apply on
such devices as the filter device.

In case a rule is added to a tc block of an internal port,
the mlx5 driver will implicitly add a matching on the internal
port's unique vport metadata value to the rule's matching list.
Therefore, only packets that previously hit a rule that redirects
to an internal port and got the vport metadata overwritten to the
internal port's unique metadata, can match on such indirect rule.

Offloading of both ingress and egress tc blocks of internal ports
is supported as opposed to other devices where only ingress block
offloading is supported.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:30 -07:00
Ariel Levkovich
100ad4e2d7 net/mlx5e: Offload internal port as encap route device
When pefroming encap action, a route lookup is performed
to find the routing device the packet should be forwarded
to after the encapsulation. This is the device that has the
local tunnel ip address.

This change adds support to offload an encap rule where the
route device ends up being an ovs internal port.
In such case, the driver will add a HW rule that will encapsulate
the packet with the tunnel header and will overwrite the vport
metadata in reg_c0 to the internal port metadata value.
Finally, the packet will be forwarded to the root table to be
processed again with the indication that it came from an internal
port.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:30 -07:00
Ariel Levkovich
27484f7170 net/mlx5e: Offload tc rules that redirect to ovs internal port
Allow offloading rules that redirect to ovs internal port
ingress and egress.

To support redirect to ingress device, offloading of REDIRECT_INGRESS
action is added.

When a tc rule redirects to ovs internal port, the hw rule will
overwrite the input vport value in reg_c0 with a new vport metadata
value that is mapped for this internal port using the internal
port mapping api that is introduce in previous patches.
After that the hw rule will redirect the packet to the root table
to continue processing with the new vport metadata value.

The new vport metadata value indicates that this packet is now
arriving through an internal port and therefore should be processed
using rules that apply on the same internal port as the filter device.
Therefore, following rules that apply on this internal port will have
to match on the same vport metadata value as part of their matching
keys to make sure the packet belongs to the internal port.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:30 -07:00
Ariel Levkovich
dbac71f229 net/mlx5e: Accept action skbedit in the tc actions list
Setting the skb packet type field to host is usually
done when performing forwarding to ingress device.

This is required since the receive handling that is used
by the redirect to ingress action checks whether the packet
doesn't belong to this host and drops the packet in such case.

In order to be able to offload action redirect ingress, tc offload
code needs to accept the skbedit ptype action as well.

There's no special handling in HW for such action since it will
be followed by a redirect action and therefore, this code
only allows us to accept such action in the actions list but
not performing anything specific in HW for it.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:29 -07:00
Ariel Levkovich
4f4edcc2b8 net/mlx5: E-Switch, Add ovs internal port mapping to metadata support
Adding infrastructure to map ovs internal port device to vport
match metadata to support offload of rules with internal port as
the filter device or as the destination device.

The infrastructure allows adding and removing internal port device
to an eswitch database and getting a unique vport metadata value to
be placed and match on in reg_c0 when offloading rules that are coming
from or going to an internal port.

The new int port metadata can be written to the source port register
in HW to indicate that current source port of the packet is the
internal port and not one of the actual HW vports (uplink or VF).
Using this method, it is possible to offload TC rules with an OVS
internal port as their destination port (overwriting the src vport
register) or as the filter port (matching on the value of the src
vport register and making sure it matches to the internal port's
value).

There is also a need to handle a miss case where the packet's
src port value was changed in HW to an internal port but a following
rule which matches on this new src port value wasn't found in HW.

In such case, the packet will be forwarded to the driver with
metadata which allows driver to restore the info of the internal
port's netdevice. Once this info is restored, the uplink driver
can forward the packet to the relevant netdevice in SW.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:29 -07:00
Ariel Levkovich
189ce08ebf net/mlx5e: Use generic name for the forwarding dev pointer
Rename tun_dev to fwd_dev within mlx5e_tc_update_priv struct
since future implementation may introduce other device types
which the handler is forwarding to.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:29 -07:00
Ariel Levkovich
28e7606fa8 net/mlx5e: Refactor rx handler of represetor device
Move the ownership of skb forwarding to network stack to the
tc update_skb handler as different cases will require different
handling of the skb.

While the tc handler will take care of the various cases and
properly handle the handover of the skb to the network stack
and freeing the skb, the main rx handler will be kept clean
from branches and usage of flags.

Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:29 -07:00
Muhammad Sammar
941f19798a net/mlx5: DR, Add check for unsupported fields in match param
When a matcher is being built, we "consume" (clear) mask fields one by one,
and to verify that we do support all the required fields we check if the
whole mask was consumed, else the matching request includes unsupported
fields.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2021-10-29 13:53:28 -07:00
Paul Blakey
504e157248 net/mlx5: Allow skipping counter refresh on creation
CT creates a counter for each CT rule, and for each such counter,
fs_counters tries to queue mlx5_fc_stats_work() work again via
mod_delayed_work(0) call to refresh all counters. This call has a
large performance impact when reaching high insertion rate and
accounts for ~8% of the insertion time when using software steering.

Allow skipping the refresh of all counters during counter creation.
Change CT to use this refresh skipping for it's counters.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:28 -07:00
Raed Salem
428ffea071 net/mlx5e: IPsec: Refactor checksum code in tx data path
Part of code that is related solely to IPsec is always compiled in the
driver code regardless if the IPsec functionality is enabled or disabled
in the driver code, this will add unnecessary branch in case IPsec is
disabled at Tx data path.

Move IPsec related code to IPsec related file such that in case of IPsec
is disabled and because of unlikely macro the compiler should be able to
optimize and omit the checksum IPsec code all together from Tx data path

Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:28 -07:00
Paul Blakey
ae2ee3be99 net/mlx5: CT: Remove warning of ignore_flow_level support for VFs
ignore_flow_level isn't supported for VFs, and so it causes
post_act and ct to warn about it.

Instead of disabling CT for VFs, and a driver update will be need
to enable CT again once firmware support this, remove this warning
specifically for VFs. This way, it could be automatically enabled on
future firmwares where VFs support ignore_flow_level capability.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:27 -07:00
Nathan Chancellor
1aec85974a net/mlx5: Add esw assignment back in mlx5e_tc_sample_unoffload()
Clang warns:

drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c:635:34: error: variable 'esw' is uninitialized when used here [-Werror,-Wuninitialized]
        mlx5_eswitch_del_offloaded_rule(esw, sample_flow->pre_rule, sample_flow->pre_attr);
                                        ^~~
drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c:626:26: note: initialize the variable 'esw' to silence this warning
        struct mlx5_eswitch *esw;
                                ^
                                 = NULL
1 error generated.

It appears that the assignment should have been shuffled instead of
removed outright like in mlx5e_tc_sample_offload(). Add it back so there
is no use of esw uninitialized.

Fixes: a64c5edbd2 ("net/mlx5: Remove unnecessary checks for slow path flag")
Link: https://github.com/ClangBuiltLinux/linux/issues/1494
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29 13:53:27 -07:00
Przemyslaw Patynowski
605ca7c5c6 iavf: Fix kernel BUG in free_msi_irqs
Fix driver not freeing VF's traffic irqs, prior to calling
pci_disable_msix in iavf_remove.
There were possible 2 erroneous states in which, iavf_close would
not be called.
One erroneous state is fixed by allowing netdev to register, when state
is already running. It was possible for VF adapter to enter state loop
from running to resetting, where iavf_open would subsequently fail.
If user would then unload driver/remove VF pci, iavf_close would not be
called, as the netdev was not registered, leaving traffic pcis still
allocated.
Fixed this by breaking loop, allowing netdev to open device when adapter
state is __IAVF_RUNNING and it is not explicitily downed.
Other possiblity is entering to iavf_remove from __IAVF_RESETTING state,
where iavf_close would not free irqs, but just return 0.
Fixed this by checking for last adapter state and then removing irqs.

Kernel panic:
[ 2773.628585] kernel BUG at drivers/pci/msi.c:375!
...
[ 2773.631567] RIP: 0010:free_msi_irqs+0x180/0x1b0
...
[ 2773.640939] Call Trace:
[ 2773.641572]  pci_disable_msix+0xf7/0x120
[ 2773.642224]  iavf_reset_interrupt_capability.part.41+0x15/0x30 [iavf]
[ 2773.642897]  iavf_remove+0x12e/0x500 [iavf]
[ 2773.643578]  pci_device_remove+0x3b/0xc0
[ 2773.644266]  device_release_driver_internal+0x103/0x1f0
[ 2773.644948]  pci_stop_bus_device+0x69/0x90
[ 2773.645576]  pci_stop_and_remove_bus_device+0xe/0x20
[ 2773.646215]  pci_iov_remove_virtfn+0xba/0x120
[ 2773.646862]  sriov_disable+0x2f/0xe0
[ 2773.647531]  ice_free_vfs+0x2f8/0x350 [ice]
[ 2773.648207]  ice_sriov_configure+0x94/0x960 [ice]
[ 2773.648883]  ? _kstrtoull+0x3b/0x90
[ 2773.649560]  sriov_numvfs_store+0x10a/0x190
[ 2773.650249]  kernfs_fop_write+0x116/0x190
[ 2773.650948]  vfs_write+0xa5/0x1a0
[ 2773.651651]  ksys_write+0x4f/0xb0
[ 2773.652358]  do_syscall_64+0x5b/0x1a0
[ 2773.653075]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: 22ead37f8a ("i40evf: Add longer wait after remove module")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 13:11:53 -07:00
Karen Sornek
247aa001b7 iavf: Add helper function to go from pci_dev to adapter
Add helper function to go from pci_dev to adapter to make work simple -
to go from a pci_dev to the adapter structure and make netdev assignment
instead of having to go to the net_device then the adapter.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 13:11:53 -07:00
Marcin Szycik
bfaaba99e6 ice: Hide bus-info in ethtool for PRs in switchdev mode
Disable showing bus-info information for port representors in switchdev
mode. This fixes a bug that caused displaying wrong netdev descriptions in
lshw tool - one port representor displayed PF branding string, and in turn
one PF displayed a "generic" description. The bug occurs when many devices
show the same bus-info in ethtool, which was the case in switchdev mode (PF
and its port representors displayed the same bus-info). The bug occurs only
if a port representor netdev appears before PF netdev in /proc/net/dev.

In the examples below:
ens6fX is PF
ens6fXvY is VF
ethX is port representor
One irrelevant column was removed from output

Before:
$ sudo lshw -c net -businfo
Bus info          Device      Description
=========================================
pci@0000:02:00.0  eth102       Ethernet Controller E810-XXV for SFP
pci@0000:02:00.1  ens6f1       Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0  ens6f0v0     Ethernet Adaptive Virtual Function
pci@0000:02:01.1  ens6f0v1     Ethernet Adaptive Virtual Function
pci@0000:02:01.2  ens6f0v2     Ethernet Adaptive Virtual Function
pci@0000:02:00.0  ens6f0       Ethernet interface

Notice that eth102 and ens6f0 have the same bus-info and their descriptions
are swapped.

After:
$ sudo lshw -c net -businfo
Bus info          Device      Description
=========================================
pci@0000:02:00.0  ens6f0      Ethernet Controller E810-XXV for SFP
pci@0000:02:00.1  ens6f1      Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0  ens6f0v0    Ethernet Adaptive Virtual Function
pci@0000:02:01.1  ens6f0v1    Ethernet Adaptive Virtual Function
pci@0000:02:01.2  ens6f0v2    Ethernet Adaptive Virtual Function

Fixes: 7aae80cef7 ("ice: add port representor ethtool ops and stats")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 11:43:15 -07:00
Marcin Szycik
c79bb28e19 ice: Clear synchronized addrs when adding VFs in switchdev mode
When spawning VFs in switchdev mode, internal filter list of VSIs is
cleared, which includes MAC rules. However MAC entries stay on netdev's
multicast list, which causes error message when bringing link up after
spawning VFs ("Failed to delete MAC filters"). __dev_mc_sync() is
called and tries to unsync addresses that were already removed
internally when adding VFs.

This can be reproduced with:
1) Load ice driver
2) Change PF to switchdev mode
3) Bring PF link up
4) Bring PF link down
5) Create a VF on PF
6) Bring PF link up

Added clearing of netdev's multicast (and also unicast) list when
spawning VFs in switchdev mode, so the state of internal rule list and
netdev's MAC list is consistent.

Fixes: 1a1c40df2e ("ice: set and release switchdev environment")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 10:56:24 -07:00
Brett Creeley
29e71f41e7 ice: Remove boolean vlan_promisc flag from function
Currently, the vlan_promisc flag is used exclusively by VF VSI to
determine whether or not to toggle VLAN pruning along with
trusted/true-promiscuous mode. This is not needed for a couple of
reasons. First, trusted/true-promiscuous mode is only supposed to allow
all MAC filters within VLANs that a VF has added filters for, so VLAN
pruning should not be disabled. Second, the boolean argument makes the
function confusing and unintuitive. Remove this flag.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 10:48:16 -07:00
Sasha Neftin
e377a063e2 igc: Change Device Reset to Port Reset
The _reset_hw_base method switched from port reset (CTRL[26]) to device
reset (CTRL[29]) since the FW was receiving an interrupt on CTRL[29].
FW code was later modified to also receive an interrupt on CTRL[26].
Since certain HW values are not reset to default by CTRL[29], we go back
to CTRL[26] for the HW reset, as it meets all current requirements.

This reverts commit bb4265ec24 ("igc: Update the MAC reset flow").

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 10:36:58 -07:00
Sasha Neftin
8f20571db5 igc: Add new device ID
Add new device ID for the next step of the silicon and
reflect the I226_LMVP part.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 09:51:26 -07:00
Sasha Neftin
8643d0b6b3 igc: Remove media type checking on the PHY initialization
i225 devices only have copper phy media type. There is no point in
checking phy media type during the phy initialization. This patch cleans
up a pointless check.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 09:51:21 -07:00
Colin Ian King
1b9abade3e net: ixgbevf: Remove redundant initialization of variable ret_val
The variable ret_val is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 09:42:59 -07:00
Jan Kundrát
a97f8783a9 igb: unbreak I2C bit-banging on i350
The driver tried to use Linux' native software I2C bus master
(i2c-algo-bits) for exporting the I2C interface that talks to the SFP
cage(s) towards userspace. As-is, however, the physical SCL/SDA pins
were not moving at all, staying at logical 1 all the time.

The main culprit was the I2CPARAMS register where igb was not setting
the I2CBB_EN bit. That meant that all the careful signal bit-banging was
actually not being propagated to the chip pads (I verified this with a
scope).

The bit-banging was not correct either, because I2C is supposed to be an
open-collector bus, and the code was driving both lines via a totem
pole. The code was also trying to do operations which did not make any
sense with the i2c-algo-bits, namely manipulating both SDA and SCL from
igb_set_i2c_data (which is only supposed to set SDA). I'm not sure if
that was meant as an optimization, or was just flat out wrong, but given
that the i2c-algo-bits is set up to work with a totally dumb GPIO-ish
implementation underneath, there's no need for this code to be smart.

The open-drain vs. totem-pole is fixed by the usual trick where the
logical zero is implemented via regular output mode and outputting a
logical 0, and the logical high is implemented via the IO pad configured
as an input (thus floating), and letting the mandatory pull-up resistors
do the rest. Anything else is actually wrong on I2C where all devices
are supposed to have open-drain connection to the bus.

The missing I2CBB_EN is set (along with a safe initial value of the
GPIOs) just before registering this software I2C bus.

The chip datasheet mentions HW-implemented I2C transactions (SFP EEPROM
reads and writes) as well, but I'm not touching these for simplicity.

Tested on a LR-Link LRES2203PF-2SFP (which is an almost-miniPCIe form
factor card, a cable, and a module with two SFP cages). There was one
casualty, an old broken SFP we had laying around, which was used to
solder some thin wires as a DIY I2C breakout. Thanks for your service.
With this patch in place, I can `i2cdump -y 3 0x51 c` and read back data
which make sense. Yay.

Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
See-also: https://www.spinics.net/lists/netdev/msg490554.html
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 09:42:59 -07:00
Yang Li
3c6f3ae3bb intel: Simplify bool conversion
Fix the following coccicheck warning:
./drivers/net/ethernet/intel/i40e/i40e_xsk.c:229:35-40: WARNING:
conversion to bool not needed here
./drivers/net/ethernet/intel/ice/ice_xsk.c:399:35-40: WARNING:
conversion to bool not needed here

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29 09:42:33 -07:00
Jakub Kicinski
28131d896d wireless-drivers-next patches for v5.16
Fourth set of patches for v5.16. Mostly fixes this time, wcn36xx and
 iwlwifi have some new features but nothing really out of ordinary. We
 have one conflict with kspp tree.
 
 Conflicts:
 
 kspp tree has a conflict in drivers/net/wireless/intel/iwlwifi/fw/api/tx.h:
 
 https://lkml.kernel.org/r/20211028192934.01520d7e@canb.auug.org.au
 
 Major changes:
 
 ath11k
 
 * fix QCA6390 A-MSDU handling (CVE-2020-24588)
 
 wcn36xx
 
 * enable hardware scan offload for 5Ghz band
 
 * add missing 5GHz channels 136 and 144
 
 iwlwifi
 
 * support a new ACPI table revision
 
 * improvements in the device selection code
 
 * new hardware support
 
 * support for WiFi 6E enablement via BIOS
 
 * support firmware API version 67
 
 * support for 160MHz in ranging measurements
 -----BEGIN PGP SIGNATURE-----
 
 iQFJBAABCgAzFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmF7+wUVHGt2YWxvQGNv
 ZGVhdXJvcmEub3JnAAoJEG4XJFUm622bRGoH/0XrfEwzH4iR3j/xRPTBMJjBZO/Z
 ZN0PQ2L8402c7iG9M0psSFOdqFe8vtKzuuV367ifQzwmGFoQIzAckL9nCA1yFUBg
 EXF3nP0/mb8R0w6rzjkLUQ9EejLLeX35Kh6B1oLgglPdLvE+Yv6/Iqs1yOHwHZFF
 RBHqyBc9YcQBuhah4JPqIXr8tUCkRgWwK/VvCwhoGDWimBlH7LhgOHTFQhVJ+Z8b
 /RsLS2iIv4wGkiQX9cwilmX3QSz/jK0od1FXHW347v+nm96Fs3d2F1juAyhE6io2
 xN1qzEU4SQ90byxhCsBOvZC6JTpRJvw49XME76MrzaLP+FuZ8I2zdAe7wd8=
 =Y1xX
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-2021-10-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.16

Fourth set of patches for v5.16. Mostly fixes this time, wcn36xx and
iwlwifi have some new features but nothing really out of ordinary.
We have one conflict with kspp tree.

Major changes:

ath11k
 * fix QCA6390 A-MSDU handling (CVE-2020-24588)

wcn36xx
 * enable hardware scan offload for 5Ghz band
 * add missing 5GHz channels 136 and 144

iwlwifi
 * support a new ACPI table revision
 * improvements in the device selection code
 * new hardware support
 * support for WiFi 6E enablement via BIOS
 * support firmware API version 67
 * support for 160MHz in ranging measurements

====================

Link: https://lore.kernel.org/r/20211029134707.DE2B0C4360D@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-29 08:58:40 -07:00
Arnd Bergmann
7444d706be ifb: fix building without CONFIG_NET_CLS_ACT
The driver no longer depends on this option, but it fails to
build if it's disabled because the skb->tc_skip_classify is
hidden behind an #ifdef:

drivers/net/ifb.c:81:8: error: no member named 'tc_skip_classify' in 'struct sk_buff'
                skb->tc_skip_classify = 1;

Use the same #ifdef around the assignment.

Fixes: 046178e726 ("ifb: Depend on netfilter alternatively to tc")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 14:01:11 +01:00
Volodymyr Mytnyk
bb5dbf2cc6 net: marvell: prestera: add firmware v4.0 support
Add firmware (FW) version 4.0 support for Marvell Prestera
driver.

Major changes have been made to new v4.0 FW ABI to add support
of new features, introduce the stability of the FW ABI and ensure
better forward compatibility for the future driver vesrions.

Current v4.0 FW feature set support does not expect any changes
to ABI, as it was defined and tested through long period of time.
The ABI may be extended in case of new features, but it will not
break the backward compatibility.

ABI major changes done in v4.0:
- L1 ABI, where MAC and PHY API configuration are split.
- ACL has been split to low-level TCAM and Counters ABI
  to provide more HW ACL capabilities for future driver
  versions.

To support backward support, the addition compatibility layer is
required in the driver which will have two different codebase under
"if FW-VER elif FW-VER else" conditions that will be removed
in the future anyway, So, the idea was to break backward support
and focus on more stable FW instead of supporting old version
with very minimal and limited set of features/capabilities.

Improve FW msg validation:
 * Use __le64, __le32, __le16 types in msg to/from FW to
   catch endian mismatch by sparse.
 * Use BUILD_BUG_ON for structures sent/recv to/from FW.

Co-developed-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:55:29 +01:00
Jean Sacren
5bd663212f net: bareudp: fix duplicate checks of data[] expressions
Both !data[IFLA_BAREUDP_PORT] and !data[IFLA_BAREUDP_ETHERTYPE] are
checked.  We should remove the checks of data[IFLA_BAREUDP_PORT] and
data[IFLA_BAREUDP_ETHERTYPE] that follow since they are always true.

Put both statements together in group and balance the space on both
sides of '=' sign.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:41:28 +01:00
Jean Sacren
c4cb8d0ac7 net: netxen: fix code indentation
Remove additional character in the source to properly indent if branch.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:34:42 +01:00
Yuiko Oshino
a1f1627540 net: ethernet: microchip: lan743x: Increase rx ring size to improve rx performance
Increase the rx ring size (LAN743X_RX_RING_SIZE) to improve rx performance on some platforms.
Tested on x86 PC with EVB-LAN7430.
The iperf3.7 TCPIP improved from 881 Mbps to 922 Mbps, and UDP improved from 817 Mbps to 936 Mbps.

Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 13:30:20 +01:00
David S. Miller
704bc986ff Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-10-28

This series contains updates to ice driver only.

Michal adds support for eswitch drop and redirect filters from and to
tunnel devices. From meaning from uplink to VF and to means from VF to
uplink. This is accomplished by adding support for indirect TC tunnel
notifications and adding appropriate training packets and match fields
for UDP tunnel headers. He also adds returning virtchannel responses for
blocked operations as returning a response is still needed.

Marcin sets netdev min and max MTU values on port representors to allow
for MTU changes over default values.

Brett adds detecting and reporting of PHY firmware load issues for devices
which support this.

Nathan Chancellor fixes a clang warning for implicit fallthrough.

Wang Hai fixes a return value for failed allocation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29 12:28:11 +01:00