linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-26 05:34:13 +08:00

Author	SHA1	Message	Date
Jakub Kicinski	be7bbea114	net/tls: use the full sk_proto pointer Since we already have the pointer to the full original sk_proto stored use that instead of storing all individual callback pointers as well. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:49:49 +02:00
Dave Taht	842841ece5	Convert usage of IN_MULTICAST to ipv4_is_multicast IN_MULTICAST's primary intent is as a uapi macro. Elsewhere in the kernel we use ipv4_is_multicast consistently. This patch unifies linux's multicast checks to use that function rather than this macro. Signed-off-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:38:32 +02:00
Colin Ian King	9367fa0841	net/sched: cbs: remove redundant assignment to variable port_rate Variable port_rate is being initialized with a value that is never read and is being re-assigned a little later on. The assignment is redundant and hence can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:37:02 +02:00
David S. Miller	e7ac4ea0fe	Merge branch 'ionic-Add-ionic-driver' Shannon Nelson says: ==================== ionic: Add ionic driver This is a patch series that adds the ionic driver, supporting the Pensando ethernet device. In this initial patchset we implement basic transmit and receive. Later patchsets will add more advanced features. Our thanks to Saeed Mahameed, David Miller, Andrew Lunn, Michal Kubecek, Jacub Kicinski, Jiri Pirko, Yunsheng Lin, and the ever present kbuild test robots for their comments and suggestions. New in v7: - stop Tx queue if no descriptor space left after a Tx - return ETIMEDOUT if the module data can't be copied out safely - remove unnecessary synchronize_irq() before free_irq() - use eth_prepare_mac_addr_change() and eth_commit_mac_addr_change() helpers - propagate error out of ionic_dl_info_get() New in v6: - added a new patch with devlink info tags for ASIC and general FW - use the new devlink info tags in the driver - fixed up TxRx cleanup on setup failure - allow for possible 0 address from dma mapping of Tx buffers - remove a few more unnecessary debugfs error checks - use innocuous hardcoded strings in the identify message - removed a couple of unused functions and definitions - fix a leak in the error handling of port_info setup - changed from BUILD_BUG_ON() to static_assert() New in v5: - code reorganized for more sane layout, with a side benefit of getting rid of a "defined but not used" complaint after patch 5 - added "ionic_" prefix to struct definitions and fixed up remaining reverse christmas tree formatting (I think I got them all...) - ndo_open and ndo_stop reworked for better error recovery - interrupt coalescing enabled at driver start - unnecessary log messaging removed from events - double copy added in the module prom read to assure a clean copy - added BQL counting - fixed a TSO unmap issue found in testing - generalize a bit-flag wait with timeout - added devlink into earlier code and dropped patch 19 New in v4: - use devlink struct alloc for ionic device specific struct - add support for devlink_port - fixup devlink fixed vs running version usage - use bitmap_copy() instead of memcpy() for link_ksettings - don't bother to zero out the advertising bits before copying in the support bits - drop unknown xcvr types (will be expanded on later) - flap the connection to force auto-negotiation - use is_power_of_2() rather than open code - simplify set/get_pauseparam use of pause->autoneg - add a couple comments about NIC status data updated in DMA spaces New in v3: - use le32_to_cpu() on queue_count[] values in debugfs - dma_free_coherent() can handle NULL pointers - remove unused SS_TEST from ethtool handlers - one more case of stop the tx ring if there is no room - remove a couple of stray // comments New in v2: - removed debugfs error checking and cut down on debugfs use - remove redundant bounds checking on incoming values for mtu and ethtool - don't alloc rx_filter memory until the match type has been checked - free the ionic struct on remove - simplified link_up and netif_carrier_ok comparison - put stats into ethtool -S, out of debugfs - moved dev_cmd and dev_info dumping to ethtool -d, out of debugfs - added devlink support - used kernel's rss init routines rather than open code - set the Kbuild dependant on 64BIT - cut down on some unnecessary log messaging - cleaned up ionic_get_link_ksettings - cleaned up other little code bits here and there ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	8c15440bce	ionic: Add coalesce and other features Interrupt coalescing, tunable copybreak value, and tx timeout. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	aa3198819b	ionic: Add RSS support Add code to manipulate through ethtool the RSS configuration used by the NIC. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	e470355bd9	ionic: Add driver stats Add in the detailed statistics for ethtool -S that the driver keeps as it processes packets. Display of the additional debug statistics can be enabled through the ethtool priv-flags feature. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	1a371ea1b7	ionic: Add netdev-event handling When the netdev gets a new name from userland, pass that name down to the NIC for internal tracking. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	0f3154e6bc	ionic: Add Tx and Rx handling Add both the Tx and Rx queue setup and handling. The related stats display comes later. Instead of using the generic napi routines used by the slow-path commands, the Tx and Rx paths are simplified and inlined in one file in order to get better compiler optimizations. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	4d03e00a21	ionic: Add initial ethtool support Add in the basic ethtool callbacks for device information and control. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	8d61aad4e8	ionic: Add async link status check and basic stats Add code to handle the link status event, and wire up the basic netdev hardware stats. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	2a654540be	ionic: Add Rx filter and rx_mode ndo support Add the Rx filtering and rx_mode NDO callbacks. Also add the deferred work thread handling needed to manage the filter requests outside of the netif_addr_lock spinlock. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:44 +02:00
Shannon Nelson	c1e329ebec	ionic: Add management of rx filters Set up the infrastructure for managing Rx filters. We can't ask the hardware for what filters it has, so we keep a local list of filters that we've pushed into the HW. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	beead698b1	ionic: Add the basic NDO callbacks for netdev support Set up the initial NDO structure and callbacks for netdev to use, and register the netdev. This will allow us to do a few basic operations on the device, but no traffic yet. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	77ceb68e29	ionic: Add notifyq support The AdminQ is fine for sending messages and requests to the NIC, but we also need to have events published from the NIC to the driver. The NotifyQ handles this for us, using the same interrupt as AdminQ. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	938962d552	ionic: Add adminq action Add AdminQ specific message requests and completion handling. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	1d062b7b6f	ionic: Add basic adminq support Most of the NIC configuration happens through the AdminQ message queue. NAPI is used for basic interrupt handling and message queue management. These routines are set up to be shared among different types of queues when used in slow-path handling. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	6461b446f2	ionic: Add interrupts and doorbells The ionic interrupt model is based on interrupt control blocks accessed through the PCI BAR. Doorbell registers are used by the driver to signal to the NIC that requests are waiting on the message queues. Interrupts are used by the NIC to signal to the driver that answers are waiting on the completion queues. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	1a58e19646	ionic: Add basic lif support The LIF is the Logical Interface, which represents the external connections. The NIC can multiplex many LIFs to a single port, but in most setups, LIF0 is the primary control for the port. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	04436595c4	ionic: Add port management commands The port management commands apply to the physical port associated with the PCI device, which might be shared among several logical interfaces. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	fbfb803153	ionic: Add hardware init and device commands The ionic device has a small set of PCI registers, including a device control and data space, and a large set of message commands. Also adds new DEVLINK_INFO_VERSION_GENERIC tags for ASIC_ID, ASIC_REV, and FW. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	df69ba4321	ionic: Add basic framework for IONIC Network device driver This patch adds a basic driver framework for the Pensando IONIC network device. There is no functionality right now other than the ability to load and unload. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
Shannon Nelson	7d5aa9a524	devlink: Add new info version tags for ASIC and FW The current tag set is still rather small and needs a couple more tags to help with ASIC identification and to have a more generic FW version. Cc: Jiri Pirko <jiri@resnulli.us> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 09:24:43 +02:00
David S. Miller	0d622143d1	Merge branch 'net-dsa-mt7530-PHYLINK-and-port-5' René van Dorst says: ==================== net: dsa: mt7530: Convert to PHYLINK and add support for port 5 1. net: dsa: mt7530: Convert to PHYLINK API This patch converts mt7530 to PHYLINK API. 2. dt-bindings: net: dsa: mt7530: Add support for port 5 3. net: dsa: mt7530: Add support for port 5 These 2 patches adding support for port 5 of the switch. v2->v3: * Removed 'status = "okay"' lines in patch #2 * Change a port 5 setup message in a debug message in patch #3 * Added ack-by and tested-by tags v1->v2: * Mostly phylink improvements after review. rfc -> v1: * Mostly phylink improvements after review. * Drop phy isolation patches. Adds no value for now. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:28:23 +02:00
René van Dorst	38f790a805	net: dsa: mt7530: Add support for port 5 Adding support for port 5. Port 5 can muxed/interface to: - internal 5th GMAC of the switch; can be used as 2nd CPU port or as extra port with an external phy for a 6th ethernet port. - internal PHY of port 0 or 4; Used in most applications so that port 0 or 4 is the WAN port and interfaces with the 2nd GMAC of the SOC. Signed-off-by: René van Dorst <opensource@vdorst.com> Tested-by: Frank Wunderlich <frank-w@public-files.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:28:23 +02:00
René van Dorst	4f358cbd05	dt-bindings: net: dsa: mt7530: Add support for port 5 MT7530 port 5 has many modes/configurations. Update the documentation how to use port 5. Signed-off-by: René van Dorst <opensource@vdorst.com> Cc: devicetree@vger.kernel.org Cc: Rob Herring <robh@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:28:23 +02:00
René van Dorst	ca366d6c88	net: dsa: mt7530: Convert to PHYLINK API Convert mt7530 to PHYLINK API Signed-off-by: René van Dorst <opensource@vdorst.com> Tested-by: Frank Wunderlich <frank-w@public-files.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:28:23 +02:00
Hayes Wang	771efeda39	r8152: modify rtl8152_set_speed function First, for AUTONEG_DISABLE, we only need to modify MII_BMCR. Second, add advertising parameter for rtl8152_set_speed(). Add RTL_ADVERTISED_xxx for advertising parameter of rtl8152_set_speed(). Then, the advertising settings from ethtool could be saved. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:26:24 +02:00
David S. Miller	472e12e7ff	Merge branch 'dpaa2-eth-Add-new-statistics-counters' Ioana Radulescu says: ==================== dpaa2-eth: Add new statistics counters Recent firmware versions offer access to more DPNI statistics counters. Add the relevant ones to ethtool interface stats. Also we can now make use of a new counter for in flight egress frames to avoid sleeping an arbitrary amount of time in the ndo_stop routine. v2: in patch 2/3, treat separately the error case for unsupported statistics pages ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:24:06 +02:00
Ioana Radulescu	52b6a4ffe2	dpaa2-eth: Poll Tx pending frames counter on if down Starting with firmware version MC10.18.0, a new counter for in flight Tx frames is offered. Use it when bringing down the interface to determine when all pending Tx frames have been processed by hardware instead of sleeping a fixed amount of time. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:24:06 +02:00
Ioana Radulescu	d84c3a4ded	dpaa2-eth: Add new DPNI statistics counters Recent firmware versions expose more DPNI counters. Export relevant ones via ethtool -S. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:24:06 +02:00
Ioana Radulescu	ae90a6f0d9	dpaa2-eth: Minor refactoring in ethtool stats As we prepare to read more pages from the DPNI stat counters, reorganize the code a bit to make it easier to extend. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-05 00:24:06 +02:00
David S. Miller	2c1f9e2634	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-09-03 This series contains updates to ice driver only. Anirudh adds the ability for the driver to handle EMP resets correctly by adding the logic to the existing ice_reset_subtask(). Jeb fixes up the logic to properly free up the resources for a switch rule whether or not it was successful in the removal. Brett fixes up the reporting of ITR values to let the user know odd ITR values are not allowed. Fixes the driver to only disable VLAN pruning on VLAN deletion when the VLAN being deleted is the last VLAN on the VF VSI. Chinh updates the driver to determine the TSA value from the priority value when in CEE mode. Bruce aligns the driver with the hardware specification by ensuring that a PF reset is done as part of the unload logic. Also update the driver unloading field, based on the latest hardware specification, which allows us to remove an unnecessary endian conversion. Moves #defines based on their need in the code. Jesse adds the current state of auto-negotiation in the link up message. In addition, adds additional information to inform the user of an issue with the topology/configuration of the link. Usha updates the driver to allow the maximum TCs that the firmware supports, rather than hard coding to a set value. Dave updates the DCB initialization flow to handle the case of an actual error during DCB init. Updated the driver to report the current stats, even when the netdev is down, which aligns with our other drivers. Mitch fixes the VF reset code flows to ensure that it properly calls ice_dis_vsi_txq() to notify the firmware that the VF is being reset. Michal fixes the driver so the DCB is not enabled when the SW LLDP is activated, which was causing a communication issue with other NICs. The problem lies in that DCB was being enabled without checking the number of TCs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-03 21:51:25 -07:00
David S. Miller	94810bd365	mlx5-updates-2019-09-01 (Software steering support) Abstract: -------- Mellanox ConnetX devices supports packet matching, packet modification and redirection. These functionalities are also referred to as flow-steering. To configure a steering rule, the rule is written to the device owned memory, this memory is accessed and cached by the device when processing a packet. Steering rules are constructed from multiple steering entries (STE). Rules are configured using the Firmware command interface. The Firmware processes the given driver command and translates them to STEs, then writes them to the device memory in the current steering tables. This process is slow due to the architecture of the command interface and the processing complexity of each rule. The highlight of this patchset is to cut the middle man (The firmware) and do steering rules programming into device directly from the driver, with no firmware intervention whatsoever. Motivation: ----------- Software (driver managed) steering allows for high rule insertion rates compared to the FW steering described above, this is achieved by using internal RDMA writes to the device owned memory instead of the slow command interface to program steering rules. Software (driver managed) steering, doesn't depend on new FW for new steering functionality, new implementations can be done in the driver skipping the FW layer. Performance: ------------ The insertion rate on a single core using the new approach allows programming ~300K rules per sec. (Done via direct raw test to the new mlx5 sw steering layer, without any kernel layer involved). Test: TC L2 rules 33K/s with Software steering (this patchset). 5K/s with FW and current driver. This will improve OVS based solution performance. Architecture and implementation details: ---------------------------------------- Software steering will be dynamically selected via devlink device parameter. Example: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs mlx5 software steering module a.k.a (DR - Direct Rule) is implemented and contained in mlx5/core/steering directory and controlled by MLX5_SW_STEERING kconfig flag. mlx5 core steering layer (fs_core) already provides a shim layer for implementing different steering mechanisms, software steering will leverage that as seen at the end of this series. When Software Steering for a specific steering domain (NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules targeting this domain to be created using SW steering instead of FW. The implementation includes: Domain - The steering domain is the object that all other object resides in. It holds the memory allocator, send engine, locks and other shared data needed by lower objects such as table, matcher, rule, action. Each domain can contain multiple tables. Domain is equivalent to namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented currently in mlx5_core fs_core (flow steering core). Table - Table objects are used for holding multiple matchers, each table has a level used to prevent processing loops. Packets are being directed to this table once it is set as the root table, this is done by fs_core using a FW command. A packet is being processed inside the table matcher by matcher until a successful hit, otherwise the packet will perform the default action. Matcher - Matchers objects are used to specify the fields mask for matching when processing a packet. A matcher belongs to a table, each matcher can hold multiple rules, each rule with different matching values corresponding to the matcher mask. Each matcher has a priority used for rule processing order inside the table. Action - Action objects are created to specify different steering actions such as count, reformat (encapsulate, decapsulate, ...), modify header, forward to table and many other actions. When creating a rule a sequence of actions can be provided to be executed on a successful match. Rule - Rule objects are used to specify a specific match on packets as well as the actions that should be executed. A rule belongs to a matcher. STE - This layer is used to hold the specific STE format for the device and to convert the requested rule to STEs. Each rule is constructed of an STE chain, Multiple rules construct a steering graph. Each node in the graph is a hash table containing multiple STEs. The index of each STE in the hash table is being calculated using a CRC32 hash function. Memory pool - Used for managing and caching device owned memory for rule insertion. The memory is being allocated using DM (device memory) API. Communication with device - layer for standard RDMA operation using RC QP to configure the device steering. Command utility - This module holds all of the FW commands that are required for SW steering to function. Patch planning and files: ------------------------- 1) First patch, adds the support to Add flow steering actions to fs_cmd shim layer. 2) Next 12 patch will add a file per each Software steering functionality/module as described above. (See patches with title: DR, ) 3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable build with the new files 4) Next two patches will add the support for software steering in mlx5 steering shim layer net/mlx5: Add API to set the namespace steering mode net/mlx5: Add direct rule fs_cmd implementation 5) Last two patches will add the new devlink parameter to select mlx5 steering mode, will be valid only for switchdev mode for now. Two modes are supported: 1. DMFS - Device managed flow steering 2. SMFS - Software/Driver managed flow steering. In the DMFS mode, the HW steering entities are created through the FW. In the SMFS mode this entities are created though the driver directly. The driver will use the devlink steering mode only if the steering domain supports it, for now SMFS will manages only the switchdev eswitch steering domain. User command examples: - Set SMFS flow steering mode:: $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - Read device flow steering mode:: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl1uxPAACgkQSD+KveBX +j5AkggAymoYqG2G+s8cLa4vQFySaD1Td3VzzWglp7PlpDBE3UcSoMAMg/gIDU1D 8F04PeCsJ6snt1ICk56vPNyAEHWfWeBUd56+QK5lEJBuwozyFvBh6HP81Bnr6T/n n6uTx45ljAFQPTHJjEOLBPSzEXecLu07+mvpzSoW0F3ehfGbELhL1IkVobr/RELx z4xZW9uM2vm5ylheWvjf4V1S/SvokgJazW9+4fh//rl8tfXgun5IfPoS0hqKie1/ h5sjcMSYkYR4gLVqrhKmBYHmHVl/h0TYROckW8iC/+XX7ailSo9uPG7lPa6cm+GE 7Bajlbz4oD/K5RWoByo+q+dmyjeVhQ== =M9bS -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2019-09-01-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-09-01 (Software steering support) Abstract: -------- Mellanox ConnetX devices supports packet matching, packet modification and redirection. These functionalities are also referred to as flow-steering. To configure a steering rule, the rule is written to the device owned memory, this memory is accessed and cached by the device when processing a packet. Steering rules are constructed from multiple steering entries (STE). Rules are configured using the Firmware command interface. The Firmware processes the given driver command and translates them to STEs, then writes them to the device memory in the current steering tables. This process is slow due to the architecture of the command interface and the processing complexity of each rule. The highlight of this patchset is to cut the middle man (The firmware) and do steering rules programming into device directly from the driver, with no firmware intervention whatsoever. Motivation: ----------- Software (driver managed) steering allows for high rule insertion rates compared to the FW steering described above, this is achieved by using internal RDMA writes to the device owned memory instead of the slow command interface to program steering rules. Software (driver managed) steering, doesn't depend on new FW for new steering functionality, new implementations can be done in the driver skipping the FW layer. Performance: ------------ The insertion rate on a single core using the new approach allows programming ~300K rules per sec. (Done via direct raw test to the new mlx5 sw steering layer, without any kernel layer involved). Test: TC L2 rules 33K/s with Software steering (this patchset). 5K/s with FW and current driver. This will improve OVS based solution performance. Architecture and implementation details: ---------------------------------------- Software steering will be dynamically selected via devlink device parameter. Example: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs mlx5 software steering module a.k.a (DR - Direct Rule) is implemented and contained in mlx5/core/steering directory and controlled by MLX5_SW_STEERING kconfig flag. mlx5 core steering layer (fs_core) already provides a shim layer for implementing different steering mechanisms, software steering will leverage that as seen at the end of this series. When Software Steering for a specific steering domain (NIC/RDMA/Vport/ESwitch, etc ..) is supported, it will cause rules targeting this domain to be created using SW steering instead of FW. The implementation includes: Domain - The steering domain is the object that all other object resides in. It holds the memory allocator, send engine, locks and other shared data needed by lower objects such as table, matcher, rule, action. Each domain can contain multiple tables. Domain is equivalent to namespaces e.g (NIC/RDMA/Vport/ESwitch, etc ..) as implemented currently in mlx5_core fs_core (flow steering core). Table - Table objects are used for holding multiple matchers, each table has a level used to prevent processing loops. Packets are being directed to this table once it is set as the root table, this is done by fs_core using a FW command. A packet is being processed inside the table matcher by matcher until a successful hit, otherwise the packet will perform the default action. Matcher - Matchers objects are used to specify the fields mask for matching when processing a packet. A matcher belongs to a table, each matcher can hold multiple rules, each rule with different matching values corresponding to the matcher mask. Each matcher has a priority used for rule processing order inside the table. Action - Action objects are created to specify different steering actions such as count, reformat (encapsulate, decapsulate, ...), modify header, forward to table and many other actions. When creating a rule a sequence of actions can be provided to be executed on a successful match. Rule - Rule objects are used to specify a specific match on packets as well as the actions that should be executed. A rule belongs to a matcher. STE - This layer is used to hold the specific STE format for the device and to convert the requested rule to STEs. Each rule is constructed of an STE chain, Multiple rules construct a steering graph. Each node in the graph is a hash table containing multiple STEs. The index of each STE in the hash table is being calculated using a CRC32 hash function. Memory pool - Used for managing and caching device owned memory for rule insertion. The memory is being allocated using DM (device memory) API. Communication with device - layer for standard RDMA operation using RC QP to configure the device steering. Command utility - This module holds all of the FW commands that are required for SW steering to function. Patch planning and files: ------------------------- 1) First patch, adds the support to Add flow steering actions to fs_cmd shim layer. 2) Next 12 patch will add a file per each Software steering functionality/module as described above. (See patches with title: DR, ) 3) Add CONFIG_MLX5_SW_STEERING for software steering support and enable build with the new files 4) Next two patches will add the support for software steering in mlx5 steering shim layer net/mlx5: Add API to set the namespace steering mode net/mlx5: Add direct rule fs_cmd implementation 5) Last two patches will add the new devlink parameter to select mlx5 steering mode, will be valid only for switchdev mode for now. Two modes are supported: 1. DMFS - Device managed flow steering 2. SMFS - Software/Driver managed flow steering. In the DMFS mode, the HW steering entities are created through the FW. In the SMFS mode this entities are created though the driver directly. The driver will use the devlink steering mode only if the steering domain supports it, for now SMFS will manages only the switchdev eswitch steering domain. User command examples: - Set SMFS flow steering mode:: $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - Read device flow steering mode:: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-09-03 21:46:13 -07:00
Brett Creeley	cd186e5151	ice: Only disable VLAN pruning for the VF when all VLANs are removed Currently if the VF adds a VLAN, VLAN pruning will be enabled for that VSI. Also, when a VLAN gets deleted it will disable VLAN pruning even if other VLAN(s) exists for the VF. Fix this by only disabling VLAN pruning on the VF VSI when removing the last VF (i.e. vf->num_vlan == 0). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:17:13 -07:00
Michal Swiatkowski	03bba02016	ice: Remove enable DCB when SW LLDP is activated Remove code that enables DCB in initialization when SW LLDP is activated. DCB flag is set or reset before in ice_init_pf_dcb based on number of TCs. So there is not need to overwrite it. Setting DCB without checking number of TCs can cause communication problems with other cards. Host card sends packet with VLAN priority tag, but client card doesn't strip this tag and ping doesn't work. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:14:37 -07:00
Dave Ertman	3d57fd10f2	ice: Report stats when VSI is down There is currently a check in get_ndo_stats that returns before updating stats if the VSI is down or there are no Tx or Rx queues. This causes the netdev to report zero stats with the netdev is down. Remove the check so that the behavior of reporting stats is the same as it was in IXGBE. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:07:50 -07:00
Mitch Williams	06914ac20a	ice: Always notify FW of VF reset The call to ice_dis_vsi_txq() acts as the notification to the firmware that the VF is being reset. Because of this, we need to make this call every time we reset, regardless of whatever else we do to stop the Tx queues. Without this change, VF resets would fail to complete on interfaces that were up and running. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:04:14 -07:00
Dave Ertman	473ca57488	ice: Correctly handle return values for init DCB In the init path for DCB, the call to ice_init_dcb() can return a non-zero value for either an actual error, or due to the FW lldp engine being stopped. We are currently treating all non-zero values only as an indication that the FW LLDP engine is stopped. Check for an actual error in the DCB init flow. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 17:02:23 -07:00
Usha Ketineni	a257f188b7	ice: Limit Max TCs on devices with more than 4 ports This patch limits the max TCs set by the driver to the value provided by the firmware as per the capabilities of the device. Otherwise, hard coding to 8 TC max would fail the device configurations with more than 4 ports. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:35:58 -07:00
Tony Nguyen	6a025730e0	ice: Cleanup defines in ice_type.h Conventionally, if the #defines/other are not needed by other header files being included, #includes are done first followed by #defines and other stuff. Move the #defines before the #includes to follow this convention. Suggested by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:32:30 -07:00
Jesse Brandeburg	2e0ab37c04	ice: print extra message if topology issue The driver needs to inform the user if there is an issue with the topology / configuration of the link. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:27:45 -07:00
Jesse Brandeburg	432609887a	ice: add print of autoneg state to link message Print the state of auto-negotiation when printing the Link up message. Adds new text to the "NIC Link is up" line like Autoneg: <True \| False> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:25:34 -07:00
Bruce Allan	7404e84a23	ice: update driver unloading field for Queue Shutdown AQ command According to recent specification versions, the field in the Queue Shutdown AdminQ command consisting of the "driver unloading" indication is not a 4 byte field (it is byte.bit 16.0). Change it to a byte and remove the unnecessary endian conversion. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:23:35 -07:00
Bruce Allan	18057cb357	ice: add needed PFR during driver unload According to the specification, a PF Reset must be done as part of the driver unload flow. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:18:52 -07:00
Chinh T Cao	d24ef08a9d	ice: Deduce TSA value from the priority value in the CEE mode In CEE mode, the TSA information can be derived from the reported priority value. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:16:36 -07:00
Brett Creeley	567af267fa	ice: Report what the user set for coalesce [tx\|rx]-usecs Currently if the user sets an odd value for [tx\|rx]-usecs we align the value because the hardware only understands ITR values in multiples of 2. This seems misleading because we are essentially telling the user that the ITR value is odd, when in fact we have changed it internally. Fix this by reporting that setting odd ITR values is not allowed. Also, while making changes to ice_set_rc_coalesce() I noticed a bit of code/error duplication. Make the necessary changes to remove the duplication. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:11:10 -07:00
Jeb Cramer	8132e17dfb	ice: Fix resource leak in ice_remove_rule_internal() We don't free s_rule if ice_aq_sw_rules() returns a non-zero status. If it returned a zero status, s_rule would be freed right after, so this implies it should be freed within the scope of the function regardless. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 16:08:54 -07:00
Anirudh Venkataramanan	03af840650	ice: Fix EMP reset handling ice_reset_subtask needs to handle EMP resets as well, as EMP resets can be triggered by the firmware. This patch adds the logic to do this. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-03 13:47:12 -07:00
Maor Gottlieb	e890acd5ff	net/mlx5: Add devlink flow_steering_mode parameter Add new parameter (flow_steering_mode) to control the flow steering mode of the driver. Two modes are supported: 1. DMFS - Device managed flow steering 2. SMFS - Software/Driver managed flow steering. In the DMFS mode, the HW steering entities are created through the FW. In the SMFS mode this entities are created though the driver directly. The driver will use the devlink steering mode only if the steering domain supports it, for now SMFS will manages only the switchdev eswitch steering domain. User command examples: - Set SMFS flow steering mode:: $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime - Read device flow steering mode:: $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode pci/0000:06:00.0: name flow_steering_mode type driver-specific values: cmode runtime value smfs Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-09-03 12:54:24 -07:00

1 2 3 4 5 ...

858807 Commits