linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-24 03:24:55 +08:00

Author	SHA1	Message	Date
Tariq Toukan	f01cc58c18	net/mlx5e: Support multiple RSS contexts Add support to multiple RSS contexts. Resources of the non-default RSS contexts are allocated and created on demand. Each RSS context can be controlled and configured separately, via the implemented ethtool ops. Here we limit the num of total contexts to 16. We do not enforce any kind of new limitation over the indirection table content. More specifically, two separate contexts can be configured to fully or partially point to the same set of receive rings. The default RSS context (index 0) is created with its full set of TIRs. All other contexts are created with an empty set, then TIRs are added upon first usage when steering rules are added. We use a reference counting mechanism to make sure an RSS context is not removed before the rules pointing to it. Block ethtool set_channels operations when multiple RSS contexts exist, as currently the kernel doesn't protect against inconsistent channels configs that break non-default RSS contexts. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:28 -07:00
Tariq Toukan	49095f641b	net/mlx5e: Dynamically allocate TIRs in RSS contexts Move from static to dynamic memory allocations for TIR. This is in preparation to supporting on-demand TIR operations in downstream patches, where every RSS context will be init with an empty set of TIRs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:27 -07:00
Tariq Toukan	25307a91cb	net/mlx5e: Convert RSS to a dedicated object Code related to RSS is now encapsulated into a dedicated object and put into new files en/rss.{c,h}. All usages are converted. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:27 -07:00
Tariq Toukan	713ba5e5f6	net/mlx5e: Introduce abstraction of RSS context Bring all fields that define and maintain RSS behavior together into a new structure. Align all usages with this new structure. Keep it hidden within rx_res.c. This helps supporting multiple RSS contexts in downstream patch. Use dynamic allocations for the RSS context. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:27 -07:00
Tariq Toukan	fc651ff910	net/mlx5e: Introduce TIR create/destroy API in rx_res Take TIR control operations in rx_res into functions. This is in preparation to supporting on-demand TIR operations in downstream patches. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:26 -07:00
Tariq Toukan	6e5fea5196	net/mlx5e: Do not try enable RSS when resetting indir table All calls to mlx5e_rx_res_rss_set_indir_uniform() occur while the RSS state is inactive, i.e. the RQT is pointing to the drop RQ, not to the channels' RQs. It means that the "apply" part of the function is not called. Remove this part from the function, and document the change. It will be useful for next patches in the series, allows code simplifications when multiple RSS contexts are introduced. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-16 16:17:26 -07:00
Antoine Tenart	1b3f78df6a	bonding: improve nl error msg when device can't be enslaved because of IFF_MASTER Use a more user friendly netlink error message when a device can't be enslaved because it has IFF_MASTER, by not referring directly to a kernel internal flag. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 14:03:30 +01:00
David S. Miller	ab6361382f	Merge branch 'bridge-mcast-fixes' Nikolay Aleksandrov says: ==================== net: bridge: mcast: fixes for mcast querier state These three fix querier state dumping. The first patch can be considered a minor behaviour improvement, it avoids dumping querier state when mcast snooping is disabled. The second patch was a report of sizeof(0) used for nested netlink attribute size which should be just 0, and the third patch accounts for IPv6 querier state size when allocating skb for notifications. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 13:58:00 +01:00
Nikolay Aleksandrov	175e669247	net: bridge: mcast: account for ipv6 size when dumping querier state We need to account for the IPv6 attributes when dumping querier state. Fixes: 5e924fe6ccfd ("net: bridge: mcast: dump ipv6 querier state") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 13:58:00 +01:00
Nikolay Aleksandrov	cdda378bd8	net: bridge: mcast: drop sizeof for nest attribute's zero size This was a dumb error I made instead of writing nla_total_size(0) for a nest attribute, I wrote nla_total_size(sizeof(0)). Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 606433fe3e11 ("net: bridge: mcast: dump ipv4 querier state") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 13:58:00 +01:00
Nikolay Aleksandrov	f137b7d4ec	net: bridge: mcast: don't dump querier state if snooping is disabled A minor improvement to avoid dumping mcast ctx querier state if snooping is disabled for that context (either bridge or vlan). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 13:57:59 +01:00
David S. Miller	23a44b77e0	Merge branch 'stmmac-per-queue-stats' Vijayakannan Ayyathurai says: ==================== net: stmmac: Add ethtool per-queue statistic Adding generic ethtool per-queue statistic framework to display the statistics for each rx/tx queue. In future, users can avail it to add more per-queue specific counters. Number of rx/tx queues displayed is depending on the available rx/tx queues in that particular MAC config and this number is limited up to the MTL_MAX_{RX\|TX}_QUEUES defined in the driver. Ethtool per-queue statistic display will look like below, when users start adding more counters. Example - 1: q0_tx_statA: q0_tx_statB: q0_tx_statC: \| q0_tx_statX: . . . qMAX_tx_statA: qMAX_tx_statB: qMAX_tx_statC: \| qMAX_tx_statX: q0_rx_statA: q0_rx_statB: q0_rx_statC: \| q0_rx_statX: . . . qMAX_rx_statA: qMAX_rx_statB: qMAX_rx_statC: \| qMAX_rx_statX: Example - 2: Ping test using the tx queue 3. $ tc qdisc add dev enp0s30f4 root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 3@0 1@3 hw 0 Statistic before ping: --------------------- $ ethtool -S enp0s30f4 [ snip ] q3_tx_pkt_n: 7916 q3_tx_irq_n: 316 [ snip ] $ cat /proc/interrupts [ snip ] 143: 0 0 0 316 0 0 0 0 IR-PCI-MSI 499719-edge enp0s30f4:tx-3 [ snip ] $ ping -I enp0s30f4 192.168.1.10 -i 0.01 -c 100 > /dev/null Statistic after ping: --------------------- $ ethtool -S enp0s30f4 [ snip ] q3_tx_pkt_n: 8016 q3_tx_irq_n: 320 [ snip ] $ cat /proc/interrupts [ snip ] 143: 0 0 0 320 0 0 0 0 IR-PCI-MSI 499719-edge enp0s30f4:tx-3 [ snip ] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:36:04 +01:00
Vijayakannan Ayyathurai	af9bf70154	net: stmmac: add ethtool per-queue irq statistic support Adding ethtool per-queue statistics support to show number of interrupts generated at DMA tx and DMA rx. All the counters are incremented at dwmac4_dma_interrupt function. Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:36:04 +01:00
Vijayakannan Ayyathurai	68e9c5dee1	net: stmmac: add ethtool per-queue statistic framework Adding generic ethtool per-queue statistic framework to display the statistics for each rx/tx queue. In future, users can avail it to add more per-queue specific counters. Number of rx/tx queues displayed is depending on the available rx/tx queues in that particular MAC config and this number is limited up to the MTL_MAX_{RX\|TX}_QUEUES defined in the driver. Ethtool per-queue statistic display will look like below, when users start adding more counters. Example: q0_tx_statA: q0_tx_statB: q0_tx_statC: \| q0_tx_statX: . . . qMAX_tx_statA: qMAX_tx_statB: qMAX_tx_statC: \| qMAX_tx_statX: q0_rx_statA: q0_rx_statB: q0_rx_statC: \| q0_rx_statX: . . . qMAX_rx_statA: qMAX_rx_statB: qMAX_rx_statC: \| qMAX_rx_statX: In addition, this patch has the support on displaying the number of packets received and transmitted per queue. Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:36:04 +01:00
Voon Weifeng	1975df880b	net: stmmac: fix INTR TBU status affecting irq count statistic DMA channel status "Transmit buffer unavailable(TBU)" bit is not considered as a successful dma tx. Hence, it should not affect all the irq count statistic. Fixes: `1103d3a553` ("net: stmmac: dwmac4: Also use TBU interrupt to clean TX path") Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:36:04 +01:00
Vladimir Oltean	022522aca4	net: dsa: sja1105: reorganize probe, remove, setup and teardown ordering The sja1105 driver's initialization and teardown sequence is a chaotic mess that has gathered a lot of cruft over time. It works because there is no strict dependency between the functions, but it could be improved. The basic principle that teardown should be the exact reverse of setup is obviously not held. We have initialization steps (sja1105_tas_setup, sja1105_flower_setup) in the probe method that are torn down in the DSA .teardown method instead of driver unbind time. We also have code after the dsa_register_switch() call, which implicitly means after the .setup() method has finished, which is pretty unusual. Also, sja1105_teardown() has calls set up in a different order than the error path of sja1105_setup(): see the reversed ordering between sja1105_ptp_clock_unregister and sja1105_mdiobus_unregister. Also, sja1105_static_config_load() is called towards the end of sja1105_setup(), but sja1105_static_config_free() is also towards the end of the error path and teardown path. The static_config_load() call should be earlier. Also, making and breaking the connections between struct sja1105_port and struct dsa_port could be refactored into dedicated functions, makes the code easier to follow. We move some code from the DSA .setup() method into the probe method, like the device tree parsing, and we move some code from the probe method into the DSA .setup() method to be symmetric with its placement in the DSA .teardown() method, which is nice because the unbind function has a single call to dsa_unregister_switch(). Example of the latter type of code movement are the connections between ports mentioned above, they are now in the .setup() method. Finally, due to fact that the kthread_init_worker() call is no longer in sja1105_probe() - located towards the bottom of the file - but in sja1105_setup() - located much higher - there is an inverse ordering with the worker function declaration, sja1105_port_deferred_xmit. To avoid that, the entire sja1105_setup() and sja1105_teardown() functions are moved towards the bottom of the file. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:24:53 +01:00
Heiner Kallweit	c07c8ffc70	r8169: rename rtl_csi_access_enable to rtl_set_aspm_entry_latency Rename the function to reflect what it's doing. Also add a description of the register values as kindly provided by Realtek. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:21:54 +01:00
David S. Miller	793ee362b0	Merge branch 'ocelot-phylink' Vladimir Oltean says: ==================== Convert ocelot to phylink The ocelot switchdev and felix dsa drivers are interesting because they target the same class of hardware switches but used in different modes. Colin has an interesting use case where he wants to use a hardware switch supported by the ocelot switchdev driver with the felix dsa driver. So far, the existing hardware revisions were similar between the ocelot and felix drivers, but not completely identical. With identical hardware, it is absurd that the felix driver uses phylink while the ocelot driver uses phylib - this should not be one of the differences between the switchdev and dsa driver, and we could eliminate it. Colin will need the common phylink support in ocelot and felix when adding a phylink_pcs driver for the PCS1G block inside VSC7514, which will make the felix driver work with either the NXP or the Microchip PCS. As usual, Alex, Horatiu, sorry for bugging you, but it would be appreciated if you could give this a quick run on actual VSC7514 hardware (which I don't have) to make sure I'm not introducing any breakage. ==================== Fixes: `0f06a6787e` ("samples: Add an IPv6 "-6" option to the pktgen scripts") Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:19:34 +01:00
Vladimir Oltean	e6e12df625	net: mscc: ocelot: convert to phylink The felix DSA driver, which is a wrapper over the same hardware class as ocelot, is integrated with phylink, but ocelot is using the plain PHY library. It makes sense to bring together the two implementations, which is what this patch achieves. This is a large patch and hard to break up, but it does the following: The existing ocelot_adjust_link writes some registers, and felix_phylink_mac_link_up writes some registers, some of them are common, but both functions write to some registers to which the other doesn't. The main reasons for this are: - Felix switches so far have used an NXP PCS so they had no need to write the PCS1G registers that ocelot_adjust_link writes - Felix switches have the MAC fixed at 1G, so some of the MAC speed changes actually break the link and must be avoided. The naming conventions for the functions introduced in this patch are: - vsc7514_phylink_{mac_config,validate} are specific to the Ocelot instantiations and placed in ocelot_net.c which is built only for the ocelot switchdev driver. - ocelot_phylink_mac_link_{up,down} are shared between the ocelot switchdev driver and the felix DSA driver (they are put in the common lib). One by one, the registers written by ocelot_adjust_link are: DEV_MAC_MODE_CFG - felix_phylink_mac_link_up had no need to write this register since its out-of-reset value was fine and did not need changing. The write is moved to the common ocelot_phylink_mac_link_up and on felix it is guarded by a quirk bit that makes the written value identical with the out-of-reset one DEV_PORT_MISC - runtime invariant, was moved to vsc7514_phylink_mac_config PCS1G_MODE_CFG - same as above PCS1G_SD_CFG - same as above PCS1G_CFG - same as above PCS1G_ANEG_CFG - same as above PCS1G_LB_CFG - same as above DEV_MAC_ENA_CFG - both ocelot_adjust_link and ocelot_port_disable touched this. felix_phylink_mac_link_{up,down} also do. We go with what felix does and put it in ocelot_phylink_mac_link_up. DEV_CLOCK_CFG - ocelot_adjust_link and felix_phylink_mac_link_up both write this, but to different values. Move to the common ocelot_phylink_mac_link_up and make sure via the quirk that the old values are preserved for both. ANA_PFC_PFC_CFG - ocelot_adjust_link wrote this, felix_phylink_mac_link_up did not. Runtime invariant, speed does not matter since PFC is disabled via the RX_PFC_ENA bits which are cleared. Move to vsc7514_phylink_mac_config. QSYS_SWITCH_PORT_MODE_PORT_ENA - both ocelot_adjust_link and felix_phylink_mac_link_{up,down} wrote this. Ocelot also wrote this register from ocelot_port_disable. Keep what felix did, move in ocelot_phylink_mac_link_{up,down} and delete ocelot_port_disable. ANA_POL_FLOWC - same as above SYS_MAC_FC_CFG - same as above, except slight behavior change. Whereas ocelot always enabled RX and TX flow control, felix listened to phylink (for the most part, at least - see the 2500base-X comment). The registers which only felix_phylink_mac_link_up wrote are: SYS_PAUSE_CFG_PAUSE_ENA - this is why I am not sure that flow control worked on ocelot. Not it should, since the code is shared with felix where it does. ANA_PORT_PORT_CFG - this is a Frame Analyzer block register, phylink should be the one touching them, deleted. Other changes: - The old phylib registration code was in mscc_ocelot_init_ports. It is hard to work with 2 levels of indentation already in, and with hard to follow teardown logic. The new phylink registration code was moved inside ocelot_probe_port(), right between alloc_etherdev() and register_netdev(). It could not be done before (=> outside of) ocelot_probe_port() because ocelot_probe_port() allocates the struct ocelot_port which we then use to assign ocelot_port->phy_mode to. It is more preferable to me to have all PHY handling logic inside the same function. - On the same topic: struct ocelot_port_private :: serdes is only used in ocelot_port_open to set the SERDES protocol to Ethernet. This is logically a runtime invariant and can be done just once, when the port registers with phylink. We therefore don't even need to keep the serdes reference inside struct ocelot_port_private, or to use the devm variant of of_phy_get(). - Phylink needs a valid phy-mode for phylink_create() to succeed, and the existing device tree bindings in arch/mips/boot/dts/mscc/ocelot_pcb120.dts don't define one for the internal PHY ports. So we patch PHY_INTERFACE_MODE_NA into PHY_INTERFACE_MODE_INTERNAL. - There was a strategically placed: switch (priv->phy_mode) { case PHY_INTERFACE_MODE_NA: continue; which made the code skip the serdes initialization for the internal PHY ports. Frankly that is not all that obvious, so now we explicitly initialize the serdes under an "if" condition and not rely on code jumps, so everything is clearer. - There was a write of OCELOT_SPEED_1000 to DEV_CLOCK_CFG for QSGMII ports. Since that is in fact the default value for the register field DEV_CLOCK_CFG_LINK_SPEED, I can only guess the intention was to clear the adjacent fields, MAC_TX_RST and MAC_RX_RST, aka take the port out of reset, which does match the comment. I don't even want to know why this code is placed there, but if there is indeed an issue that all ports that share a QSGMII lane must all be up, then this logic is already buggy, since mscc_ocelot_init_ports iterates using for_each_available_child_of_node, so nobody prevents the user from putting a 'status = "disabled";' for some QSGMII ports which would break the driver's assumption. In any case, in the eventuality that I'm right, we would have yet another issue if ocelot_phylink_mac_link_down would reset those ports and that would be forbidden, so since the ocelot_adjust_link logic did not do that (maybe for a reason), add another quirk to preserve the old logic. The ocelot driver teardown goes through all ports in one fell swoop. When initialization of one port fails, the ocelot->ports[port] pointer for that is reset to NULL, and teardown is done only for non-NULL ports, so there is no reason to do partial teardowns, let the central mscc_ocelot_release_ports() do its job. Tested bind, unbind, rebind, link up, link down, speed change on mock-up hardware (modified the driver to probe on Felix VSC9959). Also regression tested the felix DSA driver. Could not test the Ocelot specific bits (PCS1G, SERDES, device tree bindings). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:19:34 +01:00
Vladimir Oltean	46efe4efb9	net: dsa: felix: stop calling ocelot_port_{enable,disable} ocelot_port_enable touches ANA_PORT_PORT_CFG, which has the following fields: - LOCKED_PORTMOVE_CPU, LEARNDROP, LEARNCPU, LEARNAUTO, RECV_ENA, all of which are written with their hardware default values, also runtime invariants. So it makes no sense to write these during every .ndo_open. - PORTID_VAL: this field has an out-of-reset value of zero for all ports and must be initialized by software. Additionally, the ocelot_setup_logical_port_ids() code path sets up different logical port IDs for the ports in a hardware LAG, and we absolutely don't want .ndo_open to interfere there and reset those values. So in fact the write from ocelot_port_enable can better be moved to ocelot_init_port, and the .ndo_open hook deleted. ocelot_port_disable touches DEV_MAC_ENA_CFG and QSYS_SWITCH_PORT_MODE_PORT_ENA, in an attempt to undo what ocelot_adjust_link did. But since .ndo_stop does not get called each time the link falls (i.e. this isn't a substitute for .phylink_mac_link_down), felix already does better at this by writing those registers already in felix_phylink_mac_link_down. So keep ocelot_port_disable (for now, until ocelot is converted to phylink too), and just delete the felix call to it, which is not necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:19:34 +01:00
Changbin Du	e871ee6941	s390/net: replace in_irq() with in_hardirq() Replace the obsolete and ambiguos macro in_irq() with new macro in_hardirq(). Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:15:47 +01:00
Vladimir Oltean	b2b8913341	net: dsa: tag_8021q: fix notifiers broadcast when they shouldn't, and vice versa During the development of the blamed patch, the "bool broadcast" argument of dsa_port_tag_8021q_vlan_{add,del} was originally called "bool local", and the meaning was the exact opposite. Due to a rookie mistake where the patch was modified at the last minute without retesting, the instances of dsa_port_tag_8021q_vlan_{add,del} are called with the wrong values. During setup and teardown, cross-chip notifiers should not be broadcast to all DSA trees, while during bridging, they should. Fixes: `724395f4dc` ("net: dsa: tag_8021q: don't broadcast during setup/teardown") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:14:18 +01:00
Randy Dunlap	944f510176	ptp: ocp: don't allow on S390 Fix kconfig warning on arch/s390/: WARNING: unmet direct dependencies detected for SERIAL_8250 Depends on [n]: TTY [=y] && HAS_IOMEM [=y] && !S390 [=y] Selected by [m]: - PTP_1588_CLOCK_OCP [=m] && PTP_1588_CLOCK [=m] && HAS_IOMEM [=y] && PCI [=y] && SPI [=y] && I2C [=m] && MTD [=m] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:13:29 +01:00
Rao Shoaib	19eed72107	af_unix: check socket state when queuing OOB edumazet@google.com pointed out that queue_oob does not check socket state after acquiring the lock. He also pointed to an incorrect usage of kfree_skb and an unnecessary setting of skb length. This patch addresses those issue. Signed-off-by: Rao Shoaib <Rao.Shoaib@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:12:37 +01:00
Song Yoong Siang	6164659ff7	net: phy: marvell: Add WAKE_PHY support to WOL event Add Wake-on-PHY feature support by enabling the Link Up Event. Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:05:52 +01:00
Wong Vee Khee	849d2f83f5	net: pcs: xpcs: Add Pause Mode support for SGMII and 2500BaseX SGMII/2500BaseX supports Pause frame as defined in the IEEE802.3x Flow Control standardization. Add this as a supported feature under the xpcs_sgmii_features struct. Cc: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:03:22 +01:00
David S. Miller	5fa5fb8b3b	Merge branch 'pktgen-samples' samples: pktgen: enhance the usability of pktgen samples This patchset improves the usability of pktgen samples by adding an option for propagating the environment variable of normal user to sudo. And also adds the missing IPv6 option to pktgen scripts. Currently, all pktgen samples are able to use the environment variable instead of optional parameters. However, it doesn't work appropriately when running samples as normal user. This is results of running sample as root and user: // running as root # DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1 Running... ctrl^C to stop // running as normal user $ DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1 [...] ERROR: Please specify output device The reason why passing the environment varaible doesn't work properly when running samples as normal user is that the environment variable of normal user doesn't propagate to sudo (root_check_run_with_sudo)). So the first commit solves this issue by using "-E" (--preserve-env) option of "sudo", which passes normal user's existing environment variables. Also, "sample04" and "sample05" are not working properly when running with IPv6 option parameter("-6"). Because the commit `0f06a6787e` ("samples: Add an IPv6 "-6" option to the pktgen scripts") has omitted the addition of this option at these samples. So the second commit adds missing IPv6 option to pktgen scripts. ==================== Fixes: `0f06a6787e` ("samples: Add an IPv6 "-6" option to the pktgen scripts") Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:02:09 +01:00
Juhee Kang	0f0c4f1b72	samples: pktgen: add missing IPv6 option to pktgen scripts Currently, "sample04" and "sample05" are not working properly when running with an IPv6 option("-6"). The commit `0f06a6787e` ("samples: Add an IPv6 "-6" option to the pktgen scripts") has omitted the addition of this option at "sample04" and "sample05". In order to support IPv6 option, this commit adds logic related to IPv6 option. Fixes: `0f06a6787e` ("samples: Add an IPv6 "-6" option to the pktgen scripts") Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:02:09 +01:00
Juhee Kang	7caeabd726	samples: pktgen: pass the environment variable of normal user to sudo All pktgen samples can use the environment variable instead of option parameters(eg. $DEV is able to use instead of '-i' option). This is results of running sample as root and user: // running as root # DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1 Running... ctrl^C to stop // running as normal user $ DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1 [...] ERROR: Please specify output device This results show the sample doesn't work properly when the sample runs as normal user. Because the sample is restarted by the function (root_check_run_with_sudo) to run with sudo. In this process, the environment variable of normal user doesn't propagate to sudo. It can be solved by using "-E"(--preserve-env) option of "sudo", which preserve normal user's existing environment variables. So this commit adds "-E" option in the function (root_check_run_with_sudo). Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 11:02:09 +01:00
David S. Miller	cbbb7abdd0	Merge branch 'ipq-mdio' Luo Jie says: ==================== net: mdio: Add IPQ MDIO reset related function This patch series add the MDIO reset features, which includes configuring MDIO clock source frequency and indicating CMN_PLL that ethernet LDO has been ready, this ethernet LDO is dedicated in the IPQ5018 platform. Specify more chipset IPQ40xx, IPQ807x, IPQ60xx and IPQ50xx supported by this MDIO driver. Changes in v3: * simplify the function ipq_mdio_reset. Changes in v2: * Addressed review comments (Andrew Lunn). * Remove the IS_ERR(). * make binding patch part of series. * document the property 'reg' and 'clock'. Changes in v1: * make MDIO_IPQ4019 unchanged for backwards compatibility. * remove the PHY reset functions ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 10:30:27 +01:00
Luo Jie	2a4c32e767	dt-bindings: net: Add the properties for ipq4019 MDIO The new added properties resource "reg" is for configuring ethernet LDO in the IPQ5018 chipset, the property "clocks" is for configuring the MDIO clock source frequency. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 10:30:27 +01:00
Luo Jie	c76ee26306	MDIO: Kconfig: Specify more IPQ chipset supported The IPQ MDIO driver currently supports the chipset IPQ40xx, IPQ807x, IPQ60xx and IPQ50xx. Add the compatible 'qcom,ipq5018-mdio' because of ethernet LDO dedicated to the IPQ5018 platform. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 10:30:27 +01:00
Luo Jie	23a890d493	net: mdio: Add the reset function for IPQ MDIO driver 1. configure the MDIO clock source frequency. 2. the LDO resource is needed to configure the ethernet LDO available for CMN_PLL. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-16 10:30:27 +01:00
Cai Huoqing	e4637f6212	MAINTAINERS: Remove the ipx network layer info commit <47595e32869f> ("<MAINTAINERS: Mark some staging directories>") indicated the ipx network layer as obsolete in Jan 2018, updated in the MAINTAINERS file. now, after being exposed for 3 years to refactoring, so to remove the ipx network layer info from MAINTAINERS. additionally, there is no module that depends on ipx.h except a broken staging driver(r8188eu) Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:16:13 +01:00
Cai Huoqing	6c9b408447	net: Remove net/ipx.h and uapi/linux/ipx.h header files commit <47595e32869f> ("<MAINTAINERS: Mark some staging directories>") indicated the ipx network layer as obsolete in Jan 2018, updated in the MAINTAINERS file now, after being exposed for 3 years to refactoring, so to delete uapi/linux/ipx.h and net/ipx.h header files for good. additionally, there is no module that depends on ipx.h except a broken staging driver(r8188eu) Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:16:12 +01:00
David S. Miller	fda4e19d50	Merge branch 'iupa-last-things-before-pm-conversion' Alex Elder says: ==================== net: ipa: last things before PM conversion This series contains a few remaining changes needed before fully switching over to using runtime power management rather than the previous "IPA clock" mechanism. The first patch moves the calls to enable and disable the IPA interrupt as a system wakeup interrupt into "ipa_clock.c" with the rest of the power-related code. The second adds a flag to make it possible to distinguish runtime suspend from system suspend. The third and fourth patches arrange for the ->start_xmit path to resume hardware if necessary, to ensure it is powered. If power is not active, the TX queue is stopped, and arrangements are made for the queue to be restarted once hardware power is active again. The fifth patch keeps the TX queue active during suspend. This isn't necessary for system suspend but it's important for runtime suspend. And the last patch makes it so we don't hold the hardware active while the modem network device is open. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:39 +01:00
Alex Elder	8dc181f2cd	net: ipa: don't hold clock reference while netdev open Currently a clock reference is taken whenever the ->ndo_open callback for the modem netdev is called. That reference is dropped when the device is closed, in ipa_stop(). We no longer need this, because ipa_start_xmit() now handles the situation where the hardware power state is not active. Drop the clock reference in ipa_open() when we're done, and take a new reference in ipa_stop() before we begin closing the interface. Finally (and unrelated, but trivial), change the return type of ipa_start_xmit() to be netdev_tx_t instead of int. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:39 +01:00
Alex Elder	8dcf8bb30f	net: ipa: don't stop TX on suspend Currently we stop the modem netdev transmit queue when suspending the hardware. For system suspend this ensured we'd never attempt to transmit while attempting to suspend the modem endpoints. For runtime suspend, the IPA hardware might get suspended while the system is operating. In that case we want an attempt to transmit a packet to cause the hardware to resume if necessary. But if we disable the queue this cannot happen. So stop disabling the queue on suspend. In case we end up disabling it in ipa_start_xmit() (see the previous commit), we still arrange to start the TX queue on resume. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:38 +01:00
Alex Elder	6b51f802d6	net: ipa: ensure hardware has power in ipa_start_xmit() We need to ensure the hardware is powered when we transmit a packet. But if it's not, we can't block to wait for it. So asynchronously request power in ipa_start_xmit(), and only proceed if the return value indicates the power state is active. If the hardware is not active, a runtime resume request will have been initiated. In that case, stop the network stack from further transmit attempts until the resume completes. Return NETDEV_TX_BUSY, to retry sending the packet once the queue is restarted. If the power request returns an error (other than -EINPROGRESS, which just means a resume requested elsewhere isn't complete), just drop the packet. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:38 +01:00
Alex Elder	a96e73fa12	net: ipa: re-enable transmit in PM WQ context Create a new work structure in the modem private data, and use it to re-enable the modem network device transmit queue when resuming. This is needed by the next patch, which stops the TX queue if IPA power isn't active when a transmit request arrives. Packets will start arriving the instant the TX queue is enabled, but resuming isn't complete until ipa_modem_resume() returns. This way we're sure to be resumed before transmits are allowed again. Cancel it before calling ipa_stop() in ipa_modem_stop() to ensure the transmit queue restart completes before it gets stopped there. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:38 +01:00
Alex Elder	b9c532c11c	net: ipa: distinguish system from runtime suspend Add a new flag that is set when the hardware is suspended due to a system suspend operation, distingishing it from runtime suspend. Use it in the SUSPEND IPA interrupt handler to determine whether to trigger a system resume because of the event. Define new suspend and resume power management callback functions to set and clear the new flag, respectively. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:38 +01:00
Alex Elder	d430fe4bac	net: ipa: enable wakeup in ipa_power_setup() Move the call to enable the IPA interrupt as a wakeup interrupt into ipa_power_setup(), disable it in ipa_power_teardown(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:13:38 +01:00
David S. Miller	8db102a6f4	Merge branch 'bridgge-mcast' Nikolay Aleksandrov says: ==================== net: bridge: mcast: dump querier state This set adds the ability to dump the current multicast querier state. This is extremely useful when debugging multicast issues, we've had many cases of unexpected queriers causing strange behaviour and mcast test failures. The first patch changes the querier struct to record a port device's ifindex instead of a pointer to the port itself so we can later retrieve it, I chose this way because it's much simpler and doesn't require us to do querier port ref counting, it is best effort anyway. Then patch 02 makes the querier address/port updates consistent via a combination of multicast_lock and seqcount, so readers can only use seqcount to get a consistent snapshot of address and port. Patch 03 is a minor cleanup in preparation for the dump support, it consolidates IPv4 and IPv6 querier selection paths as they share most of the logic (except address comparisons of course). Finally the last three patches add the new querier state dumping support, for the bridge's global multicast context we embed the BRIDGE_QUERIER_xxx attributes into IFLA_BR_MCAST_QUERIER_STATE and for the per-vlan global mcast contexts we embed them into BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE. The structure is: [IFLA_BR_MCAST_QUERIER_STATE / BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE] `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IP_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IP_OTHER_TIMER] - other querier timeout `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IPV6_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IPV6_OTHER_TIMER] - other querier timeout Later we can also add IGMP version of seen queriers and last seen values from the queries. ====================	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	ddc649d158	net: bridge: vlan: dump mcast ctx querier state Use the new mcast querier state dump infrastructure and export vlans' mcast context querier state embedded in attribute BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	85b4108211	net: bridge: mcast: dump ipv6 querier state Add support for dumping global IPv6 querier state, we dump the state only if our own querier is enabled or there has been another external querier which has won the election. For the bridge global state we use a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside. The structure is: [IFLA_BR_MCAST_QUERIER_STATE] `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IPV6_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IPV6_OTHER_TIMER] - other querier timeout IPv4 and IPv6 attributes are embedded at the same level of IFLA_BR_MCAST_QUERIER_STATE. If we didn't dump anything we cancel the nest and return. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	c7fa1d9b1f	net: bridge: mcast: dump ipv4 querier state Add support for dumping global IPv4 querier state, we dump the state only if our own querier is enabled or there has been another external querier which has won the election. For the bridge global state we use a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside. The structure is: [IFLA_BR_MCAST_QUERIER_STATE] `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier `[BRIDGE_QUERIER_IP_PORT] - bridge port ifindex where the querier was seen (set only if external querier) `[BRIDGE_QUERIER_IP_OTHER_TIMER] - other querier timeout Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	c3fb3698f9	net: bridge: mcast: consolidate querier selection for ipv4 and ipv6 We can consolidate both functions as they share almost the same logic. This is easier to maintain and we have a single querier update function. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	67b746f94f	net: bridge: mcast: make sure querier port/address updates are consistent Use a sequence counter to make sure port/address updates can be read consistently without requiring the bridge multicast_lock. We need to zero out the port and address when the other querier has expired and we're about to select ourselves as querier. br_multicast_read_querier will be used later when dumping querier state. Updates are done only with the multicast spinlock and softirqs disabled, while reads are done from process context and from softirqs (due to notifications). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
Nikolay Aleksandrov	bb18ef8e7e	net: bridge: mcast: record querier port device ifindex instead of pointer Currently when a querier port is detected its net_bridge_port pointer is recorded, but it's used only for comparisons so it's fine to have stale pointer, in order to dereference and use the port pointer a proper accounting of its usage must be implemented adding unnecessary complexity. To solve the problem we can just store the netdevice ifindex instead of the port pointer and retrieve the bridge port. It is a best effort and the device needs to be validated that is still part of that bridge before use, but that is small price to pay for avoiding querier reference counting for each port/vlan. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 14:02:43 +01:00
David S. Miller	2fa16787c4	Merge branch 'devlink-cleanup-for-delay-event' Leon Romanovsky says: ==================== Devlink cleanup for delay event series Jakub's request to make sure that devlink events are delayed and not printed till they fully accessible [1] requires us to implement delayed event notification system in the devlink. In order to do it, I moved some of my patches (xarray e.t.c) from the future series to be before "Move devlink_register to be near devlink_reload_enable" [2]. That allows us to rely on DEVLINK_REGISTERED xarray mark to decide if to print event or not. Other patches are simple cleanup which is needed anyway. [1] https://lore.kernel.org/lkml/20210811071817.4af5ab34@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com [2] https://lore.kernel.org/lkml/cover.1628599239.git.leonro@nvidia.com Next in the queue: * Delay event series * Move devlink_register to be near devlink_reload_enable" * Extension of devlink_ops to be set dynamically * devlink_reload_* delete * Devlink locks rework to user xarray and reference counting * ???? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-14 13:59:10 +01:00

1 2 3 4 5 ...

1031974 Commits