linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-15 00:04:15 +08:00

Author	SHA1	Message	Date
David S. Miller	e3dd762772	Merge branch 'mlxsw-fw_load_policy' Ido Schimmel says: ==================== mlxsw: Add 'fw_load_policy' devlink parameter Shalom says: Currently, drivers do not have the ability to control the firmware loading policy and they always use their own fixed policy. This prevents drivers from running the device with a different firmware version for testing and/or debugging purposes. For example, testing a firmware bug fix. For these situations, the new devlink generic parameter, 'fw_load_policy', gives the ability to control this option and allows drivers to run with a different firmware version than required by the driver. Patch #1 adds the new parameter to devlink. The other two patches, #2 and #3, add support for this parameter in the mlxsw driver. Example: # Query the devlink parameters supported by the device $ devlink dev param show pci/0000:03:00.0: name fw_load_policy type generic values: cmode driverinit value driver # Flash new firmware using ethtool $ ethtool -f swp1 mellanox/mlxsw_spectrum-13.1703.4.mfa2 # Toggle parameter $ devlink dev param set pci/0000:03:00.0 name fw_load_policy value flash cmode driverinit # devlink reset $ devlink dev reload pci/0000:03:00.0 # Query firmware version to show changes took affect $ ethtool -i swp1 driver: mlxsw_spectrum version: 1.0 firmware-version: 13.1703.4 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: no supports-priv-flags: no iproute2 patches available here: https://github.com/tshalom/iproute2-next v2: * Change 'fw_version_check' to 'fw_load_policy' with values 'driver' and 'flash' (Jakub) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 13:55:44 -08:00
Shalom Toledo	064501c5b6	mlxsw: spectrum: Load firmware version based on devlink parameter Load firmware version based on 'fw_load_policy' devlink parameter. The driver supports these two options: * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER (0) Default, load firmware version preferred by the driver * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH (1) Load firmware currently stored in flash The second option, 'flash', allow the device to run with different firmware version than preferred by the driver for testing and/or debugging purposes. For example, testing a firmware bug fix. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 13:55:43 -08:00
Shalom Toledo	03bffcad49	mlxsw: core: Reset firmware after flash during driver initialization After flashing new firmware during the driver initialization flow (reload or not), the driver should do a firmware reset when it gets -EAGAIN in order to load the new one. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 13:55:43 -08:00
Shalom Toledo	846e980a87	devlink: Add 'fw_load_policy' generic parameter Many drivers load the device's firmware image during the initialization flow either from the flash or from the disk. Currently this option is not controlled by the user and the driver decides from where to load the firmware image. 'fw_load_policy' gives the ability to control this option which allows the user to choose between different loading policies supported by the driver. This parameter can be useful while testing and/or debugging the device. For example, testing a firmware bug fix. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 13:55:43 -08:00
Heiner Kallweit	6915bf3b00	net: phy: don't allow __set_phy_supported to add unsupported modes Currently __set_phy_supported allows to add modes w/o checking whether the PHY supports them. This is wrong, it should never add modes but only remove modes we don't want to support. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-03 13:50:06 -08:00
Nathan Chancellor	97e6c858a2	net: usb: aqc111: Initialize wol_cfg with memset in aqc111_suspend Clang warns: drivers/net/usb/aqc111.c:1326:37: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct aqc111_wol_cfg wol_cfg = { 0 }; ^ {} 1 warning generated. Use memset to initialize the object to take compiler instrumentation out of the equation. Fixes: `e58ba4544c` ("net: usb: aqc111: Add support for wake on LAN by MAGIC packet") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:26:15 -08:00
YueHaibing	315c9e8301	net: qualcomm: rmnet: Remove set but not used variable 'cmd' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c: In function 'rmnet_map_do_flow_control': drivers/net/ethernet/qualcomm/rmnet/rmnet_map_command.c:23:36: warning: variable 'cmd' set but not used [-Wunused-but-set-variable] struct rmnet_map_control_command *cmd; 'cmd' not used anymore now, should also be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:24:01 -08:00
Nicolas Dichtel	26d31925cd	tun: implement carrier change The userspace may need to control the carrier state. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:16:38 -08:00
Paolo Abeni	bf1c3ab8d3	net: reorder flowi_common fields to avoid holes the flowi* structures are used and memsetted by server functions in critical path. Currently flowi_common has a couple of holes that we can eliminate reordering the struct fields. As a side effect, both flowi4 and flowi6 shrink by 8 bytes. Before: pahole -EC flowi_common struct flowi_common { // ... /* size: 40, cachelines: 1, members: 10 / / sum members: 32, holes: 1, sum holes: 4 / / padding: 4 / / last cacheline: 40 bytes / }; pahole -EC flowi6 struct flowi6 { // ... / size: 88, cachelines: 2, members: 6 / / padding: 4 / / last cacheline: 24 bytes / }; pahole -EC flowi4 struct flowi4 { // ... / size: 56, cachelines: 1, members: 4 / / padding: 4 / / last cacheline: 56 bytes / }; After: struct flowi_common { // ... / size: 32, cachelines: 1, members: 10 / / last cacheline: 32 bytes / }; struct flowi6 { // ... / size: 80, cachelines: 2, members: 6 / / padding: 4 / / last cacheline: 16 bytes / }; struct flowi4 { // ... / size: 48, cachelines: 1, members: 4 / / padding: 4 / / last cacheline: 48 bytes */ }; Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:12:39 -08:00
David S. Miller	f4bb495cde	Merge branch 'mlxsw-Add-VxLAN-support-with-VLAN-aware-bridges' Ido Schimmel says: ==================== mlxsw: Add VxLAN support with VLAN-aware bridges Commit `53e50a6ec2` ("Merge branch 'mlxsw-Add-VxLAN-support'") added mlxsw support for VxLAN when the VxLAN device was enslaved to VLAN-unaware bridges. This patchset extends mlxsw to also support VxLAN with VLAN-aware bridges. With VLAN-aware bridges, the VxLAN device's VNI is mapped to the VLAN that is configured as 'pvid untagged' on the corresponding bridge port. To prevent ambiguity, mlxsw forbids configurations in which the same VLAN is configured as 'pvid untagged' on multiple VxLAN devices. Patches #1-#2 add the necessary APIs in mlxsw and the bridge driver. Patches #3-#4 perform small refactoring in order to prepare mlxsw for VLAN-aware support. Patch #5 finally enables the enslavement of VxLAN devices to a VLAN-aware bridge. Among other things, it extends mlxsw to handle switchdev notifications about VLAN add / delete on a VxLAN device enslaved to an offloaded VLAN-aware bridge. Patches #6-#8 add selftests to test the new functionality. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	b5166d7a92	selftests: forwarding: Add VxLAN test with a VLAN-aware bridge The test is very similar to its VLAN-unaware counterpart (vxlan_bridge_1d.sh), but instead of using multiple VLAN-unaware bridges, a single VLAN-aware bridge is used with multiple VLANs. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	f07232375d	selftests: mlxsw: Add a test for VxLAN configuration with a VLAN-aware bridge Extend the existing VLAN-unaware tests with their VLAN-aware counterparts. This includes sanitization of invalid configuration and offload indication on the local route performing decapsulation and the FDB entries perform encapsulation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	bbe210615d	selftests: mlxsw: Consider VLAN-aware bridges as valid Previous patches add the ability to work with VLAN-aware bridges and VxLAN devices, so make sure such configuration no longer fails. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	d70e42b22d	mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges Commit `1c30d1836a` ("mlxsw: spectrum: Enable VxLAN enslavement to bridges") enabled the enslavement of VxLAN devices to bridges that have mlxsw ports (or their upper) as slaves. This patch extends mlxsw to also support VLAN-aware bridges. The patch is similar in nature to mentioned commit, but there is one major difference. With VLAN-aware bridges, the VxLAN device's VNI is mapped to the VLAN that is configured as PVID and egress untagged on the bridge port. Therefore, the driver is extended to listen to VLAN configuration on VxLAN devices of interest and enable / disable NVE encapsulation on the corresponding 802.1Q FIDs. To prevent ambiguity, the driver makes sure that a given VLAN is not configured as PVID and egress untagged on multiple VxLAN devices. This sanitization takes place both when a port is enslaved to a bridge with existing VxLAN devices and when a VLAN is added to / removed from a VxLAN device of interest. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	48fde46606	mlxsw: spectrum_switchdev: Prepare function for VLAN-aware bridges The vxlan_join() function resolves the FID on which the VNI should be set and then sets the VNI. Currently, the FID is simply resolved according to the ifindex of the bridge device to which the VxLAN device is enslaved. This works because only VLAN-unaware bridges are supported. With VLAN-aware bridges the FID would need to be resolved based on the VLAN to which the VNI is mapped to. Add the VLAN ID to the argument list of the function. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	b03fa9e7e0	mlxsw: spectrum_switchdev: Unify VxLAN leave function The function mlxsw_sp_bridge_vxlan_leave() is currently split between VLAN-aware and VLAN-unaware bridges, but actually both types can use the same function. The function needs to resolve the FID that corresponds to the VxLAN device and disable NVE encapsulation on it. Instead of looking up the FID differently for VLAN-aware and VLAN-unaware bridges, we can always use the VxLAN's device VNI. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	5a8fb370be	mlxsw: spectrum_fid: Add API to lookup 802.1Q FIDs without creating them In a similar fashion to commit `564c6d727a` ("mlxsw: spectrum_fid: Add APIs to lookup FID without creating it"), add a corresponding API to lookup 802.1Q FIDs. This is a prerequisite to VxLAN support with VLAN-aware bridges and will allow us to resolve a 802.1Q FID by its VLAN when an FDB entry is added on the bridge port of the VxLAN device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:29 -08:00
Ido Schimmel	5a6db04ca8	net: bridge: Extend br_vlan_get_pvid() for bridge ports Currently, the function only works for the bridge device itself, but subsequent patches will need to be able to query the PVID of a given bridge port, so extend the function. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 17:06:28 -08:00
David S. Miller	734317d93e	Merge branch 'qed-Doorbell-overflow-recovery' Ariel Elior says: ==================== qed*: Doorbell overflow recovery Doorbell Overflow If sufficient CPU cores will send doorbells at a sufficiently high rate, they can cause an overflow in the doorbell queue block message fifo. When fill level reaches maximum, the device stops accepting all doorbells from that PF until a recovery procedure has taken place. Doorbell Overflow Recovery The recovery procedure basically means resending the last doorbell for every doorbelling entity. A doorbelling entity is anything which may send doorbells: L2 tx ring, rdma sq/rq/cq, light l2, vf l2 tx ring, spq, etc. This relies on the design assumption that all doorbells are aggregative, so last doorbell carries the information of all previous doorbells. APIs All doorbelling entities need to register with the mechanism before sending doorbells. The registration entails providing the doorbell address the entity would be using, and a virtual address where last doorbell data can be found. Typically fastpath structures already have this construct. Executing the recovery procedure Handling the attentions, iterating over all the registered entities and resending their doorbells, is all handled within qed core module. Relevance All doorbelling entities in all protocols need to register with the mechanism, via the new APIs. Technically this is quite simple (just call the API). Some protocol fastpath implementation may not have the doorbell data stored anywhere (compute it from scratch every time) and will have to add such a place. This is rare and is also better practice (save some cycles on the fastpath). Performance Penalty No performance penalty should incur as a result of this feature. If anything performance can improve by avoiding recalcualtion of doorbell data everytime doorbell is sent (in some flows). Add the database used to register doorbelling entities, and APIs for adding and deleting entries, and logic for traversing the database and doorbelling once on behalf of all entities. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	bd4db888ab	qede: Register l2 queues with doorbell overflow recovery mechanism All L2 queues funnel through this flow, so this would cover the regular RSS queues, as well queues created for VFs, mqos queues, xdp queues, etc. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	0e1f10447e	qed: Expose the doorbell overflow recovery mechanism to the protocol drivers Most of the doorbelling entities are outside of the core module. L2 queues, Roce queues, iscsi and fcoe all need to register. Make the APIs available for these drivers. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	b78d5400bd	qed: Register light L2 queues with doorbell overflow recovery mechanism Light L2 queues are doorbelling entities. Modify the implementation to keep the doorbell data necessary for doorbelling in well known location instead of recomputing every time. Register the LL2 queue with doorbell recovery mechanism. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	9ecd8c3fea	qed: Register slowpath queue doorbell with doorbell overflow recovery mechanism Slow path queue is a doorbelling entity. Register it with the overflow mechanism. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	a1b469b8b1	qed: Use the doorbell overflow recovery mechanism in case of doorbell overflow In case of an attention from the doorbell queue block, analyze the HW indications. In case of a doorbell overflow, execute a doorbell recovery. Since there can be spurious indications (race conditions between multiple PFs), schedule a periodic task for checking whether a doorbell overflow may have been missed. After a set time with no indications, terminate the periodic task. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:13 -08:00
Ariel Elior	36907cd5cd	qed: Add doorbell overflow recovery mechanism Add the database used to register doorbelling entities, and APIs for adding and deleting entries, and logic for traversing the database and doorbelling once on behalf of all entities. Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:45:12 -08:00
David S. Miller	dd354208dc	Merge branch 'rtnetlink-avoid-a-warning-in-rtnl_newlink' Jakub Kicinski says: ==================== rtnetlink: avoid a warning in rtnl_newlink() I've been hoping for some time that someone more competent would fix the stack frame size warning in rtnl_newlink(), but looks like I'll have to take a stab at it myself :) That's the only warning I see in most of my builds. First patch refactors away a somewhat surprising if (1) code block. Reindentation will most likely cause cherry-pick problems but OTOH rtnl_newlink() doesn't seem to be changed often, so perhaps we can risk it in the name of cleaner code? Second patch fixes the warning in simplest possible way. I was pondering if there is any more clever solution, but I can't see it.. rtnl_newlink() is quite long with a lot of possible execution paths so doing memory allocations half way through leads to very ugly results. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:33:35 -08:00
Jakub Kicinski	a293974590	rtnetlink: avoid frame size warning in rtnl_newlink() Standard kernel compilation produces the following warning: net/core/rtnetlink.c: In function ‘rtnl_newlink’: net/core/rtnetlink.c:3232:1: warning: the frame size of 1288 bytes is larger than 1024 bytes [-Wframe-larger-than=] } ^ This should not really be an issue, as rtnl_newlink() stack is generally quite shallow. Fix the warning by allocating attributes with kmalloc() in a wrapper and passing it down to rtnl_newlink(), avoiding complexities on error paths. Alternatively we could kmalloc() some structure within rtnl_newlink(), slave attributes look like a good candidate. In practice it adds to already rather high complexity and length of the function. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:33:34 -08:00
Jakub Kicinski	420d031822	rtnetlink: remove a level of indentation in rtnl_newlink() rtnl_newlink() used to create VLAs based on link kind. Since commit `ccf8dbcd06` ("rtnetlink: Remove VLA usage") statically sized array is created on the stack, so there is no more use for a separate code block that used to be the VLA's live range. While at it christmas tree the variables. Note that there is a goto-based retry so to be on the safe side the variables can no longer be initialized in place. It doesn't seem to matter, logically, but why make the code harder to read.. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:33:34 -08:00
David S. Miller	74315c393f	Merge branch 'nfp-update-TX-path-to-enable-repr-offloads' Jakub Kicinski says: ==================== nfp: update TX path to enable repr offloads This set starts with three micro optimizations to the TX path. The improvement is measurable, but below 1% of CPU utilization. Patches 4 - 9 add basic TX offloads to representor devices, like checksum offload or TSO, and remove the unnecessary TX lock and Qdisc (our representors are software constructs on top of the PF). The last 2 patches add more info to error messages - id of command which failed and exact location of incorrect TLVs, very useful for debugging. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:45 -08:00
Jakub Kicinski	6db3a9dcf0	nfp: report more info when reconfiguration fails FW reconfiguration timeouts are a common indicator of FW trouble. To make debugging easier print requested update and control word when reconfiguration fails. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:45 -08:00
Jakub Kicinski	9571d98775	nfp: add offset to all TLV parsing errors When troubleshooting incorrect FW capabilities it's useful to know where the faulty TLV is located. Add offset to all errors messages. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	51a6588e8c	nfp: add offloads on representors FW/HW can generally support the standard networking offloads on representors without any trouble. Add the ability for FW to advertise which features should be available on representors. Because representors are muxed on top of the vNIC we need to listen on feature changes of their lower devices, and update their features appropriately. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	71844fac1e	nfp: add locking around representor changes Up until now we never needed to keep a networking locks around representors accesses, we only accessed them when device was reconfigured (under nfp pf->lock) or on fast path (under RCU). Now we want to be able to iterate over all representors during notifications, so make sure representor assignment is done under RTNL lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	fbf60e377d	nfp: run don't require Qdiscs on representor netdevs Our representors are software devices built on top of the PF vNIC, the queuing should only happen at the vNIC netdevice. Allow representors to run qdisc-less. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	9db8bbcb9b	nfp: run representor TX locklessly Our representors are software devices built on top of the PF vNIC, the only state they have are per-cpu stats, so make the TX run locklessly. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	d7cc825225	nfp: avoid oversized TSO headers with metadata prepend In preparation for TSO over representors make sure the port id prepend will always fit in the frame. The current max header length is 255, which is ample, so assume worst case scenario of 8 byte prepend and save ourselves the conditionals. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	b54ad0eaad	nfp: correct descriptor offsets in presence of metadata The TSO-related offsets in the descriptor should not include the length of the prepended metadata. Adjust them. Note that this could not have caused issues in the past as we don't support TSO with metadata prepend as of this patch. Signed-off-by: Michael Rapson <michael.rapson@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	8b5ddf1e51	nfp: move queue variable init nd_q is only used at the very end of nfp_net_tx(), there is no need to initialize it early. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	de31049a48	nfp: move temporary variables in nfp_net_tx_complete() Move temporary variables in scope of the loop in nfp_net_tx_complete(), and add a temp for txbuf software structure. This saves us 0.2% of CPU. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Jakub Kicinski	9586274967	nfp: copy only the relevant part of the TX descriptor for frags Chained descriptors for fragments need to duplicate all the descriptor fields of the skb head, so we copy the descriptor and then modify the relevant fields. This is wasteful, because the top half of the descriptor will get overwritten entirely while the bottom half is not modified at all. Copy only the bottom half. This saves us 0.3% of CPU in a GSO test. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:30:44 -08:00
Eric Dumazet	6015c71e65	tcp: md5: add tcp_md5_needed jump label Most linux hosts never setup TCP MD5 keys. We can avoid a cache line miss (accessing tp->md5ig_info) on RX and TX using a jump label. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:28:03 -08:00
David S. Miller	2f69555315	Merge branch 'tcp-take-a-bit-more-care-of-backlog-stress' Eric Dumazet says: ==================== tcp: take a bit more care of backlog stress While working on the SACK compression issue Jean-Louis Dupond reported, we found that his linux box was suffering very hard from tail drops on the socket backlog queue. First patch hints the compiler about sack flows being the norm. Second patch changes non-sack code in preparation of the ack compression. Third patch fixes tcp_space() to take backlog into account. Fourth patch is attempting coalescing when a new packet must be added to the backlog queue. Cooking bigger skbs helps to keep backlog list smaller and speeds its handling when user thread finally releases the socket lock. v3: Neal/Yuchung feedback addressed : Do not aggregate if any skb has URG bit set. Do not aggregate if the skbs have different ECE/CWR bits v2: added feedback from Neal : tcp: take care of compressed acks in tcp_add_reno_sack() added : tcp: hint compiler about sack flows added : tcp: make tcp_space() aware of socket backlog ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:26:54 -08:00
Eric Dumazet	4f693b55c3	tcp: implement coalescing on backlog queue In case GRO is not as efficient as it should be or disabled, we might have a user thread trapped in __release_sock() while softirq handler flood packets up to the point we have to drop. This patch balances work done from user thread and softirq, to give more chances to __release_sock() to complete its work before new packets are added the the backlog. This also helps if we receive many ACK packets, since GRO does not aggregate them. This patch brings ~60% throughput increase on a receiver without GRO, but the spectacular gain is really on 1000x release_sock() latency reduction I have measured. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:26:54 -08:00
Eric Dumazet	85bdf7db5b	tcp: make tcp_space() aware of socket backlog Jean-Louis Dupond reported poor iscsi TCP receive performance that we tracked to backlog drops. Apparently we fail to send window updates reflecting the fact that we are under stress. Note that we might lack a proper window increase when backlog is fully processed, since __release_sock() clears sk->sk_backlog.len _after_ all skbs have been processed. This should not matter in practice. If we had a significant load through socket backlog, we are in a dangerous situation. Reported-by: Jean-Louis Dupond <jean-louis@dupond.be> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Tested-by: Jean-Louis Dupond<jean-louis@dupond.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:26:54 -08:00
Eric Dumazet	19119f298b	tcp: take care of compressed acks in tcp_add_reno_sack() Neal pointed out that non sack flows might suffer from ACK compression added in the following patch ("tcp: implement coalescing on backlog queue") Instead of tweaking tcp_add_backlog() we can take into account how many ACK were coalesced, this information will be available in skb_shinfo(skb)->gso_segs Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:26:53 -08:00
Eric Dumazet	ebeef4bccc	tcp: hint compiler about sack flows Tell the compiler that most TCP flows are using SACK these days. There is no need to add the unlikely() clause in tcp_is_reno(), the compiler is able to infer it. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:26:53 -08:00
Geneviève Bastien	b0e3f1bdf9	net: Add trace events for all receive exit points Trace events are already present for the receive entry points, to indicate how the reception entered the stack. This patch adds the corresponding exit trace events that will bound the reception such that all events occurring between the entry and the exit can be considered as part of the reception context. This greatly helps for dependency and root cause analyses. Without this, it is not possible with tracepoint instrumentation to determine whether a sched_wakeup event following a netif_receive_skb event is the result of the packet reception or a simple coincidence after further processing by the thread. It is possible using other mechanisms like kretprobes, but considering the "entry" points are already present, it would be good to add the matching exit events. In addition to linking packets with wakeups, the entry/exit event pair can also be used to perform network stack latency analyses. Signed-off-by: Geneviève Bastien <gbastien@versatic.net> CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Ingo Molnar <mingo@redhat.com> CC: David S. Miller <davem@davemloft.net> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> (tracing side) Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:23:25 -08:00
Edward Cree	91c459561b	net/flow_dissector: correct comments on enum flow_dissector_key_id There are no such structs flow_dissector_key_flow_vlan or flow_dissector_key_flow_tags, the actual structs used are struct flow_dissector_key_vlan and struct flow_dissector_key_tags. So correct the comments against FLOW_DISSECTOR_KEY_VLAN, FLOW_DISSECTOR_KEY_FLOW_LABEL and FLOW_DISSECTOR_KEY_CVLAN to refer to those. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:21:52 -08:00
Ganesh Goudar	1b974aa43a	cxgb4: number of VFs supported is not always 16 Total number of VFs supported by PF is used to determine the last byte of VF's mac address. Number of VFs supported is not always 16, use the variable nvfs to get the number of VFs supported rather than hard coding it to 16. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-30 13:09:36 -08:00
David S. Miller	93029d7d40	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2018-11-30 The following pull-request contains BPF updates for your net-next tree. (Getting out bit earlier this time to pull in a dependency from bpf.) The main changes are: 1) Add libbpf ABI versioning and document API naming conventions as well as ABI versioning process, from Andrey. 2) Add a new sk_msg_pop_data() helper for sk_msg based BPF programs that is used in conjunction with sk_msg_push_data() for adding / removing meta data to the msg data, from John. 3) Optimize convert_bpf_ld_abs() for 0 offset and fix various lib and testsuite build failures on 32 bit, from David. 4) Make BPF prog dump for !JIT identical to how we dump subprogs when JIT is in use, from Yonghong. 5) Rename btf_get_from_id() to make it more conform with libbpf API naming conventions, from Martin. 6) Add a missing BPF kselftest config item, from Naresh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-29 18:15:07 -08:00

1 2 3 4 5 ...

798680 Commits