linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-09-22 22:11:38 +08:00

Author	SHA1	Message	Date
Parav Pandit	4445abbd13	net/mlx5: SF, use recent sysfs api Use sysfs_emit() which is aware of PAGE_SIZE buffer. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:32 -07:00
Shay Drory	2d0b41a376	net/mlx5: Refcount mlx5_irq with integer Currently, all access to mlx5 IRQs are done undere a lock. Hance, there isn't a reason to have kref in struct mlx5_irq. Switch it to integer. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:31 -07:00
Shay Drory	68fefb7089	net/mlx5: Change SF missing dedicated MSI-X err message to dbg When MSI-X vectors allocated are not enough for SFs to have dedicated, MSI-X, kernel log buffer has too many entries. Hence only enable such log with debug level. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:31 -07:00
Shay Drory	211f4f99ed	net/mlx5: Align mlx5_irq structure mlx5_irq structure have holes due to incorrect position of fields in it. Make them naturally align. pahole output after alignment: struct mlx5_irq { struct atomic_notifier_head nh; /* 0 72 / / --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- / cpumask_var_t mask; / 72 8 / char name[32]; / 80 32 / struct mlx5_irq_pool pool; /* 112 8 / struct kref kref; / 120 4 / u32 index; / 124 4 / / --- cacheline 2 boundary (128 bytes) --- / int irqn; / 128 4 / / size: 136, cachelines: 3, members: 7 / / padding: 4 / / last cacheline: 8 bytes */ }; Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:31 -07:00
Leon Romanovsky	8e792700b9	net/mlx5: Delete impossible dev->state checks New mlx5_core device structure is allocated through devlink_alloc with\ kzalloc and that ensures that all fields are equal to zero and it includes ->state too. That means that checks of that field in the mlx5_init_one() is completely redundant, because that function is called only once in the begging of mlx5_core_dev lifetime. PCI: .probe() -> probe_one() -> mlx5_init_one() The recovery flow can't run at that time or before it, because relevant work initialized later in mlx5_init_once(). Such initialization flow ensures that dev->state can't be MLX5_DEVICE_STATE_UNINITIALIZED at all, so remove such impossible checks. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:30 -07:00
Maor Gottlieb	90b85d4e31	net/mlx5: Fix inner TTC table creation Fix typo of the cited commit that calls to mlx5_create_ttc_table, instead of mlx5_create_inner_ttc_table. Fixes: `f4b45940e9` ("net/mlx5: Embed mlx5_ttc_table") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:30 -07:00
Cai Huoqing	39c538d644	net/mlx5: Fix typo in comments Fix typo: vectores ==> vectors realeased ==> released erros ==> errors namepsace ==> namespace trafic ==> traffic proccessed ==> processed retore ==> restore Currenlty ==> Currently crated ==> created chane ==> change cannnot ==> cannot usuallly ==> usually failes ==> fails importent ==> important reenabled ==> re-enabled alocation ==> allocation recived ==> received tanslation ==> translation Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-08-11 11:14:30 -07:00
David S. Miller	88be326349	Merge branch 'dsa-tagger-helpers' Vladimir Oltean says: ==================== DSA tagger helpers The goal of this series is to minimize the use of memmove and skb->data in the DSA tagging protocol drivers. Unfiltered access to this level of information is not very friendly to drive-by contributors, and sometimes is also not the easiest to review. For starters, I have converted the most common form of DSA tagging protocols: the DSA headers which are placed where the EtherType is. The helper functions introduced by this series are: - dsa_alloc_etype_header - dsa_strip_etype_header - dsa_etype_header_pos_rx - dsa_etype_header_pos_tx This series is just a resend as non-RFC of v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:44:59 +01:00
Vladimir Oltean	a72808b658	net: dsa: create a helper for locating EtherType DSA headers on TX Create a similar helper for locating the offset to the DSA header relative to skb->data, and make the existing EtherType header taggers to use it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:44:58 +01:00
Vladimir Oltean	5d928ff486	net: dsa: create a helper for locating EtherType DSA headers on RX It seems that protocol tagging driver writers are always surprised about the formula they use to reach their EtherType header on RX, which becomes apparent from the fact that there are comments in multiple drivers that mention the same information. Create a helper that returns a void pointer to skb->data - 2, as well as centralize the explanation why that is the case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:44:58 +01:00
Vladimir Oltean	6bef794da6	net: dsa: create a helper which allocates space for EtherType DSA headers Hide away the memmove used by DSA EtherType header taggers to shift the MAC SA and DA to the left to make room for the header, after they've called skb_push(). The call to skb_push() is still left explicit in drivers, to be symmetric with dsa_strip_etype_header, and because not all callers can be refactored to do it (for example, brcm_tag_xmit_ll has common code for a pre-Ethernet DSA tag and an EtherType DSA tag). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:44:58 +01:00
Vladimir Oltean	f1dacd7aea	net: dsa: create a helper that strips EtherType DSA headers on RX All header taggers open-code a memmove that is fairly not all that obvious, and we can hide the details behind a helper function, since the only thing specific to the driver is the length of the header tag. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:44:58 +01:00
David S. Miller	1a8e628c8a	Merge branch 'devlink-aux-devices' Parav Pandit says: ==================== devlink: Control auxiliary devices Currently, for mlx5 multi-function device, a user is not able to control which functionality to enable/disable. For example, each PCI PF, VF, SF function by default has netdevice, RDMA and vdpa-net devices always enabled. Hence, enable user to control which device functionality to enable/disable. This is achieved by using existing devlink params [1] to enable/disable eth, rdma and vdpa net functionality control knob. For example user interested in only vdpa device function: performs, $ devlink dev param set pci/0000:06:00.0 name enable_rdma value false \ cmode driverinit $ devlink dev param set pci/0000:06:00.0 name enable_eth value false \ cmode driverinit $ devlink dev param set pci/0000:06:00.0 name enable_vnet value true \ cmode driverinit $ devlink dev reload pci/0000:06:00.0 Reload command honors parameters set, initializes the device that user has composed using devlink dev params and resources. Devices before reload: mlx5_core.sf.4 (subfunction device) /\ /\| \ / \| \ / \| \ mlx5_core.eth.4 \| mlx5_core.rdma.4 (SF eth aux dev) \| (SF rdma aux dev) \| \| \| \| \| \| enp6s0f0s88 \| mlx5_0 (SF netdev) \| (SF rdma device) \| mlx5_core.vnet.4 (SF vnet aux dev) \| \| auxiliary/mlx5_core.sf.4 (vdpa net mgmt device) Above example reconfigures the device with only VDPA functionality. Devices after reload: mlx5_core.sf.4 (subfunction device) /\ / \ / \ / \ mlx5_core.vnet.4 no eth, no rdma aux devices (SF vnet aux dev) Above parameters enable user to compose the device as needed based on the use case. Since devlink params are done on the devlink instance, these knobs are uniformly usable for PCI PF, VF and SF devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:22 +01:00
Parav Pandit	70862a5d60	net/mlx5: Support enable_vnet devlink dev param Enable user to disable VDPA net auxiliary device so that when it is not required, user can disable it. For example, $ devlink dev param set pci/0000:06:00.0 \ name enable_vnet value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device mlx5_core.vnet.2 for the VDPA net functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:22 +01:00
Parav Pandit	87158cedf0	net/mlx5: Support enable_rdma devlink dev param Enable user to disable RDMA auxiliary device so that when it is not required, user can disable it. For example, $ devlink dev param set pci/0000:06:00.0 \ name enable_rdma value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device mlx5_core.rdma.2 for the RDMA functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:22 +01:00
Parav Pandit	a17beb28ed	net/mlx5: Support enable_eth devlink dev param Enable user to disable Ethernet auxiliary device so that when it is not required, user can disable it. For example, $ devlink dev param set pci/0000:06:00.0 \ name enable_eth value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create mlx5_core.eth.2 auxiliary device for the Ethernet functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	6f35723864	net/mlx5: Fix unpublish devlink parameters Cleanup routine missed to unpublish the parameters. Add it. Fixes: `e890acd5ff` ("net/mlx5: Add devlink flow_steering_mode parameter") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	9c4a7665b4	devlink: Add APIs to publish, unpublish individual parameter Enable drivers to publish/unpublish individual parameter. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	b40c51efef	devlink: Add API to register and unregister single parameter Currently device configuration parameters can be registered as an array. Due to this a constant array must be registered. A single driver supporting multiple devices each with different device capabilities end up registering all parameters even if it doesn't support it. One possible workaround a driver can do is, it registers multiple single entry arrays to overcome such limitation. Better is to provide a API that enables driver to register/unregister a single parameter. This also further helps in two ways. (1) to reduce the memory of devlink_param_entry by avoiding in registering parameters which are not supported by the device. (2) avoid generating multiple parameter add, delete, publish, unpublish, init value notifications for such unsupported parameters Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	699784f7b7	devlink: Create a helper function for one parameter registration Create and use a helper function for one parameter registration. Subsequent patch also will reuse this for driver facing routine to register a single parameter. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	076b2a9dbb	devlink: Add new "enable_vnet" generic device param Add new device generic parameter to enable/disable creation of VDPA net auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_vnet value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the VDPA net functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	8ddaabee3c	devlink: Add new "enable_rdma" generic device param Add new device generic parameter to enable/disable creation of RDMA auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_rdma value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the RDMA functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
Parav Pandit	f13a5ad881	devlink: Add new "enable_eth" generic device param Add new device generic parameter to enable/disable creation of Ethernet auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_eth value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the Ethernet functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 14:34:21 +01:00
David S. Miller	e9c130ad66	Merge branch 'bridge-global-mcast' Nikolay Aleksandrov says: ==================== net: bridge: vlan: add global mcast options This is the first follow-up set after the support for per-vlan multicast contexts which extends global vlan options to support bridge's multicast config per-vlan, it enables user-space to change and dump the already existing bridge vlan multicast context options. The global option patches (01 - 09 and 12-13) follow a similar pattern of changing current mcast functions to take multicast context instead of a port/bridge directly. Option equality checks have been added for dumping vlan range compression. The last 2 patches extend the mcast router dump support so it can be re-used when dumping vlan config. patches 01 - 09: add support for various mcast options patches 10 - 11: prepare for per-vlan querier control patches 12 - 13: add support for querier control and router control patches 14 - 15: add support for dumping per-vlan router ports Next patch-sets: - per-port/vlan router option config - iproute2 support for all new vlan options - selftests ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	dc002875c2	net: bridge: vlan: use br_rports_fill_info() to export mcast router ports Embed the standard multicast router port export by br_rports_fill_info() into a new global vlan attribute BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS. In order to have the same format for the global bridge mcast context and the per-vlan mcast context we need a double-nesting: - BRIDGE_VLANDB_GOPTS_MCAST_ROUTER_PORTS - MDBA_ROUTER Currently we don't compare router lists, if any router port exists in the bridge mcast contexts we consider their option sets as different and export them separately. In addition we export the router port vlan id when dumping similar to the router port notification format. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	e04d377ff6	net: bridge: mcast: use the proper multicast context when dumping router ports When we are dumping the router ports of a vlan mcast context we need to use the bridge/vlan and port/vlan's multicast contexts to check if IPv4/IPv6 router port is present and later to dump the vlan id. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	a97df080b6	net: bridge: vlan: add support for mcast router global option Add support to change and retrieve global vlan multicast router state which is used for the bridge itself. We just need to pass multicast context to br_multicast_set_router instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	62938182c3	net: bridge: vlan: add support for mcast querier global option Add support to change and retrieve global vlan multicast querier state. We just need to pass multicast context to br_multicast_set_querier instead of bridge device and the rest of the logic remains the same. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	cb486ce995	net: bridge: mcast: querier and query state affect only current context type It is a minor optimization and better behaviour to make sure querier and query sending routines affect only the matching multicast context depending if vlan snooping is enabled (vlan ctx vs bridge ctx). It also avoids sending unnecessary extra query packets. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	4d5b4e84c7	net: bridge: mcast: move querier state to the multicast context We need to have the querier state per multicast context in order to have per-vlan control, so remove the internal option bit and move it to the multicast context. Also annotate the lockless reads of the new variable. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	941121ee22	net: bridge: vlan: add support for mcast startup query interval global option Add support to change and retrieve global vlan multicast startup query interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	425214508b	net: bridge: vlan: add support for mcast query response interval global option Add support to change and retrieve global vlan multicast query response interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	d6c08aba4f	net: bridge: vlan: add support for mcast query interval global option Add support to change and retrieve global vlan multicast query interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	cd9269d463	net: bridge: vlan: add support for mcast querier interval global option Add support to change and retrieve global vlan multicast querier interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	2da0aea21f	net: bridge: vlan: add support for mcast membership interval global option Add support to change and retrieve global vlan multicast membership interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	77f6ababa2	net: bridge: vlan: add support for mcast last member interval global option Add support to change and retrieve global vlan multicast last member interval option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	50725f6e6b	net: bridge: vlan: add support for mcast startup query count global option Add support to change and retrieve global vlan multicast startup query count option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	931ba87d20	net: bridge: vlan: add support for mcast last member count global option Add support to change and retrieve global vlan multicast last member count option. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
Nikolay Aleksandrov	df271cd641	net: bridge: vlan: add support for mcast igmp/mld version global options Add support to change and retrieve global vlan IGMP/MLD versions. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:34:41 +01:00
David S. Miller	6899192f64	Merge branch 'ipa-runtime-pm' Alex Elder says: ==================== net: ipa: use runtime PM reference counting This series does further rework of the IPA clock code so that we rely on some of the core runtime power management code (including its referencing counting) instead. The first patch makes ipa_clock_get() act like pm_runtime_get_sync(). The second patch makes system suspend occur regardless of the current reference count value, which is again more like how the runtime PM core code behaves. The third patch creates functions to encapsulate all hardware suspend and resume activity. The fourth uses those functions as the ->runtime_suspend and ->runtime_resume power callbacks. With that in place, ipa_clock_get() and ipa_clock_put() are changed to use runtime PM get and put functions when needed. The fifth patch eliminates an extra clock reference previously used to control system suspend. The sixth eliminates the "IPA clock" reference count and mutex. The final patch replaces the one call to ipa_clock_get_additional() with a call to pm_runtime_get_if_active(), making the former unnecessary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	0d08026ac6	net: ipa: kill ipa_clock_get_additional() Now that ipa_clock_get_additional() is a trivial wrapper around pm_runtime_get_if_active(), just open-code it in its only caller and delete the function. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	a71aeff3dd	net: ipa: kill IPA clock reference count The runtime power management core code maintains a usage count. This count mirrors the IPA clock reference count, and there's no need to maintain both. So get rid of the IPA clock reference count and just rely on the runtime PM usage count to determine when the hardware should be suspended or resumed. Use pm_runtime_get_if_active() in ipa_clock_get_additional(). We care whether power is active, regardless of whether it's in use, so pass true for its ign_usage_count argument. The IPA clock mutex is just used to make enabling/disabling the clock and updating the reference count occur atomically. Without the reference count, there's no need for the mutex, so get rid of that too. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	a3d3e759a4	net: ipa: get rid of extra clock reference Suspending the IPA hardware is now managed by the runtime PM core code. The ->runtime_idle callback returns a non-zero value, so it will never suspend except when forced. As a result, there's no need to take an extra "do not suspend" clock reference. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	63de79f031	net: ipa: use runtime PM core Use the runtime power management core to cause hardware suspend and resume to occur. Enable it in ipa_clock_init() (without autosuspend), and disable it in ipa_clock_exit(). Use ipa_runtime_suspend() as the ->runtime_suspend power operation, and arrange for it to be called by having ipa_clock_get() call pm_runtime_get_sync() when the first clock reference is taken. Similarly, use ipa_runtime_resume() as the ->runtime_resume power operation, and pm_runtime_put() when the last IPA clock reference is dropped. Introduce ipa_runtime_idle() as the ->runtime_idle power operation, and have it return a non-zero value; this way suspend will never occur except when forced. Use pm_runtime_force_suspend() and pm_runtime_force_resume() as the system suspend and resume callbacks, and remove ipa_suspend() and ipa_resume(). Store a pointer to the device structure passed to ipa_clock_init(), so it can be used by ipa_clock_exit() to disable runtime power management. For now we preserve IPA clock reference counting. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	2abb0c7f98	net: ipa: resume in ipa_clock_get() Introduce ipa_runtime_suspend() and ipa_runtime_resume(), which encapsulate the activities necessary for suspending and resuming the IPA hardware. Call these functions from ipa_clock_get() and ipa_clock_put() when the first reference is taken or last one is dropped. When the very first clock reference is taken (for ipa_config()), setup isn't complete yet, so (as before) only the core clock gets enabled. When the last clock reference is dropped (after ipa_deconfig()), ipa_teardown() will have made the setup_complete flag false, so there too, the core clock will be stopped without affecting GSI or the endpoints. Otherwise these new functions will perform the desired suspend and resume actions once setup is complete. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:56 +01:00
Alex Elder	1016c6b8c6	net: ipa: disable clock in suspend Disable the IPA clock rather than dropping a reference to it in the system suspend callback. This forces the suspend to occur without affecting existing references. Similarly, enable the clock rather than taking a reference in ipa_resume(), forcing a resume without changing the reference count. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:55 +01:00
Alex Elder	7ebd168c3b	net: ipa: have ipa_clock_get() return a value We currently assume no errors occur when enabling or disabling the IPA core clock and interconnects. And although this commit exposes errors that could occur, we generally assume this won't happen in practice. This commit changes ipa_clock_get() and ipa_clock_put() so each returns a value. The values returned are meant to mimic what the runtime power management functions return, so we can set up error handling here before we make the switch. Have ipa_clock_get() increment the reference count even if it returns an error, to match the behavior of pm_runtime_get(). More details follow. When taking a reference in ipa_clock_get(), return 0 for the first reference, 1 for subsequent references, or a negative error code if an error occurs. Note that if ipa_clock_get() returns an error, we must not touch hardware; in some cases such errors now cause entire blocks of code to be skipped. When dropping a reference in ipa_clock_put(), we return 0 or an error code. The error would come from ipa_clock_disable(), which now returns what ipa_interconnect_disable() returns (either 0 or a negative error code). For now, callers ignore the return value; if an error occurs, a message will have already been logged, and little more can actually be done to improve the situation. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 13:31:55 +01:00
David S. Miller	6f45933dfe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Use nfnetlink_unicast() instead of netlink_unicast() in nft_compat. 2) Remove call to nf_ct_l4proto_find() in flowtable offload timeout fixup. 3) CLUSTERIP registers ARP hook on demand, from Florian. 4) Use clusterip_net to store pernet warning, also from Florian. 5) Remove struct netns_xt, from Florian Westphal. 6) Enable ebtables hooks in initns on demand, from Florian. 7) Allow to filter conntrack netlink dump per status bits, from Florian Westphal. 8) Register x_tables hooks in initns on demand, from Florian. 9) Remove queue_handler from per-netns structure, again from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-11 10:22:26 +01:00
Lahav Schlesinger	d3432bf10f	net: Support filtering interfaces on no master Currently there's support for filtering neighbours/links for interfaces which have a specific master device (using the IFLA_MASTER/NDA_MASTER attributes). This patch adds support for filtering interfaces/neighbours dump for interfaces that don't have a master. Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210810090658.2778960-1-lschlesinger@drivenets.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-10 16:03:34 -07:00
Mark Bloch	a5397d68b2	net/sched: cls_api, reset flags on replay tc_new_tfilter() can replay a request if it got EAGAIN. The cited commit didn't account for this when it converted TC action ->init() API to use flags instead of parameters. This can lead to passing stale flags down the call chain which results in trying to lock rtnl when it's already locked, deadlocking the entire system. Fix by making sure to reset flags on each replay. ============================================ WARNING: possible recursive locking detected 5.14.0-rc3-custom-49011-g3d2bbb4f104d #447 Not tainted -------------------------------------------- tc/37605 is trying to acquire lock: ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_setup_cb_add+0x14b/0x4d0 but task is already holding lock: ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_new_tfilter+0xb12/0x22e0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(rtnl_mutex); lock(rtnl_mutex); * DEADLOCK * May be due to missing lock nesting notation 1 lock held by tc/37605: #0: ffffffff841df2f0 (rtnl_mutex){+.+.}-{3:3}, at: tc_new_tfilter+0xb12/0x22e0 stack backtrace: CPU: 0 PID: 37605 Comm: tc Not tainted 5.14.0-rc3-custom-49011-g3d2bbb4f104d #447 Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017 Call Trace: dump_stack_lvl+0x8b/0xb3 __lock_acquire.cold+0x175/0x3cb lock_acquire+0x1a4/0x4f0 __mutex_lock+0x136/0x10d0 fl_hw_replace_filter+0x458/0x630 [cls_flower] fl_change+0x25f2/0x4a64 [cls_flower] tc_new_tfilter+0xa65/0x22e0 rtnetlink_rcv_msg+0x86c/0xc60 netlink_rcv_skb+0x14d/0x430 netlink_unicast+0x539/0x7e0 netlink_sendmsg+0x84d/0xd80 ____sys_sendmsg+0x7ff/0x970 ___sys_sendmsg+0xf8/0x170 __sys_sendmsg+0xea/0x1b0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7b93b6c0a7 Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> RSP: 002b:00007ffe365b3818 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b93b6c0a7 RDX: 0000000000000000 RSI: 00007ffe365b3880 RDI: 0000000000000003 RBP: 00000000610a75f6 R08: 0000000000000001 R09: 0000000000000000 R10: fffffffffffff3a9 R11: 0000000000000246 R12: 0000000000000001 R13: 0000000000000000 R14: 00007ffe365b7b58 R15: 00000000004822c0 Fixes: `695176bfe5` ("net_sched: refactor TC action init API") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210810034305.63997-1-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-08-10 16:01:17 -07:00

1 2 3 4 5 ...

1031469 Commits