linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-29 15:43:59 +08:00

Author	SHA1	Message	Date
Vladimir Oltean	39e222bfd7	net: dsa: unregister cross-chip notifier after ds->ops->teardown To be symmetric with the error unwind path of dsa_switch_setup(), call dsa_switch_unregister_notifier() after ds->ops->teardown. The implication is that ds->ops->teardown cannot emit cross-chip notifiers. For example, currently the dsa_tag_8021q_unregister() call from sja1105_teardown() does not propagate to the entire tree due to this reason. However I cannot find an actual issue caused by this, observed using code inspection. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211012123735.2545742-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 13:36:01 -07:00
Anders Roxell	6312d52838	marvell: octeontx2: build error: unknown type name 'u64' Building an allmodconfig kernel arm64 kernel, the following build error shows up: In file included from drivers/crypto/marvell/octeontx2/cn10k_cpt.c:4: include/linux/soc/marvell/octeontx2/asm.h:38:15: error: unknown type name 'u64' 38 \| static inline u64 otx2_atomic64_fetch_add(u64 incr, u64 *ptr) \| ^~~ Include linux/types.h in asm.h so the compiler knows what the type 'u64' are. Fixes: `af3826db74` ("octeontx2-pf: Use hardware register for CQE count") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Link: https://lore.kernel.org/r/20211013135743.3826594-1-anders.roxell@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 13:25:36 -07:00
Jakub Kicinski	13b5ffa0e2	net: remove single-byte netdev->dev_addr writes Make the drivers which use single-byte netdev addresses (netdev->addr_len == 1) use the appropriate address setting helpers. arcnet copies from int variables and io reads a lot, so add a helper for arcnet drivers to use. Similar helper could be reused for phonet and appletalk but there isn't any good central location where we could put it, and netdevice.h is already very crowded. Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> # for HSI Link: https://lore.kernel.org/r/20211012142757.4124842-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 10:03:59 -07:00
Jakub Kicinski	400f17d330	Merge branch 'net-use-dev_addr_set-in-hamradio-and-ip-tunnels' Jakub Kicinski says: ==================== net: use dev_addr_set() in hamradio and ip tunnels Commit `406f42fa0d` ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. ==================== Link: https://lore.kernel.org/r/20211012160634.4152690-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:41:40 -07:00
Jakub Kicinski	5a1b7e1a53	ip: use dev_addr_set() in tunnels Use dev_addr_set() instead of writing to netdev->dev_addr directly in ip tunnels drivers. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:41:37 -07:00
Jakub Kicinski	20c3d9e45b	hamradio: use dev_addr_set() for setting device address Use dev_addr_set() instead of writing to netdev->dev_addr directly in hamradio drivers. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:41:37 -07:00
Jakub Kicinski	40af35fdf7	netdevice: demote the type of some dev_addr_set() helpers __dev_addr_set() and dev_addr_mod() and pretty low level, let the arguments be void, there's no chance for confusion in callers converted to use them. Keep u8 in dev_addr_set() because some of the callers are converted from a loop and we want to make sure assignments are not from an array of a different type. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:41:37 -07:00
Jakub Kicinski	fe83fe739d	Merge branch 'net-constify-dev_addr-passing-for-protocols' Jakub Kicinski says: ==================== net: constify dev_addr passing for protocols Commit `406f42fa0d` ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. netdev->dev_addr will be made const to prevent direct writes. This set sprinkles const across variables and arguments in protocol code which are used to hold references to netdev->dev_addr. ==================== Link: https://lore.kernel.org/r/20211012155840.4151590-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:49 -07:00
Jakub Kicinski	1bfcd1cc54	decnet: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in decnet constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:46 -07:00
Jakub Kicinski	6cf8628072	tipc: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in tipc constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:46 -07:00
Jakub Kicinski	1a8a23d2da	ipv6: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in ndisc constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:46 -07:00
Jakub Kicinski	2ef6db76ba	llc/snap: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in LLC and SNAP constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:46 -07:00
Jakub Kicinski	db95732446	rose: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in rose constant. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:45 -07:00
Jakub Kicinski	c045ad2cc0	ax25: constify dev_addr passing In preparation for netdev->dev_addr being constant make all relevant arguments in AX25 constant. Modify callers as well (netrom, rose). Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:40:45 -07:00
Jakub Kicinski	5f3b8acee9	Merge branch 'add-functional-support-for-gigabit-ethernet-driver' Biju Das says: ==================== Add functional support for Gigabit Ethernet driver The DMAC and EMAC blocks of Gigabit Ethernet IP found on RZ/G2L SoC are similar to the R-Car Ethernet AVB IP. The Gigabit Ethernet IP consists of Ethernet controller (E-MAC), Internal TCP/IP Offload Engine (TOE) and Dedicated Direct memory access controller (DMAC). With a few changes in the driver we can support both IPs. This patch series is aims to add functional support for Gigabit Ethernet driver by filling all the stubs except set_features. set_feature patch will send as separate RFC patch along with rx_checksum patch, as it needs further discussion related to HW checksum. With this series, we can do boot kernel with rootFS mounted on NFS on RZ/G2L platforms. ==================== Link: https://lore.kernel.org/r/20211012163613.30030-1-biju.das.jz@bp.renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:09:00 -07:00
Biju Das	9404092646	ravb: Fix typo AVB->DMAC Fix the typo AVB->DMAC in comment, as the code following the comment is for DMAC on Gigabit Ethernet IP. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:57 -07:00
Biju Das	3d6b24a2ad	ravb: Update ravb_emac_init_gbeth() This patch enables Receive/Transmit port of TOE and removes the setting of promiscuous bit from EMAC configuration mode register. This patch also update EMAC configuration mode comment from "PAUSE prohibition" to "EMAC Mode: PAUSE prohibition; Duplex; TX; RX; CRC Pass Through". Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:57 -07:00
Biju Das	95e99b1048	ravb: Document PFRI register bit Document PFRI register bit, as it is documented on R-Car Gen3 and RZ/G2L hardware manuals. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:57 -07:00
Biju Das	1091da579d	ravb: Rename "nc_queue" feature bit Rename the feature bit "nc_queue" with "nc_queues" as AVB DMAC has RX and TX NC queues. There is no functional change. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:57 -07:00
Biju Das	030634f37d	ravb: Optimize ravb_emac_init_gbeth function Optimize CXR31 register initialization on ravb_emac_init_gbeth function. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:57 -07:00
Biju Das	4ea3167bad	ravb: Rename "tsrq" variable Rename the variable "tsrq" with "tccr_mask" as we are passing TCCR mask to the ravb_wait() function. There is no functional change. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:56 -07:00
Biju Das	0ee65bc14f	ravb: Add support to retrieve stats for GbEthernet Add support for retrieving stats information for GbEthernet. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:56 -07:00
Biju Das	b6a4ee6e74	ravb: Add carrier_counters to struct ravb_hw_info RZ/G2L E-MAC supports carrier counters. Add a carrier_counter hw feature bit to struct ravb_hw_info to add this feature only for RZ/G2L. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:56 -07:00
Biju Das	1c59eb678c	ravb: Fillup ravb_rx_gbeth() stub Fillup ravb_rx_gbeth() function to support RZ/G2L. This patch also renames ravb_rcar_rx to ravb_rx_rcar to be consistent with the naming convention used in sh_eth driver. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:56 -07:00
Biju Das	16a6e245a9	ravb: Fillup ravb_rx_ring_format_gbeth() stub Fillup ravb_rx_ring_format_gbeth() function to support RZ/G2L. This patch also renames ravb_rx_ring_format to ravb_rx_ring_format_rcar to be consistent with the naming convention used in sh_eth driver. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:56 -07:00
Biju Das	2458b8edb8	ravb: Fillup ravb_rx_ring_free_gbeth() stub Fillup ravb_rx_ring_free_gbeth() function to support RZ/G2L. This patch also renames ravb_rx_ring_free to ravb_rx_ring_free_rcar to be consistent with the naming convention used in sh_eth driver. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:55 -07:00
Biju Das	3d4e37df88	ravb: Fillup ravb_alloc_rx_desc_gbeth() stub Fillup ravb_alloc_rx_desc_gbeth() function to support RZ/G2L. This patch also renames ravb_alloc_rx_desc to ravb_alloc_rx_desc_rcar to be consistent with the naming convention used in sh_eth driver. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:55 -07:00
Biju Das	2e95e08ac0	ravb: Add rx_max_buf_size to struct ravb_hw_info R-Car AVB-DMAC has maximum 2K size on RX buffer, whereas on RZ/G2L it is 8K. We need to allow for changing the MTU within the limit of the maximum size of a descriptor. Add a rx_max_buf_size variable to struct ravb_hw_info to handle this difference. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:55 -07:00
Biju Das	23144a9156	ravb: Use ALIGN macro for max_rx_len Use ALIGN macro for calculating the value for max_rx_len. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 09:08:55 -07:00
Jean Sacren	50515cac8d	net: qed_debug: fix check of false (grc_param < 0) expression The type of enum dbg_grc_params has the enumerator list starting from 0. When grc_param is declared by enum dbg_grc_params, (grc_param < 0) is always false. We should remove the check of this expression. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Acked-by: Shai Malin <smalin@marvell.com> Link: https://lore.kernel.org/r/20211012074645.12864-1-sakiwit@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 08:42:01 -07:00
Ioana Ciornei	edce2a93dd	net: enetc: include ip6_checksum.h for csum_ipv6_magic For those architectures which do not define_HAVE_ARCH_IPV6_CSUM, we need to include ip6_checksum.h which provides the csum_ipv6_magic() function. Fixes: `fb8629e2cb` ("net: enetc: add support for software TSO") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211012121358.16641-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-13 07:23:16 -07:00
Shannon Nelson	d1f24712a8	ionic: no devlink_unregister if not registered Don't try to unregister the devlink if it hasn't been registered yet. This bit of error cleanup code got missed in the recent devlink registration changes. Fixes: `7911c8bd54` ("ionic: Move devlink registration to be last devlink command") Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211012231520.72582-1-snelson@pensando.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 17:39:33 -07:00
Jakub Kicinski	0e258cec0b	Merge branch 'devlink-reload-simplification' Leon Romanovsky says: ==================== devlink reload simplification Simplify devlink reload APIs. ==================== Link: https://lore.kernel.org/r/cover.1634044267.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:21 -07:00
Leon Romanovsky	82465bec3e	devlink: Delete reload enable/disable interface Commit `a0c76345e3` ("devlink: disallow reload operation during device cleanup") added devlink_reload_{enable,disable}() APIs to prevent reload operation from racing with device probe/dismantle. After recent changes to move devlink_register() to the end of device probe and devlink_unregister() to the beginning of device dismantle, these races can no longer happen. Reload operations will be denied if the devlink instance is unregistered and devlink_unregister() will block until all in-flight operations are done. Therefore, remove these devlink_reload_{enable,disable}() APIs. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	96869f193c	net/mlx5: Set devlink reload feature bit for supported devices only Mulitport slave device doesn't support devlink reload, so instead of complicating initialization flow with devlink_reload_enable() which will be removed in next patch, don't set DEVLINK_F_RELOAD feature bit for such devices. This fixes an error when reload counters exposed (and equal zero) for the mode that is not supported at all. Fixes: `d89ddaae17` ("net/mlx5: Disable devlink reload for multi port slave device") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	bd032e35c5	devlink: Allow control devlink ops behavior through feature mask Introduce new devlink call to set feature mask to control devlink behavior during device initialization phase after devlink_alloc() is already called. This allows us to set reload ops based on device property which is not known at the beginning of driver initialization. For the sake of simplicity, this API lacks any type of locking and needs to be called before devlink_register() to make sure that no parallel access to the ops is possible at this stage. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	b88f7b1203	devlink: Annotate devlink API calls Initial annotation patch to separate calls that needs to be executed before or after devlink_register(). Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:17 -07:00
Leon Romanovsky	2bc50987dc	devlink: Move netdev_to_devlink helpers to devlink.c Both netdev_to_devlink and netdev_to_devlink_port are used in devlink.c only, so move them in order to reduce their scope. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:16 -07:00
Leon Romanovsky	21314638c9	devlink: Reduce struct devlink exposure The declaration of struct devlink in general header provokes the situation where internal fields can be accidentally used by the driver authors. In order to reduce such possible situations, let's reduce the namespace exposure of struct devlink. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-12 16:29:16 -07:00
Jakub Kicinski	177c92353b	ethernet: tulip: avoid duplicate variable name on sparc I recently added a variable called addr to tulip_init_one() but for sparc there's already a variable called that half way thru the function. Rename it to fix build. Fixes: `ca87931755` ("ethernet: tulip: remove direct netdev->dev_addr writes") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 12:12:13 +01:00
Hao Chen	850bfb912a	net: hns3: debugfs add support dumping page pool info Add a file node "page_pool_info" for debugfs, then cat this file node to dump page pool info as below: QUEUE_ID ALLOCATE_CNT FREE_CNT POOL_SIZE(PAGE_NUM) ORDER NUMA_ID MAX_LEN 0 512 0 512 0 2 4K 1 512 0 512 0 2 4K 2 512 0 512 0 2 4K 3 512 0 512 0 2 4K 4 512 0 512 0 2 4K Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:31:15 +01:00
Jakub Kicinski	25b90c1910	tulip: fix setting device address from rom I missed removing i from the array index when converting from a loop to a direct copy. Fixes: `ca87931755` ("ethernet: tulip: remove direct netdev->dev_addr writes") Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:29:16 +01:00
David S. Miller	2ed08b5ead	Merge branch 'Managed-Neighbor-Entries' Daniel Borkmann says: ==================== Managed Neighbor Entries This series adds a couple of fixes related to NTF_EXT_LEARNED and NTF_USE neighbor flags, extends the UAPI with a new NDA_FLAGS_EXT netlink attribute in order to be able to add new neighbor flags from user space given all current struct ndmsg / ndm_flags bits are used up. Finally, the core of this series adds a new NTF_EXT_MANAGED flag to neighbors, which allows user space control planes to add 'managed' neighbor entries. Meaning, user space may either transition existing entries or can push down new L3 entries without lladdr into the kernel where the latter will periodically try to keep such NTF_EXT_MANAGED managed entries in reachable state. Main use case for this series are XDP / tc BPF load-balancers which make use of the bpf_fib_lookup() helper for backends. For more details, please see individual patches. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:27:48 +01:00
Daniel Borkmann	7482e3841d	net, neigh: Add NTF_MANAGED flag for managed neighbor entries Allow a user space control plane to insert entries with a new NTF_EXT_MANAGED flag. The flag then indicates to the kernel that the neighbor entry should be periodically probed for keeping the entry in NUD_REACHABLE state iff possible. The use case for this is targeting XDP or tc BPF load-balancers which use the bpf_fib_lookup() BPF helper in order to piggyback on neighbor resolution for their backends. Given they cannot be resolved in fast-path, a control plane inserts the L3 (without L2) entries manually into the neighbor table and lets the kernel do the neighbor resolution either on the gateway or on the backend directly in case the latter resides in the same L2. This avoids to deal with L2 in the control plane and to rebuild what the kernel already does best anyway. NTF_EXT_MANAGED can be combined with NTF_EXT_LEARNED in order to avoid GC eviction. The kernel then adds NTF_MANAGED flagged entries to a per-neighbor table which gets triggered by the system work queue to periodically call neigh_event_send() for performing the resolution. The implementation allows migration from/to NTF_MANAGED neighbor entries, so that already existing entries can be converted by the control plane if needed. Potentially, we could make the interval for periodically calling neigh_event_send() configurable; right now it's set to DELAY_PROBE_TIME which is also in line with mlxsw which has similar driver-internal infrastructure `c723c735fa` ("mlxsw: spectrum_router: Periodically update the kernel's neigh table"). In future, the latter could possibly reuse the NTF_MANAGED neighbors as well. Example: # ./ip/ip n replace 192.168.178.30 dev enp5s0 managed extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a managed extern_learn REACHABLE [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Roopa Prabhu <roopa@nvidia.com> Link: https://linuxplumbersconf.org/event/11/contributions/953/ Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:27:47 +01:00
Roopa Prabhu	2c611ad97a	net, neigh: Extend neigh->flags to 32 bit to allow for extensions Currently, all bits in struct ndmsg's ndm_flags are used up with the most recent addition of `435f2e7cc0` ("net: bridge: add support for sticky fdb entries"). This makes it impossible to extend the neighboring subsystem with new NTF_* flags: struct ndmsg { __u8 ndm_family; __u8 ndm_pad1; __u16 ndm_pad2; __s32 ndm_ifindex; __u16 ndm_state; __u8 ndm_flags; __u8 ndm_type; }; There are ndm_pad{1,2} attributes which are not used. However, due to uncareful design, the kernel does not enforce them to be zero upon new neighbor entry addition, and given they've been around forever, it is not possible to reuse them today due to risk of breakage. One option to overcome this limitation is to add a new NDA_FLAGS_EXT attribute for extended flags. In struct neighbour, there is a 3 byte hole between protocol and ha_lock, which allows neigh->flags to be extended from 8 to 32 bits while still being on the same cacheline as before. This also allows for all future NTF_* flags being in neigh->flags rather than yet another flags field. Unknown flags in NDA_FLAGS_EXT will be rejected by the kernel. Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:27:47 +01:00
Daniel Borkmann	3dc20f4762	net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT state and NTF_USE flag with a dynamic NUD state from a user space control plane. Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing neighbor entry in combination with NTF_USE flag. This is due to the latter directly calling into neigh_event_send() without any meta data updates as happening in __neigh_update(). Thus, to enable this use case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the NUD_PERMANENT state in particular so that a latter neigh_event_send() is able to re-resolve a neighbor entry. Before fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] As can be seen, despite the admin-triggered replace, the entry remains in the NUD_PERMANENT state. After fix, NUD_PERMANENT -> NUD_* & NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT [...] After the fix, the admin-triggered replace switches to a dynamic state from the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can transition back from there, if needed, into NUD_PERMANENT. Similar before/after behavior can be observed for below transitions: Before fix, NTF_USE -> NTF_USE \| NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] After fix, NTF_USE -> NTF_USE \| NTF_EXT_LEARNED -> NTF_USE: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] # ./ip/ip n replace 192.168.178.30 dev enp5s0 use # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [..] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:27:47 +01:00
Daniel Borkmann	e4400bbf5b	net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE The NTF_EXT_LEARNED neigh flag is usually propagated back to user space upon dump of the neighbor table. However, when used in combination with NTF_USE flag this is not the case despite exempting the entry from the garbage collector. This results in inconsistent state since entries are typically marked in neigh->flags with NTF_EXT_LEARNED, but here they are not. Fix it by propagating the creation flag to ___neigh_create(). Before fix: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE [...] After fix: # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn # ./ip/ip n 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE [...] Fixes: `9ce33e4653` ("neighbour: support for NTF_EXT_LEARNED flag") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:27:47 +01:00
Len Baker	7bb39a3944	net: hns: Prefer struct_size over open coded arithmetic As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, take the opportunity to refactor the hnae_handle structure to switch the last member to flexible array, changing the code accordingly. Also, fix the comment in the hnae_vf_cb structure to inform that the ae_handle member must be the last member. Then, use the struct_size() helper to do the arithmetic instead of the argument "size + count * size" in the kzalloc() function. This code was detected with the help of Coccinelle and audited and fixed manually. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:23:11 +01:00
David S. Miller	249ae9495b	Merge branch 'mlxsw-ECN-mirroring' Ido Schimmel says: ==================== mlxsw: Add support for ECN mirroring Petr says: Patches in this set have been floating around for some time now together with trap_fwd support. That will however need more work, time for which is nowhere to be found, apparently. Instead, this patchset enables offload of only packet mirroring on RED mark qevent, enabling mirroring of ECN-marked packets. Formally it enables offload of filters added to blocks bound to the RED qevent mark if: - The switch ASIC is Spectrum-2 or above. - Only a single filter is attached at the block, at chain 0 (the default), and its classifier is matchall. - The filter has hw_stats set to disabled. - The filter has a single action, which is mirror. This differs from early_drop qevent offload, which supports mirroring and trapping. However trapping in context of ECN-marked packets is not suitable, because the HW does not drop the packet, as the trap action implies. And there is as of now no way to express only the part of trapping that transfers the packet to the SW datapath, sans the HW-datapath drop. The patchset progresses as follows: Patch #1 is an extack propagation. Mirroring of ECN-marked packets is configured in the ASIC through an ECN trigger, which is considered "egress", unlike the EARLY_DROP trigger. In patch #2, add a helper to classify triggers as ingress. As clarified above, traps cannot be offloaded on mark qevent. Similarly, given a trap_fwd action, it would not be offloadable on early_drop qevent. In patch #3, introduce support for tracking actions permissible on a given block. Patch #4 actually adds the mark qevent offload. In patch #5, fix a small style issue in one of the selftests, and in patch #6 add mark offload selftests. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:19:35 +01:00
Petr Machata	0cd6fa99a0	selftests: mlxsw: RED: Add selftests for the mark qevent Add do_mark_test(), which is to do_ecn_test() like do_drop_test() is to do_red_test(): meant to test that actions on the RED mark qevent block are offloaded, and executed on ECN-marked packets. The test splits install_qdisc() into its constituents, install_root_qdisc() and install_qdisc_tcX(). This is in order to test that when mirroring is enabled on one TC, the other TC does not mirror. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-10-12 11:19:35 +01:00

1 2 3 4 5 ...

1044873 Commits