Commit Graph

1281211 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
585aa621af net/tcp_sigpool: Use nested-BH locking for sigpool_scratch.
sigpool_scratch is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.

Make a struct with a pad member (original sigpool_scratch) and a
local_lock_t and use local_lock_nested_bh() for locking. This change
adds only lockdep coverage and does not alter the functional behaviour
for !PREEMPT_RT.

Cc: David Ahern <dsahern@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-6-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24 16:41:22 -07:00
Sebastian Andrzej Siewior
bdacf3e349 net: Use nested-BH locking for napi_alloc_cache.
napi_alloc_cache is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.

Add a local_lock_t to the data structure and use local_lock_nested_bh()
for locking. This change adds only lockdep coverage and does not alter
the functional behaviour for !PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-5-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24 16:41:22 -07:00
Sebastian Andrzej Siewior
43d7ca2907 net: Use __napi_alloc_frag_align() instead of open coding it.
The else condition within __netdev_alloc_frag_align() is an open coded
__napi_alloc_frag_align().

Use __napi_alloc_frag_align() instead of open coding it.
Move fragsz assignment before page_frag_alloc_align() invocation because
__napi_alloc_frag_align() also contains this statement.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-4-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24 16:41:22 -07:00
Sebastian Andrzej Siewior
c5bcab7558 locking/local_lock: Add local nested BH locking infrastructure.
Add local_lock_nested_bh() locking. It is based on local_lock_t and the
naming follows the preempt_disable_nested() example.

For !PREEMPT_RT + !LOCKDEP it is a per-CPU annotation for locking
assumptions based on local_bh_disable(). The macro is optimized away
during compilation.
For !PREEMPT_RT + LOCKDEP the local_lock_nested_bh() is reduced to
the usual lock-acquire plus lockdep_assert_in_softirq() - ensuring that
BH is disabled.

For PREEMPT_RT local_lock_nested_bh() acquires the specified per-CPU
lock. It does not disable CPU migration because it relies on
local_bh_disable() disabling CPU migration.
With LOCKDEP it performans the usual lockdep checks as with !PREEMPT_RT.
Due to include hell the softirq check has been moved spinlock.c.

The intention is to use this locking in places where locking of a per-CPU
variable relies on BH being disabled. Instead of treating disabled
bottom halves as a big per-CPU lock, PREEMPT_RT can use this to reduce
the locking scope to what actually needs protecting.
A side effect is that it also documents the protection scope of the
per-CPU variables.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-3-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24 16:41:22 -07:00
Sebastian Andrzej Siewior
07e4fd4c05 locking/local_lock: Introduce guard definition for local_lock.
Introduce lock guard definition for local_lock_t. There are no users
yet.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240620132727.660738-2-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24 16:41:22 -07:00
Lukas Bulwahn
568ebdaba6 MAINTAINERS: adjust file entry in FREESCALE QORIQ DPAA FMAN DRIVER
Commit 243996d172 ("dt-bindings: net: Convert fsl-fman to yaml") splits
the previous dt text file into four yaml files. It adjusts a corresponding
file entry in MAINTAINERS from txt to yaml, but this adjustment misses
that the file was split and renamed.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference.

Adjust the file entry to match the four yaml files resulting from this
commit above.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-24 11:34:39 +01:00
David S. Miller
84562f9953 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: prepare representor for SF support

Michal Swiatkowski says:

This is a series to prepare port representor for supporting also
subfunctions. We need correct devlink locking and the possibility to
update parent VSI after port representor is created.

Refactor how devlink lock is taken to suite the subfunction use case.

VSI configuration needs to be done after port representor is created.
Port representor needs only allocated VSI. It doesn't need to be
configured before.

VSI needs to be reconfigured when update function is called.

The code for this patchset was split from (too big) patchset [1].

[1] https://lore.kernel.org/netdev/20240213072724.77275-1-michal.swiatkowski@linux.intel.com/
---
Originally from https://lore.kernel.org/netdev/20240605-next-2024-06-03-intel-next-batch-v2-0-39c23963fa78@intel.com/
Changes:
- delete ice_repr_get_by_vsi() from header
- rephrase commit message in moving devlink locking
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23 12:50:19 +01:00
Sean Anderson
185d72112b net: xilinx: axienet: Enable multicast by default
We support multicast addresses, so enable it by default.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-22 11:35:55 +01:00
Jakub Kicinski
e9212f9dd1 linux-can-next-for-6.11-20240621
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEUEC6huC2BN0pvD5fKDiiPnotvG8FAmZ1MB4THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAoOKI+ei28b+QQB/9wDkLdjgRsvI3c32IXxTrOmuX97Pm2
 jYaHXmH1QkOKK0CXkPLEGjzpxSVXO6/d+ooeINkKmeF9VrtxxYWuE9pEsulyPz4y
 dbeK5oAZdhtS6CEnArzbNHR6mdP1IcmJ3uKojsV71t+4GTdZw4WHe6LNgC4ua7n8
 oYOpgbEV/YP/AyP8vO1L65M+tI69/0jeDpidtDQkNgWepE/Kh0W0ASfqjYrBgvZq
 eycUmLTCigKNhG8fNJVHejyrnLynY2QBqwM7M5yYw1q8lX4W8Yf1y55VCqzPz63D
 RiYVdVdG720jk9Ux9ZqxAKBPyzACnV6CG/MnzIgIdpV3O1aEf1WDkU2Y
 =rRB9
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.11-20240621' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2024-06-21

The first 2 patches are by Andy Shevchenko, one cleans up the includes
in the mcp251x driver, the other one updates the sja100 plx_pci driver
to make use of predefines PCI subvendor ID.

Mans Rullgard's patch cleans up the Kconfig help text of for the slcan
driver.

Oliver Hartkopp provides a patch to update the documentation, which
removes the ISO 15675-2 specification version where possible.

The next 2 patches are by Harini T and update the documentation of the
xilinx_can driver.

Francesco Valla provides documentation for the ISO 15765-2 protocol.

A patch by Dr. David Alan Gilbert removes an unused struct from the
mscan driver.

12 patches are by Martin Jocic. The first three add support for 3 new
devices to the kvaser_usb driver. The remaining 9 first clean up the
kvaser_pciefd driver, and then add support for MSI.

Krzysztof Kozlowski contributes 3 patches simplifies the CAN SPI
drivers by making use of spi_get_device_match_data().

The last patch is by Martin Hundebøll, which reworks the m_can driver
to not enable the CAN transceiver during probe.

* tag 'linux-can-next-for-6.11-20240621' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (24 commits)
  can: m_can: don't enable transceiver when probing
  can: mcp251xfd: simplify with spi_get_device_match_data()
  can: mcp251x: simplify with spi_get_device_match_data()
  can: hi311x: simplify with spi_get_device_match_data()
  can: kvaser_pciefd: Add MSI interrupts
  can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR
  can: kvaser_pciefd: Change name of return code variable
  can: kvaser_pciefd: Rename board_irq to pci_irq
  can: kvaser_pciefd: Add unlikely
  can: kvaser_pciefd: Add inline
  can: kvaser_pciefd: Remove unnecessary comment
  can: kvaser_pciefd: Skip redundant NULL pointer check in ISR
  can: kvaser_pciefd: Group #defines together
  can: kvaser_usb: Add support for Kvaser Mini PCIe 1xCAN
  can: kvaser_usb: Add support for Kvaser USBcan Pro 5xCAN
  can: kvaser_usb: Add support for Vining 800
  can: mscan: remove unused struct 'mscan_state'
  Documentation: networking: document ISO 15765-2
  can: xilinx_can: Document driver description to list all supported IPs
  can: isotp: remove ISO 15675-2 specification version where possible
  ...
====================

Link: https://patch.msgid.link/20240621080201.305471-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-21 18:06:41 -07:00
Michal Swiatkowski
fff5cca345 ice: update representor when VSI is ready
In case of reset of VF VSI can be reallocated. To handle this case it
should be properly updated.

Reload representor as vsi->vsi_num can be different than the one stored
when representor was created.

Instead of only changing antispoof do whole VSI configuration for
eswitch.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21 08:51:58 -07:00
Michal Swiatkowski
4d364df2b5 ice: move VSI configuration outside repr setup
It is needed because subfunction port representor shouldn't configure
the source VSI during representor creation.

Move the code to separate function and call it only in case the VF port
representor is being created.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21 08:51:50 -07:00
Michal Swiatkowski
8d2f518c0c ice: move devlink locking outside the port creation
In case of subfunction lock will be taken for whole port creation and
removing. Do the same in VF case.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21 07:45:38 -07:00
Michal Swiatkowski
639ac8ce8b ice: store representor ID in bridge port
It is used to get representor structure during cleaning.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21 07:44:33 -07:00
Taehee Yoo
3226607302 selftests: net: change shebang to bash in amt.sh
amt.sh is written in bash, not sh.
So, shebang should be bash.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 14:27:22 +01:00
Niklas Söderlund
b0d3969d2b net: ethernet: rtsn: Add support for Renesas Ethernet-TSN
Add initial support for Renesas Ethernet-TSN End-station device of R-Car
V4H. The Ethernet End-station can connect to an Ethernet network using a
10 Mbps, 100 Mbps, or 1 Gbps full-duplex link via MII/GMII/RMII/RGMII.
Depending on the connected PHY.

The driver supports Rx checksum and offload and hardware timestamps.

While full power management and suspend/resume is not yet supported the
driver enables runtime PM in order to enable the module clock. While
explicit clock management using clk_enable() would suffice for the
supported SoC, the module could be reused on SoCs where the module is
part of a power domain.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:24:54 +01:00
David S. Miller
d7527fe98d Merge branch 'qca8k-cleanup-and-port-isolation'
Matthias Schiffer says:

====================
net: dsa: qca8k: cleanup and port isolation

A small cleanup patch, and basically the same changes that were just
accepted for mt7530 to implement port isolation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:22:59 +01:00
Matthias Schiffer
422b64025e net: dsa: qca8k: add support for bridge port isolation
Remove a pair of ports from the port matrix when both ports have the
isolated flag set.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:22:59 +01:00
Matthias Schiffer
412e1775f4 net: dsa: qca8k: factor out bridge join/leave logic
Most of the logic in qca8k_port_bridge_join() and qca8k_port_bridge_leave()
is the same. Refactor to reduce duplication and prepare for reusing the
code for implementing bridge port isolation.

dsa_port_offloads_bridge_dev() is used instead of
dsa_port_offloads_bridge(), passing the bridge in as a struct netdevice *,
as we won't have a struct dsa_bridge in qca8k_port_bridge_flags().

The error handling is changed slightly in the bridge leave case,
returning early and emitting an error message when a regmap access fails.
This shouldn't matter in practice, as there isn't much we can do if
communication with the switch breaks down in the middle of reconfiguration.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:22:59 +01:00
Matthias Schiffer
e85d3e6fea net: dsa: qca8k: do not write port mask twice in bridge join/leave
qca8k_port_bridge_join() set QCA8K_PORT_LOOKUP_CTRL() for i == port twice,
once in the loop handling all other port's masks, and finally at the end
with the accumulated port_mask.

The first time it would incorrectly set the port's own bit in the mask,
only to correct the mistake a moment later. qca8k_port_bridge_leave() had
the same issue, but here the regmap_clear_bits() was a no-op rather than
setting an unintended value.

Remove the duplicate assignment by skipping the whole loop iteration for
i == port. The unintended bit setting doesn't seem to have any negative
effects (even when not reverted right away), so the change is submitted
as a simple cleanup rather than a fix.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:22:59 +01:00
David S. Miller
28ba5c1171 Merge branch 'net-mscc-miim-switch-reset'
Herve Codina says:

====================
Handle switch reset in mscc-miim

These two patches were previously sent as part of a bigger series:
  https://lore.kernel.org/lkml/20240527161450.326615-1-herve.codina@bootlin.com/

v1 and v2 iterations were handled during the v1 and v2 reviews of this
bigger series. As theses two patches are now ready to be applied, they
were extracted from the bigger series and sent alone in this current
series.

This current v3 series takes into account feedback received during the
bigger series v2 review.

Changes v2 -> v3
  - patch 1
    Drop one useless sentence.
    Add 'Reviewed-by: Andrew Lunn <andrew@lunn.ch>'
    Add 'Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>'

  - patch 2
   Add 'Reviewed-by: Andrew Lunn <andrew@lunn.ch>'

Changes v1 -> v2 (as part of the bigger series iterations)
  - Patch 1
    Improve the reset property description

  - Patch 2
    Fix a wrong reverse x-mass tree declaration
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:12:43 +01:00
Herve Codina
9e6d33937b net: mdio: mscc-miim: Handle the switch reset
The mscc-miim device can be impacted by the switch reset, at least when
this device is part of the LAN966x PCI device.

Handle this newly added (optional) resets property.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:12:42 +01:00
Herve Codina
e5efa3ff41 dt-bindings: net: mscc-miim: Add resets property
Add the (optional) resets property.
The mscc-miim device is impacted by the switch reset especially when the
mscc-miim device is used as part of the LAN966x PCI device.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 12:12:42 +01:00
David S. Miller
4fce809e40 Merge branch 'l2tp-sk_user_data'
James Chapman says:

====================
l2tp: don't use the tunnel socket's sk_user_data in datapath

This series refactors l2tp to not use the tunnel socket's sk_user_data
in the datapath. The main reasons for doing this are

 * to allow for simplifying internal socket cleanup code (to be done
   in a later series)
 * to support multiple L2TPv3 UDP tunnels using the same 5-tuple
   address

When handling received UDP frames, l2tp's current approach is to look
up a session in a per-tunnel list. l2tp uses the tunnel socket's
sk_user_data to derive the tunnel context from the receiving socket.

But this results in the socket and tunnel lifetimes being very tightly
coupled and the tunnel/socket cleanup paths being complicated. The
latter has historically been a source of l2tp bugs and makes the code
more difficult to maintain. Also, if sockets are aliased, we can't
trust that the socket's sk_user_data references the right tunnel
anyway. Hence the desire to not use sk_user_data in the datapath.

The new approach is to lookup sessions in a per-net session list
without first deriving the tunnel:

 * For L2TPv2, the l2tp header has separate tunnel ID and session ID
   fields which can be trivially combined to make a unique 32-bit key
   for per-net session lookup.

 * For L2TPv3, there is no tunnel ID in the packet header, only a
   session ID, which should be unique over all tunnels so can be used
   as a key for per-net session lookup. However, this only works when
   the L2TPv3 session ID really is unique over all tunnels. At least
   one L2TPv3 application is known to use the same session ID in
   different L2TPv3 UDP tunnels, relying on UDP address/ports to
   distinguish them. This worked previously because sessions in UDP
   tunnels were kept in a per-tunnel list. To retain support for this,
   L2TPv3 session ID collisions are managed using a separate per-net
   session hlist, keyed by ID and sk. When looking up a session by ID,
   if there's more than one match, sk is used to find the right one.

L2TPv3 sessions in IP-encap tunnels are already looked up by session
ID in a per-net list. This work has UDP sessions also use the per-net
session list, while allowing for session ID collisions. The existing
per-tunnel hlist becomes a plain list since it is used only in
management and cleanup paths to walk a list of sessions in a given
tunnel.

For better performance, the per-net session lists use IDR. Separate
IDRs are used for L2TPv2 and L2TPv3 sessions to avoid potential key
collisions.

These changes pass l2tp regression tests and improve data forwarding
performance by about 10% in some of my test setups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:34 +01:00
James Chapman
d18d3f0a24 l2tp: replace hlist with simple list for per-tunnel session list
The per-tunnel session list is no longer used by the
datapath. However, we still need a list of sessions in the tunnel for
l2tp_session_get_nth, which is used by management code. (An
alternative might be to walk each session IDR list, matching only
sessions of a given tunnel.)

Replace the per-tunnel hlist with a per-tunnel list. In functions
which walk a list of sessions of a tunnel, walk this list instead.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:34 +01:00
James Chapman
8c6245af4f l2tp: drop the now unused l2tp_tunnel_get_session
All users of l2tp_tunnel_get_session are now gone so it can be
removed.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:34 +01:00
James Chapman
5f77c18ea5 l2tp: use IDR for all session lookups
Add generic session getter which uses IDR. Replace all users of
l2tp_tunnel_get_session which uses the per-tunnel session list to use
the generic getter.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:34 +01:00
James Chapman
c37e0138ca l2tp: don't use sk_user_data in l2tp_udp_encap_err_recv
If UDP sockets are aliased, sk might be the wrong socket. There's no
benefit to using sk_user_data to do some checks on the associated
tunnel context. Just report the error anyway, like udp core does.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:33 +01:00
James Chapman
ff6a2ac23c l2tp: refactor udp recv to lookup to not use sk_user_data
Modify UDP decap to not use the tunnel pointer which comes from the
sock's sk_user_data when parsing the L2TP header. By looking up the
destination session using only the packet contents we avoid potential
UDP 5-tuple aliasing issues which arise from depending on the socket
that received the packet.

Drop the useless error messages on short packet or on failing to find
a session since the tunnel pointer might point to a different tunnel
if multiple sockets use the same 5-tuple.

Short packets (those not big enough to contain an L2TP header) are no
longer counted in the tunnel's invalid counter because we can't derive
the tunnel until we parse the l2tp header to lookup the session.

l2tp_udp_encap_recv was a small wrapper around l2tp_udp_recv_core which
used sk_user_data to derive a tunnel pointer in an RCU-safe way. But
we no longer need the tunnel pointer, so remove that code and combine
the two functions.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:33 +01:00
James Chapman
2a3339f6c9 l2tp: store l2tpv2 sessions in per-net IDR
L2TPv2 sessions are currently kept in a per-tunnel hashlist, keyed by
16-bit session_id. When handling received L2TPv2 packets, we need to
first derive the tunnel using the 16-bit tunnel_id or sock, then
lookup the session in a per-tunnel hlist using the 16-bit session_id.

We want to avoid using sk_user_data in the datapath and double lookups
on every packet. So instead, use a per-net IDR to hold L2TPv2
sessions, keyed by a 32-bit value derived from the 16-bit tunnel_id
and session_id. This will allow the L2TPv2 UDP receive datapath to
lookup a session with a single lookup without deriving the tunnel
first.

L2TPv2 sessions are held in their own IDR to avoid potential
key collisions with L2TPv3 sessions.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:33 +01:00
James Chapman
aa5e17e1f5 l2tp: store l2tpv3 sessions in per-net IDR
L2TPv3 sessions are currently held in one of two fixed-size hash
lists: either a per-net hashlist (IP-encap), or a per-tunnel hashlist
(UDP-encap), keyed by the L2TPv3 32-bit session_id.

In order to lookup L2TPv3 sessions in UDP-encap tunnels efficiently
without finding the tunnel first via sk_user_data, UDP sessions are
now kept in a per-net session list, keyed by session ID. Convert the
existing per-net hashlist to use an IDR for better performance when
there are many sessions and have L2TPv3 UDP sessions use the same IDR.

Although the L2TPv3 RFC states that the session ID alone identifies
the session, our implementation has allowed the same session ID to be
used in different L2TP UDP tunnels. To retain support for this, a new
per-net session hashtable is used, keyed by the sock and session
ID. If on creating a new session, a session already exists with that
ID in the IDR, the colliding sessions are added to the new hashtable
and the existing IDR entry is flagged. When looking up sessions, the
approach is to first check the IDR and if no unflagged match is found,
check the new hashtable. The sock is made available to session getters
where session ID collisions are to be considered. In this way, the new
hashtable is used only for session ID collisions so can be kept small.

For managing session removal, we need a list of colliding sessions
matching a given ID in order to update or remove the IDR entry of the
ID. This is necessary to detect session ID collisions when future
sessions are created. The list head is allocated on first collision
of a given ID and refcounted.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:33 +01:00
James Chapman
a744e2d03a l2tp: remove unused list_head member in l2tp_tunnel
Remove an unused variable in struct l2tp_tunnel which was left behind
by commit c4d48a58f3 ("l2tp: convert l2tp_tunnel_list to idr").

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:33:33 +01:00
Sai Krishna
39c469188b octeontx2-pf: Add ucast filter count configurability via devlink.
The existing method of reserving unicast filter count leads to wasted
MCAM entries if the functionality is not used or fewer entries are used.
Furthermore, the amount of MCAM entries differs amongst Octeon SoCs.
We implemented a means to adjust the UC filter count via devlink,
allowing for better use of MCAM entries across Netdev apps.

commands:

To get the current unicast filter count
 # devlink dev param show pci/0002:02:00.0 name unicast_filter_count

To change/set the unicast filter count
 # devlink dev param  set  pci/0002:02:00.0  name unicast_filter_count
 value 5 cmode runtime

Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 11:28:47 +01:00
Jakub Kicinski
4558645d13 docs: net: document guidance of implementing the SR-IOV NDOs
New drivers were prevented from adding ndo_set_vf_* callbacks
over the last few years. This was expected to result in broader
switchdev adoption, but seems to have had little effect.

Based on recent netdev meeting there is broad support for allowing
adding those ops.

There is a problem with the current API supporting a limited number
of VFs (100+, which is less than some modern HW supports).
We can try to solve it by adding similar functionality on devlink
ports, but that'd be another API variation to maintain.
So a netlink attribute reshuffling is a more likely outcome.

Document the guidance, make it clear that the API is frozen.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:18:45 +01:00
Lukasz Majewski
dcec8d291d net: dsa: ksz_common: Allow only up to two HSR HW offloaded ports for KSZ9477
The KSZ9477 allows HSR in-HW offloading for any of two selected ports.
This patch adds check if one tries to use more than two ports with
HSR offloading enabled.

The problem is with RedBox configuration (HSR-SAN) - when configuring:
ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 interlink lan3 \
	supervision 45 version 1

The lan1 (port0) and lan2 (port1) are correctly configured as ports, which
can use HSR offloading on ksz9477.

However, when we do already have two bits set in hsr_ports, we need to
return (-ENOTSUPP), so the interlink port (lan3) would be used with
SW based HSR RedBox support.

Otherwise, I do see some strange network behavior, as some HSR frames are
visible on non-HSR network and vice versa.

This causes the switch connected to interlink port (lan3) to drop frames
and no communication is possible.

Moreover, conceptually - the interlink (i.e. HSR-SAN port - lan3/port2)
shall be only supported in software as it is also possible to use ksz9477
with only SW based HSR (i.e. port0/1 -> hsr0 with offloading, port2 ->
HSR-SAN/interlink, port4/5 -> hsr1 with SW based HSR).

Fixes: 5055cccfc2 ("net: hsr: Provide RedBox support (HSR-SAN)")
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:15:46 +01:00
Csókás, Bence
c32fe1986f net: fec: Fix FEC_ECR_EN1588 being cleared on link-down
FEC_ECR_EN1588 bit gets cleared after MAC reset in `fec_stop()`, which
makes all 1588 functionality shut down, and all the extended registers
disappear, on link-down, making the adapter fall back to compatibility
"dumb mode". However, some functionality needs to be retained (e.g. PPS)
even without link.

Fixes: 6605b730c0 ("FEC: Add time stamping code and a PTP hardware clock")
Cc: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/netdev/5fa9fadc-a89d-467a-aae9-c65469ff5fe1@lunn.ch/
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:12:32 +01:00
David S. Miller
a0c6359df6 Merge branch 'bnxt_en-netdev_queue_mgmt_ops'
David Wei says:

====================
bnxt_en: implement netdev_queue_mgmt_ops

Implement netdev_queue_mgmt_ops for bnxt added in [1]. This will be used
in the io_uring ZC Rx patchset to configure queues with a custom page
pool w/ a special memory provider for zero copy support.

The first two patches prep the driver, while the final patch adds the
implementation.

Any arbitrary Rx queue can be reset without affecting other queues. V2
and prior of this patchset was thought to only support resetting queues
not in the main RSS context. Upon further testing I realised moving
queues out and calling bnxt_hwrm_vnic_update() wasn't necessary.

I didn't include the netdev core API using this netdev_queue_mgmt_ops
because Mina is adding it in his devmem TCP series [2]. But I'm happy to
include it if folks want to include a user with this series.

I tested this series on BCM957504-N1100FY4 with FW 229.1.123.0. I
manually injected failures at all the places that can return an errno
and confirmed that the device/queue is never left in a broken state.

[1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/
[2]: https://lore.kernel.org/netdev/20240607005127.3078656-2-almasrymina@google.com/

v3:
 - tested w/o bnxt_hwrm_vnic_update() and it works on any queue
 - removed unneeded code

v2:
 - fix broken build
 - remove unused var in bnxt_init_one_rx_ring()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:10:34 +01:00
David Wei
2d694c27d3 bnxt_en: implement netdev_queue_mgmt_ops
Implement netdev_queue_mgmt_ops for bnxt added in [1].

Two bnxt_rx_ring_info structs are allocated to hold the new/old queue
memory. Queue memory is copied from/to the main bp->rx_ring[idx]
bnxt_rx_ring_info.

Queue memory is pre-allocated in bnxt_queue_mem_alloc() into a clone,
and then copied into bp->rx_ring[idx] in bnxt_queue_mem_start().

Similarly, when bp->rx_ring[idx] is stopped its queue memory is copied
into a clone, and then freed later in bnxt_queue_mem_free().

I tested this patchset with netdev_rx_queue_restart(), including
inducing errors in all places that returns an error code. In all cases,
the queue is left in a good working state.

Rx queues are created/destroyed using bnxt_hwrm_rx_ring_alloc() and
bnxt_hwrm_rx_ring_free(), which issue HWRM_RING_ALLOC and HWRM_RING_FREE
commands respectively to the firmware. By the time a HWRM_RING_FREE
response is received, there won't be any more completions from that
queue.

Thanks to Somnath for helping me with this patch. With their permission
I've added them as Acked-by.

[1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/

Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:10:33 +01:00
David Wei
88f56254a2 bnxt_en: split rx ring helpers out from ring helpers
To prepare for queue API implementation, split rx ring functions out
from ring helpers. These new helpers will be called from queue API
implementation.

Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:10:33 +01:00
David S. Miller
9413b1be0b Merge branch 'net-cleanup-arc-emac'
Johan Jonker says:

====================
cleanup arc emac

The Rockchip emac binding for rk3036/rk3066/rk3188 has been converted to YAML
with the ethernet-phy node in a mdio node. This requires some driver fixes
by someone that can do hardware testing.

In order to make a future fix easier make the driver 'Rockchip only'
by removing the obsolete part of the arc emac driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:07:18 +01:00
Johan Jonker
8a3913c8e0 dt-bindings: net: remove arc_emac.txt
The last real user nSIM_700 of the "snps,arc-emac" compatible string in
a driver was removed in 2019. The use of this string in the combined DT of
rk3066a/rk3188 as place holder has also been replaced, so
remove arc_emac.txt

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:07:18 +01:00
Johan Jonker
a119aec5bf net: ethernet: arc: remove emac_arc driver
The last real user nSIM_700 of the "snps,arc-emac" compatible string in
a driver was removed in 2019. The use of this string in the combined DT of
rk3066a/rk3188 as place holder has also been replaced, so
remove emac_arc.c to clean up some code.

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:07:17 +01:00
Johan Jonker
a8bd4d7af7 ARM: dts: rockchip: rk3xxx: fix emac node
In the combined DT of rk3066a/rk3188 the emac node uses as place holder
the compatible string "snps,arc-emac". The last real user nSIM_700
of the "snps,arc-emac" compatible string in a driver was removed in 2019.
Rockchip emac nodes don't make use of this common fall back string.
In order to removed unused driver code replace this string with
"rockchip,rk3066-emac".
As we are there remove the blank lines and sort.

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21 10:07:17 +01:00
Martin Hundebøll
cd5a46ce6f can: m_can: don't enable transceiver when probing
The m_can driver sets and clears the CCCR.INIT bit during probe (both
when testing the NON-ISO bit, and when configuring the chip). After
clearing the CCCR.INIT bit, the transceiver enters normal mode, where it
affects the CAN bus (i.e. it ACKs frames). This can cause troubles when
the m_can node is only used for monitoring the bus, as one cannot setup
listen-only mode before the device is probed.

Rework the probe flow, so that the CCCR.INIT bit is only cleared when
upping the device. First, the tcan4x5x driver is changed to stay in
standby mode during/after probe. This in turn requires changes when
setting bits in the CCCR register, as its CSR and CSA bits are always
high in standby mode.

Signed-off-by: Martin Hundebøll <martin@geanix.com>
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Tested-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20240607105210.155435-1-martin@geanix.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:47:24 +02:00
Marc Kleine-Budde
3da74c5145 Merge patch series "can: hi311x: simplify with spi_get_device_match_data()"
Simplify SPI driver by making use of spi_get_device_match_data().

Link: https://lore.kernel.org/all/20240606142424.129709-1-krzysztof.kozlowski@linaro.org
[mkl: add intermediate cast to uintptr_t]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:47:18 +02:00
Krzysztof Kozlowski
9cdae370c4 can: mcp251xfd: simplify with spi_get_device_match_data()
Use spi_get_device_match_data() helper to simplify a bit the driver.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/20240606142424.129709-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:40 +02:00
Krzysztof Kozlowski
d4383d67a2 can: mcp251x: simplify with spi_get_device_match_data()
Use spi_get_device_match_data() helper to simplify a bit the driver.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/20240606142424.129709-2-krzysztof.kozlowski@linaro.org
[mkl: add intermediate cast to uintptr_t]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:39 +02:00
Krzysztof Kozlowski
1562a49d00 can: hi311x: simplify with spi_get_device_match_data()
Use spi_get_device_match_data() helper to simplify a bit the driver.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/20240606142424.129709-1-krzysztof.kozlowski@linaro.org
[mkl: add intermediate cast to uintptr_t]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:38 +02:00
Marc Kleine-Budde
3ba5caf39d Merge patch series "can: kvaser_pciefd: Support MSI interrupts"
Martin Jocic <martin.jocic@kvaser.com> says:

This patch series adds support for MSI interrupts. It depends on the
patch series can: kvaser_pciefd: Minor improvements and cleanups.

Link: https://lore.kernel.org/all/20240620181320.235465-1-martin.jocic@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:34 +02:00
Martin Jocic
dd1f05ba2a can: kvaser_pciefd: Add MSI interrupts
Use MSI interrupts with fallback to INTx interrupts.

Signed-off-by: Martin Jocic <martin.jocic@kvaser.com>
Link: https://lore.kernel.org/all/20240620181320.235465-3-martin.jocic@kvaser.com
[mkl: kvaser_pciefd_probe(): call pci_free_irq_vectors() unconditionally]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:02 +02:00
Martin Jocic
48f827d4f4 can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR
A new interrupt is triggered by resetting the DMA RX buffers.
Since MSI interrupts are faster than legacy interrupts, the reset
of the DMA buffers must be moved to the very end of the ISR,
otherwise a new MSI interrupt will be masked by the current one.

Signed-off-by: Martin Jocic <martin.jocic@kvaser.com>
Link: https://lore.kernel.org/all/20240620181320.235465-2-martin.jocic@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-06-21 09:46:00 +02:00