Commit Graph

968617 Commits

Author SHA1 Message Date
Martin Schiller
62480b992b net/lapb: fix t1 timer handling for LAPB_STATE_0
1. DTE interface changes immediately to LAPB_STATE_1 and start sending
   SABM(E).

2. DCE interface sends N2-times DM and changes to LAPB_STATE_1
   afterwards if there is no response in the meantime.

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:22:51 -08:00
Martin Schiller
a4989fa911 net/lapb: support netdev events
This patch allows layer2 (LAPB) to react to netdev events itself and
avoids the detour via layer3 (X.25).

1. Establish layer2 on NETDEV_UP events, if the carrier is already up.

2. Call lapb_disconnect_request() on NETDEV_GOING_DOWN events to signal
   the peer that the connection will go down.
   (Only when the carrier is up.)

3. When a NETDEV_DOWN event occur, clear all queues, enter state
   LAPB_STATE_0 and stop all timers.

4. The NETDEV_CHANGE event makes it possible to handle carrier loss and
   detection.

   In case of Carrier Loss, clear all queues, enter state LAPB_STATE_0
   and stop all timers.

   In case of Carrier Detection, we start timer t1 on a DCE interface,
   and on a DTE interface we change to state LAPB_STATE_1 and start
   sending SABM(E).

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Acked-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:22:51 -08:00
Martin Schiller
7eed751b3b net/x25: handle additional netdev events
1. Add / remove x25_link_device by NETDEV_REGISTER/UNREGISTER and also
   by NETDEV_POST_TYPE_CHANGE/NETDEV_PRE_TYPE_CHANGE.

   This change is needed so that the x25_neigh struct for an interface
   is already created when it shows up and is kept independently if the
   interface goes UP or DOWN.

   This is used in an upcomming commit, where x25 params of an neighbour
   will get configurable through ioctls.

2. NETDEV_CHANGE event makes it possible to handle carrier loss and
   detection. If carrier is lost, clean up everything related to this
   neighbour by calling x25_link_terminated().

3. Also call x25_link_terminated() for NETDEV_DOWN events and remove the
   call to x25_clear_forward_by_dev() in x25_route_device_down(), as
   this is already called by x25_kill_by_neigh() which gets called by
   x25_link_terminated().

4. Do nothing for NETDEV_UP and NETDEV_GOING_DOWN events, as these will
   be handled in layer 2 (LAPB) and layer3 (X.25) will be informed by
   layer2 when layer2 link is established and layer3 link should be
   initiated.

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:22:51 -08:00
Jakub Kicinski
f5d709ffde Merge branch 'mlxsw-update-adjacency-index-more-efficiently'
Ido Schimmel says:

====================
mlxsw: Update adjacency index more efficiently

The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.

Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.

Instead, this patch set tracks the VRs in which the nexthop group is
used and issues one request for each VR.

Example:

8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').

Before:

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

            16,385      devlink:devlink_hwmsg

       4.255933213 seconds time elapsed

       0.000000000 seconds user
       0.666923000 seconds sys

Number of EMAD transactions corresponds to number of routes using the
nexthop group.

After:

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

                 3      devlink:devlink_hwmsg

       0.077655094 seconds time elapsed

       0.000000000 seconds user
       0.076698000 seconds sys

Number of EMAD transactions corresponds to number of VRFs / VRs.

Patch set overview:

Patch #1 is a fix for a bug introduced in previous submission. Detected
by Coverity.

Patches #2 and #3 are preparations.

Patch #4 tracks the VRs a nexthop group is member of.

Patch #5 uses the membership tracking from the previous patch to issue
one update request per each VR.
====================

Link: https://lore.kernel.org/r/20201125193505.1052466-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:37 -08:00
Ido Schimmel
ff47fa13c9 mlxsw: spectrum_router: Update adjacency index more efficiently
The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.

Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.

Instead, use the VR tracking added in the previous patch to update the
adjacency index once for each VR currently using the nexthop group.

Example:

8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').

Before:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

            16,385      devlink:devlink_hwmsg

       4.255933213 seconds time elapsed

       0.000000000 seconds user
       0.666923000 seconds sys

Number of EMAD transactions corresponds to number of routes using the
nexthop group.

After:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

                 3      devlink:devlink_hwmsg

       0.077655094 seconds time elapsed

       0.000000000 seconds user
       0.076698000 seconds sys

Number of EMAD transactions corresponds to number of VRFs / VRs.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
d2141a42b9 mlxsw: spectrum_router: Track nexthop group virtual router membership
For each nexthop group, track in which virtual routers (VRs) the group is
used. This is going to be used by the next patch to perform a more
efficient adjacency index update whenever the group's adjacency index
changes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
9a4ab10c74 mlxsw: spectrum_router: Rollback virtual router adjacency pointer update
In the rare case where the adjacency pointer cannot be updated for a
given virtual router, rollback the operation so that virtual routers
that are already using the new index will use the old one again.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
40e4413d5d mlxsw: spectrum_router: Pass virtual router parameters directly instead of pointer
mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's
identifier and protocol, so pass them directly. In a subsequent patch
the caller will not have access to the pointer.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
1c2c5eb6e1 mlxsw: spectrum_router: Fix error handling issue
Return error to the caller instead of suppressing it.

Fixes: e3ddfb45ba ("mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()")
Addresses-Coverity: ("Error handling issues  (CHECKED_RETURN)")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:32 -08:00
Jakub Kicinski
4be074e6dd Merge branch 'net-sched-fix-over-mtu-packet-of-defrag-in'
wenxu says:

====================
net/sched: fix over mtu packet of defrag in

Currently kernel tc subsystem can do conntrack in act_ct. But when several
fragment packets go through the act_ct, function tcf_ct_handle_fragments
will defrag the packets to a big one. But the last action will redirect
mirred to a device which maybe lead the reassembly big packet over the mtu
of target device.

The first patch fix miss init the qdisc_skb_cb->mru
The send one refactor the hanle of xmit in act_mirred and prepare for the
third one
The last one add implict packet fragment support to fix the over mtu for
defrag in act_ct.
====================

Link: https://lore.kernel.org/r/1606276883-6825-1-git-send-email-wenxu@ucloud.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:36:04 -08:00
wenxu
c129412f74 net/sched: sch_frag: add generic packet fragment support.
Currently kernel tc subsystem can do conntrack in cat_ct. But when several
fragment packets go through the act_ct, function tcf_ct_handle_fragments
will defrag the packets to a big one. But the last action will redirect
mirred to a device which maybe lead the reassembly big packet over the mtu
of target device.

This patch add support for a xmit hook to mirred, that gets executed before
xmiting the packet. Then, when act_ct gets loaded, it configs that hook.
The frag xmit hook maybe reused by other modules.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:36:02 -08:00
wenxu
fa6d639930 net/sched: act_mirred: refactor the handle of xmit
This one is prepare for the next patch.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:36:02 -08:00
wenxu
aadaca9e7c net/sched: fix miss init the mru in qdisc_skb_cb
The mru in the qdisc_skb_cb should be init as 0. Only defrag packets in the
act_ct will set the value.

Fixes: 038ebb1a71 ("net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:36:02 -08:00
Jakub Kicinski
fb3158ea61 Merge branch 'add-chacha20-poly1305-cipher-to-kernel-tls'
Vadim Fedorenko says:

====================
Add CHACHA20-POLY1305 cipher to Kernel TLS

RFC 7905 defines usage of ChaCha20-Poly1305 in TLS connections. This
cipher is widely used nowadays and it's good to have a support for it
in TLS connections in kernel.
====================

Link: https://lore.kernel.org/r/1606231490-653-1-git-send-email-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:40 -08:00
Vadim Fedorenko
4f336e88a8 selftests/tls: add CHACHA20-POLY1305 to tls selftests
Add new cipher as a variant of standard tls selftests

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
74ea610602 net/tls: add CHACHA20-POLY1305 configuration
Add ChaCha-Poly specific configuration code.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
a6acbe6235 net/tls: add CHACHA20-POLY1305 specific behavior
RFC 7905 defines special behavior for ChaCha-Poly TLS sessions.
The differences are in the calculation of nonce and the absence
of explicit IV. This behavior is like TLSv1.3 partly.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
923c40c465 net/tls: add CHACHA20-POLY1305 specific defines and structures
To provide support for ChaCha-Poly cipher we need to define
specific constants and structures.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Vadim Fedorenko
6942a284fb net/tls: make inline helpers protocol-aware
Inline functions defined in tls.h have a lot of AES-specific
constants. Remove these constants and change argument to struct
tls_prot_info to have an access to cipher type in later patches

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 14:32:37 -08:00
Jakub Kicinski
594e31bceb Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2020-11-24

This series contains updates to i40e and igbvf drivers.

Marek removes a redundant assignment for i40e.

Stefan Assmann corrects reporting of VF link speed for i40e.

Karen revises a couple of error messages to warnings for igbvf as they
could be misinterpreted as issues when they are not.

v2: Dropped PTP patch as it's being updated.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  igbvf: Refactor traces
  i40e: report correct VF link speed when link state is set to enable
  i40e: remove redundant assignment
====================

Link: https://lore.kernel.org/r/20201124165245.2844118-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 18:11:25 -08:00
Jakub Kicinski
64088b2ac1 Merge branch 'net-dsa-mv88e6xxx-serdes-link-without-phy'
Chris Packham says:

====================
net: dsa: mv88e6xxx: serdes link without phy

This small series gets my hardware into a working state. The key points are to
make sure we don't force the link and that we ask the MAC for the link status.
I also have updated my dts to say `phy-mode = "1000base-x";` and `managed =
"in-band-status";`

I've dropped the patch for the 88E6123 as it's a distraction and I lack
hardware to do any proper testing with it. Earlier versions are on the mailing
list if anyone wants to pick it up in the future.

I notice there's a series for mv88e6393x circulating on the netdev mailing
list. As patch #1 is adding a new device specific op either this series will
need updating to cover the mv88e6393x or the mv88e6393x series will need
updating for the new op depenting on which lands first.
====================

Link: https://lore.kernel.org/r/20201124043440.28400-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:09 -08:00
Chris Packham
0fd5d79efa net: dsa: mv88e6xxx: Handle error in serdes_get_regs
If the underlying read operation failed we would end up writing stale
data to the supplied buffer. This would end up with the last
successfully read value repeating. Fix this by only writing the data
when we know the read was good. This will mean that failed values will
return 0xffff.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
5c19bc8b57 net: dsa: mv88e6xxx: Add serdes interrupt support for MV88E6097
The MV88E6097 presents the serdes interrupts for ports 8 and 9 via the
Switch Global 2 registers. There is no additional layer of
enablinh/disabling the serdes interrupts like other mv88e6xxx switches.
Even though most of the serdes behaviour is the same as the MV88E6185
that chip does not provide interrupts for serdes events so unlike
earlier commits the functions added here are specific to the MV88E6097.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
f5be107c33 net: dsa: mv88e6xxx: Support serdes ports on MV88E6097/6095/6185
Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for
the MV88E6097/6095/6185 so that ports 8 & 9 can be supported as serdes
ports and directly connected to other network interfaces or to SFPs
without a PHY.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Chris Packham
4efe766290 net: dsa: mv88e6xxx: Don't force link when using in-band-status
When a port is configured with 'managed = "in-band-status"' switch chips
like the 88E6390 need to propagate the SERDES link state to the MAC
because the link state is not correctly detected. This causes problems
on the 88E6185/88E6097 where the link partner won't see link state
changes because we're forcing the link.

To address this introduce a new device specific op port_sync_link() and
push the logic from mv88e6xxx_mac_link_up() into that. Provide an
implementation for the 88E6185 like devices which doesn't force the
link.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:58:06 -08:00
Jakub Kicinski
0f614511fa Merge branch 'dt-bindings-net-dsa-microchip-convert-ksz-bindings-to-yaml'
Christian Eggers says:

====================
dt-bindings: net: dsa: microchip: convert KSZ bindings to yaml

These patches are orginally from the series

"net: dsa: microchip: PTP support for KSZ956x"

As the the device tree conversion to yaml is not really related to the
PTP patches and the original series is going to take more time than
I expected, I would like to split this.

Changes (original series -> v1)
--------------------------------
- dts: moved "allOf" below "maintainers"
- dts: use "unevaluatedProperties" instead of "additionalProperties"
- dts: removed "spi-cpha" and "spi-cpol" flags as the hardware is fixed
- ksz8795: setup SPI for mode 3
- ksz9477: dito
====================

Link: https://lore.kernel.org/r/20201120112107.16334-1-ceggers@arri.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:55:27 -08:00
Christian Eggers
8c4599f498 net: dsa: microchip: ksz8795: setup SPI mode
This should be done in the device driver instead of the device tree.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Christian Eggers
9ed602bac9 net: dsa: microchip: ksz9477: setup SPI mode
This should be done in the device driver instead of the device tree.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Christian Eggers
44e53c8882 net: dsa: microchip: support for "ethernet-ports" node
The dsa.yaml device tree binding allows "ethernet-ports" (preferred) and
"ports".

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Christian Eggers
4f36d97786 dt-bindings: net: dsa: convert ksz bindings document to yaml
Convert the bindings document for Microchip KSZ Series Ethernet switches
from txt to yaml. Removed spi-cpha and spi-cpol flags is this should be
handled by the device driver.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:53:45 -08:00
Jakub Kicinski
0e1f1cc89a Merge branch 'add-an-assert-in-napi_consume_skb'
Yunsheng Lin says:

====================
Add an assert in napi_consume_skb()

This patch introduces a lockdep_assert_in_softirq() interface and
uses it to assert the case when napi_consume_skb() is not called in
the softirq context.
====================

Link: https://lore.kernel.org/r/1606214969-97849-1-git-send-email-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 15:08:39 -08:00
Yunsheng Lin
6454eca81e net: Use lockdep_assert_in_softirq() in napi_consume_skb()
Use napi_consume_skb() to assert the case when it is not called
in a atomic softirq context.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 15:08:38 -08:00
Yunsheng Lin
8b5536ad12 lockdep: Introduce in_softirq lockdep assert
The current semantic for napi_consume_skb() is that caller need
to provide non-zero budget when calling from NAPI context, and
breaking this semantic will cause hard to debug problem, because
_kfree_skb_defer() need to run in atomic context in order to push
the skb to the particular cpu' napi_alloc_cache atomically.

So add the lockdep_assert_in_softirq() to assert when the running
context is not in_softirq, in_softirq means softirq is serving or
BH is disabled, which has a ambiguous semantics due to the BH
disabled confusion, so add a comment to emphasize that.

And the softirq context can be interrupted by hard IRQ or NMI
context, lockdep_assert_in_softirq() need to assert about hard
IRQ or NMI context too.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 15:08:34 -08:00
Rikard Falkeborn
b5094a3b53 soc: qcom: ipa: Constify static qmi structs
These are only used as input arguments to qmi_handle_init() which
accepts const pointers to both qmi_ops and qmi_msg_handler. Make them
const to allow the compiler to put them in read-only memory.

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Acked-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20201122234031.33432-2-rikard.falkeborn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 13:54:41 -08:00
Paolo Abeni
fd8976790a mptcp: be careful on MPTCP-level ack.
We can enter the main mptcp_recvmsg() loop even when
no subflows are connected. As note by Eric, that would
result in a divide by zero oops on ack generation.

Address the issue by checking the subflow status before
sending the ack.

Additionally protect mptcp_recvmsg() against invocation
with weird socket states.

v1 -> v2:
 - removed unneeded inline keyword - Jakub

Reported-and-suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: ea4ca586b1 ("mptcp: refine MPTCP-level ack scheduling")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/5370c0ae03449239e3d1674ddcfb090cf6f20abe.1606253206.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 13:36:16 -08:00
Horatiu Vultur
bfd042321a bridge: mrp: Implement LC mode for MRP
Extend MRP to support LC mode(link check) for the interconnect port.
This applies only to the interconnect ring.

Opposite to RC mode(ring check) the LC mode is using CFM frames to
detect when the link goes up or down and based on that the userspace
will need to react.
One advantage of the LC mode over RC mode is that there will be fewer
frames in the normal rings. Because RC mode generates InTest on all
ports while LC mode sends CFM frame only on the interconnect port.

All 4 nodes part of the interconnect ring needs to have the same mode.
And it is not possible to have running LC and RC mode at the same time
on a node.

Whenever the MIM starts it needs to detect the status of the other 3
nodes in the interconnect ring so it would send a frame called
InLinkStatus, on which the clients needs to reply with their link
status.

This patch adds InLinkStatus frame type and extends existing rules on
how to forward this frame.

Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20201124082525.273820-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 13:33:35 -08:00
Vlad Buslov
f460019b4c net: sched: alias action flags with TCA_ACT_ prefix
Currently both filter and action flags use same "TCA_" prefix which makes
them hard to distinguish to code and confusing for users. Create aliases
for existing action flags constants with "TCA_ACT_" prefix.

Signed-off-by: Vlad Buslov <vlad@buslov.dev>
Link: https://lore.kernel.org/r/20201124164054.893168-1-vlad@buslov.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:34:44 -08:00
Florian Westphal
b6d69fc8e8 mptcp: put reference in mptcp timeout timer
On close this timer might be scheduled. mptcp uses sk_reset_timer for
this, so the a reference on the mptcp socket is taken.

This causes a refcount leak which can for example be reproduced
with 'mp_join_server_v4.pkt' from the mptcp-packetdrill repo.

The leak has nothing to do with join requests, v1_mp_capable_bind_no_cs.pkt
works too when replacing the last ack mpcapable to v1 instead of v0.

unreferenced object 0xffff888109bba040 (size 2744):
  comm "packetdrill", [..]
  backtrace:
    [..] sk_prot_alloc.isra.0+0x2b/0xc0
    [..] sk_clone_lock+0x2f/0x740
    [..] mptcp_sk_clone+0x33/0x1a0
    [..] subflow_syn_recv_sock+0x2b1/0x690 [..]

Fixes: e16163b6e2 ("mptcp: refactor shutdown and close")
Cc: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20201124162446.11448-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:32:45 -08:00
Antonio Borneo
4826d2c4fc net: phy: realtek: read actual speed on rtl8211f to detect downshift
The rtl8211f supports downshift and before commit 5502b218e0
("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
the read-back of register MII_CTRL1000 was used to detect the
negotiated link speed.
The code added in commit d445dff2df ("net: phy: realtek: read
actual speed to detect downshift") is working fine also for this
phy and it's trivial re-using it to restore the downshift
detection on rtl8211f.

Add the phy specific read_status() pointing to the existing
function rtlgen_read_status().

Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Link: https://lore.kernel.org/r/478f871a-583d-01f1-9cc5-2eea56d8c2a7@huawei.com
Tested-by: Yonglong Liu <liuyonglong@huawei.com>
Link: https://lore.kernel.org/r/20201124230756.887925-1-antonio.borneo@st.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:29:40 -08:00
Jakub Kicinski
16d07c38c4 Merge branch 'net-ptp-use-common-defines-for-ptp-message-types-in-further-drivers'
Christian Eggers says:

====================
net: ptp: use common defines for PTP message types in further drivers

This series replaces further driver internal enumeration / uses of magic
numbers with the newly introduced PTP_MSGTYPE_* defines.
====================

Link: https://lore.kernel.org/r/20201124074418.2609-1-ceggers@arri.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:27 -08:00
Christian Eggers
298722166a net: phy: mscc: use new PTP_MSGTYPE_* defines
Use recently introduced PTP_MSGTYPE_SYNC and PTP_MSGTYPE_DELAY_REQ
defines instead of a driver internal enumeration.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:21 -08:00
Christian Eggers
37e9d0559a mlxsw: spectrum_ptp: use PTP wide message type definitions
Use recently introduced PTP wide defines instead of a driver internal
enumeration.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Cc: Petr Machata <petrm@mellanox.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:14 -08:00
Christian Eggers
651c814f3c net: phy: dp83640: use new PTP_MSGTYPE_SYNC define
Replace use of magic number with recently introduced define.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:06 -08:00
Jakub Kicinski
062547380d Merge branch 'net-phy-add-support-for-shared-interrupts-part-3'
Ioana Ciornei says:

====================
net: phy: add support for shared interrupts (part 3)

This patch set aims to actually add support for shared interrupts in
phylib and not only for multi-PHY devices. While we are at it,
streamline the interrupt handling in phylib.

For a bit of context, at the moment, there are multiple phy_driver ops
that deal with this subject:

- .config_intr() - Enable/disable the interrupt line.

- .ack_interrupt() - Should quiesce any interrupts that may have been
  fired.  It's also used by phylib in conjunction with .config_intr() to
  clear any pending interrupts after the line was disabled, and before
  it is going to be enabled.

- .did_interrupt() - Intended for multi-PHY devices with a shared IRQ
  line and used by phylib to discern which PHY from the package was the
  one that actually fired the interrupt.

- .handle_interrupt() - Completely overrides the default interrupt
  handling logic from phylib. The PHY driver is responsible for checking
  if any interrupt was fired by the respective PHY and choose
  accordingly if it's the one that should trigger the link state machine.

From my point of view, the interrupt handling in phylib has become
somewhat confusing with all these callbacks that actually read the same
PHY register - the interrupt status.  A more streamlined approach would
be to just move the responsibility to write an interrupt handler to the
driver (as any other device driver does) and make .handle_interrupt()
the only way to deal with interrupts.

Another advantage with this approach would be that phylib would gain
support for shared IRQs between different PHY (not just multi-PHY
devices), something which at the moment would require extending every
PHY driver anyway in order to implement their .did_interrupt() callback
and duplicate the same logic as in .ack_interrupt(). The disadvantage
of making .did_interrupt() mandatory would be that we are slightly
changing the semantics of the phylib API and that would increase
confusion instead of reducing it.

What I am proposing is the following:

- As a first step, make the .ack_interrupt() callback optional so that
  we do not break any PHY driver amid the transition.

- Every PHY driver gains a .handle_interrupt() implementation that, for
  the most part, would look like below:

	irq_status = phy_read(phydev, INTR_STATUS);
	if (irq_status < 0) {
		phy_error(phydev);
		return IRQ_NONE;
	}

	if (!(irq_status & irq_mask))
		return IRQ_NONE;

	phy_trigger_machine(phydev);

	return IRQ_HANDLED;

- Remove each PHY driver's implementation of the .ack_interrupt() by
  actually taking care of quiescing any pending interrupts before
  enabling/after disabling the interrupt line.

- Finally, after all drivers have been ported, remove the
  .ack_interrupt() and .did_interrupt() callbacks from phy_driver.

This patch set is part 3 (and final) of the entire change set and it
addresses the remaining PHY drivers that have not been migrated
previosly. Also, it finally removed the .did_interrupt() and
.ack_interrupt() callbacks since they are of no use anymore.

I do not have access to most of these PHY's, therefore I Cc-ed the
latest contributors to the individual PHY drivers in order to have
access, hopefully, to more regression testing.
====================

Link: https://lore.kernel.org/r/20201123153817.1616814-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:41 -08:00
Ioana Ciornei
6527b93842 net: phy: remove the .did_interrupt() and .ack_interrupt() callback
Now that all the PHY drivers have been migrated to directly implement
the generic .handle_interrupt() callback for a seamless support of
shared IRQs and all the .config_inter() implementations clear any
pending interrupts, we can safely remove the two callbacks.

With this patch, phylib has a proper support for shared IRQs (and not
just for multi-PHY devices. A PHY driver must implement both the
.handle_interrupt() and .config_intr() callbacks for the IRQs to be
actually used.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Ioana Ciornei
a1a4417458 net: phy: qsemi: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.

This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.

Also, add a comment describing the multiple step interrupt
acknoledgement process.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Ioana Ciornei
efc3d9de7f net: phy: qsemi: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Ioana Ciornei
aa2d603ac8 net: phy: ti: remove the use of .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.

This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.

Cc: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Ioana Ciornei
1d1ae3c6ca net: phy: ti: implement generic .handle_interrupt() callback
In an attempt to actually support shared IRQs in phylib, we now move the
responsibility of triggering the phylib state machine or just returning
IRQ_NONE, based on the IRQ status register, to the PHY driver. Having
3 different IRQ handling callbacks (.handle_interrupt(),
.did_interrupt() and .ack_interrupt() ) is confusing so let the PHY
driver implement directly an IRQ handler like any other device driver.
Make this driver follow the new convention.

Cc: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Ioana Ciornei
a4d7742149 net: phy: national: remove the use of the .ack_interrupt()
In preparation of removing the .ack_interrupt() callback, we must replace
its occurrences (aka phy_clear_interrupt), from the 2 places where it is
called from (phy_enable_interrupts and phy_disable_interrupts), with
equivalent functionality.

This means that clearing interrupts now becomes something that the PHY
driver is responsible of doing, before enabling interrupts and after
clearing them. Make this driver follow the new contract.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:37 -08:00